Keywords

1 Introduction

Many applications of automated deduction require reasoning with respect to a combination of a background theory, say integer arithmetic, and a foreground theory that extends the background theory by new sorts such as \( list \), new operators, such as \( cons : int \times list \rightarrow list \) and \( length : list \rightarrow int \), and first-order axioms. Developing corresponding automated reasoning systems that are also able to deal with quantified formulas has recently been an active area of research. One major line of research is concerned with extending (SMT-based) solvers [24] for the quantifier-free case by instantiation heuristics for quantifiers [17, 18, e. g.]. Another line of research is concerned with adding black-box reasoners for specific background theories to first-order automated reasoning methods (resolution [1, 5, 19], sequent calculi [26], instantiation methods [8, 9, 16], etc). In both cases, a major unsolved research challenge is to provide reasoning support that is “reasonably complete” in practice, so that the systems can be used more reliably for both proving theorems and finding counterexamples.

In [5], Bachmair, Ganzinger, and Waldmann introduced the hierarchical superposition calculus as a generalization of the superposition calculus for black-box style theory reasoning. Their calculus works in a framework of hierarchic specifications. It tries to prove the unsatisfiability of a set of clauses with respect to interpretations that extend a background model such as the integers with linear arithmetic conservatively, that is, without identifying distinct elements of old sorts (“confusion”) and without adding new elements to old sorts (“junk”). While confusion can be detected by first-order theorem proving techniques, junk can not – in fact, the set of logical consequences of a hierarchic specifications is usually not recursively enumerable. Refutational completeness can therefore only be guaranteed if one restricts oneself to sets of formulas where junk can be excluded a priori. The property introduced by Bachmair, Ganzinger, and Waldmann for this purpose is called “sufficient completeness with respect to simple instances”. Given this property, their calculus is refutationally complete for clause sets that are fully abstracted (i. e., where no literal contains both foreground and background symbols). Unfortunately their full abstraction rule may destroy sufficient completeness with respect to simple instances. We show that this problem can be avoided by using a new form of clause abstraction and a suitably modified hierarchical superposition calculus. Since the new calculus is still refutationally complete and the new abstraction rule is guaranteed to preserve sufficient completeness with respect to simple instances, the new combination is strictly more powerful than the old one.

In practice, sufficient completeness is a rather restrictive property. While there are application areas where one knows in advance that every input is sufficiently complete, in most cases this does not hold. As a user of an automated theorem prover, one would like to see a best effort behavior: The prover might for instance try to make the input sufficiently complete by adding further theory axioms. In the calculus from [5], this does not help at all: The restriction to a particular kind of instantiations (“simple instances”) renders theory axioms essentially unusable in refutations. We show that this can be prevented by introducing two kinds of variables of the background theory sorts, that can be instantiated in different ways, making our calculus significantly “more complete” in practice. We also include a definition rule in the calculus that can be used to establish sufficient completeness by linking foreground terms to background parameters, thus allowing the background prover to reason about these terms.

The following trivial example demonstrates the problem. Consider the clause set \(N = \{ C\}\) where \(C = \mathsf {{f}}(1) < \mathsf {{f}}(1)\). Assume that the background theory is integer arithmetic and that \(\mathsf {{f}}\) is an integer-sorted operator from the foreground (free) signature. Intuitively, one would expect N to be unsatisfiable. However, N is not sufficiently complete, and it admits models in which \(\mathsf {{f}}(1)\) is interpreted as some junk element , an element of the domain of the integer sort that is not a numeric constant. So both the calculus in [5] and ours are excused to not find a refutation. To fix that, one could add an instance \(C' = \lnot (\mathsf {{f}}(1) < \mathsf {{f}}(1))\) of the irreflexivity axiom \(\lnot (x < x)\). The resulting set \(N' = \{ C,\ C' \}\) is (trivially) sufficiently complete as it has no models at all. However, the calculus in [5] is not helped by adding \(C'\), since the abstracted version of \(N'\) is again not sufficiently complete and admits a model that interprets \(\mathsf {{f}}(1)\) as . Our abstraction mechanism always preserves sufficient completeness and our calculus will find a refutation.

With this example one could think that replacing the abstraction mechanism in [5] with ours gives all the advantages of our calculus. But this is not the case. Let \(N'' = \{ C,\ \lnot (x < x) \}\) be obtained by adding the more realistic axiom \(\lnot (x < x)\). The set \(N''\) is still sufficiently complete with our approach thanks to having two kinds of variables at disposal, but it is not sufficiently complete in the sense of [5]. Indeed, in that calculus adding background theory axioms never helps to gain sufficient completeness, as variables there have only one kind.

Another alternative to make N sufficiently complete is by adding a clause that forces \(\mathsf {{f}}(1)\) to be equal to some background domain element. For instance, one can add a “definition” for \(\mathsf {{f}}(1)\), that is, a clause \(\mathsf {{f}}(1) \approx \alpha \), where \(\alpha \) is a fresh symbolic constant belonging to the background signature (a “parameter”). The set \(N''' = \{ C, \mathsf {{f}}(1) \approx \alpha \}\) is sufficiently complete and it admits refutations with both calculi. The definition rule in our calculus mentioned above will generate this definition automatically. Moreover, the set N belongs to a syntactic fragment for which we can guarantee not only sufficient completeness (by means of the definition rule) but also refutational completeness.

We present the new calculus in detail and provide a general completeness result, modulo compactness of the background theory, and two specific completeness results for clause sets that do not require compactness – one for the fragment where all background-sorted terms are ground and another one for a special case of linear (integer or rational) arithmetic as a background theory.

We also report on experiments with a prototypical implementation on the TPTP problem library [27].

Sections 17, 910, and 12 of this paper are a substantially expanded and revised version of [11]. A preliminary version of Sect. 11 has appeared in [10]. However, we omit from this paper some proofs that are not essential for the understanding of the main ideas. They can be found in a slightly extended version of this paper at http://arxiv.org/abs/1904.03776 [12].

Related Work. The relation with the predecessor calculus in [5] is discussed above and also further below. What we say there also applies to other developments rooted in that calculus, [1, e. g.]. The specialized version of hierarchic superposition in [22] will be discussed in Sect. 9 below. The resolution calculus in [19] has built-in inference rules for linear (rational) arithmetic, but is complete only under restrictions that effectively prevent quantification over rationals. Earlier work on integrating theory reasoning into model evolution [8, 9] lacks the treatment of background-sorted foreground function symbols. The same applies to the sequent calculus in [26], which treats linear arithmetic with built-in rules for quantifier elimination. The instantiation method in [16] requires an answer-complete solver for the background theory to enumerate concrete solutions of background constraints, not just a decision procedure. All these approaches have in common that they integrate specialized reasoning for background theories into a general first-order reasoning method. A conceptually different approach consists in using first-order theorem provers as (semi-)decision procedures for specific theories in DPLL(T)(-like) architectures [2, 13, 14]. Notice that in this context the theorem provers do not need to reason modulo background theories themselves, and indeed they don’t. The calculus and system in [14], for instance, integrates superposition and DPLL(T). From DPLL(T) it inherits splitting of ground non-unit clauses into their unit components, which determines a (backtrackable) model candidate M. The superposition inference rules are applied to elements from M and a current clause set F. The superposition component guarantees refutational completeness for pure first-order clause logic. Beyond that, for clauses containing background-sorted variables, (heuristic) instantiation is needed. Instantiation is done with ground terms that are provably equal w.r.t. the equations in M to some ground term in M in order to advance the derivation. The limits of that method can be illustrated with an (artificial but simple) example. Consider the unsatisfiable clause set \(\{ i \le j\,\vee \, \mathsf {{P}}(i\,+\,1, x)\, \vee \, \mathsf {{P}}(j\,+\,2, x),\) \(i \le j \,\vee \, \lnot \mathsf {{P}}(i\,+\,3, x) \vee \, \lnot \,\mathsf {{P}}(j\,+\,4,x) \}\) where i and j are integer-sorted variables and x is a foreground-sorted variable. Neither splitting into unit clauses, superposition calculus rules, nor instantiation applies, and so the derivation gets stuck with an inconclusive result. By contrast, the clause set belongs to a fragment that entails sufficient completeness (“no background-sorted foreground function symbols”) and hence is refutable by our calculus. On the other hand, heuristic instantiation does have a place in our calculus, but we leave that for future work.

2 Signatures, Clauses, and Interpretations

We work in the context of standard many-sorted logic with first-order signatures comprised of sorts and operator (or function) symbols of given arities over these sorts. A signature is a pair \(\varSigma = (\varXi ,\varOmega )\), where \(\varXi \) is a set of sorts and \(\varOmega \) is a set of operator symbols over \(\varXi \). If \(\mathcal {X}\) is a set of sorted variables with sorts in \(\varXi \), then the set of well-sorted terms over \(\varSigma = (\varXi ,\varOmega )\) and \(\mathcal {X}\) is denoted by \(\mathrm {T}_\varSigma (\mathcal {X})\); \(\mathrm {T}_\varSigma \) is short for \(\mathrm {T}_\varSigma (\emptyset )\). We require that \(\varSigma \) is a sensible signature, i. e., that \(\mathrm {T}_\varSigma \) has no empty sorts. As usual, we write t[u] to indicate that the term u is a (not necessarily proper) subterm of the term t. The position of u in t is left implicit.

A \(\varSigma \)-equation is an unordered pair (st), usually written \(s \approx t\), where s and t are terms from \(\mathrm {T}_\varSigma (\mathcal {X})\) of the same sort. For simplicity, we use equality as the only predicate in our language. Other predicates can always be encoded as a function into a set with one distinguished element, so that a non-equational atom is turned into an equation \(P(t_1,\ldots ,t_n) \approx { true}_P^{}\); this is usually abbreviated by \(P(t_1,\ldots ,t_n)\).Footnote 1 A literal is an equation \(s \approx t\) or a negated equation \(\lnot (s \approx t)\), also written as \(s \not \approx t\). A clause is a multiset of literals, usually written as a disjunction; the empty clause, denoted by \(\Box \) is a contradiction. If F is a term, equation, literal or clause, we denote by \(\mathrm {vars}(F)\) the set of variables that occur in F. We say F is ground if \(\mathrm {vars}(F) = \emptyset \).

A substitution \(\sigma \) is a mapping from variables to terms that is sort respecting, that is, maps each variable \(x \in \mathcal {X}\) to a term of the same sort. Substitutions are homomorphically extended to terms as usual. We write substitution application in postfix form. A term s is an instance of a term t if there is a substitution \(\sigma \) such that \(t\sigma = s\). All these notions carry over to equations, literals and clauses in the obvious way. The composition \(\sigma \tau \) of the substitutions \(\sigma \) and \(\tau \) is the substitution that maps every variable x to \((x\sigma )\tau \).

The domain of a substitution \(\sigma \) is the set \(\mathrm {dom}(\sigma ) = \{ x \mid x \ne x\sigma \}\). We use only substitutions with finite domains, written as \(\sigma = [x_1\mapsto t_1,\ldots ,x_n\mapsto t_n]\) where \(\mathrm {dom}(\sigma ) = \{x_1,\ldots ,x_n\}\). A ground substitution is a substitution that maps every variable in its domain to a ground term. A ground instance of F is obtained by applying some ground substitution with domain (at least) \(\mathrm {vars}(F)\) to it.

A \(\varSigma \)-interpretation I consists of a \(\varXi \)-sorted family of carrier sets \(\{I_\xi \}_{\xi \in \varXi }\) and of a function \(I_f : I_{\xi _1} \times \cdots \times I_{\xi _n} \rightarrow I_{\xi _0}\) for every \(f : \xi _1 \ldots \xi _n \rightarrow \xi _0\) in \(\varOmega \). The interpretation \(t^I\) of a ground term t is defined recursively by \(f(t_1,\ldots ,t_n)^I = I_f(t_1^I,\ldots ,t_n^I)\) for \(n \ge 0\). An interpretation I is called term-generated, if every element of an \(I_\xi \) is the interpretation of some ground term of sort \(\xi \). An interpretation I is said to satisfy a ground equation \(s \approx t\), if s and t have the same interpretation in I; it is said to satisfy a negated ground equation \(s \not \approx t\), if s and t do not have the same interpretation in I. The interpretation I satisfies a ground clause C if at least one of the literals of C is satisfied by I. We also say that a ground clause C is true in I, if I satisfies C; and that C is false in I, otherwise. A term-generated interpretation I is said to satisfy a non-ground clause C if it satisfies all ground instances \(C\sigma \); it is called a model of a set N of clauses, if it satisfies all clauses of N.Footnote 2 We abbreviate the fact that I is a model of N by \(I\, \models \, N\); \(I\,\models \,C\) is short for \(I \,\models \, \{C\}\). We say that N entails \(N'\), and write \(N \,\models \, N'\), if every model of N is a model of \(N'\); \(N \,\models \, C\) is short for \(N \,\models \, \{C\}\). We say that N and \(N'\) are equivalent, if \(N \,\models \, N'\) and \(N' \,\models \, N\).

If \(\mathcal {J}\) is a class of \(\varSigma \)-interpretations, a \(\varSigma \)-clause or clause set is called \(\mathcal {J}\)-satisfiable if at least one \(I \in \mathcal {J}\) satisfies the clause or clause set; otherwise it is called \(\mathcal {J}\)-unsatisfiable.

A specification is a pair \( SP = (\varSigma ,\mathcal {J})\), where \(\varSigma \) is a signature and \(\mathcal {J}\) is a class of term-generated \(\varSigma \)-interpretations called models of the specification \( SP \). We assume that \(\mathcal {J}\) is closed under isomorphisms.

We say that a class of \(\varSigma \)-interpretations \(\mathcal {J}\) or a specification \((\varSigma ,\,\mathcal {J})\) is compact, if every infinite set of \(\varSigma \)-clauses that is \(\mathcal {J}\)-unsatisfiable has a finite subset that is also \(\mathcal {J}\)-unsatisfiable.

3 Hierarchic Theorem Proving

In hierarchic theorem proving, we consider a scenario in which a general-purpose foreground theorem prover and a specialized background prover cooperate to derive a contradiction from a set of clauses. In the sequel, we will usually abbreviate “foreground” and “background” by “FG” and “BG”.

The BG prover accepts as input sets of clauses over a BG signature \(\varSigma _\mathrm {B}= (\varXi _\mathrm {B},\varOmega _\mathrm {B})\). Elements of \(\varXi _\mathrm {B}\) and \(\varOmega _\mathrm {B}\) are called BG sorts and BG operators, respectively. We fix an infinite set \(\mathcal {X}_\mathrm {B}\) of BG variables of sorts in \(\varXi _\mathrm {B}\). Every BG variable has (is labeled with) a kind, which is either “abstraction” or “ordinary”. Terms over \(\varSigma _\mathrm {B}\) and \(\mathcal {X}_\mathrm {B}\) are called BG terms. A BG term is called pure, if it does not contain ordinary variables; otherwise it is impure. These notions apply analogously to equations, literals, clauses, and clause sets.

The BG prover decides the satisfiability of \(\varSigma _\mathrm {B}\)-clause sets with respect to a BG specification \((\varSigma _\mathrm {B},\mathcal {B})\), where \(\mathcal {B}\) is a class of term-generated \(\varSigma _\mathrm {B}\)-interpretations called BG models. We assume that \(\mathcal {B}\) is closed under isomorphisms.

In most applications of hierarchic theorem proving, the set of BG operators \(\varOmega _\mathrm {B}\) contains a set of distinguished constant symbols \(\varOmega _\mathrm {B}^D \subseteq \varOmega _\mathrm {B}\) that has the property that \(d_1^I \ne d_2^I\) for any two distinct \(d_1, d_2 \in \varOmega _\mathrm {B}^D\) and every BG model \(I \in \mathcal {B}\). We refer to these constant symbols as (BG) domain elements.

While we permit arbitrary classes of BG models, in practice the following three cases are most relevant:

  1. (1)

    \(\mathcal {B}\) consists of exactly one \(\varSigma _\mathrm {B}\)-interpretation (up to isomorphism), say, the integer numbers over a signature containing all integer constants as domain elements and \({\le }, {<}, +, -\) with the expected arities. In this case, \(\mathcal {B}\) is trivially compact; in fact, a set N of \(\varSigma _\mathrm {B}\)-clauses is \(\mathcal {B}\)-unsatisfiable if and only if some clause of N is \(\mathcal {B}\)-unsatisfiable.

  2. (2)

    \(\varSigma _\mathrm {B}\) is extended by an infinite number of parameters, that is, additional constant symbols. While all interpretations in \(\mathcal {B}\) share the same carrier sets \(\{I_\xi \}_{\xi \in \varXi _\mathrm {B}}\) and interpretations of non-parameter symbols, parameters may be interpreted freely by arbitrary elements of the appropriate \(I_\xi \). The class \(\mathcal {B}\) obtained in this way is in general not compact; for instance the infinite set of clauses \(\{n \le \beta \mid n \in \mathbb {N}\}\), where \(\beta \) is a parameter, is unsatisfiable in the integers, but every finite subset is satisfiable.

  3. (3)

    \(\varSigma _\mathrm {B}\) is again extended by parameters, however, \(\mathcal {B}\) is now the class of all interpretations that satisfy some first-order theory, say, the first-order theory of linear integer arithmetic.Footnote 3 Since \(\mathcal {B}\) corresponds to a first-order theory, compactness is recovered. It should be noted, however, that \(\mathcal {B}\) contains non-standard models, so that for instance the clause set \(\{n \le \beta \mid n \in \mathbb {N}\}\) is now satisfiable (e. g., \(\mathbb {Q} \times \mathbb {Z}\) with a lexicographic ordering is a model).

The FG theorem prover accepts as inputs clauses over a signature \(\varSigma = (\varXi ,\varOmega )\), where \(\varXi _\mathrm {B}\subseteq \varXi \) and \(\varOmega _\mathrm {B}\subseteq \varOmega \). The sorts in \(\varXi _\mathrm {F}= \varXi \setminus \varXi _\mathrm {B}\) and the operator symbols in \(\varOmega _\mathrm {F}= \varOmega \setminus \varOmega _\mathrm {B}\) are called FG sorts and FG operators. Again we fix an infinite set \(\mathcal {X}_\mathrm {F}\) of FG variables of sorts in \(\varXi _\mathrm {F}\). All FG variables have the kind “ordinary”. We define \(\mathcal {X}= \mathcal {X}_\mathrm {B}\,\cup \,\mathcal {X}_\mathrm {F}\).

In examples we will use \(\{0, 1, 2, \dots \}\) to denote BG domain elements, \(\{{+}, {-}, {<},\) \({\le }\}\) to denote (non-parameter) BG operators, and the possibly subscripted letters \(\{x,y\}\), \(\{X,Y\}\), \(\{\alpha ,\beta \}\), and \(\{\mathsf {{a}}, \mathsf {{b}}, \mathsf {{c}}, \mathsf {{f}}, \mathsf {{g}}\}\) to denote ordinary variables, abstraction variables, parameters, and FG operators, respectively. The letter \(\zeta \) denotes an ordinary variable or an abstraction variable.

We call a term in \(\mathrm {T}_\varSigma (\mathcal {X})\) a FG term, if it is not a BG term, that is, if it contains at least one FG operator or FG variable (and analogously for literals or clauses). We emphasize that for a FG operator \(\mathsf {{f}} : \xi _1 \ldots \xi _n \rightarrow \xi _0\) in \(\varOmega _\mathrm {F}\) any of the \(\xi _i\) may be a BG sort, and that consequently FG terms may have BG sorts.

If I is a \(\varSigma \)-interpretation, the restriction of I to \(\varSigma _\mathrm {B}\), written \(I{|_{\varSigma _\mathrm {B}}}\), is the \(\varSigma _\mathrm {B}\)-interpretation that is obtained from I by removing all carrier sets \(I_\xi \) for \(\xi \in \varXi _\mathrm {F}\) and all functions \(I_f\) for \(f \in \varOmega _\mathrm {F}\). Note that \(I{|_{\varSigma _\mathrm {B}}}\) is not necessarily term-generated even if I is term-generated. In hierarchic theorem proving, we are only interested in \(\varSigma \)-interpretations that extend some model in \(\mathcal {B}\) and neither collapse any of its sorts nor add new elements to them, that is, in \(\varSigma \)-interpretations I for which \(I{|_{\varSigma _\mathrm {B}}} \in \mathcal {B}\). We call such a \(\varSigma \)-interpretation a \(\mathcal {B}\)-interpretation.

Let N and \(N'\) be two sets of \(\varSigma \)-clauses. We say that N entails \(N'\) relative to \(\mathcal {B}\) (and write \(N \,\models _\mathcal {B}\,N'\)), if every model of N whose restriction to \(\varSigma _\mathrm {B}\) is in \(\mathcal {B}\) is a model of \(N'\). Note that \(N \,\models _\mathcal {B}\, N'\) follows from \(N \,\models \, N'\). If \(N \,\models _\mathcal {B}\, \Box \), we call N \(\mathcal {B}\)-unsatisfiable; otherwise, we call it \(\mathcal {B}\)-satisfiable.Footnote 4

Our goal in refutational hierarchic theorem proving is to check whether a given set of \(\varSigma \)-clauses N is false in all \(\mathcal {B}\)-interpretations, or equivalently, whether N is \(\mathcal {B}\)-unsatisfiable.

We say that a substitution \(\sigma \) is simple if \(X\sigma \) is a pure BG term for every abstraction variable \(X \in \mathrm {dom}(\sigma )\). For example, \([x \mapsto 1\,+\,Y\,+\,\alpha ]\), \([X \mapsto 1\,+\,Y\,+\,\alpha ]\) and \([x \mapsto \mathsf {{f}}(1)]\) all are simple, whereas \([X \mapsto 1\,+\,y\,+\,\alpha ]\) and \([X \mapsto \mathsf {{f}}(1)]\) are not. Let F be a clause or (possibly infinite) clause set. By \(\mathrm {sgi}(F)\) we denote the set of simple ground instances of F, that is, the set of all ground instances of (all clauses in) F obtained by simple ground substitutions.

For a BG specification \((\varSigma _\mathrm {B},\mathcal {B})\), we define \(\mathrm {GndTh}(\mathcal {B})\) as the set of all ground \(\varSigma _\mathrm {B}\)-formulas that are satisfied by every \(I \in \mathcal {B}\).

Definition 3.1

(Sufficient completeness). A \(\varSigma \)-clause set N is called sufficiently complete w.r.t. simple instances if for every \(\varSigma \)-model J of \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B})\)Footnote 5 and every ground BG-sorted FG term s there is a ground BG term t such that \(J \,\models \, s \approx t\).Footnote 6    \(\square \)

For brevity, we will from now on omit the phrase “w.r.t. simple instances” and speak only of “sufficient completeness”. It should be noted, though, that our definition differs from the classical definition of sufficient completeness in the literature on algebraic specifications.

4 Orderings

A hierarchic reduction ordering is a strict, well-founded ordering on terms that is compatible with contexts, i. e., \(s \succ t\) implies \(u[s] \succ u[t]\), and stable under simple substitutions, i. e., \(s \succ t\) implies \(s\sigma \succ t\sigma \) for every simple \(\sigma \). In the rest of this paper we assume such a hierarchic reduction ordering \(\succ \) that satisfies all of the following: (i) \(\succ \) is total on ground terms, (ii) \(s \succ d\) for every domain element d and every ground term s that is not a domain element, and (iii) \(s \succ t\) for every ground FG term s and every ground BG term t. These conditions are easily satisfied by an LPO with an operator precedence in which FG operators are larger than BG operators and domain elements are minimal with, for example, \(\cdots \succ -2 \succ 2 \succ -1 \succ 1 \succ 0\) to achieve well-foundedness.

Condition (iii) and stability under simple substitutions together justify to always order \(s \succ X\) where s is a non-variable FG term and X is an abstraction variable. By contrast, \(s \succ x\) can only hold if \(x \in \mathrm {vars}(s)\). Intuitively, the combination of hierarchic reduction orderings and abstraction variables affords ordering more terms.

The ordering \(\succ \) is extended to literals over terms by identifying a positive literal \(s \approx t\) with the multiset \(\{s,t\}\), a negative literal \(s \not \approx t\) with \(\{s,s,t,t\}\), and using the multiset extension of \(\succ \). Clauses are compared by the multiset extension of \(\succ \), also denoted by \(\succ \).

The non-strict orderings \(\succeq \) are defined as \(s \succeq t\) if and only if \(s \succ t\) or \(s = t\) (the latter is multiset equality in case of literals and clauses). A literal L is maximal (strictly maximal) in a clause \(L \vee C\) if there is no \(K \in C\) with \(K \succ L\) (\(K \succeq L\)).

5 Weak Abstraction

To refute an input set of \(\varSigma \)-clauses, hierarchic superposition calculi derive BG clauses from them and pass the latter to a BG prover. In order to do this, some separation of the FG and BG vocabulary in a clause is necessary. The technique used for this separation is known as abstraction: One (repeatedly) replaces some term q in a clause by a new variable and adds a disequations to the clause, so that C[q] is converted into the equivalent clause \(\zeta \not \approx q \vee C[\zeta ]\), where \(\zeta \) is a new (abstraction or ordinary) variable.

The calculus by Bachmair, Ganzinger, and Waldmann [5] works on “fully abstracted” clauses: Background terms occurring below a FG operator or in an equation between a BG and a FG term or vice versa are abstracted out until one arrives at a clause in which no literal contains both FG and BG operators.

A problematic aspect of any kind of abstraction is that it tends to increase the number of incomparable terms in a clause, which leads to an undesirable growth of the search space of a theorem prover. For instance, if we abstract out the subterms t and \(t'\) in a ground clause \(\mathsf {{f}}(t) \approx \mathsf {{g}}(t')\), we get \({x \not \approx t} \vee {y \not \approx t'} \vee {\mathsf {{f}}(x) \approx \mathsf {{g}}(y)}\), and the two new terms \(\mathsf {{f}}(x)\) and \(\mathsf {{g}}(y)\) are incomparable in any reduction ordering. In [5] this problem is mitigated by considering only instances where BG-sorted variables are mapped to BG terms: In the terminology of the current paper, all BG-sorted variables in [5] have the kind “abstraction”. This means that, in the example above, we obtain the two terms \(\mathsf {{f}}(X)\) and \(\mathsf {{g}}(Y)\). If we use an LPO with a precedence in which \(\mathsf {{f}}\) is larger than \(\mathsf {{g}}\) and \(\mathsf {{g}}\) is larger than every BG operator, then for every simple ground substitution \(\tau \), \(\mathsf {{f}}(X)\tau \) is strictly larger that \(\mathsf {{g}}(Y)\tau \), so we can still consider \(\mathsf {{f}}(X)\) as the only maximal term in the literal.

The advantage of full abstraction is that this clause structure is preserved by all inference rules. There is a serious drawback, however: Consider the clause set \( N = \{\,1 + \mathsf {{c}} \not \approx 1 + \mathsf {{c}}\,\} \). Since N is ground, we have \(\mathrm {sgi}(N) = N\), and since \(\mathrm {sgi}(N)\) is unsatisfiable, N is trivially sufficiently complete. Full abstraction turns N into \( N' = \{\,X \not \approx \mathsf {{c}} \,\vee \, 1 + X \not \approx 1 + X\,\} \). In the simple ground instances of \(N'\), X is mapped to all pure BG terms. However, there are \(\varSigma \)-interpretations of \(\mathrm {sgi}(N')\) in which \(\mathsf {{c}}\) is interpreted differently from any pure BG term, so \(\mathrm {sgi}(N') \,\cup \,\mathrm {GndTh}(\mathcal {B})\) does have a \(\varSigma \)-model and \(N'\) is no longer sufficiently complete. In other words, the calculus of [5] is refutationally complete for clause sets that are fully abstracted and sufficiently complete, but full abstraction may destroy sufficient completeness. (In fact, the calculus is not able to refute \(N'\).)

The problem that we have seen is caused by the fact that full abstraction replaces FG terms by abstraction variables, which may not be mapped to FG terms later on. The obvious fix would be to use ordinary variables instead of abstraction variables whenever the term to be abstracted out is not a pure BG term, but as we have seen above, this would increase the number of incomparable terms and it would therefore be detrimental to the performance of the prover.

Full abstraction is a property that is stronger than actually necessary for the completeness proof of [5]. In fact, it was claimed in a footnote in [5] that the calculus could be optimized by abstracting out only non-variable BG terms that occur below a FG operator. This is incorrect, however: Using this abstraction rule, neither our calculus nor the calculus of [5] would be able to refute \(\{\,1\,+\,1 \approx 2,\) \((1\,+\,1)\,+\,\mathsf {{c}} \not \approx 2\, +\, \mathsf {{c}}\,\}\), even though this set is unsatisfiable and trivially sufficiently complete. We need a slightly different abstraction rule to avoid this problem:

Definition 5.1

A BG term q is a target term in a clause C if q is neither a domain element nor a variable and if C has the form \(C[f(s_1,\dots ,q,\dots ,s_n)]\), where f is a FG operator or at least one of the \(s_i\) is a FG or impure BG term.Footnote 7

A clause is called weakly abstracted if it does not have any target terms.

The weakly abstracted version of a clause is the clause that is obtained by exhaustively replacing C[q] by

  • \(C[X] \vee X \not \approx q\), where X is a new abstraction variable, if q is a pure target term in C,

  • \(C[y] \vee y \not \approx q\), where y is a new ordinary variable, if q is an impure target term in C.

The weakly abstracted version of a clause C is denoted by \(\mathrm {abstr}(C)\); if N is a set of clauses then \(\mathrm {abstr}(N) = \{\,\mathrm {abstr}(C) \mid C \in N\,\}\).    \(\square \)

For example, weak abstraction of the clause \(\mathsf {{g}}(1, \alpha , \mathsf {{f}}(1) + (\alpha +1), z) \approx \beta \) yields \(\mathsf {{g}}(1, X, \mathsf {{f}}(1)\,+\,Y, z) \approx \beta \,\vee \, X \not \approx \alpha \,\vee \, Y \not \approx \alpha \,+\,1\). Note that the terms 1, \(\mathsf {{f}}(1)\,+\,(\alpha \,+\,1)\), z, and \(\beta \) are not abstracted out: 1 is a domain element; \(\mathsf {{f}}(1)\,+\,(\alpha \,+\,1)\) has a BG sort, but it is not a BG term; z is a variable; and \(\beta \) is not a proper subterm of any other term. The clause \(\mathsf {{write}}(\mathsf {{a}}, 2, \mathsf {{read}}(\mathsf {{a}}, 1)\,+\,1) \approx \mathsf {{b}}\) is already weakly abstracted. Every pure BG clause is trivially weakly abstracted.

Nested abstraction is only necessary for certain impure BG terms. For instance, the clause \(\mathsf {{f}}(z+\alpha ) \approx 1\) has two target terms, namely \(\alpha \) (since z is an impure BG term) and \(z+\alpha \) (since \(\mathsf {{f}}\) is a FG operator). If we abstract out \(\alpha \), we obtain \(\mathsf {{f}}(z+X) \approx 1 \vee X \not \approx \alpha \). The new term \(z + X\) is still a target term, so one more abstraction step yields \(\mathsf {{f}}(y) \approx 1 \vee X \not \approx \alpha \vee y \not \approx z+X\). (Alternatively, we can first abstract out \(z+\alpha \), yielding \(\mathsf {{f}}(y) \approx 1 \vee y \not \approx z+\alpha \), and then \(\alpha \). The final result is the same.)

It is easy to see that the abstraction process described in Definition 5.1 terminates by comparing the multisets of the numbers of non-variable occurrences in the left and right-hand sides of all literals before and after an abstraction step.

Proposition 5.2

If N is a set of clauses and \(N'\) is obtained from N by replacing one or more clauses by their weakly abstracted versions, then \(\mathrm {sgi}(N)\) and \(\mathrm {sgi}(N')\) are equivalent and \(N'\) is sufficiently complete whenever N is.

Proof

Let us first consider the case of a single abstraction step applied to a single clause. Let C[q] be a clause with a target term q and let \(D \,=\, C[\zeta ] \,\vee \,\zeta \not \approx q\) be the result of abstracting out q (where \(\zeta \) is a new abstraction variable, if q is pure, and a new ordinary variable, if q is impure). We will show that \(\mathrm {sgi}(C)\) and \(\mathrm {sgi}(D)\) have the same models.

In one direction let I be an arbitrary model of \(\mathrm {sgi}(C)\). We have to show that I is also a model of every simple ground instance \(D\tau \) of D. If I satisfies the disequation \(\zeta \tau \not \approx q\tau \) then this is trivial. Otherwise, \(\zeta \tau \) and \(q\tau \) have the same interpretation in I. Since \(\mathrm {dom}(\tau ) \supseteq \mathrm {vars}(D) = \mathrm {vars}(C) \,\cup \,\{\zeta \}\), \(C\tau \) is a simple ground instance of C, so I is a model of \(C\tau = C\tau [q\tau ]\). By congruence, we conclude that I is also a model of \(C\tau [\zeta \tau ]\), hence it is a model of \(D\tau \,=\, C\tau [\zeta \tau ] \,\vee \, \zeta \tau \not \approx q\tau \).

In the other direction let I be an arbitrary model of \(\mathrm {sgi}(D)\). We have to show that I is also a model of every simple ground instance \(C\tau \) of C. Without loss of generality assume that \(\zeta \notin \mathrm {dom}(\tau )\). If \(\zeta \) is an abstraction variable, then q is a pure BG term, and since \(\tau \) is a simple substitution, \(q\tau \) is a pure BG term as well. Consequently, the substitutions \([\zeta \mapsto q\tau ]\) and \(\tau ' = \tau [\zeta \mapsto q\tau ]\) are again simple substitutions and \(D\tau '\) is a simple ground instance of D. This implies that I is a model of \(D\tau '\). The clause \(D\tau '\) has the form \(D\tau ' \,=\, C\tau '[\zeta \tau '] \,\vee \, \zeta \tau ' \not \approx q\tau '\); since \(\zeta \tau ' = q\tau \), \(C\tau ' = C\tau \) and \(q\tau ' = q\tau \), this is equal to \(C\tau [q\tau ] \,\vee \, q\tau \not \approx q\tau \). Obviously, the literal \(q\tau \not \approx q\tau \) must be false in I, so I must be a model of \(C\tau [q\tau ] = C[q]\tau = C\tau \).

By induction over the number of abstraction steps we conclude that for any clause C, \(\mathrm {sgi}(C)\) and \(\mathrm {sgi}(\mathrm {abstr}(C))\) are equivalent. The extension to clause sets N and \(N'\) follows then from the fact that I is a model of \(\mathrm {sgi}(N)\) if and only if it is a model of \(\mathrm {sgi}(C)\) for all \(C \in N\). Moreover, the equivalence of \(\mathrm {sgi}(N)\) and \(\mathrm {sgi}(N')\) implies obviously that \(N'\) is sufficiently complete whenever N is.    \(\square \)

In contrast to full abstraction, the weak abstraction rule does not require abstraction of FG terms (which can destroy sufficient completeness if done using abstraction variables, and which is detrimental to the performance of a prover if done using ordinary variables). BG terms are usually abstracted out using abstraction variables. The exception are BG terms that are impure, i. e., that contain ordinary variables themselves. In this case, we cannot avoid to use ordinary variables for abstraction, otherwise, we might again destroy sufficient completeness. For example, the clause set \(\{\mathsf {{P}}(1+y),\ \lnot \mathsf {{P}}(1+\mathsf {{c}})\}\) is sufficiently complete. If we used an abstraction variable instead of an ordinary variable to abstract out the impure subterm \(1+y\), we would get \(\{\mathsf {{P}}(X) \,\vee \, X \not \approx 1+y ,\ \lnot \mathsf {{P}}(1+\mathsf {{c}})\}\), which is no longer sufficiently complete.

In input clauses (that is, before abstraction), BG-sorted variables may be declared as “ordinary” or “abstraction”. As we have seen above, using abstraction variables can reduce the search space; on the other hand, abstraction variables may be detrimental to sufficient completeness. Consider the following example: The set of clauses \(N = \{\lnot \mathsf {{f}}(x) > \mathsf {{g}}(x) \,\vee \, \mathsf {{h}}(x) \approx 1,\) \(\lnot \mathsf {{f}}(x) \le \mathsf {{g}}(x) \,\vee \, \mathsf {{h}}(x) \approx 2,\) \(\lnot \mathsf {{h}}(x) > 0\}\) is unsatisfiable w.r.t. linear integer arithmetic, but since it is not sufficiently complete, the hierarchic superposition calculus does not detect the unsatisfiability. Adding the clause \(X > Y \,\vee \, X \le Y\) to N does not help: Since the abstraction variables X and Y may not be mapped to the FG terms \(\mathsf {{f}}(x)\) and \(\mathsf {{g}}(x)\) in a simple ground instance, the resulting set is still not sufficiently complete. However, if we add the clause \(x > y \,\vee \, x \le y\), the set of clauses becomes (vacuously) sufficiently complete and its unsatisfiability is detected.

One might wonder whether it is also possible to gain anything if the abstraction process is performed using ordinary variables instead of abstraction variables. The following proposition shows that this is not the case:

Proposition 5.3

Let N be a set of clauses, let \(N'\) be the result of weak abstraction of N as defined above, and let \(N''\) be the result of weak abstraction of N where all newly introduced variables are ordinary variables. Then \(\mathrm {sgi}(N')\) and \(\mathrm {sgi}(N'')\) are equivalent and \(\mathrm {sgi}(N')\) is sufficiently complete if and only if \(\mathrm {sgi}(N'')\) is.

Proof

By Proposition 5.2, we know already that \(\mathrm {sgi}(N)\) and \(\mathrm {sgi}(N')\) are equivalent. Moreover, it is easy to check the proof of Proposition 5.2 is still valid if we assume that the newly introduced variable \(\zeta \) is always an ordinary variable. (Note that the proof requires that abstraction variables are mapped only to pure BG terms, but it does not require that a variable that is mapped to a pure BG term must be an abstraction variable.) So we can conclude in the same way that \(\mathrm {sgi}(N)\) and \(\mathrm {sgi}(N'')\) are equivalent, and hence, that \(\mathrm {sgi}(N')\) and \(\mathrm {sgi}(N'')\) are equivalent. From this, we can conclude that \(N'\) is sufficiently complete whenever \(N''\) is.    \(\square \)

6 Base Inference System

An inference system \(\mathcal {I}\) is a set of inference rules. By an \(\mathcal {I}\) inference we mean an instance of an inference rule from \(\mathcal {I}\) such that all conditions are satisfied.

The base inference system \(\mathrm {HSP}_{\mathrm {Base}}\) of the hierarchic superposition calculus consists of the inference rules , , , , and defined below. The calculus is parameterized by a hierarchic reduction ordering \(\succ \) and by a “selection function” that assigns to every clause a (possibly empty) subset of its negative FG literals. All inference rules are applicable only to weakly abstracted premise clauses.

figure g

if (i) \(\sigma \) is a simple mgu of s and t, (ii) \(s\sigma \) is not a pure BG term, and (iii) if the premise has selected literals, then \(s \not \approx t\) is selected in the premise, otherwise \((s \not \approx t)\sigma \) is maximal in \((s \not \approx t \,\vee \, C)\sigma \).Footnote 8

For example, is applicable to \(1+ \mathsf {{c}} \not \approx 1+x\) with the simple mgu \([x \mapsto \mathsf {{c}}]\), but it is not applicable to \(1+ \alpha \not \approx 1+x\), since \(1+ \alpha \) is a pure BG term.

figure i

if (i) u is not a variable, (ii) \(\sigma \) is a simple mgu of l and u, (iii) \(l\sigma \) is not a pure BG term, (iv) \(r\sigma \not \succeq l\sigma \), (v) \((l \approx r)\sigma \) is strictly maximal in \((l \approx r \,\vee \, C)\sigma \), (vi) the first premise does not have selected literals, (vii) \(t\sigma \not \succeq s\sigma \), and (viii) if the second premise has selected literals, then \(s \not \approx t\) is selected in the second premise, otherwise \((s \not \approx t)\sigma \) is maximal in \((s \not \approx t \,\vee \, D)\sigma \).

figure j

if (i) u is not a variable, (ii) \(\sigma \) is a simple mgu of l and u, (iii) \(l\sigma \) is not a pure BG term, (iv) \(r\sigma \not \succeq l\sigma \), (v) \((l \approx r)\sigma \) is strictly maximal in \((l \approx r \,\vee \, C)\sigma \), (vi) \(t\sigma \not \succeq s\sigma \), (vii) \((s \not \approx t)\sigma \) is strictly maximal in \((s \approx t \,\vee \, D)\sigma \), and (viii) none of the premises has selected literals.

figure k

where (i) \(\sigma \) is a simple mgu of s and l, (ii) \(s\sigma \) is not a pure BG term, (iii) \((s \approx t)\sigma \) is maximal in \((s \approx t \,\vee \, l \approx r \,\vee \, C)\sigma \), (iv) \(t\sigma \not \succeq s\sigma \), (v) \(l\sigma \not \succeq r\sigma \), and (vi) the premise does not have selected literals.

figure l

if \(C_1,\ldots ,C_n\) are BG clauses and \(\{ C_1,\ldots ,C_n\}\) is \(\mathcal {B}\)-unsatisfiable, i. e., no interpretation in \(\mathcal {B}\) is a \(\varSigma _\mathrm {B}\)-model of \(\{ C_1,\ldots ,C_n\}\).

Notice that is not restricted to take pure BG clauses only. The reason is that also impure BG clauses admit simple ground instances that are pure.

Theorem 6.1

The inference rules of \(\mathrm {HSP}_{\mathrm {Base}}\) are sound w.r.t. \(\models _\mathcal {B}\), i. e., for every inference with premises in N and conclusion C, we have \(N \,\models _\mathcal {B}\, C\).

Proof

, , , and are clearly sound w.r.t. \(\models \), and therefore also sound w.r.t. \(\models _\mathcal {B}\). For , soundness w.r.t. \(\models _\mathcal {B}\) follows immediately from the definition.    \(\square \)

All inference rules of \(\mathrm {HSP}_{\mathrm {Base}}\) involve (simple) mgus. Because of the two kinds of variables, abstraction and ordinary ones, the practical question arises if standard unification algorithms can be used without or only little modification. For example, the terms Z and \((x+y)\) admit a simple mgu \(\sigma = [x \mapsto X,\, y \mapsto Y,\, Z \mapsto X+Y]\). This prompts for the use of weakening substitutions as in many-sorted logics with subsorts [28]. A closer inspection of the inference rules shows, however, that such substitutions never need to be considered: All unifiers computed in the inference rules have the property that abstraction variables are only mapped to abstraction variables or domain elements; apart from this additional restriction, we can use a standard unification algorithm.

In contrast to [5], the inference rules above include an explicit weak abstraction in their conclusion. Without it, conclusions would not be weakly abstracted in general. For example applied to the weakly abstracted clauses \(\mathsf {{f}}(X) \approx 1 \,\vee \, X \not \approx \alpha \) and \(\mathsf {{P}}(\mathsf {{f}}(1) + 1)\) would then yield \(\mathsf {{P}}(1 + 1) \,\vee \, 1 \not \approx \alpha \), whose \(\mathsf {{P}}\)-literal is not weakly abstracted. Additionally, the side conditions of our rules differ somewhat from the corresponding rules of [5], this is due on the one hand to the presence of impure BG terms (which must sometimes be treated like FG terms), and on the other hand to the fact that, after weak abstraction, literals may still contain both FG and BG operators.

The inference rules are supplemented by a redundancy criterion, that is, a mapping \(\mathcal {R}_\mathrm {Cl}\) from sets of formulae to sets of formulae and a mapping \(\mathcal {R}_\mathrm {Inf}\) from sets of formulae to sets of inferences that are meant to specify formulae that may be removed from N and inferences that need not be computed. (\(\mathcal {R}_\mathrm {Cl}(N)\) need not be a subset of N and \(\mathcal {R}_\mathrm {Inf}(N)\) will usually also contain inferences whose premises are not in N.)

Definition 6.2

A pair \(\mathcal {R}= (\mathcal {R}_\mathrm {Inf},\mathcal {R}_\mathrm {Cl})\) is called a redundancy criterion (with respect to an inference system \(\mathcal {I}\) and a consequence relation \(\models \)), if the following conditions are satisfied for all sets of formulae N and \(N'\):

(i):

\(N \setminus \mathcal {R}_\mathrm {Cl}(N) \,\models \, \mathcal {R}_\mathrm {Cl}(N)\).

(ii):

If \(N \subseteq N'\), then \(\mathcal {R}_\mathrm {Cl}(N) \subseteq \mathcal {R}_\mathrm {Cl}(N')\) and \(\mathcal {R}_\mathrm {Inf}(N) \subseteq \mathcal {R}_\mathrm {Inf}(N')\).

(iii):

If \(\iota \) is an inference and its conclusion is in N, then \(\iota \in \mathcal {R}_\mathrm {Inf}(N)\).

(iv):

If \(N' \subseteq \mathcal {R}_\mathrm {Cl}(N)\), then \(\mathcal {R}_\mathrm {Cl}(N) \subseteq \mathcal {R}_\mathrm {Cl}(N \setminus N')\) and \(\mathcal {R}_\mathrm {Inf}(N) \subseteq \mathcal {R}_\mathrm {Inf}(N \setminus N')\).

The inferences in \(\mathcal {R}_\mathrm {Inf}(N)\) and the formulae in \(\mathcal {R}_\mathrm {Cl}(N)\) are said to be redundant with respect to N.    \(\square \)

Let \(\mathrm {SSP}\) be the ground standard superposition calculus using the inference rules equality resolution, negative superposition, positive superposition, and equality factoring (Bachmair and Ganzinger [3], Nieuwenhuis [23], Nieuwenhuis and Rubio [25]). To define a redundancy criterion for \(\mathrm {HSP}_{\mathrm {Base}}\) and to prove the refutational completeness of the calculus, we use the same approach as in [5] and relate \(\mathrm {HSP}_{\mathrm {Base}}\) inferences to the corresponding \(\mathrm {SSP}\) inferences.

For a set of ground clauses N, we define \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(N)\) to be the set of all clauses C such that there exist clauses \(C_1, \ldots , C_n \in N\) that are smaller than C with respect to \(\succ \) and \(C_1, \ldots , C_n \,\models \, C\). We define \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N)\) to be the set of all ground \(\mathrm {SSP}\) inferences \(\iota \) such that either a premise of \(\iota \) is in \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(N)\) or else \(C_0\) is the conclusion of \(\iota \) and there exist clauses \(C_1, \ldots , C_n \in N\) that are smaller with respect to \(\succ ^\mathrm {c}\) than the maximal premise of \(\iota \) and \(C_1, \ldots , C_n \,\models \, C_0\).

The following results can be found in [3] and [23]:

Theorem 6.3

The (ground) standard superposition calculus \(\mathrm {SSP}\) and \(\mathcal {R}^\mathcal {S}= (\mathcal {R}_\mathrm {Inf}^\mathcal {S},\mathcal {R}_\mathrm {Cl}^\mathcal {S})\) satisfy the following properties:

(i):

\(\mathcal {R}^\mathcal {S}\) is a redundancy criterion with respect to \(\models \).

(ii):

\(\mathrm {SSP}\) together with \(\mathcal {R}^\mathcal {S}\) is refutationally complete.

Let \(\iota \) be an \(\mathrm {HSP}_{\mathrm {Base}}\) inference with premises \(C_1,\ldots ,C_n\) and conclusion \(\mathrm {abstr}(C)\), where the clauses \(C_1,\ldots ,C_n\) have no variables in common. Let \(\iota '\) be a ground \(\mathrm {SSP}\) inference with premises \(C'_1,\ldots ,C'_n\) and conclusion \(C'\). If \(\sigma \) is a simple substitution such that \(C' = C\sigma \) and \(C'_i = C_i\sigma \) for all i, and if none of the \(C'_i\) is a BG clause, then \(\iota '\) is called a simple ground instance of \(\iota \). The set of all simple ground instances of an inference \(\iota \) is denoted by \(\mathrm {sgi}(\iota )\).

Definition 6.4

Let N be a set of weakly abstracted clauses. We define \(\mathcal {R}_\mathrm {Inf}^\mathcal {H}(N)\) to be the set of all inferences \(\iota \) such that either \(\iota \) is not a inference and \(\mathrm {sgi}(\iota ) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B}))\), or else \(\iota \) is a inference and \(\Box \in N\). We define \(\mathcal {R}_\mathrm {Cl}^\mathcal {H}(N)\) to be the set of all weakly abstracted clauses C such that \(\mathrm {sgi}(C) \subseteq \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B})) \,\cup \,\mathrm {GndTh}(\mathcal {B})\).Footnote 9    \(\square \)

7 Refutational Completeness

To prove that \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}= (\mathcal {R}_\mathrm {Inf}^\mathcal {H}, \mathcal {R}_\mathrm {Cl}^\mathcal {H})\) are refutationally complete for sets of weakly abstracted \(\varSigma \)-clauses and compact BG specifications \((\varSigma _\mathrm {B},\mathcal {B})\), we use the same technique as in [5]:

First we show that \(\mathcal {R}^\mathcal {H}\) is a redundancy criterion with respect to \(\models _\mathcal {B}\), and that a set of clauses remains sufficiently complete if new clauses are added or if redundant clauses are deleted. The proofs are rather technical and can be found in [12]. They are similar to the corresponding ones in [5]; the differences are due, on the one hand, to the fact that we include \(\mathrm {GndTh}(\mathcal {B})\) in the redundancy criterion and in the definition of sufficient completeness, and, on the other hand, to the explicit abstraction steps in our inference rules.

Lemma 7.1

If \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B}) \,\models \, \mathrm {sgi}(C)\), then \(N \,\models _\mathcal {B}\,C\).

Proof

Suppose that \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B}) \,\models \, \mathrm {sgi}(C)\). Let \(I'\) be a \(\varSigma \)-model of N whose restriction to \(\varSigma _\mathrm {B}\) is contained in \(\mathcal {B}\). Clearly, \(I'\) is also a model of \(\mathrm {GndTh}(\mathcal {B})\). Since \(I'\) does not add new elements to the sorts of \(I = I'{|_{\varSigma _\mathrm {B}}}\) and I is a term-generated \(\varSigma _\mathrm {B}\)-interpretation, we know that for every ground \(\varSigma \)-term \(t'\) of a BG sort there is a ground BG term t such that t and \(t'\) have the same interpretation in \(I'\). Therefore, for every ground substitution \(\sigma '\) there is an equivalent simple ground substitution \(\sigma \); since \(C\sigma \) is valid in \(I'\), \(C\sigma '\) is also valid.    \(\square \)

We call the simple most general unifier \(\sigma \) that is computed during an inference \(\iota \) and applied to the conclusion the pivotal substitution of \(\iota \). (For ground inferences, the pivotal substitution is the identity mapping.) If L is the literal \([\lnot ]\, s \approx t\) or \([\lnot ]\, s[u] \approx t\) of the second or only premise that is eliminated in \(\iota \), we call \(L\sigma \) the pivotal literal of \(\iota \), and we call \(s\sigma \) or \(s[u]\sigma \) the pivotal term of \(\iota \).

Lemma 7.2

Let \(\iota \) be an \(\mathrm {HSP}_{\mathrm {Base}}\) inference

figure w

from weakly abstracted premises with pivotal substitution \(\sigma \). Let \(\iota '\) be a simple ground instance of \(\iota \) of the form

figure x

Then there is a simple ground instance of \(\mathrm {abstr}(C_0\sigma )\) that has the form \(C_0\sigma \tau \,\vee \, E\), where E is a (possibly empty) disjunction of literals \(s \not \approx s\), and each literal of E is smaller than the pivotal literal of \(\iota '\).

As \(M \subseteq M'\) implies \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(M) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(M')\), we obtain \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N) \setminus \mathrm {sgi}(N')) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N \setminus N'))\). Furthermore, it is fairly easy to see that \(\mathrm {sgi}(N) \setminus (\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B})) \,\cup \,\mathrm {GndTh}(\mathcal {B})) \subseteq \mathrm {sgi}(N \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {H}(N))\). Using these two results we can prove the following lemmas:

Lemma 7.3

\(\mathcal {R}^\mathcal {H}= (\mathcal {R}_\mathrm {Inf}^\mathcal {H}, \mathcal {R}_\mathrm {Cl}^\mathcal {H})\) is a redundancy criterion with respect to \(\models _\mathcal {B}\).

Lemma 7.4

Let N, \(N'\) and M be sets of weakly abstracted clauses such that \(N' \subseteq \mathcal {R}_\mathrm {Cl}^\mathcal {H}(N)\). If N is sufficiently complete, then so are \(N \,\cup \,M\) and \(N \setminus N'\).

We now encode arbitrary term-generated \(\varSigma _\mathrm {B}\)-interpretation by sets of unit ground clauses in the following way: Let \(I \in \mathcal {B}\) be a term-generated \(\varSigma _\mathrm {B}\)-interpretation. For every \(\varSigma _\mathrm {B}\)-ground term t let m(t) be the smallest ground term of the congruence class of t in I. We define a rewrite system \(\mathrm {E}_I'\) by \(\mathrm {E}_I' = \{t \rightarrow m(t) \mid t \in \mathrm {T}_\varSigma ,\, t \not = m(t)\}\). Obviously, \(\mathrm {E}_I'\) is right-reduced; since all rewrite rules are contained in \(\succ \), \(\mathrm {E}_I'\) is terminating; and since every ground term t has m(t) as its only normal form, \(\mathrm {E}_I'\) is also confluent. Now let \(\mathrm {E}_I\) be the set of all rules \(l \rightarrow r\) in \(\mathrm {E}_I'\) such that l is not reducible by \(\mathrm {E}_I' \setminus \{l \rightarrow r\}\). Clearly every term that is reducible by \(\mathrm {E}_I\) is also reducible by \(\mathrm {E}_I'\); conversely every term that is reducible by \(\mathrm {E}_I'\) has a minimal subterm that is reducible by \(\mathrm {E}_I'\) and the rule in \(\mathrm {E}_I'\) that is used to rewrite this minimal subterm is necessarily contained in \(\mathrm {E}_I\). Therefore \(\mathrm {E}_I'\) and \(\mathrm {E}_I\) define the same set of normal forms, and from this we can conclude that \(\mathrm {E}_I\) and \(\mathrm {E}_I'\) induce the same equality relation on ground \(\varSigma _\mathrm {B}\)-terms. We identify \(\mathrm {E}_I\) with the set of clauses \(\{t \approx t' \mid t \rightarrow t' \in \mathrm {E}_I\}\). Let \(\mathrm {D}_I\) be the set of all clauses \(t \not \approx t'\), such that t and \(t'\) are distinct ground \(\varSigma _\mathrm {B}\)-terms in normal form with respect to \(\mathrm {E}_I\).Footnote 10

Lemma 7.5

Let \(I \in \mathcal {B}\) be a term-generated \(\varSigma _\mathrm {B}\)-interpretation and let C be a ground BG clause. Then C is true in I if and only if there exist clauses \(C_1,\ldots ,C_n\) in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\) such that \(C_1,\ldots ,C_n \,\models \, C\) and \(C \succeq C_i\) for \(1 \le i \le n\).

Proof

The “if” part follows immediately from the fact that \(I \,\models \, \mathrm {E}_I\,\cup \,\mathrm {D}_I\). For the “only if” part assume that the ground BG clause C is true in I. Consequently, there is some literal \(s \approx t\) or \(s \not \approx t\) of C that is true in I. Then this literal follows from (i) the rewrite rules in \(\mathrm {E}_I\) that are used to normalize s to its normal form \(s'\), (ii) the rewrite rules in \(\mathrm {E}_I\) that are used to normalize t to its normal form \(t'\), and, in the case of a negated literal \(s \not \approx t\), (iii) the clause \(s' \not \approx t' \in \mathrm {D}_I\). It is routine to show that all these clauses are smaller than or equal to \(s \approx t\) or \(s \not \approx t\), respectively, and hence smaller than or equal to C.    \(\square \)

Corollary 7.6

Let \(I \in \mathcal {B}\) be a term-generated \(\varSigma _\mathrm {B}\)-interpretation. Then \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\models \, \mathrm {GndTh}(\mathcal {B})\).

Proof

Since \(I \in \mathcal {B}\), we have \(I \,\models \, \mathrm {GndTh}(\mathcal {B})\), hence \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\models \, \mathrm {GndTh}(\mathcal {B})\) by Lemma 7.5.    \(\square \)

Let N be a set of weakly abstracted clauses and \(I \in \mathcal {B}\) be a term-generated \(\varSigma _\mathrm {B}\)-interpretation, then \(N_I\) denotes the set \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\{\, C\sigma \mid \sigma \) simple, reduced with respect to \(\mathrm {E}_I\), \(C \in N\), \(C\sigma \) ground\(\,\}\).

Lemma 7.7

If N is a set of weakly abstracted clauses, then \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B})) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\).

Proof

By part (i) of Theorem 6.3 we have obviously \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N)) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B}))\). Let C be a clause in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B})\) and not in \(N_I\). If \(C \in \mathrm {GndTh}(\mathcal {B})\), then it is true in I, so by Lemma 7.5 it is either contained in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\subseteq N_I\) or it follows from smaller clauses in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\) and is therefore in \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N))\). If \(C \notin \mathrm {GndTh}(\mathcal {B})\), then \(C = C'\sigma \) for some \(C' \in N\), so it follows from \(C'\rho \) and \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\), where \(\rho \) is the substitution that maps every variable \(\zeta \) to the \(\mathrm {E}_I\)-normal form of \(\zeta \sigma \). Since C follows from smaller clauses in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N)\), it is in \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N))\). Hence \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {E}_I\,\cup \,\mathrm {D}_I\,\cup \,\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B})) \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\).    \(\square \)

A clause set N is called saturated (with respect to an inference system \(\mathcal {I}\) and a redundancy criterion \(\mathcal {R}\)) if \(\iota \in \mathcal {R}_\mathrm {Inf}(N)\) for every inference \(\iota \) with premises in N.

Theorem 7.8

Let \(I \in \mathcal {B}\) be a term-generated \(\varSigma _\mathrm {B}\)-interpretation and let N be a set of weakly abstracted \(\varSigma \)-clauses. If I satisfies all BG clauses in \(\mathrm {sgi}(N)\) and N is saturated with respect to \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\), then \(N_I\) is saturated with respect to \(\mathrm {SSP}\) and \(\mathcal {R}^\mathcal {S}\).

Proof

We have to show that every \(\mathrm {SSP}\)-inference from clauses of \(N_I\) is contained in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\). We demonstrate this in detail for the equality resolution and the negative superposition rule. The analysis of the other rules is similar. Note that by Lemma 7.5 every BG clause that is true in I and is not contained in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\) follows from smaller clauses in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\), thus it is in \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(N_I)\); every inference involving such a clause is in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\).

The equality resolution rule is obviously not applicable to clauses from \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\). Suppose that \(\iota \) is an equality resolution inference with a premise \(C\sigma \), where \(C \in N\) and \(\sigma \) is a simple substitution and reduced with respect to \(\mathrm {E}_I\). If \(C\sigma \) is a BG clause, then \(\iota \) is in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\). If the pivotal term of \(\iota \) is a pure BG term then the pivotal literal is pure BG as well. Because the pivotal literal is maximal in \(C\sigma \) it follows from properties of the ordering that \(C\sigma \) is a BG clause. Because we have already considered this case we can assume from now on that the pivotal term of \(\iota \) is not pure BG and that \(C\sigma \) is an FG clause. It follows that \(\iota \) is a simple ground instance of a hierarchic inference \(\iota '\) from C. Since \(\iota '\) is in \(\mathcal {R}_\mathrm {Inf}^\mathcal {H}(N)\), \(\iota \) is in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(\mathrm {sgi}(N)\,\cup \,\mathrm {GndTh}(\mathcal {B}))\), by Lemma 7.7, this implies again \(\iota \in \mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\).

Obviously a clause from \(\mathrm {D}_I\) cannot be the first premise of a negative superposition inference. Suppose that the first premise is a clause from \(\mathrm {E}_I\). The second premise cannot be a FG clause, since the maximal sides of maximal literals in a FG clause are reduced; as it is a BG clause, the inference is redundant. Now suppose that \(\iota \) is a negative superposition inference with a first premise \(C\sigma \), where \(C \in N\) and \(\sigma \) is a simple substitution and reduced with respect to \(\mathrm {E}_I\). If \(C\sigma \) is a BG clause, then \(\iota \) is in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\). Otherwise, with the same arguments as for the equality resolution case above, the pivotal term is not pure BG and \(C\sigma \) is a FG clause. Hence we can conclude that the second premise can be written as \(C'\sigma \), where \(C' \in N\) is a FG clause (without loss of generality, C and \(C'\) do not have common variables). If the overlap takes place below a variable occurrence, the conclusion of the inference follows from \(C\sigma \) and some instance \(C'\rho \), which are both smaller than \(C'\sigma \). Otherwise, \(\iota \) is a simple ground instance of a hierarchic inference \(\iota '\) from C. In both cases, \(\iota \) is contained in \(\mathcal {R}_\mathrm {Inf}^\mathcal {S}(N_I)\).    \(\square \)

The crucial property of abstracted clauses that is needed in the proof of this theorem is that there are no superposition inferences between clauses in \(\mathrm {E}_I\) and FG ground instances \(C\sigma \) in \(N_I\), or in other words, that all FG terms occurring in ground instances \(C\sigma \) are reduced w.r.t. \(\mathrm {E}_I\). This motivates the definition of target terms in Definition 5.1: Recall that two different domain elements must always be interpreted differently in I and that a domain element is smaller in the term ordering than any ground term that is not a domain element. Consequently, any domain element is the smallest term in its congruence class, so it is reduced by \(\mathrm {E}_I\). Furthermore, by the definition of \(N_I\), \(\zeta \sigma \) is reduced by \(\mathrm {E}_I\) for every variable \(\zeta \). So variables and domain elements never need to be abstracted out. Other BG terms (such as parameters \(\alpha \) or non-constant terms \(\zeta _1 + \zeta _2\)) have to be abstracted out if they occur below a FG operator, or if one of their sibling terms is a FG term or an impure BG term (since \(\sigma \) can map the latter to a FG term). On the other hand, abstracting out FG terms as in [5] is never necessary to ensure that FG terms are reduced w.r.t. \(\mathrm {E}_I\).

If N is saturated with respect to \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\) and does not contain the empty clause, then cannot be applicable to N. If \((\varSigma _\mathrm {B},\mathcal {B})\) is compact, this implies that there is some term-generated \(\varSigma _\mathrm {B}\)-interpretation \(I \in \mathcal {B}\) that satisfies all BG clauses in \(\mathrm {sgi}(N)\). Hence, by Theorem 7.8, the set of reduced simple ground instances of N has a model that also satisfies \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\). Sufficient completeness allows us to show that this is in fact a model of all ground instances of clauses in N and that I is its restriction to \(\varSigma _\mathrm {B}\):

Theorem 7.9

If the BG specification \((\varSigma _\mathrm {B},\mathcal {B})\) is compact, then \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\) are statically refutationally complete for all sufficiently complete sets of clauses, i. e., if a set of clauses N is sufficiently complete and saturated w.r.t. \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\), and \(N \,\models _\mathcal {B}\, \Box \), then \(\Box \in N\).

Proof

Let N be a set of weakly abstracted clauses that is sufficiently complete, and saturated w.r.t.the hierarchic superposition calculus and \(\mathcal {R}^\mathcal {H}\) and does not contain \(\Box \). Consequently, the rule is not applicable to N. By compactness, this means that the set of all \(\varSigma _\mathrm {B}\)-clauses in \(\mathrm {sgi}(N)\) is satisfied by some term-generated \(\varSigma _\mathrm {B}\)-interpretation \(I \in \mathcal {B}\). By Theorem 7.8, \(N_I\) is saturated with respect to the standard superposition calculus. Since \(\Box \notin N_I\), the refutational completeness of standard superposition implies that there is a \(\varSigma \)-model \(I'\) of \(N_I\). Since N is sufficiently complete, we know that for every ground term \(t'\) of a BG sort there exists a BG term t such that \(t' \approx t\) is true in \(I'\). Consequently, for every ground instance of a clause in N there exists an equivalent simple ground instance, thus \(I'\) is also a model of all ground instances of clauses in N. To see that the restriction of \(I'\) to \(\varSigma _\mathrm {B}\) is isomorphic to I and thus in \(\mathcal {B}\), note that \(I'\) satisfies \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\), preventing confusion, and that N is sufficiently complete, preventing junk. Since \(I'\) satisfies N and \(I'{|_{\varSigma _\mathrm {B}}} \in \mathcal {B}\), we have \(N \not \models _\mathcal {B}\Box \)    \(\square \)

A theorem proving derivation \(\mathcal {D}\) is a finite or infinite sequence of weakly abstracted clause sets \(N_0, N_1, ...\), such that \(N_i\) and \(N_{i+1}\) are equisatisfiable w.r.t. \(\models _\mathcal {B}\) and \(N_i \setminus N_{i+1} \subseteq \mathcal {R}_\mathrm {Inf}^\mathcal {H}(N_{i+1})\) for all indices i. The set \(N_\infty = \bigcup _{i\ge 0} \bigcap _{j\ge i} N_j\) is called the limit of \(\mathcal {D}\); the set \(N^\infty = \bigcup _{i\ge 0} N_i\) is called the union of \(\mathcal {D}\). It is easy to show that every clause in \(N^\infty \) is either contained in \(N_\infty \) or redundant w.r.t. \(N_\infty \). The derivation \(\mathcal {D}\) is said to be fair, if every \(\mathrm {HSP}_{\mathrm {Base}}\)-inference with (non-redundant) premises in \(N_\infty \) becomes redundant at some point of the derivation. The limit of a fair derivation is saturated [4]; this is the key result that allows us to deduce dynamic refutational completeness from static refutational completeness:

Theorem 7.10

If the BG specification \((\varSigma _\mathrm {B},\mathcal {B})\) is compact, then \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\) are dynamically refutationally complete for all sufficiently complete sets of clauses, i. e., if \(N \,\models _\mathcal {B}\, \Box \), then every fair derivation starting from \(\mathrm {abstr}(N)\) eventually generates \(\Box \).

In the rest of the paper, we consider only theorem proving derivations where each set \(N_{i+1}\) results from from \(N_i\) by either adding the conclusions of inferences from \(N_i\), or by deleting clauses that are redundant w.r.t. \(N_{i+1}\), or by applying the following generic simplification rule for clause sets:

figure aa

if \(n \ge 0\) and (i) \(D_i\) is weakly abstracted, for all \(i=1,\ldots ,n\), (ii) \(N \,\cup \,\{ C \} \,\models _\mathcal {B}\, D_i\), and (iii) \(C \in \mathcal {R}_\mathrm {Cl}^\mathcal {H}(N \,\cup \,\{ D_1,\ldots ,D_n \})\).

Condition (ii) is needed for soundness, and condition (iii) is needed for completeness. The rule covers the usual simplification rules of the standard superposition calculus, such as demodulation by unit clauses and deletion of tautologies and (properly) subsumed clauses. It also covers simplification of arithmetic terms, e. g., replacing a subterm \((2+3)+\alpha \) by \(5+\alpha \) and deleting an unsatisfiable BG literal \(5+\alpha < 4+\alpha \) from a clause. Any clause of the form \(C \,\vee \, \zeta \not \approx d\) where d is domain element can be simplified to \(C[\zeta \mapsto d]\). Notice, though, that impure BG terms or FG terms can in general not be simplified by BG tautologies. Although, e. g., \(\mathsf {{f}}(X)\,+\,1 \not \approx y\,+\,1\) is larger than \(1\,+\,\mathsf {{f}}(X) \not \approx y\,+\,1\) (with a LPO), such a “simplification” is not justified by the redundancy criterion. Indeed, in the example it destroys sufficient completeness.

We have to point out a limitation of the calculus described above. The standard superposition calculus \(\mathrm {SSP}\) exists in two variants: either using the rule, or using the and rules. Only the first of these variants works together with weak abstraction. Consider the following example. Let \(N = \{\,\alpha + \beta \approx \alpha ,\) \(\mathsf {{c}} \not \approx \beta \,\vee \, \mathsf {{c}} \not \approx 0,\) \(\mathsf {{c}} \approx \beta \,\vee \, \mathsf {{c}} \approx 0\,\}\). All clauses in N are weakly abstracted. Since the first clause entails \(\beta \approx 0\) relative to linear arithmetic, the second and the third clause are obviously contradictory. The \(\mathrm {HSP}_{\mathrm {Base}}\) calculus as defined above is able to detect this by first applying to the third clause, yielding \(\mathsf {{c}} \approx 0 \,\vee \, \beta \not \approx 0\), followed by two steps and . If is replaced by and , however, the refutational completeness of \(\mathrm {HSP}_{\mathrm {Base}}\) is lost. The only inference that remains possible is a inference between the third and the second clause. But since the conclusion of this inference is a tautology, the inference is redundant, so the clause set is saturated. (Note that the clause \(\beta \approx 0\) is entailed by N, but it is not explicitly present, so there is no way to perform a inference with the smaller side of the maximal literal of the third clause.)

8 Local Sufficient Completeness

The definition of sufficient completeness w.r.t. simple instances that was given in Sect. 3 requires that every ground BG-sorted FG term s is equal to some ground BG term t in every \(\varSigma \)-model J of \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B})\). It is rather evident, however, that this condition is sometimes stronger than needed. For instance, if the set of input clauses N is ground, then we only have to consider the ground BG-sorted FG terms that actually occur in N [22] (analogously to the Nelson-Oppen combination procedure). A relaxation of sufficient completeness that is also useful for non-ground clauses and that still ensures refutational completeness was given by Kruglov [21]:

Definition 8.1

(Smooth ground instance). We say that a substitution \(\sigma \) is smooth if for every variable \(\zeta \in \mathrm {dom}(\sigma )\) all BG-sorted (proper or non-proper) subterms of \(\zeta \sigma \) are pure BG terms. If \(F\sigma \) is a ground instance of a term or clause F and \(\sigma \) is smooth, \(F\sigma \) is called a smooth ground instance. (Recall that every ground BG term is necessarily pure.) If N is a set of clauses, \(\mathrm {smgi}(N)\) denotes the set of all smooth ground instances of clauses in N.    \(\square \)

Every smooth substitution is a simple substitution, but not vice versa. For instance, if x is a FG-sorted variable and y is an ordinary BG-sorted variable, then \(\sigma _1 = [x \mapsto \mathsf {{cons}}(\mathsf {{f}}(1)+2,\mathsf {{empty}})]\) and \(\sigma _2 = [y \mapsto \mathsf {{f}}(1)]\) are simple substitutions, but neither of them is smooth, since \(x\sigma _1\) and \(y\sigma _2\) contain the BG-sorted FG subterm \(\mathsf {{f}}(1)\).

Definition 8.2

(Local sufficient completeness). Let N be a \(\varSigma \)-clause set. We say that N is locally sufficiently complete w.r.t. smooth instances if for every \(\varSigma _\mathrm {B}\)-interpretation \(I \in \mathcal {B}\), every \(\varSigma \)-model J of \(\mathrm {sgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), and every BG-sorted FG term s occurring in \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\) there is a ground BG term t such that \(J \,\models \, s \approx t\). (Again, we will from now on omit the phrase “w.r.t. smooth instances” for brevity.)    \(\square \)

Example 8.3

The clause set \(N = \{\,X \not \approx \alpha \,\vee \, \mathsf {{f}}(X) \approx \beta \,\}\) is locally sufficiently complete: The smooth ground instances have the form \(s' \not \approx \alpha \,\vee \, \mathsf {{f}}(s') \approx \beta \), where \(s'\) is a pure BG term. We have to show that \(\mathsf {{f}}(s')\) equals some ground BG term t whenever the smooth ground instance is not redundant. Let \(I \in \mathcal {B}\) be a \(\varSigma _\mathrm {B}\)-interpretation and J be a \(\varSigma \)-model of \(\mathrm {sgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\). If \(I \,\models \, s' \not \approx \alpha \), then \(s' \not \approx \alpha \) follows from some clauses in \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\), so \(s' \not \approx \alpha \,\vee \, \mathsf {{f}}(s') \approx \beta \) is contained in \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\) and \(\mathsf {{f}}(s')\) need not be considered. Otherwise \(I \,\models \, s' \approx \alpha \), then \(\mathsf {{f}}(s')\) occurs in a non-redundant smooth ground instance of a clause in N and \(J \,\models \, f(s') \approx \beta \), so \(t := \beta \) has the desired property. On the other hand, N is clearly not sufficiently complete, since there are models of \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathcal {B})\) in which \(\mathsf {{f}}(\beta )\) is interpreted by some junk element that is different from the interpretation of any ground BG term.

The example demonstrates that local sufficient completeness is significantly more powerful than sufficient completeness, but this comes at a price. For instance, as shown by the next example, local sufficient completeness is not preserved by abstraction:

Example 8.4

Suppose that the BG specification is linear integer arithmetic (including parameters \(\alpha \), \(\beta \), \(\gamma \)), the FG operators are \(\mathsf {{f}} : int \rightarrow int \), \(\mathsf {{g}} : int \rightarrow data \), \(\mathsf {{a}} : \, \rightarrow data \), the term ordering is an LPO with precedence \(\mathsf {{g}}> \mathsf {{f}}> \mathsf {{a}}> \gamma> \beta> \alpha> 3> 2 > 1\), and the clause set N is given by

figure an

Since all clauses in N are ground, \(\mathrm {smgi}(N) = \mathrm {sgi}(N) = N\). Clause (8) is redundant w.r.t. \(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\) (for any I): it follows from clauses (6) and (7), and both are smaller than (8). The BG-sorted FG terms in non-redundant clauses are \(\mathsf {{f}}(2)\), \(\mathsf {{f}}(3)\), \(\mathsf {{f}}(\alpha )\), and \(\mathsf {{f}}(\beta )\), and in any \(\varSigma \)-model J of \(\mathrm {sgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), these are necessarily equal to the BG terms 2 or 3, respectively, so N is locally sufficiently complete.

Let \(N' = \mathrm {abstr}(N)\), let I be a BG-model such that \(\mathrm {E}_I\) contains \(\alpha \approx 3\), \(\beta \approx 2\), and \(\gamma \approx 1\) (among others), \(\mathrm {D}_I\) contains \(1 \not \approx 2\), \(1 \not \approx 3\), and \(2 \not \approx 3\) (among others), and let J be a \(\varSigma \)-model of \(\mathrm {sgi}(N') \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\) in which \(\mathsf {{f}}(1)\) is interpreted by some junk element. The set \(N'\) contains the clause \(\mathsf {{g}}(\mathsf {{f}}(X)) \approx \mathsf {{a}} \;\vee \; \mathsf {{g}}(\mathsf {{f}}(Y)) \approx \mathsf {{a}} \;\vee \; \gamma \not \approx X \;\vee \; \beta \not \approx Y\) obtained by abstraction of (8). Its smooth ground instance \(C \ =\ \mathsf {{g}}(\mathsf {{f}}(1)) \approx \mathsf {{a}} \;\vee \; \mathsf {{g}}(\mathsf {{f}}(2)) \approx \mathsf {{a}} \;\vee \; \gamma \not \approx 1 \;\vee \; \beta \not \approx 2\) is not redundant: it follows from other clauses in \(\mathrm {smgi}(N') \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), namely

figure ao

but the ground instances \((6')\) and \((7')\) that are needed here are larger than C. Since C contains the BG-sorted FG term \(\mathsf {{f}}(1)\) which is interpreted differently from any BG term in J, \(N'\) is not locally sufficiently complete.

Local sufficient completeness of a clause set suffices to ensure refutational completeness. Kruglov’s proof [21] works also if one uses weak abstraction instead of strong abstraction and ordinary as well as abstraction variables, but it relies on an additional restriction on the term ordering.Footnote 11 We give an alternative proof that works without this restriction.

The proof is based on a transformation on \(\varSigma \)-interpretations. Let J be an arbitrary \(\varSigma \)-interpretation. We transform J into a term-generated \(\varSigma \)-interpretation \(\mathrm {nojunk}(J)\) without junk in two steps. In the first step, we define a \(\varSigma \)-interpretation \(J'\) as follows:

  • For every FG sort \(\xi \), define \(J'_\xi = J_\xi \).

  • For every BG sort \(\xi \), define \(J'_\xi = \{\,t^J \mid t \text { is a ground BG term of sort }\xi \,\}\).

  • For every \(f : \xi _1 \ldots \xi _n \rightarrow \xi _0\) the function \(J'_f : J'_{\xi _1} \times \cdots \times J'_{\xi _n} \rightarrow J'_{\xi _0}\) maps \((a_1,\dots ,a_n)\) to \(J_f(a_1,\dots ,a_n)\), if \(J_f(a_1,\dots ,a_n) \in J'_{\xi _0}\), and to an arbitrary element of \(J'_{\xi _0}\) otherwise.

That is, we obtain \(J'\) from J be deleting all junk elements from \(J_\xi \) if \(\xi \) is a BG sort, and by redefining the interpretation of f arbitrarily whenever \(J_f(a_1,\dots ,a_n)\) is a junk element.

In the second step, we define the \(\varSigma \)-interpretation \(\mathrm {nojunk}(J) = J''\) as the term-generated subinterpretation of \(J'\), that is,

  • For every sort \(\xi \), \(J''_\xi = \{\,t^{J'} \mid t \text { is a ground term of sort }\xi \,\}\),

  • For every \(f : \xi _1 \ldots \xi _n \rightarrow \xi _0\), the function \(J''_f : J''_{\xi _1} \times \cdots \times J''_{\xi _n} \rightarrow J''_{\xi _0}\) satisfies \(J''_f(a_1,\dots ,a_n) = J'_f(a_1,\dots ,a_n)\).

Lemma 8.5

Let J, \(J'\), and \(\mathrm {nojunk}(J) = J''\) be given as above. Then the following properties hold:

(i):

\(t^{J''} = t^{J'}\) for every ground term t.

(ii):

\(J''_\xi = J'_\xi \) for every BG sort \(\xi \).

(iii):

\(J''\) is a term-generated \(\varSigma \)-interpretation and \(J''{|_{\varSigma _\mathrm {B}}}\) is a term-generated \(\varSigma _\mathrm {B}\)-interpretation.

(iv):

If \(t = f(t_1,\dots ,t_n)\) is ground, \(t_i^{J'} = t_i^J\) for all i, and \(t^J \in J'_\xi \), then \(t^{J'} = t^J\).

(v):

If t is ground and all BG-sorted subterms of t are BG terms, then \(t^{J'} = t^J\).

(vi):

If C is a ground BG clause, then \(J \,\models \, C\) if and only if \(J'' \,\models \, C\) if and only if \(J''{|_{\varSigma _\mathrm {B}}} \,\models \, C\).

Proof

Properties (i)–(iv) follow directly from the definition of \(J'\) and \(J''\). Property (v) follows from (iv) and the definition of \(J'\) by induction over the term structure. By (i) and (v), every ground BG term is interpreted in the same way in J and \(J''\), moreover it is obvious that every ground BG term is interpreted in the same way in \(J''\) and \(J''{|_{\varSigma _\mathrm {B}}}\); this implies (vi).    \(\square \)

Lemma 8.6

If J is a \(\varSigma \)-interpretation and \(I = \mathrm {nojunk}(J)\), then for every ground term s there exists a ground term t such that \(s^I = t^I\) and all BG-sorted (proper or non-proper) subterms of t are BG terms.

Proof

If s has a BG sort \(\xi \), then this follows directly from the fact that \(s^I \in I_\xi \) and that every element of \(I_\xi \) equals \(t^I\) for some ground BG term t of sort \(\xi \). If s has a FG sort, let \(s_1,\dots ,s_n\) be the maximal BG-sorted subterms of \(s = s[s_1,\dots ,s_n]\). Since for every \(s_i\) there is a ground BG term \(t_i\) with \(s_i^I = t_i^I\), we obtain \(s^I = (s[s_1,\dots ,s_n])^I = (s[t_1,\dots ,t_n])^I\). Set \(t := s[t_1,\dots ,t_n]\).    \(\square \)

Corollary 8.7

Let J be a \(\varSigma \)-interpretation and \(I = \mathrm {nojunk}(J)\). Let \(C\sigma \) by a ground instance of a clause C. Then there is a smooth ground instance \(C\tau \) of C such that \((t\sigma )^I = (t\tau )^I\) for every term occurring in C and such that \(I \,\models \, C\sigma \) if and only if \(I \,\models \, C\tau \).

Proof

Using the previous lemma, we define \(\tau \) such that for every variable \(\zeta \) occurring in C, \((\zeta \tau )^I = (\zeta \sigma )^I\) and all BG-sorted (proper or non-proper) subterms of \(\zeta \tau \) are BG terms. Clearly \(\tau \) is smooth. The other properties follow immediately by induction over the term or clause structure.    \(\square \)

Lemma 8.8

Let N be a set of \(\varSigma \)-clauses that is locally sufficiently complete. Let \(I \in \mathcal {B}\) be a \(\varSigma _\mathrm {B}\)-interpretation, let J be a \(\varSigma \)-model of \(\mathrm {sgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), and let \(J'' = \mathrm {nojunk}(J)\). Let \(C \in N\) and let \(C\tau \) by a smooth ground instance in \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\). Then \((t\tau )^J = (t\tau )^{J''}\) for every term t occurring in C and \(J \,\models \, C\tau \) if and only if \(J'' \,\models \, C\tau \).

Proof

Let \(J'\) be defined as above, then \((t\tau )^{J'} = (t\tau )^{J''}\) for any term t occurring in C by Lemma 8.5-(i). We prove that \((t\tau )^J = (t\tau )^{J'}\) by induction over the term structure: If t is a variable, then by smoothness all BG-sorted subterms of \(t\tau \) are BG terms, hence \((t\tau )^{J'} = (t\tau )^J\) by Lemma 8.5-(v). Otherwise let \(t = f(t_1,\dots ,t_n)\). If \(t\tau \) is a BG term, then again \((t\tau )^{J'} = (t\tau )^J\) by Lemma 8.5-(v). If \(t\tau \) is a FG term of sort \(\xi \), then t must be a FG term of sort \(\xi \) as well. By the induction hypothesis, \((t_i\tau )^J = (t_i\tau )^{J'}\) for every i. If \(\xi \) is a FG sort, then trivially \((t\tau )^J = J_f((t_1\tau )^J,\dots ,(t_n\tau )^J)\) is contained in \(J'_\xi \), so \((t\tau )^{J'} = (t\tau )^J\) by Lemma 8.5-(iv). Otherwise, \(t\tau \) is a BG-sorted FG term occurring in \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\). By local sufficient completeness, there exists a ground BG term s such that \(s^J = (t\tau )^J\), hence \((t\tau )^J \in J'_\xi \). Again, Lemma 8.5-(iv) yields \((t\tau )^{J'} = (t\tau )^J\).

Since all left and right-hand sides of equations in \(C\tau \) are evaluated in the same way in \(J''\) and J, it follows that \(J \,\models \, C\tau \) if and only if \(J'' \,\models \, C\tau \).    \(\square \)

Lemma 8.9

Let N be a set of \(\varSigma \)-clauses that is locally sufficiently complete. Let \(I \in \mathcal {B}\) be a \(\varSigma _\mathrm {B}\)-interpretation, let J be a \(\varSigma \)-model of \(\mathrm {sgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), and let \(J'' = \mathrm {nojunk}(J)\). Then \(J''\) is a model of N.

Proof

The proof proceeds in three steps. In the first step we show that \(J''\) is a model of \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\): Let \(C \in N\) and let \(C\tau \) be a smooth ground instance in \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\). Since every smooth ground instance is a simple ground instance and J is a \(\varSigma \)-model of \(\mathrm {sgi}(N)\), we know that \(J \,\models \, C\tau \). By Lemma 8.8, this implies \(J'' \,\models \, C\tau \).

In the second step we show that \(J''\) is a model of \(\mathrm {smgi}(N)\). Since we already know that \(J''\) is a model of \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\), it is clearly sufficient to show that \(J''\) is a model of \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\): First we observe that by Lemma 8.5 \(J'' \,\models \, \mathrm {E}_I\,\cup \,\mathrm {D}_I\). Using the result of the first step, this implies that \(J''\) is a model of \((\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\), and since this set is a superset of \((\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\), \(J''\) is also a model of the latter. By Definition 6.2-(i), \((\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I) \,\models \, \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\). So \(J''\) is a model of \(\mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\).

We can now show the main statement: We know that \(J''\) is a term-generated \(\varSigma \)-interpretation, so \(J'' \,\models \, N\) holds if and only if \(J''\) is a model of all ground instances of clauses in N. Let \(C\sigma \) be an arbitrary ground instance of \(C \in N\). By Corollary 8.7, there is a smooth ground instance \(C\tau \) such that \(J'' \,\models \, C\sigma \) if and only if \(J'' \,\models \, C\tau \). As the latter has been shown in the second step, the result follows.    \(\square \)

Theorem 8.10

If the BG specification \((\varSigma _\mathrm {B},\mathcal {B})\) is compact and if the clause set N is locally sufficiently complete, then \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\) are dynamically refutationally complete for \(\mathrm {abstr}(N)\), i. e., if \(N\, \models _\mathcal {B}\,\Box \), then every fair derivation starting from \(\mathrm {abstr}(N)\) eventually generates \(\Box \).

Proof

Let \(\mathcal {D}= (N_i)_{i\ge 0}\) be a fair derivation starting from \(N_0 = \mathrm {abstr}(N)\), and let \(N_\infty \) be the limit of \(\mathcal {D}\). By fairness, \(N_\infty \) is saturated w.r.t. \(\mathrm {HSP}_{\mathrm {Base}}\) and \(\mathcal {R}^\mathcal {H}\). If \(\Box \notin N_\infty \), then the rule is not applicable to \(N_\infty \). Since \((\varSigma _\mathrm {B},\mathcal {B})\) is compact, this means that the set of all \(\varSigma _\mathrm {B}\)-clauses in \(\mathrm {sgi}(N_\infty )\) is satisfied by some term-generated \(\varSigma _\mathrm {B}\)-interpretation \(I \in \mathcal {B}\). By Theorem 7.8, \((N_\infty )_I\) is saturated with respect to the standard superposition calculus. Since \(\Box \notin (N_\infty )_I\), the refutational completeness of standard superposition implies that there is a \(\varSigma \)-model J of \((N_\infty )_I\), and since \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\subseteq (N_\infty )_I\), J is also a \(\varSigma \)-model of \(\mathrm {sgi}(N_\infty ) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\). Since every clause in \(N_0\) is either contained in \(N_\infty \) or redundant w.r.t. \(N_\infty \), every simple ground instance of a clause in \(N_0\) is a simple ground instance of a clause in \(N_\infty \) or contained in \(\mathrm {GndTh}(\mathcal {B})\) or redundant w.r.t. \(\mathrm {sgi}(N_\infty )\,\cup \,\mathrm {GndTh}(\mathcal {B})\). We conclude that J is a \(\varSigma \)-model of \(\mathrm {sgi}(N_0)\), and since \(\mathrm {sgi}(N_0)\) and \(\mathrm {sgi}(N)\) are equivalent, J is a \(\varSigma \)-model of \(\mathrm {sgi}(N)\). Now define \(J'' = \mathrm {nojunk}(J)\). By Lemma 8.5, \(J''\) is a term-generated \(\varSigma \)-interpretation, \(J''{|_{\varSigma _\mathrm {B}}}\) is a term-generated \(\varSigma _\mathrm {B}\)-interpretation, and \(J''{|_{\varSigma _\mathrm {B}}}\) satisfies \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\). Consequently, \(J''{|_{\varSigma _\mathrm {B}}}\) is isomorphic to I and thus contained in \(\mathcal {B}\). Finally, \(J''\) is a model of N by Lemma 8.9.    \(\square \)

If all BG-sorted FG terms in a set N of clauses are ground, local sufficient completeness can be established automatically by adding a “definition” of the form \(t \approx \alpha \), where t is a ground BG-sorted FG term and \(\alpha \) is a parameter. The following section explains this idea in a more general way.

9 Local Sufficient Completeness by Define

The \(\mathrm {HSP}_{\mathrm {Base}}\) inference system will derive a contradiction if the input clause set is inconsistent and (locally) sufficiently complete (cf. Sect. 8). In this section we extend this functionality by adding an inference rule, , which can turn input clause sets that are not sufficiently complete into locally sufficiently complete ones. Technically, the rule derives “definitions” of the form \(t \approx \alpha \), where t is a ground BG-sorted FG term and \(\alpha \) is a parameter of the proper sort. For economy of reasoning, definitions are introduced only on a by-need basis, when t appears in a current clause, and \(t \approx \alpha \) is used to simplify that clause immediately.

We need one more preliminary definition before introducing formally.

Definition 9.1

(Unabstracted clause). A clause is unabstracted if it does not contain any disequation \(\zeta \not \approx t\) between a variable \(\zeta \) and a term t unless \(t \ne \zeta \) and \(\zeta \in \mathrm {vars}(t)\).    \(\square \)

Any clause can be unabstracted by repeatedly replacing \(C \,\vee \, \zeta \not \approx t\) by \({C[\zeta \mapsto t]}\) whenever \(t = \zeta \) or \(\zeta \notin \mathrm {vars}(t)\). Let \(\mathrm {unabstr}(C)\) denote an unabstracted version of C obtained this way. If \(t = t[\zeta _1,\ldots ,\zeta _n]\) is a term in C and \(\zeta _i\) is finally instantiated to \(t_i\), we denote its unabstracted version \(t[t_1,\ldots ,t_n]\) by \(\mathrm {unabstr}(t[\zeta _1,\ldots ,\zeta _n], C)\). For a clause set N let \(\mathrm {unabstr}(N) = \{\mathrm {unabstr}(C) \mid C \in N\}\).

The full inference system \(\mathrm {HSP}\) of the hierarchic superposition calculus consists of the inference rules of \(\mathrm {HSP}_{\mathrm {Base}}\) and the following inference rule. As for the other inference rules we suppose that all premises are weakly abstracted.

figure au

if

(i):

\(t[\zeta _1,\ldots ,\zeta _n]\) is a minimal BG-sorted non-variable term with a toplevel FG operator,

(ii):

\(t[t_1,\ldots ,t_n] = \mathrm {unabstr}(\{t[\zeta _1,\ldots ,\zeta _n], L[t[\zeta _1,\ldots ,\zeta _n]] \,\vee \, D\})\),

(iii):

\(t[t_1,\ldots ,t_n]\) is ground,

(iv):

\(\alpha _{t[t_1,\ldots ,t_n]}\) is a parameter, uniquely determined by \(t[t_1,\ldots ,t_n]\), and

(v):

\(L[t[\zeta _1,\ldots ,\zeta _n]]\,\vee \, D \in \mathcal {R}_\mathrm {Cl}^\mathcal {H}(N \,\cup \,\mathrm {abstr}(\{ t[t_1,\ldots ,t_n] \approx \alpha _{t[t_1,\ldots ,t_n]},\ L[\alpha _{t[t_1,\ldots ,t_n]}] \,\vee \, D\}))\).

In (i), by minimality we mean that no proper subterm of \(t[\zeta _1,\ldots ,\zeta _n]\) is a BG-sorted non-variable term with a toplevel FG operator. In effect, the rule eliminates such terms inside-out. Conditions (iii) and (iv) are needed for soundness. Condition (v) is needed to guarantee that is a simplifying inference rule, much like the rule in Sect. 7.Footnote 12 In particular, it makes sure that cannot be applied to definitions themselves.

Theorem 9.2

The inference rules of \(\mathrm {HSP}\) are satisfiability-preserving w.r.t. \(\models _\mathcal {B}\), i. e., for every inference with premise N and conclusion \(N'\) we have \(N \,\models _\mathcal {B}\, \Box \) if and only if \(N'\, \models _\mathcal {B}\,\Box \). Moreover, \(N' \,\models _\mathcal {B}\,N\).

Proof

For the inference rules of \(\mathrm {HSP}_{\mathrm {Base}}\), the result follows from Theorem 6.1.

For , we observe first that condition (ii) implies that \(L[t[\zeta _1,\ldots ,\zeta _n]] \,\vee \, D\) and \(L[t[t_1,\ldots ,t_n]] \,\vee \, D\) are equivalent. If \(N \,\cup \,\{ L[t[t_1,\ldots ,t_n]] \,\vee \, D \}\) is \(\mathcal {B}\)-satisfiable, let I be a \(\varSigma \)-model of all ground instances of \(N \,\cup \,\{ L[t[t_1,\ldots ,t_n]] \,\vee \, D \}\) such that \(I{|_{\varSigma _\mathrm {B}}}\) is in \(\mathcal {B}\). By condition (iii), \(t[t_1,\ldots ,t_n]\) is ground. Let J be the \(\varSigma \)-interpretation obtained from J by redefining the interpretation of \(\alpha _{t[t_1,\ldots ,t_n]}\) in such a way that \(\alpha _{t[t_1,\ldots ,t_n]}^J = t[t_1,\ldots ,t_n]^I\), then J is a \(\varSigma \)-model of every ground instance of N, \(t[t_1,\ldots ,t_n] \approx \alpha _{t[t_1,\ldots ,t_n]}\) and \(L[\alpha _{t[t_1,\ldots ,t_n]}] \,\vee \, D\), and hence also a model of the abstractions of these clauses. Conversely, every model of \(t[t_1,\ldots ,t_n] \approx \alpha _{t[t_1,\ldots ,t_n]}\) and \(L[\alpha _{t[t_1,\ldots ,t_n]}] \,\vee \, D\) is a model of \(L[t[t_1,\ldots ,t_n]] \,\vee \, D\).    \(\square \)

Example 9.3

Let \(C = \mathsf {{g}}(\mathsf {{f}}(x, y)+1, x, y) \approx 1 \,\vee \, x \not \approx 1 + \beta \,\vee \, y \not \approx \mathsf {{c}}\) be the premise of a inference. We get \(\mathrm {unabstr}(C) = \mathsf {{g}}(\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}})\,+\,1, 1\,+\,\beta , \mathsf {{c}}) \approx 1\). The (unabstracted) conclusions are the definition \(\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}}) \approx \alpha _{\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}})}\) and the clause \(\mathsf {{g}}(\alpha _{\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}})}\,+\,1, x, y) \approx 1 \,\vee \, x \not \approx 1 \,+\, \beta \,\vee \, y \not \approx \mathsf {{c}}\). Abstraction yields \(\mathsf {{f}}(X, \mathsf {{c}}) \approx \alpha _{\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}})} \,\vee \, X \not \approx 1\,+\,\beta \) and \(\mathsf {{g}}(Z, x, y) \approx 1 \,\vee \, x \not \approx 1 \,+ \,\beta \,\vee \, y \not \approx \mathsf {{c}} \,\vee \, Z \not \approx \alpha _{\mathsf {{f}}(1\,+\,\beta , \mathsf {{c}})}\,+\,1\).

One might be tempted to first unabstract the premise C before applying . However, unabstraction may eliminate FG terms (\(\mathsf {{c}}\) in the example) which is not undone by abstraction. This may lead to incompleteness.   \(\square \)

Example 9.4

The following clause set demonstrates the need for condition (v) in . Let \(N = \{ \mathsf {{f}}(\mathsf {{c}}) \approx 1\}\) and suppose condition (v) is absent. Then we obtain \(N' = \{ \mathsf {{f}}(\mathsf {{c}}) \approx \alpha _{\mathsf {{f}}(\mathsf {{c}})},\ \alpha _{\mathsf {{f}}(\mathsf {{c}})} \approx 1\}\). By demodulating the first clause with the second clause we get \(N'' = \{ \mathsf {{f}}(\mathsf {{c}}) \approx 1,\ \alpha _{\mathsf {{f}}(\mathsf {{c}})} \approx 1\}\). Now we can continue with \(N''\) as with N. The problem is, of course, that the new definition \(\mathsf {{f}}(\mathsf {{c}}) \approx \alpha _{\mathsf {{f}}(\mathsf {{c}})}\) is greater w.r.t. the term ordering than the parent clause, in violation of condition (v).    \(\square \)

Example 9.5

Consider the weakly abstracted clauses \(\mathsf {{P}}(0)\), \(\mathsf {{f}}(x)>0 \,\vee \, \lnot \mathsf {{P}}(x)\), \(\mathsf {{Q}}(\mathsf {{f}}(x))\), \(\lnot \mathsf {{Q}}(x) \,\vee \, 0>x\). Suppose \(\lnot \mathsf {{P}}(x)\) is maximal in the second clause. By superposition between the first two clauses we derive \(\mathsf {{f}}(0)>0\). With we obtain \(\mathsf {{f}}(0)\approx \alpha _{\mathsf {{f}}(0)}\) and \(\alpha _{\mathsf {{f}}(0)}>0\), the latter replacing \(\mathsf {{f}}(0)>0\). From the third clause and \(\mathsf {{f}}(0)\approx \alpha _{\mathsf {{f}}(0)}\) we obtain \(\mathsf {{Q}}(\alpha _{\mathsf {{f}}(0)})\), and with the fourth clause \(0>\alpha _{\mathsf {{f}}(0)}\). Finally we apply to \(\{ \alpha _{\mathsf {{f}}(0)}>0,\ 0>\alpha _{\mathsf {{f}}(0)}\}\).    \(\square \)

It is easy to generalize Theorem 8.10 to the case that local sufficient completeness does not hold initially, but is only established with the help of inferences:

Theorem 9.6

Let \(\mathcal {D}= (N_i)_{i\ge 0}\) be a fair \(\mathrm {HSP}\) derivation starting from \(N_0 = \mathrm {abstr}(N)\), let \(k \ge 0\), such that \(N_k = \mathrm {abstr}(N')\) and \(N'\) is locally sufficiently complete. If the BG specification \((\varSigma _\mathrm {B},\mathcal {B})\) is compact, then the limit of \(\mathcal {D}\) contains \(\Box \) if and only if N is \(\mathcal {B}\)-unsatisfiable.

Proof

Since every derivation step in an \(\mathrm {HSP}\) derivation is satisfiability-preserving, the “only if” part is again obvious.

For the “if” part, we assume that \(N_\infty \), the limit of \(\mathcal {D}\), does not contain \(\Box \). By fairness, \(N_\infty \) is saturated w.r.t. \(\mathrm {HSP}\) and \(\mathcal {R}^\mathcal {H}\). We start by considering the subderivation \((N_i)_{i\ge k}\) starting with \(N_k = \mathrm {abstr}(N')\). Like in the proof of Theorem 8.10, we can show that \(N'\) is \(\mathcal {B}\)-satisfiable, that is, there exists a model J of \(N'\) that is a term-generated \(\varSigma \)-interpretation, and whose restriction \(J{|_{\varSigma _\mathrm {B}}}\) is contained in \(\mathcal {B}\). From Lemma 7.1 and Proposition 5.2 we see that \(N'\, \models _\mathcal {B}\,N_k\), and similarly \(N_0\, \models _\mathcal {B}\,N\). Furthermore, since every clause in \(N_0 \setminus N_k\) must be redundant w.r.t. \(N_k\), we have \(N_k \,\models _\mathcal {B}\, N_0\). Combining these three entailments, we conclude that \(N' \,\models _\mathcal {B}\, N\), so N is \(\mathcal {B}\)-satisfiable and J is a model of N.    \(\square \)

Condition (v) of the rule requires that the clause that is deleted during a inference must be redundant with respect to the remaining clauses. This condition is needed to preserve refutational completeness. There are cases, however, where condition (v) prevents us from introducing a definition for a subterm. Consider the clause set \(N = \{ C \}\) where \(C = \mathsf {{f}} (\mathsf {{c}}) \approx 1 \,\vee \, \mathsf {{c}} \approx \mathsf {{d}}\), the constants \(\mathsf {{c}}\) and \(\mathsf {{d}}\) are FG-sorted, \(\mathsf {{f}}\) is a BG-sorted FG operator, and \(\mathsf {{c}} \succ \mathsf {{d}} \succ 1\). The literal \(\mathsf {{f}} (\mathsf {{c}}) \approx 1\) is maximal in C. The clause set \(N = \mathrm {abstr}(N)\) is not locally sufficient complete (the BG-sorted FG-term \(\mathsf {{f}} (\mathsf {{c}})\) may be interpreted differently from all BG terms in a \(\varSigma \)-model). Moreover, it cannot be made locally sufficient complete using the rule, since the definition \(\mathsf {{f}} (\mathsf {{c}}) \approx \alpha _{\mathsf {{f}} (\mathsf {{c}})}\) is larger w.r.t. the clause ordering than C, in violation of condition (v) of .

However, at the beginning of a derivation, we may be a bit more permissive. Let us define the inference rule in the same way as the rule except that the applicability condition (v) is dropped. Clearly, in the example above, the reckless rule allows us to derive the locally sufficiently complete clause set \(N' = \{ \alpha _{\mathsf {{f}} (\mathsf {{c}})} \approx 1 \,\vee \, \mathsf {{c}} \approx \mathsf {{d}},\ \mathsf {{f}} (\mathsf {{c}}) \approx \alpha _{\mathsf {{f}} (\mathsf {{c}})}\}\) as desired. In fact, we can show that this is always possible if N is a finite clause set in which all BG-sorted FG terms are ground.

Definition 9.7

(Pre-derivation). Let \(N_0\) be a weakly abstracted clause set. A pre-derivation (of a clause set \(N^\mathrm {pre}\)) is a derivation of the form \(N_0,N_1,\) \(\ldots ,(N_k = N^\mathrm {pre})\), for some \(k \ge 0\), with the inference rule reckless only, and such that each clause \(C \in N^\mathrm {pre}\) either does not contain any BG-sorted FG operator or \(C = \mathrm {abstr}(C')\) and \(C'\) is a definition, i. e., a ground positive unit clause of the form \(\mathsf {{f}}(t_1,\ldots ,t_n) \approx t\) where \(\mathsf {{f}}\) is a BG-sorted FG operator, \(t_1,\ldots ,t_n\) do not contain BG-sorted FG operators, and t is a background term.    \(\square \)

Lemma 9.8

Let N be a finite clause set in which all BG-sorted FG terms are ground. Then there is a pre-derivation starting from \(N_0 = \mathrm {abstr}(N)\) such that \(N^\mathrm {pre}\) is locally sufficiently complete.

Proof

Since every term headed by a BG-sorted FG operator in \(\mathrm {unabstr}(N_0)\) is ground, we can incrementally eliminate all occurrences of terms headed by BG-sorted FG operators from \(N_0\), except those in abstractions of definitions. Let \(N_0,N_1,\ldots ,(N_k = N^\mathrm {pre})\) be the sequence of sets of clauses obtained in this way. We will show that \(N^\mathrm {pre}\) is locally sufficiently complete.

Let \(I \in \mathcal {B}\) be a \(\varSigma _\mathrm {B}\)-interpretation, let J be a \(\varSigma \)-model of \(\mathrm {sgi}(N^\mathrm {pre}) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I\) and let \(C\theta \) be a smooth ground instance in \(\mathrm {smgi}(N) \setminus \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\). We have to show that for every BG-sorted FG term s occurring in \(C\theta \) there is a ground BG term t such that \(J \,\models \, s \approx t\).

If C does not contain any BG-sorted FG operator, then there are no BG-sorted FG terms in \(C\theta \), so the property is vacuously true. Otherwise \(C = \mathrm {abstr}(C')\) and \(C'\) is a definition \(\mathsf {{f}}(t_1,\ldots ,t_n) \approx t\) where \(\mathsf {{f}}\) is a BG-sorted FG operator, \(t_1,\ldots ,t_n\) do not contain BG-sorted FG operators, and t is a background term. In this case, C must have the form \(\mathsf {{f}}(u_1,\ldots ,u_n) \approx u \,\vee \, E\), such that E is a BG clause, \(u_1,\ldots ,u_n\) do not contain BG-sorted FG operators, and u is a BG term. The only BG-sorted FG term in the smooth instance \(C\theta \) is therefore \(\mathsf {{f}}(u_1\theta ,\ldots ,u_n\theta )\). If any literal of \(E\theta \) were true in J, then it would follow from \(\mathrm {E}_I\,\cup \,\mathrm {D}_I\), therefore \(C\theta \in \mathcal {R}_\mathrm {Cl}^\mathcal {S}(\mathrm {smgi}(N) \,\cup \,\mathrm {E}_I\,\cup \,\mathrm {D}_I)\), contradicting the assumption. Hence \(J \,\models \, \mathsf {{f}}(u_1\theta ,\ldots ,u_n\theta ) \approx u\theta \), and since \(u\theta \) is a ground BG term, the requirement is satisfied.    \(\square \)

Lemma 9.8 will be needed to prove a completeness result for the fragment defined in the next section.

10 The Ground BG-Sorted Term Fragment

According to Theorem 8.10, the \(\mathrm {HSP}_{\mathrm {Base}}\) calculus is refutationally complete provided that the clause set is locally sufficiently complete and the BG specification is compact. We have seen in the previous section that the (reckless) rule can help to establish local sufficient completeness by introducing new parameters. In fact, finite clause sets in which all BG-sorted FG terms are ground can always be converted into locally sufficiently complete clause sets (cf. Lemma 9.8). On the other hand, as noticed in Sect. 3, the introduction of parameters can destroy the compactness of the BG specification. In this and the following section, we will identify two cases where we can not only establish local sufficient completeness, but where we can also guarantee that compactness poses no problems. The ground BG-sorted term fragment (GBT fragment) is one such case:

Definition 10.1

(GBT fragment). A clause C is a GBT clause if all BG-sorted terms in C are ground. A finite clause set N belongs to the GBT fragment if all clauses in N are GBT clauses.    \(\square \)

Clearly, by Lemma 9.8 for every clause set N that belongs to the GBT fragment there is a pre-derivation that converts \(\mathrm {abstr}(N)\) into a locally sufficiently complete clause set. Moreover, pre-derivations also preserve the GBT property:

Lemma 10.2

If \(\mathrm {unabstr}(N)\) belongs to the GBT fragment and \(N'\) is obtained from N by a reckless inference, then \(\mathrm {unabstr}(N')\) also belongs to the GBT fragment.

The proof can be found in [12].

As we have seen, \(N^\mathrm {pre}\) is locally sufficiently complete. At this stage this suggests to exploit the completeness result for locally sufficiently complete clause sets, Theorem 8.10. However, Theorem 8.10 requires compact BG specifications, and the question is if we can avoid this. We can indeed get a complete calculus under rather mild assumptions on the rule:

Definition 10.3

(Suitable inference). Let \(\succ _{\mathrm {fin}}\) be a strict partial term ordering such that for every ground BG term s only finitely many ground BG terms t with \(s \succ _{\mathrm {fin}} t\) exist.Footnote 13 We say that a inference with premise \(N \,\cup \,\{ C\}\) and conclusion \(N \,\cup \,\{ D\}\) is suitable (for the GBT fragment) if

(i):

for every BG term t occurring in \(\mathrm {unabstr}(D)\) there is a BG term \(s \in \mathrm {unabstr}(C)\) such that \(s \succeq _{\mathrm {fin}} t\),

(ii):

every occurrence of a BG-sorted FG operator \(\mathsf {{f}}\) in \(\mathrm {unabstr}(D)\) is of the form \(\mathsf {{f}}(t_1,\ldots ,t_n) \approx t\) where t is a ground BG term,

(iii):

every BG term in D is pure, and

(iv):

if every BG term in \(\mathrm {unabstr}(C)\) is ground then every BG term in \(\mathrm {unabstr}(D)\) is ground.

We say the inference rule is suitable if every inference is.    \(\square \)

Expected simplification techniques like demodulation, subsumption deletion and evaluation of BG subterms are all covered as suitable inferences. Also, evaluation of BG subterms is possible, because simplifications are not only decreasing w.r.t. \(\succ \) but additionally also decreasing w.r.t. \(\succeq _{\mathrm {fin}}\), as expressed in condition (i). Without it, e. g., the clause \(\mathsf {{P}}(1\,+\,1,0)\) would admit infinitely many simplified versions \(\mathsf {{P}}(2,0)\), \(\mathsf {{P}}(2,0+0)\), \(\mathsf {{P}}(2,0+(0+0))\), etc.

The \(\mathrm {HSP}_{\mathrm {Base}}\) inferences do in general not preserve the shape of the clauses in \(\mathrm {unabstr}(N^\mathrm {pre})\); they do preserve a somewhat weaker property – cleanness – which is sufficient for our purposes.

Definition 10.4

(Clean clause). A weakly abstracted clause C is clean if

(i):

every BG term in C is pure,

(ii):

every BG term in \(\mathrm {unabstr}(C)\) is ground, and

(iii):

every occurrence of a BG-sorted FG operator \(\mathsf {{f}}\) in \(\mathrm {unabstr}(C)\) is in a positive literal of the form \(\mathsf {{f}}(t_1,\ldots ,t_n) \approx t\) where t is a ground BG term.

For example, if \(\mathsf {{c}}\) is FG-sorted, then \(\mathsf {{P}}(\mathsf {{f}}(\mathsf {{c}}) + 1)\) is not clean, while \(\mathsf {{f}}(x) \approx 1+\alpha \,\vee \, \mathsf {{P}}(x)\) is. A clause set is called clean if every clause in N is. Notice that \(N^\mathrm {pre}\) is clean.

Lemma 10.5

Let \(C_1,\ldots ,C_n\) be clean clauses. Assume a \(\mathrm {HSP}_{\mathrm {Base}}\) inference with premises \(C_1,\ldots ,C_n\) and conclusion C. Then C is clean and every BG term occurring in \(\mathrm {unabstr}(C)\) also occurs in some clause \(\mathrm {unabstr}(C_1)\),..., \(\mathrm {unabstr}(C_n)\).

The proof can be found in [12].

Thanks to conditions (ii)–(iv) in Definition 10.3, suitable inferences preserves cleanness:

Lemma 10.6

Let \(N \,\cup \,\{C\}\) be a set of clean clauses. If \(N \,\cup \,\{D\}\) is obtained from \(N \,\cup \,\{C\}\) by a suitable inference then D is clean.

Proof

Suppose \(N \,\cup \,\{D\}\) is obtained from \(N \,\cup \,\{C\}\) by a suitable inference. We need to show properties (i)–(iii) of cleanness for D. That every BG term in D is pure follows from Definition 10.3-(iii). That every BG term in \(\mathrm {unabstr}(D)\) is ground follows from Definition 10.3-(iv) and cleanness of C. Finally, property (iii) follows from Definition 10.3-(ii).    \(\square \)

With the above lemmas we can prove our main result:

Theorem 10.7

The \(\mathrm {HSP}\) calculus with a suitable inference rule is dynamically refutationally complete for the ground BG-sorted term fragment. More precisely, let N be a finite set of GBT clauses and \(\mathcal {D}= (N_i)_{i\ge 0}\) a fair \(\mathrm {HSP}\) derivation such that reckless is applied only in a pre-derivation \((N_0 = \mathrm {abstr}(N)),\ldots ,(N_k = N^\mathrm {pre})\), for some \(k \ge 0\). Then the limit of \(\mathcal {D}\) contains \(\Box \) if and only if N is \(\mathcal {B}\)-unsatisfiable.

Notice that Theorem 10.7 does not appeal to compactness of BG specifications.

Proof

Our goal is to apply Theorem 9.6 and its proof, in a slightly modified way. For that, we first need to know that \(N^\mathrm {pre}= \mathrm {abstr}(N')\) for some clause set \(N'\) that is locally sufficiently complete.

We are given that N is a set of GBT clauses. Recall that weak abstraction (recursively) extracts BG subterms by substituting fresh variables and adding disequations. Unabstraction reverses this process (and possibly eliminates additional disequations). It follows that with N being a set of GBT clauses, so is \(\mathrm {unabstr}(\mathrm {abstr}(N)) = \mathrm {unabstr}(N_0)\). From Lemma 10.2 it follows that \(\mathrm {unabstr}(N^\mathrm {pre})\) is also a GBT clause set.

Now chose \(N'\) as the clause set that is obtained from \(N^\mathrm {pre}\) by replacing every clause \(C \in N^\mathrm {pre}\) such that \(\mathrm {unabstr}(C)\) is a definition by \(\mathrm {unabstr}(C)\). By construction of definitions, unabstraction reverses weak abstraction of definitions. It follows \(N^\mathrm {pre}= \mathrm {abstr}(N')\). By definition of pre-derivations, all BG-sorted FG terms occurring in \(\mathrm {unabstr}(N^\mathrm {pre})\) occur in definitions. Hence, with \(\mathrm {unabstr}(N^\mathrm {pre})\) being a set of GBT clauses so is \(N'\). It follows easily that \(N'\) is locally sufficiently complete, as desired.

We cannot apply Theorem 9.6 directly now because it requires compactness of the BG specification, which cannot be assumed. However, we can use the following argumentation instead.

Let \(N^\infty = \bigcup _{i\ge 0} N_i\) be the union of \(\mathcal {D}\). We next show that \(\mathrm {unabstr}(N^\infty )\) contains only finitely many different BG terms and each of them is ground. Recall that \(\mathrm {unabstr}(N^\mathrm {pre})\) is a GBT clause set, and so every BG term in \(\mathrm {unabstr}(N^\mathrm {pre})\) is ground. Because is disabled in \(\mathcal {D}\), only \(\mathrm {HSP}_{\mathrm {Base}}\) and (suitable) inferences need to be analysed. Notice that \(N^\mathrm {pre}\) is clean and both the \(\mathrm {HSP}_{\mathrm {Base}}\) and inferences preserve cleanness, as per Lemmas 10.5-(1) and 10.6, respectively.

With respect to \(\mathrm {HSP}_{\mathrm {Base}}\) inferences, together with Definition 10.4-(ii) it follows that every BG term t in the unabstracted version \(\mathrm {unabstr}(C)\) of the inference conclusion C is ground. Moreover, t also occurs in the unabstracted version of some premise clause by Lemma 10.5-(2). In other words, \(\mathrm {HSP}_{\mathrm {Base}}\) inferences do not grow the set of BG terms w.r.t. unabstracted premises and conclusions.

With respect to inferences, \(\mathrm {unabstr}(N^\mathrm {pre})\) provide an upper bound w.r.t. the term ordering \(\succ _{\mathrm {fin}}\) for all BG terms generated in inferences. There can be only finitely many such terms, and each of them is ground, which follows from Definition 10.3-(i).

Because every BG term occurring in \(\mathrm {unabstr}(N^\infty )\) is ground, every BG clause in \(\mathrm {unabstr}(N^\infty )\) is a multiset of literals of the form \(s \approx t\) or \(s \not \approx t\), where s and t are ground BG terms. With only finitely many BG terms available, there are only finitely many BG clauses in \(\mathrm {unabstr}(N^\infty )\), modulo equivalence. Because unabstraction is an equivalence transformation, there are only finitely many BG clauses in \(N^\infty \) as well, modulo equivalence.

Let \(N_\infty = \bigcup _{i\ge 0} \bigcap _{j\ge i} N_j\) be the limit clause set of the derivation \(\mathcal {D}\), which is saturated w.r.t. the hierarchic superposition calculus and \(\mathcal {R}^\mathcal {H}\). Because \(\mathcal {D}\) is not a refutation, it does not contain \(\Box \). Consequently the rule is not applicable to \(N_\infty \). The set \(N^\infty \), and hence also \(N_\infty \subseteq N^\infty \), contains only finitely many BG clauses, modulo equivalence. This entails that the set of all \(\varSigma _\mathrm {B}\)-clauses in \(\mathrm {sgi}(N_\infty )\) is satisfied by some term-generated \(\varSigma _\mathrm {B}\)-interpretation \(I \in \mathcal {B}\). Now, the rest of the proof is literally the same as in the proof of Theorem 9.6.    \(\square \)

Because unabstraction can also be applied to fully abstracted clauses, it is possible to equip the hierarchic superposition calculus of [5] with a correspondingly modified rule and get Theorem 10.7 in that context as well.

Kruglov and Weidenbach [22] have shown how to use hierarchic superposition as a decision procedure for ground clause sets (and for Horn clause sets with constants and variables as the only FG terms). Their method preprocesses the given clause set by “basification”, a process that removes BG-sorted FG terms similarly to our reckless rule. The resulting clauses then are fully abstracted and hierarchic superposition is applied. Some modifications of the inference rules make sure derivations always terminate. Simplification is restricted to subsumption deletion. The fragment of [22] is a further restriction of the GBT fragment. We expect we can get decidability results for that fragment with similar techniques.

11 Linear Arithmetic

For the special cases of linear integer arithmetic (LIA) and linear rational arithmetic as BG specifications, the result of the previous section can be extended significantly: In addition to ground BG-sorted terms, we can also permit BG-sorted variables and, in certain positions, even variables with offsets.

Recall that we have assumed that equality is the only predicate symbol in our language, so that a non-equational atom, say \(s < t\), is to be taken as a shorthand for the equation \((s < t) \approx true \). We refer to the terms that result from this encoding of atoms as atom terms; other terms are called proper terms.

Theorem 11.1

Let N be a set of clauses over the signature of linear integer arithmetic (with parameters \(\alpha \), \(\beta \), etc.), such that every proper term in these clauses is either (i) ground, or (ii) a variable, or (iii) a sum \(\zeta + k\) of a variable \(\zeta \) and a number \(k \ge 0\) that occurs on the right-hand side of a positive literal \(s < \zeta \,+\,k\). If the set of ground terms occurring in N is finite, then N is satisfiable in LIA over \(\mathbb {Z}\) if and only if N is satisfiable w.r.t.the first-order theory of LIA.

Proof

Let N be a set of clauses with the required properties, and let T be the finite set of ground terms occurring in N. We will show that N is equivalent to some finite set of clauses over the signature of linear integer arithmetic, which implies that it is satisfiable in the integer numbers if and only if it is satisfiable in the first-order theory of LIA.

In a first step, we replace every negative ordering literal \(\lnot s < t\) or \(\lnot s \le t\) by the equivalent positive ordering literal \(t \le s\) or \(t < s\). All literals of clauses in the resulting set \(N_0\) have the form \(s \approx t\), \(s \not \approx t\), \(s < t\), \(s \le t\), or \(s < \zeta + k\), where s and t are either variables or elements of T and \(k \in \mathbb {N}\). Note that the number of variables in clauses in \(N_0\) may be unbounded.

In order to handle the various inequality literals in a more uniform way, we introduce new binary relation symbols \(<_k\) (for \(k \in \mathbb {N}\)) that are defined by \(a <_k b\) if and only if \(a < b + k\). Observe that \(s <_k t\) entails \(s <_n t\) whenever \(k \le n\). Obviously, we may replace every literal \(s < t\) by \(s <_0 t\), every literal \(s \le t\) by \(s <_1 t\), and every literal \(s < \zeta + k\) by \(s <_k \zeta \). Let \(N_1\) be the resulting clause set.

We will now transform \(N_1\) into an equivalent set \(N_2\) of ground clauses. We start by eliminating all equality literals that contain variables by exhaustively applying the following transformation rules:

figure cm

All variables in inequality literals are then eliminated in a Fourier-Motzkin-like manner by exhaustively applying the transformation rule

figure cn

where \(\zeta \) does not occur in C and one of the index sets I and J may be empty.

The clauses in \(N_2\) are constructed over the finite set T of proper ground terms, but the length of the clauses in \(N_2\) is potentially unbounded. In the next step, we will transform the clauses in such a way that any pair of terms st from T is related by at most one literal in any clause: We apply one of the following transformation rules as long as two terms s and t occur in more than one literal:

figure co

The length of the clauses in the resulting set \(N_3\) is now bounded by \(\frac{1}{2}m(m\,+\,1)\), where m is the cardinality of T. Still, due to the indices of the relation symbols \(<_k\), \(N_3\) may be infinite. We introduce an equivalence relation \(\sim \) on clauses in \(N_3\) as follows: Define \(C \sim C'\) if for all \(s,t \in T\) (i) \(s \approx t \in C\) if and only if \(s \approx t \in C'\), (ii) \(s \not \approx t \in C\) if and only if \(s \not \approx t \in C'\), and (iii) \(s <_k t \in C\) for some k if and only if \(s <_n t \in C'\) for some n. This relation splits \(N_3\) into at most \((\frac{1}{2}m(m\,+\,1))^5\) equivalence classes.Footnote 14

We will now show that each equivalence class is logically equivalent to a finite subset of itself. Let M be some equivalence class. Since any two clauses from M differ at most in the indices of their \(<_k\)-literals, we can write every clause \(C_i \in M\) in the form

figure cp

where C and the \(s_l\) and \(t_l\) are the same for all clauses in M. As we have mentioned above, \(s_l <_{k_{il}} t_l\) entails \(s_l <_{k_{jl}} t_l\) whenever \(k_{il} \le k_{jl}\); so a clause \(C_i \in M\) entails \(C_j \in M\) whenever the n-tuple \((k_{i1},\dots ,k_{in})\) is pointwise smaller or equal to the n-tuple \((k_{j1},\dots ,k_{jn})\) (that is, \(k_{il} \le k_{jl}\) for all \(1 \le l \le n\)).

Let Q be the set of n-tuples of natural numbers corresponding to the clauses in M. By Dickson’s lemma [15], for every set of tuples in \(\mathbb {N}^n\) the subset of minimal tuples (w.r.t.the pointwise extension of \(\le \) to tuples) is finite. Let \(Q'\) be the subset of minimal tuples in Q, and let \(M'\) be the set of clauses in M that correspond to the tuples in \(Q'\). Since for every tuple in \(Q \setminus Q'\) there is a smaller tuple in \(Q'\), we know that every clause in \(M \setminus M'\) is entailed by some clause in \(M'\). So the equivalence class M is logically equivalent to its finite subset \(M'\). Since the number of equivalence classes is also finite and all transformation rules are sound, this proves our claim.    \(\square \)

In order to apply this theorem to hierarchic superposition, we must again impose some restrictions on the calculus. Most important, we have to change the definition of weak abstraction slightly: We drop the requirement that target terms are not domain elements from Definition 5.1, i. e., we abstract out a non-variable BG term q occurring in a clause \(C[f(s_1,\dots ,q,\dots ,s_n)]\), where f is a FG operator or at least one of the \(s_i\) is a FG or impure BG term, even if q is a domain element. As we mentioned, all results obtained so far hold also for the modified definition of weak abstraction. In addition, we must again restrict to suitable inferences (Definition 10.3). With these restrictions, we can prove our main result:

Theorem 11.2

The hierarchic superposition calculus is dynamically refutationally complete w.r.t. LIA over \(\mathbb {Z}\) for finite sets of \(\varSigma \)-clauses in which every proper BG-sorted term is either (i) ground, or (ii) a variable, or (iii) a sum \(\zeta + k\) of a variable \(\zeta \) and a number \(k \ge 0\) that occurs on the right-hand side of a positive literal \(s < \zeta + k\).

Proof

Let N be a finite set of \(\varSigma \)-clauses with the required properties. By Lemma 9.8, a pre-derivation starting from \(N_0 = \mathrm {abstr}(N)\) yields a locally sufficiently complete finite set \(N_0\) of abstracted clauses.

Now we run the hierarchic superposition calculus on \(N_0\) (with the same restrictions on simplifications as in Sect. 10). Let \(N_1\) be the (possibly infinite) set of BG clauses generated during the run. By unabstracting these clauses, we obtain an equivalent set \(N_2\) of clauses that satisfy the conditions of Theorem 11.1, so \(N_2\) is satisfiable in LIA over \(\mathbb {Z}\) if and only if N is satisfiable w.r.t.the first-order theory of LIA. Since the hierarchic superposition calculus is dynamically refutationally complete w.r.t. the first-order theory of LIA, the result follows.    \(\square \)

Analogous results hold for linear rational arithmetic. Let n be the least common divisor of all numerical constants in the original clause set; then we define \(a <_{2i} b\) by \(a < b + \frac{i}{n}\) and \(a <_{2i+1} b\) by \(a \le b + \frac{i}{n}\) for \(i \in \mathbb {N}\) and express every inequation literal in terms of \(<_k\). The Fourier-Motzkin transformation rule is replaced by

figure cr

where \(\zeta \) does not occur in C, one of the index sets I and J may be empty, and \(k \bullet n\) is defined as \(k + n - 1\) if both k and n are odd, and \(k + n\) otherwise. The rest of the proof proceeds in the same way as before.

12 Experiments

We implemented the \(\mathrm {HSP}\) calculus in the theorem prover Beagle.Footnote 15 Beagle is a testbed for rapidly trying out theoretical ideas but it is not a high-performance prover (in particular it lacks indexing of any form). The perhaps most significant calculus feature not yet implemented is the improvement for linear integer and rational arithmetic of Sect. 11.

Beagle ’s proof procedure and background reasoning, in particular for linear integer arithmetic, and experimental results have been described in [7]. Here we only provide an update on the experiments and report on complementary aspects not discussed in [7]. More specifically, our new experiments are based on a more recent version of the TPTP problem library [27] (by four years), and we discuss in more detail the impact of the various calculus variants introduced in this paper. We also compare Beagle ’s performance to that of other provers.

We tested Beagle on the first-order problems from the TPTP library, version 7.2.0,Footnote 16 that involve some form of arithmetic, including non-linear, rational and real arithmetics. The problems in the TPTP are organized in categories, and the results for some of them are quickly dealt with: none of the HWV-problems in the TPTP library was solvable within the time limit and we ignore these below. We ignore also the SYN category as its sole problem is merely a syntax test, and the GEG category as all problems are zero-rated and easily solved by Beagle.

The experiments were run on a MacBook Pro with a 2.3 GHz Intel i7 processor and 16 GB main memory. The CPU time limit was 120 s (a higher time limit does not help much solving more problems). Tables 1 and 2 summarize the results for the problems with a known “theorem” or “unsatisfiable” status with non-zero rating. Beagle can also solve some satisfiable problems, but most of them are rather easy and can be solved by the BG solver alone. Unfortunately, the TPTP does not contain reasonably difficult satisfiable problems from the GBT-fragment, which would be interesting for exploiting the completeness result of Sect. 10.

Table 1. Number of TPTP version 7.2.0 problems solved, of all non-zero rated “theorem” or “unsatisfiable” problems involving any form of arithmetic. The flag settings giving the best result are in typeset in bold. The CPU time limit was 120 s. The column “Any” is the number of problems solved in the union of the four setting to its left. For the “Auto” column see the description of auto-mode in the main text further below. For auto-mode only, the CPU time limit was increased to 300 s.

Table 1 is a breakdown of Beagle ’s performance by TPTP problem categories and four flag settings. Beagle features a host of flags for controlling its search, but in Table 1 we varied only the two most influential ones: one that controls whether input arithmetic variables are taken as ordinary variables or as abstraction variables. (Sect. 5 discusses the trade-off between these two kinds of variables.) The other controls whether simplification of BG terms is done cautiously or aggressively.

To explain, the cautious simplification rules comprise evaluation of arithmetic terms, e. g. \(3\cdot 5\), \(3 < 5\), \(\alpha +1 < \alpha +1\) (equal lhs and rhs terms in inequations), and rules for TPTP-operators, e. g., \(\mathsf {{to\_rat}}(5)\), \(\mathsf {{is\_int}}(3.5)\). For aggressive simplification, integer sorted subterms are brought into a polynomial-like form and are evaluated as much as possible. For example, the term \(5\cdot \alpha \,+ \, \mathsf {{f}}(3\,+\,6, \alpha \cdot 4) - \alpha \cdot 3\) becomes \(2\cdot \alpha \,+\, \mathsf {{f}}(9, 4\cdot \alpha )\). These conversions exploit the associativity and commutativity laws for \(+\) and \(\cdot \). We refer the reader to [7] for additional aggressive simplification rules, but we note here that aggressive simplification does not always preserve sufficient completeness. For example, in the clause set \(N = \{ \mathsf {{P}}(1\,+\,(2\,+\,\mathsf {{f}}(X))),\ \lnot \mathsf {{P}}(1\,+\,(X\,+\,\mathsf {{f}}(X)))\}\) the first clause is aggressively simplified, giving \(N' = \{ \mathsf {{P}}(3\,+\,\mathsf {{f}}(X)),\ \lnot \mathsf {{P}}(1\,+\,(X\,+\,\mathsf {{f}}(X)))\}\). Both N and \(N'\) are LIA-unsatisfiable, \(\mathrm {sgi}(N) \,\cup \,\mathrm {GndTh}(\mathrm {LIA})\) is unsatisfiable, but \(\mathrm {sgi}(N') \,\cup \,\mathrm {GndTh}(\mathrm {LIA})\) is satisfiable. Thus, N is (trivially) sufficiently complete while \(N'\) is not.

These two flag settings, in four combinations in total, span a range from “most complete but larger search space” by using ordinary variables and cautious simplification, to “most incomplete but smaller search space” by using abstraction variables and aggressive simplification. As the results in Table 1 show, the flag setting “abstraction variables” solves more problems than “ordinary variables”, but not uniformly so. Indeed, as indicated by the “Any” column in Table 1, there are problems that are solved only with either ordinary or abstraction variables.

Some more specific comments, by problem categories:

ARI. Of the 362 solved problems, 14 are not solved in every setting. Of these, four problems require cautious simplification, and five problems require aggressive simplification. This is independent from whether abstraction or ordinary variables are used.

DAT. The DAT category benefits significantly from using ordinary variables. There is only one problem, DAT075=1.p, that is not solved with ordinary variables. Two problems, DAT072=1.p and DAT086=1.p are solvable only with ordinary variables and aggressive simplification.

Many problems in the DAT category, including DAT086=1.p, state existentially quantified theorems about data structures such as arrays and lists. If they are of an arithmetic sort, these existentially quantified variables must be taken as ordinary variables. This way, they can be unified with BG-sorted FG terms such as \(\mathsf {{head}}(\mathsf {{cons}}(x, y))\) (which appear in the list axioms) which might be necessary for getting a refutation at all.

A trivial example for this phenomenon is the entailment \(\{\mathsf {{P}}(\mathsf {{f}}(1))\} \,\models \, \exists x\ \mathsf {{P}}(x)\), where \(\mathsf {{f}}\) is BG-sorted, which is provable only with ordinary variables.

NUM. This category requires abstraction variables. With it, four of the problems can be solved in the NUM category (NUM859=1.p, NUM860=1.p, NUM861=1.p, NUM862=1.p), as the search space with ordinary variables is too big.

SWW. By and large, cautious BG simplification fares slightly better on the SWW problems. Of the 97 problems solved, 16 are not solved in every setting, and the settings that do solve it do not follow an obvious pattern.

We were also interested in Beagle ’s performance, on the same problems, broken down by the calculus features introduced in this paper. Table 2 summarizes our findings for five configurations obtained by progressively enabling these features. In order to assess the usefulness of the features we filtered the results by problem rating. The column “\(\ge \)0.75”, for instance, lists the number of solved problems, of all 80 known “theorem” or “unsatisfiable” problems with a rating 0.75 or higher and that involve some form of arithmetic.

Table 2. Number of “theorem” or “unsatisfiable” problems solved, by calculus features and problem rating, excluding the HWV-problems.

The predecessor calculus of [5] uses an exhaustive abstraction mechanism that turns every side of an equation into either a pure BG or pure FG term. All BG variables are always abstraction variables. Configuration implements this calculus, with the only deviation of an added splitting rule. The splitting rule [29] breaks apart a clause into variable-disjoint parts and leads to a branching search space for finding corresponding sub-proofs. See again [7] for more details.

In our experiments splitting is always enabled, in particular also for configuration for better comparability of result. Cautious BG simplification is enabled for configuration and the subsequent configurations .

Configuration differs from configuration only by an additional rule. (As said earlier, the rule can be added without problems to the previous calculus.) By comparing the results for and it becomes obvious that adding improves performance dramatically. This applies to the new calculus as well. The rule stands out and should always be enabled.

Configuration replaces the standard abstraction mechanism of [5] by the new weak abstraction mechanism of Sect. 5. Weak abstraction seems more effective than standard abstraction for problems with a higher rating, but the data set supporting this conclusion is very small.

There are five problems, all from the SWW categoryFootnote 17 that re solved only with configuration , and there is one problem, SWW607=2.p, that is solved only by configurations and .

There are four solvable problems with rating 0.75. These are ARI595=1.p – ARI598=1.p, which are “simple” problems involving a free predicate symbols over the integer background theory. The problem ARI595=1.p, for instance, is to prove the validity of the formula .Footnote 18 The calculus and implementation techniques needed for solving such problems are rather different to those needed for solving combinatory problems involving trivial arithmetics only, like, e.g., the HWV-problems.

Configuration is the same as except that it includes the results for general variables instead of abstraction variables. Similarly, configuration is the same as except that it includes the results for aggressive BG simplification. It is the union of all results in Table 1.

For comparison with other implemented theorem provers for first-order logic with arithmetics, we ran Beagle on the problem set used in the 2018 edition of the CADE ATP system competition (CASC-J9).Footnote 19. The competing systems were CVC4 [6], Princess [26], and two versions of Vampire [20].

In the competition, the systems were given 200 problems from the TPTP problem library, 125 problems over the integers as the background theory (TFI category), and 75 over the reals (TFE category). The system that solves the most problems in the union of the TFI and TFE categories within a CPU time limit of 300 s wins. We applied Beagle in an “auto” mode, which time-slices (at most) three parameter settings. These differ mainly in their use of abstraction variables or ordinary variables, and the addition of certain arithmetic lemmas.

Table 3. CADE ATP system competition results 2018 and Beagle ’s performance on the same problem sets.

The results are summarized in Table 3. We note that Beagle was run on different hardware but the same timeout of 300 s. The results are thus only indicative of Beagle ’s performance, but we do not expect significantly different result had it participated. In the TFI category, of the 36 problems solved, 5 require the use of ordinary variables. In the TFE category, 16 problems involve the ceiling or floor function, which is currently not implemented, and hence cannot be attempted.

In general, many problems used in the competition are rather large in size or search space and would require a more sophisticated implementation of Beagle.

13 Conclusions

The main theoretical contribution of this paper is an improved variant of the hierarchic superposition calculus. One improvement over its predecessor [5] is a different form of “abstracted” clauses, the clauses the calculus works with internally. Because of that, a modified completeness proof is required. We have argued informally for the benefits over the old calculus in [5]. They concern making the calculus “more complete” in practice. It is hard to quantify that exactly in a general way, as completeness is impossible to achieve in presence of background-sorted foreground function symbols (e. g., “head” of integer-sorted lists). To compensate for that to some degree, we have reported on initial experiments with a prototypical implementation on the TPTP problem library. These experiments clearly indicate the benefits of our concepts, in particular the definition rule and the use of ordinary variables. There is no problem that is solved only by the old calculus only. Certainly more experimentation and an improved implementation is needed to also solve bigger-sized problems with a larger combinatorial search space.

We have also obtained two new completeness results for certain clause logic fragments that do not require compactness of the background specification, cf. Sects. 10 and 11. The former is loosely related to the decidability results in [22], as discussed in Sect. 9. It is also loosely related to results in SMT-based theorem proving. For instance, the method in [18] deals with the case that variables appear only as arguments of, in our words, foreground operators. It works by ground-instantiating all variables in order to being able to use an SMT-solver for the quantifier-free fragment. Under certain conditions, finite ground instantiation is possible and the method is complete, otherwise it is complete only modulo compactness of the background theory (as expected). Treating different fragments, the theoretical results are mutually non-subsuming with ours. Yet, on the fragment they consider we could adopt their technique of finite ground instantiation before applying Theorem 10.7 (when it applies). However, according to Theorem 10.7 our calculus needs instantiation of background-sorted variables only, this way keeping reasoning with foreground-sorted terms on the first-order level, as usual with superposition.