1 Introduction

Satisfiability (SAT) solvers have been successfully used for a broad spectrum of applications ranging from formal verification [1] over security [2] to classical mathematics [3, 4]. This success came as a slight surprise because the translation of problem instances from application domains into propositional logic can lead to a loss of domain-specific information. However, this loss of information is often harmless since many domain-specific reasoning techniques (e.g., for Boolean circuits or number theory) can be simulated by SAT-preprocessing techniques such as blocked-clause elimination [5]. In this paper, we present further evidence of this observation by introducing a new propositional reasoning technique that simulates the removal of redundant inputs from Boolean circuits.

Our reasoning technique, which we call globally-blocked-clause elimination, is strongly related to the concept of conditional autarkies [6]—a generalization of autarkies [7, 8]. Given a propositional formula in conjunctive normal form, an autarky is a truth assignment that satisfies every clause it touches, that is, it satisfies every clause of which it assigns at least one variable. For example, given the formula \((\overline{a} \vee b) \wedge (a \vee \overline{b} \vee \overline{c} \vee d) \wedge (c \vee \overline{d})\), the (partial) assignment that makes both a and b true is an autarky. In contrast, neither the assignment that makes only a true nor the assignment that makes only b true are autarkies because they touch clauses without satisfying them.

A conditional autarky is an assignment that can be split into two parts—the conditional part and the autarky part—such that the autarky part becomes an autarky after applying the conditional part to the formula. For example, after making a true in the formula \((a \vee \overline{b} \vee \overline{c}) \wedge (\overline{a} \vee b \vee \overline{d}) \wedge (\overline{a} \vee \overline{b} \vee c) \wedge (\overline{a} \vee d)\) we obtain the formula \((b \vee \overline{d}) \wedge (\overline{b} \vee c) \wedge (d)\) for which the assignment that makes b and c true is an autarky. Hence, the assignment that makes a, b, and c true is a conditional autarky with conditional part a although it is not an autarky. In fact, every truth assignment is a conditional autarky with an empty autarky part, but we are particularly interested in conditional autarkies with non-empty autarky parts. We show that such conditional autarkies help us find redundant clauses (i.e., clauses that can be removed from a formula without affecting satisfiability). More specifically, we present globally-blocked clauses, a novel type of redundant clauses that is strongly related to the conditional-autarky concept.

Globally-blocked clauses are a strict generalization of set-blocked clauses [9], which themselves are a strict generalization of blocked clauses [10]. This means that every set-blocked clause (and thus every blocked clause) is a globally-blocked clause but not vice versa. The elimination of blocked clauses can improve the efficiency of SAT solvers [11], first-order theorem provers [12], and solvers for quantified Boolean formulas (QBF) [13, 14]. Moreover, it can simulate several reasoning techniques for Boolean circuits on the propositional level. Set-blocked clauses form the basis of the satisfaction-driven clause learning [6] paradigm for SAT solving, which can lead to exponential speed-ups on hard formulas compared to traditional conflict-driven clause learning [15].

We show how the elimination of globally-blocked clauses simulates circuit-reasoning techniques that could not be performed by SAT solvers so far. In a preprocessing step for the actual solving, our approach takes a formula and tries to find conditional autarkies with large autarky parts. It then uses the resulting conditional autarkies to identify and eliminate globally-blocked clauses from the formula, which results in a simplified formula that is easier to solve. In a more theoretic part of the paper, we present several properties of conditional autarkies and pin down their relationship with globally-blocked clauses and other existing types of redundant clauses.

The main purpose of this invited paper is to discuss the concept of globally-blocked clauses and relate it to conditional autarkies [6]. This concept was partially described in section 2.2.4 of the PhD thesis of the first author [16]. In this paper, we provide more details and also describe an algorithm for fast computation of multiple globally-blocked clauses from one arbitrary total assignment. This algorithm has also been implemented in the SAT solver CaDiCaL. The second purpose of this paper is to describe the history and applications of such notions of redundancy as well as how these are used in clausal proofs.

2 Preprocessing, Redundancy, and Proofs

Preprocessing (simplifying a formula before search) and inprocessing (simplifying a formula during search) [17] are crucial techniques in state-of-the-art SAT solvers. This line of research started with bounded variable elimination, which allows to significantly reduce the size of large industrial verification problems [18]. Bounded variable elimination removes variables from a formula—by combining the variable elimination with substitution—if this removal does not increase the size of the formula. More recently, some solvers even allow a small increase of the formula size.

Another popular approach to preprocessing and inprocessing are so-called clause-elimination techniques, which remove clauses from a formula without affecting satisfiability. Examples are the two particularly well-known clause-elimination techniques of subsumed-clause elimination [18] and blocked-clause elimination [11]. Blocked-clause elimination simulates many circuit simplification techniques on the CNF level and it allows the simulation of other high-level reasoning techniques, such as the elimination of redundant Pythagorean triples, which was crucial to solving the Pythagorean triples problem [19]. Several generalizations of subsumed clauses and blocked clauses have been proposed to further reduce the size of propositional formulas [20,21,22]. Moreover, clause-elimination techniques boost the effectiveness of variable-elimination techniques since the removal of clauses often enables further variable-elimination steps.

Although preprocessing techniques have contributed significantly to the improvement of SAT solvers in the last decade, they can also be expensive. To deal with this issue, SAT solvers have shifted their focus from preprocessing to inprocessing, meaning that only limited preprocessing is done initially and that later on the solver interleaves additional variable and clause-elimination techniques with the search process. One advantage of this is that inprocessing can simplify a formula after important clauses have been learned, allowing for further simplifications compared to preprocessing. As a matter of fact, inprocessing solvers have been dominating the SAT competitions since 2013.

As a drawback, the incorporation of inprocessing has made SAT solvers more complex and thus more prone to implementation errors and conceptual errors [17]. Various communities that use SAT solvers have therefore expressed interest in verifiable SAT solver output. For example, SAT solvers are used in industry to show the correctness of safety-critical hardware and software, or in mathematics to solve long-standing open problems. In both cases, it is crucial that a SAT solver not just returns a simple yes/no answer but instead produces verifiable output that certifies the correctness of its result—a so-called proof.

Constructing such proofs is a non-trivial issue as several inprocessing techniques cannot be expressed succinctly in the resolution proof system, which was commonly used in the past. This led to the search for stronger proof systems that are well-suited for practical SAT solving, and it turned out that clause-redundancy notions that form the theoretical foundation of clause-elimination techniques can also act as ideal building blocks for stronger proof systems. The DRAT proof system [23], which is based on the notion of resolution asymmetric tautologies, has since become the de-facto standard proof system in SAT solving. DRAT is now supported by all state-of-the-art SAT solvers and there exist formally-verified tools—in ACL2, Coq, and Isabelle [24, 25]—that check the correctness of DRAT proofs. Such tools have not only been used for validating the unsatisfiability results of recent SAT competitions [26] but also for verifying the solutions of some long-standing math problems, including the Erdős discrepancy conjecture, the Pythagorean triples problem, and Schur number five [3, 4, 19].

To strengthen the DRAT proof system even further, proof systems based on stronger redundancy notions than resolution asymmetric tautologies have been proposed, leading to the Propagation Redundancy (PR) proof system [27]. This proof system is surprisingly strong even without the introduction of new variables, which usually is a key technique to obtaining short proofs. As has been shown, there exist short PR proofs without new variables for pigeon hole formulas, Tseitin formulas, and mutilated chessboard problems [28, 29]—these problem families are notorious for admitting only resolution proofs of exponential size. Moreover, the PR proofs for these problems can be found automatically using the satisfaction-driven clause learning paradigm (SDCL) [6, 30], which is a generalization of conflict-driven clause learning.

As DRAT has been shown to polynomially simulate the PR [28] proof system, it is possible to transform PR proofs into DRAT proofs and then check their correctness using a formally-verified checker, meaning that one can have high confidence in the correctness of results obtained by SDCL.

Research on preprocessing, clause redundancy, and proofs has also expanded beyond propositional logic. When solving quantified Boolean formulas, QBF generalizations of blocked-clause elimination have been used successfully [13]. Also, the QRAT proof system [31] (a generalization of DRAT) allows the succinct expression of virtually all QBF inprocessing techniques, and QRAT has given rise to various new QBF preprocessing techniques, such as the elimination of blocked literals [32] as well as of QRAT clauses and their generalizations [33, 34]. The research on generalizing redundancy notions has also been extended to first-order logic [35], where especially the elimination of blocked clauses has proven to be a valuable preprocessing technique [12].

In the following, we present globally-blocked clauses, a new kind of redundant clauses that generalizes existing notions of redundancy in propositional logic.

3 Motivating Example

Consider a single-output circuit F(I) (where I is a set of inputs) that can be decomposed syntactically into \(F(I) = G (J, H (K))\), where both G and H are single output sub-circuits, and J and K partition the inputs I. If we want to solve the satisfiability problem for F, i.e., the problem of deciding if there exists a set of inputs such that F produces a 1 as its output (CIRCUIT-SAT), we can proceed as follows: We show that G(Jx) is satisfiable, where x is a new variable, and that H(K) can produce both 0 and 1 as its output. In many situations, if the sub-circuit H is given, the second requirement can be shown easily using random simulation. Checking satisfiability of G remains to be shown, but is hopefully easier, since G is smaller than F.

This circuit-level technique is also called “unique sensitization” in the FAN algorithm [36] and H would be called a “headline”. However, if we are only given a CNF encoding of F, previously known CNF preprocessing techniques are in general not able to perform such a simplification, whereas globally-blocked-clause elimination in essence allows to remove all clauses of the CNF encoding of H.

To continue the example, assume for simplicity that the top-level gate of H is an AND gate (the same arguments apply to arbitrary top gates of H). After introducing Tseitin variables x for H, y and z for the AND gate inputs, etc., the Tseitin encoding \(F'\) of F has the following structure

figure a

where \(S \cup T \cup \{x, y, z\}\) are new Tseitin variables. Note that \(F'\), \(G'\), \(H'\) and \(H''\) are in CNF. Further assume we find two assignments \(\alpha \) and \(\beta \) over the variables of \(H'\), which both satisfy \(H'\), i.e., \(\alpha (H') = \beta (H') = 1\), and \(\alpha (x) = 1\), \(\beta (x) = 0\).

It is not that hard to find such assignments through local search or random decisions and unit propagation. Actually, one can also start with a total assignment with these properties, which will then—by our algorithms—be pruned down to range only over variables in \(H'\). These assignments are conditional autarkies where the conditional part consists of the assignment to x and the autarky part consists of the assignments to the other variables of \(H'\).

It turns out that the first two binary clauses encoding the top AND gate of H contain the negation \(\overline{x}\) of the condition in \(\alpha \), and the last ternary clause of the AND gate contains the negation x of the condition in \(\beta \). Moreover, these three clauses are satisfied by the autarky parts of \(\alpha \) and \(\beta \). As we are going to prove, this situation allows to deduce that the clauses are globally blocked and thus redundant. After removing the three AND gate clauses, both conditional autarkies \(\alpha \) and \(\beta \) become autarkies, allowing to remove \(H'\) too.

Alternatively, blocked-clause elimination [11] or bounded variable elimination [18] would also remove \(H'\), since after removing the clauses of the top gate of H cone-of-influence reduction applies, which is simulated by both techniques [5]. Thus, for this example the key aspect of globally-blocked-clause elimination is that it allows to the remove the clauses of the headline gate connecting the two parts of the CNF, in fact simulating unique sensitization on the CNF level.

4 Preliminaries

Here, we present the background necessary for understanding the rest of the paper. We consider propositional formulas in conjunctive normal form (CNF), which are made up of variables, literals, and clauses, as defined in the following. A literal is either a variable x (a positive literal) or the negation \(\overline{x}\) of a variable x (a negative literal). The complement \(\overline{l}\) of a literal l is defined as \(\overline{l} = \overline{x}\) if \(l = x\) and as \(\overline{l} = x\) if \(l = \overline{x}\). For a literal l, we denote the variable of l by \( var (l)\). A clause is a finite disjunction of the form \((l_1 \vee \dots \vee l_n)\) where \(l_1, \dots , l_n\) are literals. A tautology is a clause that contains both a literal and its complement. If not stated otherwise, we assume that clauses are not tautologies. A formula is a finite conjunction of the form \(C_1 \wedge \dots \wedge C_m\) where \(C_1, \dots , C_m\) are clauses. Clauses can be viewed as sets of literals and formulas can be viewed as sets of clauses. For a set L of literals and a formula F, we define \(F_L = \{ C \in F \mid C \cap L \ne \emptyset \}\). We sometimes write \(F_l\) for \(F_{\{l\}}\).

A truth assignment (or short, assignment) is a function from a set of variables to the truth values 1 (true) and 0 (false). An assignment is total with respect to a formula if it assigns a truth value to all variables occurring in the formula. If not stated otherwise, we do not require assignments to be total. We denote the domain of an assignment \(\alpha \) by \( var (\alpha )\). A literal l is satisfied by an assignment \(\alpha \) if l is positive and \(\alpha ( var (l)) = 1\) or if it is negative and \(\alpha ( var (l)) = 0\). A literal is falsified by an assignment if its complement is satisfied by the assignment. An assignment touches a clause if it assigns a truth value to at least one of its literals. A clause is satisfied by an assignment \(\alpha \) if it contains a literal that is satisfied by \(\alpha \). Finally, a formula is satisfied by an assignment \(\alpha \) if all its clauses are satisfied by \(\alpha \). A formula is satisfiable if there exists an assignment that satisfies it. Two formulas are logically equivalent if they are satisfied by the same total assignments; they are satisfiability-equivalent if they are either both satisfiable or both unsatisfiable. We often view assignments as the sets of literals they satisfy and denote them as sequences of literals. For instance, given an assignment \(\alpha \) that makes x true and y false, we would denote \(\alpha \) by \(x\overline{y}\) and write things like \(x \in \alpha \).

We denote the empty clause by \(\bot \) and the satisfied clause by \(\top \). Given an assignment \(\alpha \) and a clause C, we define if \(\alpha \) satisfies C, otherwise denotes the result of removing from C all the literals falsified by \(\alpha \). Moreover, for a formula F, we define .

We consider a clause to be redundant with respect to a formula if the clause can be removed from the formula without affecting the formula’s satisfiability or unsatisfiability:

Definition 1

A clause C is redundant with respect to a formula F if F and \(F \wedge C\) are satisfiability-equivalent.

For instance, the clause \(C = (a \vee b)\) is redundant with respect to the formula \(F = (\overline{a} \vee \overline{b})\) since F and \(F \wedge C\) are satisfiability-equivalent (although they are not logically equivalent).

5 Conditional Autarkies

In the following, we discuss the notions of autarkies and conditional autarkies from the literature. We then present new theoretical results for conditional autarkies and use these results to develop an algorithm that identifies particular conditional autarkies. This section provides the basis for our SAT-preprocessing approach. We start with autarkies (remember that we do not require assignments to be total) [7, 8]:

Definition 2

An assignment \(\alpha \) is an autarky for a formula F if \(\alpha \) satisfies every clause \(C \in F\) for which \( var (\alpha ) \cap var (C) \ne \emptyset \).

In other words, an autarky satisfies every clause it touches.

Example 3

Let \(F = (a \vee b \vee \overline{c}) \wedge (\overline{b} \vee c \vee d) \wedge (\overline{a} \vee \overline{d})\) and let \(\alpha = b c\). Then, \(\alpha \) touches only the first two clauses. Since it satisfies them, it is an autarky for F.

One crucial property of autarkies, which follows easily from the definition, is that they can be applied to a formula without affecting the formula’s satisfiability:

Theorem 4

If an assignment \(\alpha \) is an autarky for a formula F, then F and are satisfiability-equivalent.

Theorem 4 can be viewed as follows: If \(\alpha = l_1 \dots l_n\) is an autarky for F, then F and \(F \wedge (l_1) \wedge \dots \wedge (l_n)\) are satisfiability-equivalent. This view is useful in the context of conditional autarkies [6]. Informally, a conditional autarky is an assignment that can be partitioned into two parts such that one part becomes an autarky after the other part has been applied to the formula:

Definition 5

An assignment \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) (with \(\alpha _{\mathsf {c}} \cap \alpha _{\mathsf {a}} = \emptyset \)) is a conditional autarky for a formula F if \(\alpha _{\mathsf {a}}\) is an autarky for .

We call \(\alpha _{\mathsf {c}}\) the conditional part and \(\alpha _{\mathsf {a}}\) the autarky part of \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\). Observe that every assignment is a conditional autarky with an empty autarky part. We are mainly interested in conditional autarkies with non-empty autarky parts:

Example 6

Consider the formula \(F = (a \vee \overline{b} \vee \overline{c}) \wedge (\overline{a} \vee b \vee \overline{d}) \wedge (\overline{a} \vee \overline{b} \vee c) \wedge (\overline{a} \vee d)\) and the assignments \(\alpha _{\mathsf {c}} = a\) and \(\alpha _{\mathsf {a}} = bc\). The assignment \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) is a conditional autarky for F since \(\alpha _{\mathsf {a}}\) is an autarky for . Notice that neither \(\alpha _{\mathsf {a}}\) alone nor abc (or any subset) are autarkies for F.

Theorem 4 above tells us that the application of an autarky to a formula does not affect the formula’s satisfiability. The following statement, which is a simple consequence of Theorem 4 and the fact that \(\alpha _{\mathsf {a}}\) is an autarky for , generalizes this statement for conditional autarkies:

Corollary 7

Let F be a formula and \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) a conditional autarky for F with conditional part \(\alpha _{\mathsf {c}}\) and autarky part \(\alpha _{\mathsf {a}}\). Then, and are satisfiability-equivalent.

As for ordinary autarkies, where we can add all unit clauses \(l \in \alpha \) of an autarky \(\alpha \) to a formula F, we get a similar result for conditional autarkies:

Given a conditional autarky \(c_1 \dots c_m a_1 \dots a_n\) (with conditional part \(c_1 \dots c_m\)) for a formula F, we can safely add to F the clause form of the implication

$$ c_1 \wedge \dots \wedge c_m \;\rightarrow \; a_1 \wedge \dots \wedge a_n. $$

This will later allow us to prove the redundancy of globally-blocked clauses:

Theorem 8

Let \(c_1 \dots c_ma_1 \dots a_n\) be a conditional autarky (with conditional part \(c_1 \dots c_m\)) for a formula F. Then, F and \(F \wedge \bigwedge _{1 \le i \le n} (\overline{c}_1 \vee \dots \vee \overline{c}_m \vee a_i)\) are satisfiability-equivalent.

Proof

We have to show that the satisfiability of F implies the satisfiability of \(F \wedge \bigwedge _{1 \le i \le n} (\overline{c}_1 \vee \dots \vee \overline{c}_m \vee a_i)\). Assume that F is satisfiable and let \(\tau \) be a satisfying assignment of F. If \(\tau \) falsifies one of the literals \(c_1, \dots , c_m\), the statement trivially holds. Assume thus that \(\tau \) satisfies all of \(c_1, \dots , c_m\) and define \(\tau '(a_i) = 1\) for \(1 \le i \le n\) and \(\tau '(l) = \tau (l)\) for each remaining literal l. Since \(c_1 \dots c_ma_1 \dots a_n\) is a conditional autarky for F with conditional part \(c_1 \dots c_m\), we know that \(a_1 \dots a_n\) is an autarky for . Hence, since \(\tau \) satisfies F and all of \(c_1 \dots c_m\), the clauses that were affected by making \(a_1, \dots , a_n\) true must also be satisfied by \(\tau '\). We conclude that \(\tau '\) satisfies \(F \wedge \bigwedge _{1 \le i \le n} (\overline{c}_1 \vee \dots \vee \overline{c}_m \vee a_i)\).

We already mentioned that we are interested in conditional autarkies with non-empty autarky parts. In fact, for our preprocessing approach, we try to find the smallest conditional parts (and thus the largest autarky parts) for given assignments. As we show next, the smallest conditional part of a given assignment is unique and we can find it efficiently. For this, we need the notion of the least conditional part of an assignment:

Definition 9

Given an assignment \(\alpha \) and a formula F, the least conditional part of \(\alpha \) on F is the assignment \(\alpha _{\mathsf {c}}\) such that (1) \(\alpha \) is a conditional autarky for F with conditional part \(\alpha _{\mathsf {c}}\) and (2) for all assignments \(\alpha _{\mathsf {c}}'\) such that \(\alpha \) is a conditional autarky for F with conditional part \(\alpha _{\mathsf {c}}'\), it holds that \(\alpha _{\mathsf {c}} \subseteq \alpha _{\mathsf {c}}'\).

The least conditional part of an assignment is unique. To see this, assume \(\alpha _1\) and \(\alpha _2\) are least conditional parts for \(\alpha \) on a formula F. Then, \(\alpha _1 \subseteq \alpha _2\) and \(\alpha _2 \subseteq \alpha _1\) and thus \(\alpha _1 = \alpha _2\).

The algorithm \(\mathsf {LeastConditionalPart}\) in Fig. 1 computes the least conditional part of a given assignment for a formula. In a greedy fashion, the algorithm iterates over all clauses of the formula and whenever it encouters a clause that is touched but not satisfied by the given assignment, it adds all the touched literals of the clause to the conditional part.

Fig. 1.
figure 1

Compute the least conditional part of an assignment.

Theorem 10

Let \(\alpha _{\mathsf {c}} = \mathsf {LeastConditionalPart}(\alpha ,F)\) given a CNF formula F and an assignment \(\alpha \). Then, \(\alpha _{\mathsf {c}}\) is the least conditional part of \(\alpha \) on F.

Proof

Clearly, \(\alpha \) is a conditional autarky for F with conditional part \(\alpha _{\mathsf {c}}\): Whenever \(\alpha \) touches a clause without satisfying it, all the touched literals are added to \(\alpha _{\mathsf {c}}\) (in line 4). Thus a clause in touched by \(\alpha \setminus \alpha _{\mathsf {c}}\) is also satisfied by it.

It remains to show that for every assignment \(\alpha _{\mathsf {c}}'\) such that \(\alpha \) is a conditional autarky with conditional part \(\alpha _{\mathsf {c}}'\), it holds that \(\alpha _{\mathsf {c}} \subseteq \alpha _{\mathsf {c}}'\). Let \(l \in \alpha _{\mathsf {c}}\). Then, l occurs in a clause C that is touched but not satisfied by \(\alpha \). Now, assume that l is not contained in \(\alpha '_c\). It follows that \(l \in \alpha \setminus \alpha '_c\). But then \(\alpha \setminus \alpha '_c\) touches without satisfying it and so it is not an autarky for \(F\). It follows that \(\alpha _{\mathsf {c}} \subseteq \alpha _{\mathsf {c}}'\).

6 Globally-Blocked Clauses

We now have an algorithm that identifies the least conditional part of a given conditional autarky. In the following, we use this algorithm to find redundant clauses in a propositional formula. To this end, we introduce globally-blocked clauses—a type of redundant clauses that generalizes the existing notion of blocked clauses [10] (note that in our notation, the set operators have precedence over logical operators, i.e., \(D \setminus \{\overline{l}\} \vee C\) means \((D \setminus \{\overline{l}\}) \vee C\)):

Definition 11

A clause C is blocked by a literal \(l \in C\) in a formula F if for every clause \(D \in F_{\overline{l}}\), the clause \(D \setminus \{\overline{l}\} \vee C\) is a tautology.

Example 12

Let \(F = (a \vee b) \wedge (\overline{a} \vee c) \wedge (\overline{b} \vee \overline{a})\) and \(C = a \vee b\). The literal b blocks C in F since the only clause in \(F_{\overline{b}}\) is the clause \(D = \overline{b} \vee \overline{a}\), and \(D \setminus \{\overline{b}\} \vee C = \overline{a} \vee a \vee b\) is a tautology.

Blocked clauses are redundant clauses and according to [6] are related to conditional autarkies as follows:

Theorem 13

A clause \((c_1 \vee \dots \vee c_n \vee l)\) is blocked by l in F iff the assignment \(\overline{c}_1 \dots \overline{c}_n l\) is a conditional autarky (with conditional part \(\overline{c}_1 \dots \overline{c}_n\)) for F.

Globally-blocked clauses generalize blocked clauses by not only considering a single literal l, but a set L of literals:

Definition 14

A clause C is globally blocked by a set L of literals in a formula F if \(L \cap C \ne \emptyset \) and all \(D \in F_{\overline{L}} \setminus F_L\), the clause \(D \setminus \overline{L} \vee C\) is a tautology.

We say a clause is globally blocked if there exists some set L of literals by which the clause is globally blocked.

Example 15

Consider \(F = (a \vee \overline{b} \vee \overline{c}) \wedge (\overline{a} \vee b \vee \overline{d}) \wedge (\overline{a} \vee \overline{b} \vee c) \wedge (\overline{a} \vee d)\) from Example 6. The clause \(C = (\overline{a} \vee c)\) is globally blocked in F. To see this, consider the set \(L = \{b, c\}\) and the formulas \(F_{\overline{L}} = (a \vee \overline{b} \vee \overline{c}) \wedge (\overline{a} \vee b \vee \overline{d}) \wedge (\overline{a} \vee \overline{b} \vee c)\) and \(F_{L} = (\overline{a} \vee b \vee \overline{d}) \wedge (\overline{a} \vee \overline{b} \vee c)\). We then have \(F_{\overline{L}} \setminus F_L = (a \vee \overline{b})\). Let \(D = (a \vee \overline{b} \vee \overline{c})\). Then, \(D \setminus \overline{L} \vee C = (a \vee \overline{a} \vee c)\) is a tautology and so C is globally blocked by L in F. Note, C is not blocked in F. In a similar manner \((\overline{a} \vee b)\) is globally blocked.

Remember that we showed in the previous section (Example 6) that abc is a conditional autarky for F with conditional part a and autarky part bc. Now in Example 15, to demonstrate that C is globally blocked, we used the literals of the autarky part as the set L and we could observe that the literal a of the conditional part together with its complement \(\overline{a}\) caused the clause \(D \setminus \overline{L} \vee C\) to be a tautology. This is a consequence of the following statement, which will help us with finding globally-blocked clauses using conditional autarkies:

Theorem 16

Let F be a formula, let C be a clause, let L be a set of literals such that \(L \cap C \ne \emptyset \), and define the assignments \(\alpha _{\mathsf {c}} = \overline{C \setminus L}\) and \(\alpha _{\mathsf {a}} = L\). Then, C is globally blocked by L in F if and only if \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) is a conditional autarky (with conditional part \(\alpha _{\mathsf {c}}\)) for F.

Proof

For the “only if” direction, assume C is globally blocked by L in F. We show that \(\alpha _{\mathsf {a}}\) is an autarky for . Let . Then, D is not satisfied by \(\alpha _{\mathsf {c}}\). Since \(\alpha _{\mathsf {c}}\) falsifies exactly the literals of \(C \setminus L\), it follows that D cannot contain the complement of a literal in \(C \setminus L\). This implies that C cannot contain the complement of a literal in \(D \setminus \overline{L}\) and so \(D \setminus \overline{L} \vee C\) is not a tautology. But then D cannot be contained in \(F_{\overline{L}} \setminus F_L\), meaning that if D is touched by \(\alpha _{\mathsf {a}}\) (which satisfies exactly the literals of L), D is also satisfied by \(\alpha _{\mathsf {a}}\). Hence, since \(\alpha _{\mathsf {a}}\) assigns only variables that are not assigned by \(\alpha _{\mathsf {c}}\), it cannot be the case that \(\alpha _{\mathsf {a}}\) touches without satisfying it. We thus conclude that \(\alpha _{\mathsf {a}}\) is an autarky for .

For the “if” direction, assume \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) is a conditional autarky for F with conditional part \(\alpha _{\mathsf {c}}\). We show that for every clause \(D \in F_{\overline{L}} \setminus F_L\), the clause \(D \setminus \overline{L} \vee C\) is a tautology. Let \(D \in F_{\overline{L}} \setminus F_L\). Then, D is a clause that is touched but not satisfied by \(\alpha _{\mathsf {a}}\). Hence, \(\alpha _{\mathsf {c}}\) must satisfy a literal l of D, for otherwise would be touched but not satisfied by \(\alpha _{\mathsf {a}}\). Moreover, since \(\alpha _{\mathsf {c}}\) assigns no literals of \(\overline{L}\), it must actually be the case that \(l \in D \setminus \overline{L}\). But then, since \(\alpha _{\mathsf {c}}\) falsifies only literals of C, it follows that \(\overline{l} \in C\) and so \(C \vee D \setminus \overline{L}\) is a tautology. It follows that C is globally blocked by L in F.

Before we focus on finding and removing globally-blocked clauses, we have to show that they are indeed redundant:

Theorem 17

If a clause C is globally blocked in a formula F, it is redundant with respect to F.

Proof

Assume that C is globally blocked by some set \(L = \{l_1, \dots , l_n\}\) in F and that F is satisfiable. We show that the formula \(F \wedge C\) is satisfiable. First, observe that C is of the form \((c_1 \vee \dots c_m \vee l_1 \vee \dots \vee l_k)\) where \(\{l_1, \dots , l_k\} \subseteq L\) and \(k \ge 1\). By Theorem 16, we know that the assignment \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\), with \(\alpha _{\mathsf {c}} = \overline{c}_1 \dots \overline{c}_m\) and \(\alpha _{\mathsf {a}} = l_1 \dots l_n\), is a conditional autarky (with conditional part \(\alpha _{\mathsf {c}}\)) for F. Hence, by Theorem 8, F and \(F' = F \wedge \bigwedge _{1 \le i \le n} (c_1 \vee \dots \vee c_m \vee l_i)\) are satisfiability-equivalent and so \(F'\) must be satisfiable. But then, as C is subsumed by each clause \((c_1 \vee \dots \vee c_m \vee l_i)\) with \(i \in 1, \dots ,k\), every satisfying assignment of \(F'\) must also satisfy \(F \wedge C\). It follows that C is redundant with respect to F.

Finally, we note that globally-blocked clauses are a subclass of propagation-redundant clauses (for details, see [27]) but we omit the proof here [16]:

Theorem 18

If a clause C is globally blocked in a formula F, it is propagation-redundant with respect to F.

7 Detecting Globally-Blocked Clauses

We have seen that globally-blocked clauses are redundant and that they correspond closely to conditional autarkies. In the next step, we use this correspondence to find globally-blocked clauses in a formula. The idea is as follows: We take an assignment (we will see later how this assignment can be obtained) and then check for all clauses whether the assignment witnesses that the clause is globally blocked. We start with a formal notion of a witness:

Definition 19

Given a clause C and a formula F, a conditional autarky \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) (with conditional part \(\alpha _{\mathsf {c}}\)) for F witnesses that C is globally blocked in F if \(\alpha _{\mathsf {a}} \cap C \ne \emptyset \) and \(\alpha _{\mathsf {c}} \subseteq \overline{C}\).

Suppose we have a conditional autarky \(\alpha _{\mathsf {c}} \cup \alpha _{\mathsf {a}}\) with conditional part \(\alpha _{\mathsf {c}}\) for F and we want to use it for checking if a clause C is globally blocked. We know from Theorem 16 that C is globally blocked by \(L = \alpha _{\mathsf {a}}\) in F if \(\alpha _{\mathsf {a}} \cap C \ne \emptyset \) and \(\alpha _{\mathsf {c}} = \overline{C \setminus L}\). However, a closer look reveals that the requirement \(\alpha _{\mathsf {c}} = \overline{C \setminus L}\) is needlessly restrictive for our purpose: Theorem 20 below implies that it suffices if \(\alpha _{\mathsf {c}}\) is a subset of \(\overline{C \setminus L}\) (and thus of \(\overline{C}\), since \(\alpha _{\mathsf {c}}\) assigns no variables of \(L = \alpha _{\mathsf {a}}\)) to guarantee that C is globally blocked. Hence, if we have a conditional autarky which witnesses (as defined above) that C is globally blocked, we can be sure that the clause is indeed globally blocked.

Theorem 20

Let F be a formula, \(\alpha \) a conditional autarky for F with autarky part \(\alpha _{\mathsf {a}}\), and \(\tau \) an assignment such that \(\alpha \subseteq \tau \). Then, \(\tau \) is a conditional autarky for F with autarky part \(\alpha _{\mathsf {a}}\).

Proof

Let \(\alpha _{\mathsf {c}} = \alpha \setminus \alpha _{\mathsf {a}}\) and \(\tau _c = \tau \setminus \alpha _{\mathsf {a}}\). We know that \(\alpha _{\mathsf {a}}\) is an autarky for . Since \(\alpha \subseteq \tau \), it follows that \(\alpha _{\mathsf {c}} \subseteq \tau _c\). Now, let . If D is not satisfied by \(\tau _c\), then it is also not satisfied by \(\alpha _{\mathsf {c}}\). Thus, if is touched by \(\alpha _{\mathsf {a}}\), then it must be satisfied by \(\alpha _{\mathsf {a}}\), for otherwise \(\alpha _{\mathsf {a}}\) is not an autarky for . It follows that \(\alpha _{\mathsf {a}}\) is an autarky for .

We can now present the algorithm (Fig. 2) for finding globally-blocked clauses. The algorithm repeatedly computes the least conditional part \(\alpha _{\mathsf {c}}\) of the given assignment (line 1) and then removes from \(\alpha _{\mathsf {c}}\) all literals that are not in \(\overline{C}\) (line 2) because of the requirement \(\alpha _{\mathsf {c}} \subseteq \overline{C}\). If the algorithm finally reaches a fixpoint, meaning that \(\alpha _{\mathsf {c}} \subseteq \overline{C}\), it returns whether the autarky part has a non-empty intersection with C (line 3), which is necessary to guarantee that the assignment witnesses that C is globally blocked.

Fig. 2.
figure 2

Algorithm for detecting globally-blocked clauses.

Theorem 21

Let C be a clause, F a formula, and \(\alpha \) an assignment. Then, \(\mathsf {IsGloballyBlocked}(C,F,\alpha ) = \mathsf {TRUE}\) if and only if a subassignment of \(\alpha \) witnesses that C is globally blocked in F.

Proof

In the rest of the proof, we denote by \(\alpha ^i\) the assignment passed to \(\mathsf {IsGloballyBlocked}\) at the i-th recursive call (we denote the initial call as the 0-th recursive call, i.e., \(\alpha ^0 = \alpha \)). The assignments \(\alpha ^i_c\) and \(\alpha ^i_a\) are defined accordingly.

For the “only if” direction, assume that \(\mathsf {IsGloballyBlocked}(C,F,\alpha ) = \mathsf {TRUE}\) and let \(\alpha ^n\) be the assignment to the last recursive call (i.e., \(\alpha ^n\) is the assignment for which the algorithm returns if \(\alpha ^n_a \cap C \ne \emptyset \)). Then, since the algorithm only modifies the initial assignment \(\alpha \) by unassigning variables, \(\alpha ^n\) is a subassignment of \(\alpha \). Now, since \(\alpha ^n_c = \mathsf {LeastConditionalPart}(\alpha ,F)\), we know that \(\alpha ^n\) is a conditional autarky for F with (least) conditional part \(\alpha ^n_c\). Moreover, all literals of \(\alpha ^n_c\) are contained in \(\overline{C}\) due to line 2 of the algorithm. Finally, since \(\alpha ^n_a\) and C have a non-empty intersection, \(\alpha ^n\) witnesses that C is globally blocked in F.

For the “if” direction, suppose some subassignment \(\tau = \tau _c \cup \tau _\alpha \) of \(\alpha \) witnesses that C is globally blocked in F. Below, we show by induction on i that \(\tau \subseteq \alpha ^i\) and \(\tau _a \subseteq \alpha ^i_a\). From this, the statement follows then easily: Denote by \(\alpha ^n\) the assignment passed to the final recursive call (it can be easily seen that the algorithm terminates). Since \(\tau _a \subseteq \alpha _{\mathsf {a}}^n\) and since \(\tau _a \cap C \ne \emptyset \), it must then be the case that \(\alpha _{\mathsf {a}} \cap C \ne \emptyset \). We conclude with the induction proof of the mentioned statement:

Induction start (\(i = 0\)): In this case, \(\alpha ^0 = \alpha \). By assumption \(\tau \subseteq \alpha \). Thus, by Theorem 20, we know that \(\tau _a\) is an autarky for . Hence, as \(\alpha _{\mathsf {c}}\) is the least conditional part of \(\alpha \), it follows that \(\alpha _{\mathsf {c}} \subseteq \alpha \setminus \tau _a\) and thus \(\tau _a \subseteq \alpha \setminus \alpha _{\mathsf {c}} = \alpha _{\mathsf {a}}\).

Induction step (\(i > 0\)): We assume that \(\tau \subseteq \alpha ^{i-1}\) and \(\tau _a \subseteq \alpha ^{i-1}_a\). We first show that \(\tau \subseteq \alpha ^{i}\). The assignment \(\alpha _i\) is obtained as \(\alpha ' = \alpha ^{i-1}_a \cup \{l \mid l \in \alpha ^{i-1}_c \text { and } l \in \overline{C}\}\) in the \((i-1)\)-th recursive call. Thus, since \(\tau _a \subseteq \alpha ^{i-1}_a\), we know that \(\tau _a \subseteq \alpha _i\). Therefore, the only literals that are contained in \(\alpha ^i\) but not in \(\alpha _{i-1}\) are literals of \(\alpha ^{i-1}_c\) that are not in \(\overline{C}\). But such literals cannot be contained in \(\tau _c\) since \(\tau _c \subseteq \overline{C}\). It follows that \(\tau \subseteq \alpha ^i\). Hence, by Theorem 20, it follows that \(\tau _a \subseteq \alpha ^i_a\).

8 Implementation

The abstract algorithm presented in the previous section connects well to the presented theory of globally-blocked clauses, but is hard to implement efficiently. Figure 3 describes a refinement of the algorithm and further discusses implementation details which are crucial for efficiency. Without giving a detailed analysis, it is easy to see that for each candidate clause, the running time of the algorithm is similar and thus bounded by the time it would take to propagate all conditional variables obtained during the first step of the algorithm.

Fig. 3.
figure 3

Algorithm to extract globally-blocked clauses from a given assignment.

This variant has been implemented in C++ in CaDiCaL [37] and is available at http://fmv.jku.at/globalblocking (see “condition.cpp”). We experimented on SAT Competition benchmarks and also in an incremental bounded model checking setting [38]. Our algorithm in Fig. 3 does find non-trivial globally blocked clauses, but at this point we have not found an instance or application where the removal of globally-blocked clauses results in an overall improvement in running time. It thus remains to be seen whether or not the idea of removing globally-blocked clauses can be beneficial in practice.

One issue is particularly problematic: For some instances with many globally-blocked clauses, the large number of literals on the reconstruction stack requires too much memory, particularly if the autarky part—which serves as witness—contains a substantial fraction of all variables.

9 Conclusion

We introduced globally-blocked clauses, a new kind of redundant clauses that generalizes the existing notions of blocked clauses and set-blocked clauses. As we have shown, globally-blocked clauses correspond closely to conditional autarkies, which are special assignments that can be partitioned into two parts such that one part becomes an autarky once the other part has been applied to the formula.

Since finding globally-blocked clauses is non-trivial, we presented an algorithm that takes as input a formula and a clause together with a candidate assignment and then checks if the assignment (or a subassignment thereof) can witness that the clause is globally blocked in the formula.

Our algorithm simulates a well-known circuit preprocessing technique, known as unique sensitization, on the CNF level. Although our algorithm is conceptually simple, implementing it efficiently is far from straight-forward. We thus presented an implementation of our algorithm, which we ran on a range of formulas to evaluate its effectiveness in practice.