1 Introduction

This paper is concerned with analyzing Horn formulas from the perspectives of Read-once resolution (ROR) and Unit Read-once resolution (UROR). A Horn clause is a clause with at most one positive literal. A Horn formula is a conjunction of Horn clauses. These formulas find applications in a number of domains, including but not limited to program verification [8, 11], logic programming [4], abstract interpretation [3] and econometrics [1]. Resolution is a sound and complete proof system introduced by Robinson [10] for checking the satisfiability of boolean formulas in conjunctive normal form (CNF). As an algorithmic strategy, resolution is not particularly efficient. However, it continues to be used extensively in theorem provers on account of its simplicity and ease of implementation. A resolution proof system typically declares a CNF system to be feasible, or (alternatively) finds a sequence of resolution steps which lead to the derivation of a contradiction. Such a sequence of resolution steps is called a refutation since it acts as a certificate, which certifies the infeasibility of the given clausal system. There are a number of variants of resolution. Each variant puts limitations on the type of resolution step that is permitted or how the resolution steps are to be counted. Three variants which are particularly important from the program verification perspective are tree-like resolution, dag-like resolution and read-once resolution. In tree-like resolution refutations input clauses can be used in multiple resolution steps, however each derived clause can only be used once. If a resolution step requires the reuse of a derived clause then that clause needs to be re-derived. Note that this re-derivation increases the length of the refutation. In dag-like resolution refutations both input and derived clauses can be used in multiple resolution steps. Finally, in read-once resolution refutations each input and derived clause can be used by only one resolution step. Note that clauses can be re-derived, and thus reused, if they can be derived from a different set of input clauses. A complete exposition of different types of resolutions can be found in [9] and [5].

Although the resolution proof system is both sound and complete (in general), several of its variants are not. For instance, read-once resolution (ROR) is incomplete, in that there exist unsatisfiable CNF formulas, which do not have read-once resolutions.

Example 1

Consider the following 2CNF formula:

We now show that this formula does not have a read-once refutation.

To derive \((x_1)\) we need to derive \((\lnot x_2)\). Similarly, to derive \((x_2)\) we need to derive \((\lnot x_1)\). However, the derivations of both \((\lnot x_1)\) and \((\lnot x_2)\) require the use of the clause \((x_3 \vee x_4)\).

To derive \((x_3)\) we need to derive \((\lnot x_4)\). Similarly, to derive \((x_4)\) we need to derive \((\lnot x_3)\). However, the derivations of both \((\lnot x_3)\) and \((\lnot x_4)\) require the use of the clause \((x_1 \vee x_2)\).

Another variant of resolution is unit resolution. In unit resolution, for each resolution step one of the two parent clauses must be a unit clause, i.e. a clause of the form \((x_i)\) or \((\lnot x_i)\). Although unit resolution is incomplete in general, it has been shown to be complete for Horn formulas [4]. In this paper, we study the effect of combining read-once resolution with unit resolution with respect to Horn formulas. We refer to this resolution system as Unit read-once resolution (UROR).

A natural question to ask is: How many times should a given clause be copied so that a read-once refutation can be extracted? This question was first investigated by Iwama and Miyano in [6] and leads naturally to the notion of copy complexity (see Sect. 2). This paper investigates the copy complexity of Horn formulas with respect to unit resolution.

The principal contributions of this paper are as follows:

  1. 1.

    A proof that the problem of finding the shortest read-once refutation for a Horn formula is NP-hard (Sect. 3).

  2. 2.

    An algorithm that can determine if a Horn formula with m clauses and at most two literals per clause has a read-once unit resolution refutation in \(O(m^2)\) time (Sect. 4).

  3. 3.

    Showing that the copy complexity of Horn formulas with respect to read-once unit resolution is \(2^{n-1}\) where n is the number of variables (Sect. 5).

The rest of the paper is organized as follows. Section 2 details the problems under consideration. In Sect. 3 we cover the optimal length ROR problem for Horn formulas. Section 4 describes our work on the UROR problem for 2-Horn formulas. In Sect. 5, we examine the copy complexity of Horn formulas with respect to read-once unit resolution. Finally, Sect. 6 summarizes our results.

2 Statement of Problems

In this section, we briefly discuss the terms used in this paper. We assume that the reader is familiar with elementary propositional logic.

Definition 1

A literal is a variable x or its complement \(\lnot x\). x is called a positive literal and \(\lnot x\) is called a negative literal.

Definition 2

A CNF clause is a disjunction of literals. The empty clause, which is always false, is denoted as \(\sqcup \).

Definition 3

A k -CNF clause is a CNF clause with at most k literals.

Definition 4

A Horn clause is a CNF clause which contains at most one positive literal.

For a single resolution step with parent clauses \((\alpha \vee x)\) and \((\lnot x \vee \beta )\) and with resolvent \((\alpha \vee \beta )\), we write

$$\begin{aligned} (\alpha \vee x),(\lnot x \vee \beta ) \ \mid \!\!\!\frac{1}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ (\alpha \vee \beta ). \end{aligned}$$

The variable x is called the matching or resolution variable. If for initial clauses \(\alpha _1, \ldots , \alpha _n\), a clause \(\pi \) can be generated by a sequence of resolution steps we write

$$\begin{aligned} \alpha _1, \ldots , \alpha _n \ \mid \!\!\!\frac{}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ \pi . \end{aligned}$$

If a resolution step involves a unit clause, a clause of the form (x) or \((\lnot x)\), then it is called a unit resolution step. If a resolution refutation consists of only unit resolution steps, then it is called a unit resolution refutation.

We now formally define the types of resolution refutation discussed in this paper.

Definition 5

A Dag-like resolution refutation is a refutation in which each clause, \(\pi \), can be used in any number of resolution steps. This applies to clauses present in the original formula and those derived as a result of previous resolution steps.

Note that Dag-like refutations are unrestricted and thus are equivalent to general resolution. We now introduce several restricted forms of resolution.

Definition 6

A Tree-like resolution refutation is a refutation in which each derived clause, \(\pi \), can be used in only one resolution step. However, clauses in the original formula can be reused and clauses can be re-derived if necessary.

Definition 7

A Read-Once resolution refutation is a refutation in which each clause, \(\pi \), can be used in only one resolution step. This applies to clauses present in the original formula and those derived as a result of previous resolution steps.

In a read-once refutation, a clause can be reused if can be re-derived from a set of unused input clauses.

More formally, a resolution derivation \(\varPhi \ \mid \!\!\!\frac{}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ \pi \) is a read-once resolution derivation, if for all resolution steps \(\pi _1 \wedge \pi _2 \ \mid \!\!\!\frac{1}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ \pi \), we delete one instance of the clauses \(\pi _1\) and \(\pi _2\) from, and add a copy of the resolvent \(\pi \) to, the current multi-set of clauses. In other words, if U is the current multi-set of clauses, we obtain \(U = (U \setminus \{\pi _1, \pi _2\}) \cup \{\pi \}\).

It is important to note Read-Once resolution is an incomplete refutation procedure.

We can similarly define read-once unit resolution.

Definition 8

A Read-Once Unit resolution refutation is a unit resolution refutation in which each clause, \(\pi \), can be used in only one unit resolution step. This applies to clauses present in the original formula and those derived as a result of previous unit resolution steps.

This lets us define the concept of copy complexity with respect to read-once unit resolution.

Definition 9

A CNF formula \(\varPhi \) has copy complexity at most k, with respect to unit resolution, if there exists a multi-set of CNF clauses, \(\varPhi '\) such that:

  1. 1.

    Every clause in \(\varPhi \) appears at most k times in \(\varPhi '\).

  2. 2.

    Every clause in \(\varPhi '\) appears in \(\varPhi \).

  3. 3.

    \(\varPhi '\) has a read-once unit resolution refutation.

Let k-UROR denote the set of CNF formulas with copy complexity k with respect to unit resolution.

For any type of resolution refutation, we can define the length of that refutation in terms of the number of resolution steps.

Definition 10

The length of a resolution refutation R, is the number of resolution steps in R.

We now define the problems under consideration.

figure a
figure b
figure c
figure d

3 The OLROR Problem for Horn Formulas

In this section, we discuss the problem of finding the optimal read-once refutation of a Horn formula.

Let \(\varPhi \) be an unsatisfiable Horn formula. We know that \(\varPhi \) has a read-once refutation [12]. The question whether \(\varPhi \) has a read-once resolution of length less than k is equivalent to the question whether \(\varPhi \) contains a minimal unsatisfiable formula consisting of at most k clauses.

Theorem 1

Let R denote an OLROR of an unsatisfiable Horn formula \(\varPhi \) and let \(\varPhi _R \subseteq \varPhi \) be the set of clauses used by R. R is also an optimal tree-like refutation of \(\varPhi \) and an optimal dag-like refutation of \(\varPhi \). Additionally, \(\varPhi _R\) is a minimum unsatisfiable subset of \(\varPhi \).

Proof

Since R is a read-once refutation of \(\varPhi \), R is also a tree-like and a dag-like refutation of \(\varPhi \).

Assume that T is a tree like refutation of \(\varPhi \) such that \(|T| < |R|\). Let \(\varPhi _T \subseteq \varPhi \) be the set of clauses used by T. It follows that \(|\varPhi _T| \le |T|+1 < |R|+1\).

\(\varPhi _T\) must have a read-once refutation, \(R_T\) [12]. However, \(|R_T| \le |\varPhi _T|-1 < |R|\). This contradicts the fact that R is the optimal read-once refutation of \(\varPhi \). Thus, R is also an optimal tree-like refutation of \(\varPhi \). Similarly, R is an optimal dag-like refutation of \(\varPhi \).

Let \(\varPhi ' \subseteq \varPhi \) be an unsatisfiable Horn formula such that \(|\varPhi '| < |\varPhi _R|\). \(\varPhi '\) must have a read-once refutation, \(R'\). However, \(|R'| \le |\varPhi '|-1 < |\varPhi _R|-1 = |R|\). This contradicts the fact that R is the optimal read-once refutation of \(\varPhi \). Thus, \(\varPhi _R\) is a minimum unsatisfiable subset of R.    \(\square \)

We conclude that, for unsatisfiable Horn formulas, the length of the shortest resolution refutation equals the length of the shortest read-once refutation and the shortest tree-like resolution. Moreover, if k is the number of clauses of a minimal unsatisfiable subformula, then the shortest resolution refutation uses \((k-1)\) resolution steps.

The following result is a corollary of Theorem 5.2 in [7]. However, it is included here for completeness.

Theorem 2

The problem of deciding whether a Horn formula contains an unsatisfiable sub-formula with at most k clauses is NP-complete.

Proof

We show this by a reduction from Vertex Cover. Let \(G=(V,E)\) be an undirected graph where \(V=\{v_1, \ldots , v_n\}\) and \(E=\{e_1, \ldots , e_m\}\). We associate with G the Horn formula:

$$\begin{aligned} (v_1) \wedge \ldots \wedge (v_n) \wedge (\lnot e_1\vee \ldots \vee \lnot e_m) \wedge \bigwedge _{1 \le i \le m, e_i =(v_{i_1}, v_{i_2})} ((\lnot v_{i_1} \vee e_i) \wedge (\lnot v_{i_2} \vee e_i)) \end{aligned}$$

Then there exists a subset \(V' \subseteq V\) such that \(|V'| \le r\) and \(V' \cap e_i\) is non-empty for every \(1 \le i \le m\) if and only if the associated formula contains an unsatisfiable sub-formula with at most \((1 + m + r)\) clauses. (the negative clause, the clause \((\lnot v \vee e_i)\) for each \(1 \le i \le m\), and r unit clauses).

Since the problem of finding a vertex cover of size at most r is NP-complete, the problem of finding a read-once resolution refutation of length at most \(k=(1+m+r)\) is NP-complete for Horn formulas.    \(\square \)

4 The UROR Problem for 2-Horn Formulas

In this section, we look at the UROR problem for 2-Horn formulas. In particular, we show that checking whether a 2-Horn formula has a unit read-once refutation is in P.

As in the case of Horn formulas, not every unsatisfiable 2-Horn formula has a read-once unit resolution refutation.

Example 2

Consider the formula

$$\begin{aligned} (x_1) \wedge (\lnot x_1 \vee x_2) \wedge (\lnot x_1 \vee \lnot x_2). \end{aligned}$$

This formula is an unsatisfiable Horn formula. However, the clause \((x_1)\) needs to be used twice in any unit resolution refutation. Thus, this formula does not have a read-once unit resolution refutation.

Let \(\varPhi \) be a 2-Horn formula with m clauses over n variables. We can divide \(\varPhi \) into a set of positive unit clauses (PU), a set of implications (\(\phi \)), a set of negative unit clauses (NU), and a set of clauses with two negative literals (N). We write \(\varPhi = PU \cup \phi \cup NU \cup N\). Note that any part (PU, \(\phi \), NU, or N) can be empty.

For resolution or restricted resolution calculi we say a formula \(\varPhi \) is minimal with respect to the calculus, if there exists a refutation for \(\varPhi \) and no proper sub-formula of \(\varPhi \) has a refutation.

Let \(\mathbf{R}\) be a read-once unit resolution of \(\varPhi \). Let \(\varPhi _k = PU_k \cup \phi _k \cup NU_k \cup N_k\) be the set of clauses derived from \(\varPhi \) after k steps of \(\mathbf{R}\). Let us consider the \((k+1)^{th}\) resolution step of \(\mathbf{R}\).

Note that this resolution step must resolve one of the following clause pairs:

  1. 1.

    A clause of the form \((x_i) \in PU_k\) and a clause of the form \((\lnot x_i)\in {NU}_k\). In this case, this resolution step produces the empty cause. Note that this is the last step of the refutation.

  2. 2.

    A clause of the form \((x_i) \in PU_k\) and a clause of the form \((\lnot x_i \vee \lnot x_j) \in {N}_k\). In this case, the resolution produces the negative unit clause \((\lnot x_j)\). Thus, \(PU_{k+1} = PU_k \setminus \{(x_i)\}\), \({N}_{k+1} = {N}_k \setminus \{(\lnot x_i \vee \lnot x_j)\}\), and \({NU}_{k+1} = {NU}_k \cup \{(\lnot x_2)\}\). Note that \(|PU_{k+1} \cup NU_{k+1}| = |PU_k \cup NU_k|\) and \(|NU_{k+1} \cup N_{k+1}| = |NU_k \cup N_k|\).

  3. 3.

    A clause of the form \((x_i) \in PU_k\) and a clause of the form \((\lnot x_i \vee x_j) \in \phi _k\). In this case, the resolution produces the positive unit clause \((x_j)\). Thus, \(PU_{k+1} = (PU_k \setminus \{(x_i)\})\cup \{(x_j)\}\), \({N}_{k+1} = {N}_k\), and \({NU}_{k+1} = {NU}_k\). Note that \(|PU_{k+1} \cup NU_{k+1}| = |PU_k \cup NU_k|\) and \(|NU_{k+1} \cup N_{k+1}| = |NU_k \cup N_k|\).

  4. 4.

    A clause of the form \((\lnot x_i) \in NU_k\) and a clause of the form \((x_i \vee \lnot x_j) \in \phi _k\). In this case, the resolution produces the negative unit clause \((\lnot x_j)\). Thus, \(PU_{k+1} = PU_k\), \({N}_{k+1} = {N}_k\), and \({NU}_{k+1} = ({NU}_k \setminus \{(\lnot x_i)\})\cup \{(\lnot x_j)\}\). Note that \(|PU_{k+1} \cup NU_{k+1}| = |PU_k \cup NU_k|\) and \(|NU_{k+1} \cup N_{k+1}| = |NU_k \cup N_k|\).

Let \(\varPhi = PU \cup \phi \cup NU \cup N\) be minimal with respect to read-once unit resolution. Let \(\mathbf{R}\) be a read-once unit resolution refutation of \(\varPhi \) and let \((x_1) \wedge (\lnot x_1) \ \mid \!\!\!\frac{1}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ \sqcup \), be the last resolution step of \(\mathbf{R}\). Based on the preceding arguments, \(\varPhi \) must have the following properties:

  1. 1.

    \(|NU \cup N|=1\). Note that every prior resolution step of \(\mathbf{R}\) preserves the size of \(NU \cup N\). Before the last resolution step of \(\mathbf{R}\), \(NU \cup N = \{(\lnot x_i)\}\). Thus, before the last resolution step \(|NU \cup N|=1\). This means that this must have been true for the original formula.

  2. 2.

    \(|PU \cup NU|=2\). Note that every prior resolution step of \(\mathbf{R}\) preserves the size of \(PU \cup NU\). Before the last resolution step of \(\mathbf{R}\), \(PU \cup NU = \{(x_i),(\lnot x_i)\}\). Thus, before the last resolution step \(|PU \cup NU|=2\). This means that this must have been true for the original formula.

Thus, \(\varPhi \) must satisfy one of the following conditions:

  1. 1.

    \(|PU|=1\), \(|NU|=1\), and \(|N|=0\).

  2. 2.

    \(|PU|=2\), \(|NU|=0\), and \(|N|=1\).

This means that we only need to consider two types of read-once unit resolution refutations.

  1. 1.

    Refutations which use a single positive unit clause from PU and a single negative unit clause from NU. We refer to this as a Type 1 refutation.

  2. 2.

    Refutations which use two positive unit clauses from PU and a single negative clause from N. We refer to this as a Type 2 refutation.

From an arbitrary 2-Horn formula \(\varPhi =PU \cup \phi \cup NU \cup N\), we can create a directed graph \(\mathbf{G} =\langle \mathbf{V,E} \rangle \) as follows:

  1. 1.

    For each variable \(x_i\) in \(\varPhi \), add the vertex \(v_i\) to \(\mathbf{V}\).

  2. 2.

    For each clause \((x_i \vee \lnot x_j)\), add the edge \((v_j,v_i)\) to \(\mathbf{E}\).

  3. 3.

    Let \(S =\{v_i|(x_i) \in PU\}\) and \(T =\{v_j|(\lnot x_j)\in NU\}\).

We utilize \(\mathbf{G}\) to find read-once unit resolution refutations.

Lemma 1

The clause \((x_i)\) is derivable from \(\varPhi \) by a read-once unit resolution if and only if \(v_i\) is reachable in \(\mathbf{G}\) from a vertex in S.

Proof

Assume that \((x_i)\) is derivable from \(\varPhi \) by a read-once unit resolution \(\mathbf{R}\). We will show that \(v_i\) is reachable from a vertex in S by induction based on the number of resolution steps in \(\mathbf{R}\).

If \(\mathbf{R}\) has no resolution steps then \((x_i) \in \varPhi \). Thus, \((x_i) \in PU\) and, by construction, \(v_i \in S\). Thus, \(v_i\) is trivially reachable from a vertex in S.

Now suppose that this holds true of all resolutions with k steps. If \(\mathbf{R}\) has \((k+1)\) steps, then the last resolution step of \(\mathbf{R}\) must be \((x_j) \wedge (\lnot x_j \vee x_i) \ \mid \!\!\!\frac{1}{\ {\mathop {\ }\limits ^{\text{ RES }}}\ }\ (x_i)\) for some clause \((x_j)\). This means that \((x_j)\) must be derivable from \(\varPhi \) by a read-once unit resolution with k steps. Thus, by the induction hypothesis, \(v_j\) is reachable from a vertex in S. Thus, \((\lnot x_j \vee x_i) \in \varPhi \) since non-unit clauses are not derivable from 2-Horn clauses by unit resolution. This means that \((\lnot x_j \vee x_i) \in \phi \) and, by construction, the edge \((v_j,v_i)\) is in \(\mathbf{E}\). Thus, \(v_i\) is also reachable from a vertex in S.

Now assume that \(v_i\) is reachable from a vertex \(v_j \in S\). Thus, there must be a simple path p in \(\mathbf{G}\) from \(v_j\) to \(v_i\). Let the edges in p be \((v_j,v_{p_1})\), \((v_{p_1},v_{p_2})\), \(\ldots \), \((v_{p_k},v_i)\). By construction of \(\mathbf{G}\), the clauses \((x_j)\), \((\lnot x_j \vee x_{p_1})\), \((\lnot x_{p_1} \vee x_{p_2})\), \(\ldots \), \((\lnot x_{p_k} \vee x_i)\) are all in \(\varPhi \).

It is easy to see that these clauses can be used to derive \((x_i)\) through unit resolution. Since p was simple no clause is used more than once. Thus, this is a read-once unit resolution of \((x_i)\).    \(\square \)

Theorem 3

Whether the formula \(\varPhi \) has a Type 1 refutation can be determined in linear time.

Proof

\(\varPhi \) has a Type 1 refutation if and only if for some clause \((\lnot x_t) \in \varPhi \), we can derive the clause \((x_t)\) from \(\varPhi \) by a read-once unit resolution \(\mathbf{R}\). Thus, by Lemma 1, some vertex \(v_t \in T\) must be reachable from a vertex in S. This can be checked by doing a depth-first search in \(\mathbf{G}\).    \(\square \)

Theorem 4

Whether the formula \(\varPhi \) has a Type 2 refutation can be determined in \(O(m^2)\) time.

Proof

\(\varPhi \) has a Type 2 refutation if and only if for some clause \((\lnot x_{t_1} \vee \lnot x_{t_2}) \in \varPhi \), we can derive the clauses \((x_{t_1})\) and \((x_{t_2})\) from \(\varPhi \) by read-once unit resolution \(\mathbf{R}\). Thus, by Lemma 1, both \(v_{t_1}\) and \(v_{t_2}\) must be reachable from verticies in S. Thus, there is a path \(p_1\) from a vertex in S to \(v_{t_1}\) and there is a path \(p_2\) from a vertex in S to \(v_{t_2}\). Since \(\mathbf{R}\) is a read-once unit resolution, \(p_1\) and \(p_2\) cannot share any edges and cannot have the same starting vertex.

If we are given verticies \(v_{t_1}\) and \(v_{t_2}\), then we need to check if \(\mathbf{G}\) has such a pair of edge-disjoint paths.

From \(\mathbf{G}\), S, \(v_{t_1}\), and \(v_{t_2}\), we can construct flow network \(\mathbf{G'}\) as follows:

  1. 1.

    For each vertex \(v_i\) in \(\mathbf{G}\), add the node \(v_i\) to \(\mathbf{G'}\).

  2. 2.

    For each edge \((v_i,v_j)\) in \(\mathbf{G}\) add the edge \((v_i,v_j)\) to \(\mathbf{G}'\) with capacity 1.

  3. 3.

    Add a source s and the edges \((s,v_i)\) for \(v_i \in S\) to \(\mathbf{G'}\).

  4. 4.

    Add a sink t and the edges \((v_{t_1},t)\), and \((v_{t_2},t)\) to \(\mathbf{G'}\).

The desired paths exist if and only if a flow of 2 can be pushed from s to t in \(\mathbf{G'}\).

Thus, \(\varPhi \) has a Type 2 refutation if and only if, for some clause \((\lnot x_{t_1} \vee \lnot x_{t_2}) \in \varPhi \), the graph \(\mathbf{G'}\) has a max flow of 2.

Note that all edges in \(\mathbf{G'}\) have capacity 1, and that t has only two incoming edges. This means that the maximum flow is at most 2. Thus, we can find the max flow from s to t in \(\mathbf{G'}\) in O(m) time where m is the number of clauses in \(\varPhi \) [2]. Since we need to find this flow for each clause of the form \((\lnot x_{t_1} \vee \lnot x_{t_2})\), determining if \(\varPhi \) has a Type 2 refutation can be accomplished in \(O(m^2)\) time.    \(\square \)

Thus, checking if \(\varPhi \) has a read-once unit resolution refutation can be accomplished by checking to see if it has either a Type 1 refutation or a Type 2 refutation. This can be accomplished in \(O(m^2)\) time since checking for Type 2 refutations is the more time consuming process.

5 UROR Copy Complexity of Horn Formulas

We now examine the copy complexity of Horn formulas with respect to unit resolution.

Theorem 5

The copy complexity of Horn formulas with respect to unit resolution is at most \(2^{n-1}\) where n is the number of variables.

Proof

Suppose \(\varPhi \) is an unsatisfiable Horn formula. Note that adding clauses to a system cannot increase the copy complexity. Thus, we can assume without loss of generality that \(\varPhi \) is minimal unsatisfiable. Let CC(n) denote the copy complexity of a minimal unsatisfiable Horn formula with n variables. We will show that \(CC(n) \le 2^{n-1}\). For a clause \(\phi _j\) of \(\varPhi \) let \(N_c(\phi _j)\) be the number of copies of \(\phi _j\) needed for a read-once unit resolution refutation.

Let \(\varPhi \) be a minimal unsatisfiable Horn formula with n variables. Thus, \(\varPhi \) has \((n+1)\) clauses. If \(n=1\), then \(\varPhi \) has the form \((x) \wedge (\lnot x)\). This formula has a read-once unit resolution refutation. Thus \(CC(1)=1\le 2^0\). Also note that \(\sum _{\phi _j\in \varPhi } N_c(\phi _j)=2\le 2^1\).

Now assume that \(CC(k) \le 2^{k-1}\), and that for each minimal unsatisfiable formula \(\varPhi '\) with k variables, \(\sum _{\phi '_j\in \varPhi '} N_c(\phi '_j)\le 2^k\). If \(n=k+1\), then \(\varPhi \) has the form \((x) \wedge (\lnot x \vee \alpha _1) \wedge \ldots \wedge (\lnot x \vee \alpha _t) \wedge \sigma _{t+1}\ldots \wedge \sigma _{k+1}\). A read-once unit resolution refutation needs to use the clause (x) to eliminate each instance of \(\lnot x\). Thus, we need a copy of the clause (x) for each copy of \((\lnot x \vee \alpha _i)\) for \(i=1 \ldots t\). Let \(\varPhi '\) be the formula \(\alpha _1 \wedge \ldots \alpha _t \wedge \sigma _{t+1}\ldots \wedge \sigma _{k+1}\). Note that \(\varPhi '\) is a minimal unsatisfiable formula with \(n-1=k\) variables. Thus,

$$\begin{aligned} \sum _{j=1}^{t} N_c(\alpha _i)\le \sum _{\phi '_j \in \varPhi '} N_c(\phi '_j)\le 2^k. \end{aligned}$$

This means that we need at most \(2^k\) copies of the clause (x). Thus, \(CC(k+1) \le 2^k\) and \(\sum _{\phi _j\in \varPhi } N_c(\phi _j)\le 2^k+2^k=2^{k+1}.\)    \(\square \)

Theorem 6

There exists a Horn formula with copy complexity \(2^{n-1}\) where n is the number of variables.

Proof

Consider the following clauses:

$$\begin{aligned} \begin{array}{ccccc} \left( \lnot x_1 \vee \lnot x_2 \vee \ldots \vee \lnot x_n\right) &{} &{} \left( x_1 \vee \lnot x_2 \vee \ldots \vee \lnot x_n\right) &{} &{} \left( x_2 \vee \lnot x_3 \vee \ldots \vee \lnot x_n\right) \\ \ldots &{} &{} \left( x_{n-1} \vee \lnot x_n\right) &{} &{} \left( x_n\right) \end{array} \end{aligned}$$

Let us consider the clause \(\left( \lnot x_1 \vee \lnot x_2 \vee \ldots \vee \lnot x_n\right) \). To eliminate this clause through unit resolution, we need to derive the clauses \((x_1)\), \((x_2)\), \(\ldots \), \((x_n)\). We will prove by induction, that for each \(i =1 \ldots n\), \(2^{i-1}\) copies of the clause \(\left( x_i \vee \lnot x_{i+1} \vee \ldots \vee \lnot x_n\right) \) are required.

To eliminate \(\lnot x_1\) from the clause we only need one copy of the clause \((x_1)\), thus we only need \(1=2^0\) copies of the clause \(\left( x_1 \vee \lnot x_2 \vee \ldots \vee \lnot x_n\right) \).

Now assume that for each \(i<k\), we need to use \(2^{i-1}\) copies of the clause \(\left( x_i \vee \lnot x_{i+1} \vee \ldots \vee \lnot x_n\right) \). Each of these clauses uses the literal \(\lnot x_k\), as does the clause \(\left( \lnot x_1 \vee \lnot x_2 \vee \ldots \vee \lnot x_n\right) \). Thus, we need a total of \(1 + \sum _{i=1}^{k-1} 2^{i-1} = 2^{k-1}\) copies of the clause \((x_k)\) to cancel every instance of \(\lnot x_k\). The only clause with the literal \(x_k\) is the clause \(\left( x_k \vee \lnot x_{k+1} \vee \ldots \vee \lnot x_n\right) \). Thus, we need to use \(2^{k-1}\) copies of this clause, as desired.

This means that we need a total of \(2^{n-1}\) copies of the clause \((x_n)\). Thus, the copy complexity of this formula is \(2^{n-1}\) with respect to unit resolution.    \(\square \)

From these two results, it is easy to see that the copy complexity of Horn formulas with respect to read-once unit resolution refutation is \(2^{n-1}\).

6 Conclusion

In this paper, we studied two proof systems, viz., Read-once resolution (ROR) and Unit read-once resolution (UROR) from the perspective of Horn clause systems. Our work is motivated by two important factors, viz., the ubiquitousness of the resolution proof system in SMT solvers and the wide applicability of Horn formulas. As discussed before, the ROR and UROR proof systems are incomplete for general CNF formualas, although ROR is complete for Horn formulas. Note that if an unsatisfiable formula has a read-once refutation or unit read-once refutation, then this refutation is necessarily short. This is in contrast to general resolution proofs, which could be exponentially large with respect to input size. Our investigations established that the problem of checking whether a Horn clause system has a read-once refutation of length at most k is NP-hard. We also showed that the problem of finding a UROR for 2-Horn formulas is in P. Finally, we discussed the copy complexity of Horn formulas with respect to unit resolution and obtained an exponential lower bound.