1 Introduction

This paper is concerned with the problem of determining bounds on the copy complexity of Horn constraint systems (HCSs) under the ADD refutation system [10]. A linear constraint of the form \(\sum _{i=1}^{n}a_{i}\cdot x_{i} \ge b\), \(b \in \mathbb {Z}\), is said to be Horn, if \(\forall i, a_{i} \in \{0,1,-1\}\) and at most one of the \(a_{i} =1\). A conjunction of such constraints is called a Horn constraint system (HCS). Horn constraints arise in a number of application domains such as program verification [2, 3], lattice programming [9] and econometrics. The ADD refutation system is a refutation system with a single inference rule, viz., if two constraints \(l_{1}\) and \(l_{2}\) are part of the HCS or can be inferred from the HCS, then so can their sum. It is well-known that the ADD refutation system is both sound and complete from the perspective of establishing infeasibility in polyhedral constraint systems [10]. Furthermore, this system enables the extraction of the actual refutation. When it comes to establishing infeasibility, the goal is clearly to find “short” certificates, since such certificates can be effectively verified. However, not all constraint systems have compact certificates in the ADD refutation system. In our quest for compactness, we attempt to minimize the number of times a constraint can be used by the refutation system, in order to infer a contradiction. This leads to the notion of copy complexity of a constraint system under the ADD refutation system. For the rest of the paper, we will assume that the constraint system under consideration is Horn and that the refutation system is the ADD refutation system (see Sect. 2). Accordingly, we will use the phrase “copy complexity” without reference to the accompanying refutation system.

The problem of determining the copy complexity of an HCS is known to be NP-hard [8].

In this paper, we investigate width 3 HCSs. In most program verification applications, the width of Horn clauses or constraints is bounded by a small constant. Accordingly, this investigation is well-motivated.

2 Statement of Problems

In this section, we define the problems studied in this paper.

Definition 1

A system of constraints \(\mathbf{A \cdot x \ge b}\) is said to be a Horn Constraint system (HCS) if:

  1. 1.

    The entries in \(\mathbf {A}\) belong to the set \(\{0,1,-1\}\).

  2. 2.

    Each row of \(\mathbf{A}\) contains at most one positive entry.

  3. 3.

    \(\mathbf{x}\) is a real valued vector.

  4. 4.

    \(\mathbf{b}\) is an integral vector.

In the constraint \(\mathbf{a\cdot x }\ge b_{j}\), \(b_{j}\) is called the defining constant. If a Horn constraint has at most w non-zero coefficients, then it is called a width w Horn constraint. A system of width w Horn constraints is known as a width w HCS. In a Horn constraint we refer to the terms \(x_i\) and \(-x_i\) as literals.

If a Horn constraint has only one non-zero coefficient, then it is called an absolute constraint. If that coefficient is 1, then it is called a positive absolute constraint.

We are interested in certificates of infeasibility. In this paper, we utilize an inference rule known as the ADD rule [10]. This inference rule derives a new constraint by summing a pair of constraints (either from the original system or derived by previous inferences) and is defined as follows:

$$\begin{aligned} \text {ADD}:\dfrac{ { \sum _{i=1}^n a_i\cdot x_i} \ge b_1 \qquad {\sum _{i=1}^n a'_i \cdot x_i} \ge b_2 }{ \sum _{i=1}^n (a_i+a'_i)\cdot x_i \ge b_1 + b_2 } \end{aligned}$$
(1)

This inference rule plays a similar role to the role played by resolution in clausal formulas.

Using Rule (1), we can now define a linear refutation.

Definition 2

A linear refutation is a sequence of applications of the ADD rule that results in a contradiction of the form \(0 \ge b\), \(b > 0\).

The form of refutation defined in Definition 2 is both sound and complete when used as a proof system for linear feasibility. It is sound since any assignment that satisfies the constraints used by an application of the ADD rule also satisfies the constraint derived by that application. Additionally, ADD rule based linear refutation is complete. This means that repeated application of the ADD rule will eventually result in a contradiction of the form: \(0 \ge b\), \(b> 0\) for any linearly infeasible system. The completeness of ADD rule based linear refutations was established by Farkas [4], in a lemma that is famously known as Farkas’ Lemma for systems of linear inequalities [12].

Of particular interest is a restricted form of refutation known as read-once refutation.

Definition 3

A read-once refutation is a refutation in which each constraint, l, can be used in only one inference. This applies to constraints present in the original systems and those derived as a result of previous inferences.

Example 1

Consider the HCS \(\mathbf{H}\) defined by System (2).

$$\begin{aligned}&l_1:x_1 - x_2 -x_3 \ge 0 \quad \ l_2:x_2 - x_3 \ge -1&\nonumber \\&l_3:x_3 - x_1 \ge 1 \quad \ l_4:x_3 \ge 1&\end{aligned}$$
(2)

System (2) has the following read-once refutation:

  1. 1.

    Apply the ADD rule to \(l_1\) and \(l_2\) to get \(l_5:x_1 - 2 \cdot x_3 \ge -1\).

  2. 2.

    Apply the ADD rule to \(l_5\) and \(l_3\) to get \(l_6:- x_3 \ge 0\).

  3. 3.

    Apply the ADD rule to \(l_6\) and \(l_4\) to get the contradiction \(0 \ge 1\).

We can now define copy complexity in terms of read-once refutations.

Definition 4

A HCS \(\mathbf{H}\) has copy complexity k if k is the smallest integer for which there exists a multi-set of Horn constraints, \(\mathbf{H'}\) such that:

  1. 1.

    Every constraint in \(\mathbf{H}\) appears at most k times in \(\mathbf{H'}\).

  2. 2.

    Every constraint in \(\mathbf{H'}\) appears in \(\mathbf{H}\).

  3. 3.

    \(\mathbf{H'}\) has a read-once refutation using the ADD rule.

In this paper, we examine the following problems related to copy complexity:

Definition 5

The copy complexity (CC\(_D\)) problem: given an HCS \(\mathbf{H}\) and an integer k, is the copy complexity of \(\mathbf{H}\) at most k?

Definition 6

The optimal copy complexity (CC\(_{Opt}\)) problem: given an HCS \(\mathbf{H}\), what is the smallest k such that the copy complexity of \(\mathbf{H}\) is at most k?

In this paper, we focus on these problems in width 3 HCSs, for the most part.

The principal contributions of this paper are as follows:

  1. 1.

    Establishing a lower bound on the copy complexity of bounded width HCSs (Theorem 2).

  2. 2.

    Establishing that the CC\(_D\) problem for width 3 HCSs is NP-complete (Theorem 4).

  3. 3.

    Establishing that no algorithm for the CC\(_D\) problem for width 3 HCSs can run in time \(2^{o(n)}\) unless the Exponential Time Hypothesis (ETH) fails (Theorem 6).

  4. 4.

    Establishing that the CC\(_{Opt}\) problem for width 3 HCSs is NPO-complete [7] (Theorem 7).

3 Observations on Copy Complexity

In this section, we observe several properties of the copy complexity of bounded width HCSs.

First, we show that the copy complexity of a width w HCS with \(((w-1) \cdot n'+1)\) variables can be as large as \(2^{(w-2) \cdot n'}\).

Theorem 1

For each integer \(n' \ge 0\), there exists a width w HCS \(\mathbf{H}\) with \(((w-1) \cdot n' +1)\) variables such that \(\mathbf{H}\) has copy complexity \(2^{(w-2) \cdot n'}\).

Proof

Let \(\mathbf{H}\) be the HCS constructed as follows:

  1. 1.

    The constraint \(l_1\) is \(-x_1 \ge 1\).

  2. 2.

    For \(r = 2, \ldots ,(w-1) \cdot n'+1\), the constraint \(l_r\) is

    $$\begin{aligned} x_{r-1} - \sum _{j=r}^{\left\lceil \frac{r-1}{w-1}\right\rceil \cdot (w-1)+1} x_j \ge 0. \end{aligned}$$
  3. 3.

    The constraint \(l_{(w-1) \cdot n' +2}\) is \(x_{(w-1) \cdot n'+1} \ge 0\).

We will show that for each \(i=0 \ldots n'\), the constraint \(l_{(w-1) \cdot i+2}\) must be used at least \(2^{(w-2) \cdot i}\) times by any linear refutation of \(\mathbf{H}\).

Note that constraint \(l_1\) is the only constraint in \(\mathbf{H}\) that has a positive defining constant. Thus, \(l_1\) must be in any linear refutation of \(\mathbf{H}\). Additionally, the only constraint in \(\mathbf{H}\) with the literal \(x_1\) is \(l_2\). Thus, the constraint \(l_2\) must be used at least \(2^0\) times by any linear refutation of \(\mathbf{H}\).

Now assume that the constraint \(l_{(w-1) \cdot i+2}\) must be used at least \(2^{(w-2) \cdot i}\) times by any linear refutation of \(\mathbf{H}\). Note that this constraint contains the literal \(-x_{(w-1) \cdot i+2}\) and that this is the only constraint in \(\mathbf{H}\) containing that literal. The only constraint in \(\mathbf{H}\) with the literal \(x_{(w-1) \cdot i+2}\) is \(l_{(w-1) \cdot i+3}\). Thus, this constraint must also be used at least \(2^{(w-2) \cdot i}\) times by any linear refutation of \(\mathbf{H}\).

Both constraints \(l_{(w-1) \cdot i+2}\) and \(l_{(w-1) \cdot i+3}\) contain the literal \(-x_{(w-1) \cdot i+3}\) and these are the only constraints in \(\mathbf{H}\) containing that literal. Thus, the literal \(-x_{(w-1) \cdot i+3}\) is used by at least \(2^{(w-2) \cdot i+1}\) constraints in any linear refutation of \(\mathbf{H}\).

For each \(j = 1 \dots w-1\), the constraints \(l_{(w-1) \cdot i+2}\) through \(l_{(w-1) \cdot i+j+1}\) contain the literal \(-x_{(w-1) \cdot i+j+1}\) and these are the only constraints in \(\mathbf{H}\) containing that literal. Thus, the literal \(-x_{(w-1) \cdot i+j+1}\) is used by at least \(2^{(w-2) \cdot i+j-1}\) constraints in any linear refutation of \(\mathbf{H}\).

Note that the only constraint in \(\mathbf{H}\) with the literal \(x_{(w-1) \cdot i+w}=x_{(w-1) \cdot (i+1)+1}\) is \(l_{(w-1) \cdot (i+1)+2}\). Thus, any linear refutation of \(\mathbf{H}\) must use the constraint \(l_{(w-1) \cdot (i+1)+2}\) at least \(2^{(w-2) \cdot (i+1)}\) times as desired.

Thus, for each \(i=0 \ldots n'\), the constraint \(l_{(w-1) \cdot i+2}\) must be used at least \(2^{(w-2) \cdot i}\) times by any linear refutation of \(\mathbf{H}\). In particular, the constraint \(l_{(w-1) \cdot n'+2}\) must be used at least \(2^{(w-2) \cdot n'}\) times by any linear refutation of \(\mathbf{H}\).

We can construct a linear refutation of \(\mathbf{H}\) by using each constraint \(l_r\), \(2^{r-1 - \left\lfloor \frac{r-1}{w-1} \right\rfloor }\) times.    \(\square \)

From, Theorem 1, there is a width w HCS \(\mathbf{H}\) with \(n \equiv 1 \mod (w-1)\) variables such that \(\mathbf{H}\) has copy complexity at least \(2^{(w-2) \cdot \frac{n-1}{w-1}}\). Utilizing a different construction, we can obtain a tighter bound based on a generalization of the Fibonacci Sequence.

Recall that the Fibonacci Sequence \(F_n\) is the sequence in which each element is the sum of the two previous elements. More formally, the Fibonacci Sequence is defined as follows: \(F_0 =0\), \(F_1 = 1\), and \(F_n = F_{n-1} + F_{n-2}\) for \(n \ge 2\). We can generalize this definition by having each element depend on more than just the previous two elements. Our result on width w HCSs utilizes the following generalization of the Fibonacci Sequence.

For each \(w \ge 1\), the width w Fibonacci Sequence \(F_{w,n}\) is defined as follows: \(F_{w,0} = 0\), \(F_{w,1} = 1\), \(F_{w,n} = \sum _{i=\max \{0, n-w\}}^{n-1} F_{w,i}\) for \(n \ge 2\). Thus, in the width w Fibonacci Sequence each element depends on the sum of the previous w elements, not just the previous 2 elements. Note that the width 2 Fibonacci Sequence is simply the regular Fibonacci Sequence.

We now make a structural observation about \(F_{w,n}\).

Lemma 1

For \(w \ge 1\) and \(2 \le n \le w+1\), \(F_{w,n} = 2^{n-2}\).

Proof

Let w be a positive integer. By definition, \(F_{w,0}=0\) and \(F_{w,1}=1\). Additionally, for \(n =2 \ldots w\), \(F_{w,n} = \sum _{i=0}^{n-1} F_{w,i}\). Since \(F_{w,0}=0\), we have that for \(n =2 \ldots w+1\), \(F_{w,n} = \sum _{i=1}^{n-1} F_{w,i}\).

When \(n=2\), we have

$$\begin{aligned} F_{w,n} = F_{w,2} = F_{w,0} + F_{w,1} =1 = 2^{n-2}. \end{aligned}$$

Let n be an integer such that \( 2 \le n \le w+1\). Assume that \(F_{w,i} = 2^{i-2}\) for \(2 \le i < n\). Recall that,

$$\begin{aligned} F_{w,n} = \sum _{i=1}^{n-1} F_{w,i} = 1 + \sum _{i=2}^{n-1} 2^{i-2} = 1 + (2^{n-2}-1) = 2^{n-2}. \end{aligned}$$

Thus, for \(w \ge 1\) and \(2 \le n \le w+1\), \(F_{w,n} = 2^{n-2}\).    \(\square \)

We will now utilize width w Fibonacci Sequences to establish a stronger lower bound on the copy complexity of HCSs with bounded constraint width.

Theorem 2

For each \(w \ge 2\) and \(n \ge 2\), there exists a width w HCS \(\mathbf{H}_{w,n}\) with n variables such that the copy complexity of \(\mathbf{H}_{w,n}\) is \(2 \cdot F_{(w-1),n}\).

Proof

For each \(w \ge 2\) and \(n \ge 2\), Let \(\mathbf{H}_{w,n}\) be the HCS constructed as follows:

  1. 1.

    The constraint \(l_0\) is \(-x_1 - x_{2} - \ldots - x_{w} \ge 1\).

  2. 2.

    For \(r = 1, \ldots ,n-1\), the constraint \(l_r\) is \(x_{r} - \sum _{j=r+1}^{\min \{r+w-1,n\}} x_j \ge 0\).

  3. 3.

    The constraint \(l_{n}\) is \(x_{n} \ge 0\).

Let R be a linear refutation of \(\mathbf{H}_{w,n}\) and let \(C(l_i)\) be the number of times the constraint \(l_i\) is used by R. We will show that for any refutation R of \(\mathbf{H}_{w,n}\), \(C(l_i) \ge 2 \cdot F_{(w-1),i}\) for \(i=2 \ldots n\).

We make the following observations about R and the structure of \(\mathbf{H}_{w,n}\):

  1. 1.

    \(l_0\) is the only constraint with positive defining constant. Thus, \(l_0\) must be used by R and \(C(l_0) \ge 1\)

  2. 2.

    For each \(i=1 \ldots n\), the constraint \(l_i\) is the only constraint to use the literal \(x_i\). Additionally, the constraints \(l_{i-w+1}\) through \(l_{i-1}\) are the only constraints in \(\mathbf{H}_{w,n}\) to use the literal \(-x_i\). The only exception to this is the literal \(-x_w\) which also appears in the constraint \(l_0\). Since R is a refutation of \(\mathbf{H}_{w,n}\), the number of constraints in R that use the literal \(x_i\) and the number of constraints in R that use the literal \(-x_i\) are equal. Thus, for \(i =1 \ldots n\), \(i \ne w\) we have that \(C(l_i) = \sum _{j=\max \{0, i-w+1\}}^{i-1} C(l_j)\) and \(C(l_w) = \sum _{j=0}^{w-1} C(l_j)\).

  3. 3.

    For \(i=1 \ldots w\), \(C(l_i) = \sum _{j=0}^{i-1} C(l_j)\). Thus, \(C(l_i) \ge 2^{i-1}\). From Lemma 1, we have that for \(i = 2 \ldots w\), \(F_{(w-1),i} = 2^{i-2}\). Thus, for \(i=2 \ldots w\), \(C(l_i) \ge 2 \cdot F_{(w-1),i}\).

  4. 4.

    Let i be an integer such that \(w < i \le n\). Assume that for each \(j = 2 \ldots i-1\), \(C(l_j) \ge 2 \cdot F_{(w-1),j}\). We have that \(C(l_i) = \sum _{j=i-w+1}^{i-1} C(l_j)\). Since \((i-w+1) \ge 2\), \(\sum _{j=i-w+1}^{i-1} C(l_j) \ge \sum _{j=i-w+1}^{i-1} 2 \cdot F_{(w-1),j} = 2 \cdot F_{(w-1),i}\). Thus, \(C(l_i) \ge 2 \cdot F_{(w-1),i}\) as desired.

From the above observations, for each \(i=2 \ldots n\), \(C(l_i) \ge 2 \cdot F_{(w-1),i}\). In particular, \(C(l_n) \ge 2 \cdot F_{(w-1),n}\). Thus, the copy complexity of \(\mathbf{H}_{w,n}\) is at least \(2 \cdot F_{(w-1),n}\).

Let R be such that \(C(l_0)=1\), \(C(l_1)=1\), and for each \(i =2 \ldots n\), \(C(l_i) = 2 \cdot F_{(w-1),i}\). It can be algebraically verified that R is a linear refutation of \(\mathbf{H}_{w,n}\). Thus, copy complexity of \(\mathbf{H}_{w,n}\) is \(2 \cdot F_{(w-1),n}\) as desired.    \(\square \)

Let \(S_l\) be the following set of constraints:

$$\begin{aligned}&l_1: x_{2 \cdot l+1} - x_{2 \cdot l} - x_{2 \cdot l-1} \ge 0 \qquad \ \ l_2: x_{2 \cdot l}-x_{2 \cdot l-1} \ge 0&\\&l_3: x_{2 \cdot l-1} - x_{2 \cdot l-2} -x_{2 \cdot l-3} \ge 0 \qquad \ \ l_4: x_{2 \cdot l-2} - x_{2 \cdot l-3} \ge 0 \qquad \ \ \ldots&\\&l_{2 \cdot l-1}: x_{3} -x_{2} - x_{1} \ge 0 \qquad \ \ l_{2 \cdot l}: x_{2} - x_{1} \ge 0 \qquad \ \ l_{2 \cdot l+1}: x_1 \ge 0&\end{aligned}$$

Theorem 1, when applied to width 3 HCSs, can be extended to the following result which will be utilized later in the paper.

Theorem 3

Let \(\mathbf{H}\) be an HCS and let \(\{x_1, \ldots , x_{2 \cdot l+1}\}\) be a subset of the variables in \(\mathbf{H}\) such that for each \(i=1 \ldots (2 \cdot l+1)\), the only constraint in \(\mathbf{H}\) that uses the literal \(x_i\) belongs to the set \(S_l\). If a linear refutation R of \(\mathbf{H}\) uses a constraint \(x-\sum _{i \in S} x_i \ge b\) for some set \(S \subseteq \{1,3, \ldots , 2 \cdot l+1\}\), then R must use the constraint \(x_1 \ge 0\) at least \(\sum _{2 \cdot i+1 \in S} 2^{i}\) times.

Proof

Let \(\mathbf{H}\) be an appropriately constructed HCS. For some subset \(S \subseteq \{1,3, \ldots , 2 \cdot l+1\}\) let \(x-\sum _{i \in S} x_i \ge b\) be a constraint in \(\mathbf{H}\) used by a linear refutation R of \(\mathbf{H}\).

Let \(x_i\) be a variable such that \(2 \cdot i+1 \in S\). By the definition of \(\mathbf{H}\), the only constraint in \(\mathbf{H}\) with the literal \(x_{2 \cdot i+1}\) is \(x_{2 \cdot i+1} - x_{2 \cdot i} -x_{2 \cdot i-1} \ge 0\). Thus, this constraint must be used by R. Observe the following:

  1. 1.

    The constraint \(x_{2 \cdot i} -x_{2 \cdot i-1} \ge 0\) is the only constraint with the literal \(x_{2 \cdot i}\). Thus, it needs to be in R. Consequently, R has at least 2 constraints with the literal \(-x_{2 \cdot i-1}\).

  2. 2.

    The constraint \(x_{2 \cdot i-1} - x_{2 \cdot i-2} -x_{2 \cdot i-3} \ge 0\) is the only constraint with the literal \(x_{2 \cdot i-1}\). Thus, R needs to use this constraint at least 2 times. Consequently, R has at least 2 constraints with the literal \(-x_{2 \cdot i-2}\).

  3. 3.

    The constraint \( x_{2 \cdot i-2} -x_{2 \cdot i-3} \ge 0\) is the only constraint with the literal \(x_{2 \cdot i-2}\). Thus, R needs to use this constraint at least 2 times. Consequently, R has at least 4 constraints with the literal \(-x_{2 \cdot i-3}\).

  4. 4.

    For each r, R uses at least \(2^{r}\) constraints with the literal \(-x_{2 \cdot (i-r)+1}\). Thus, the constraint \(x_{2 \cdot (i-r)+1} - x_{2 \cdot (i-r)} -x_{2 \cdot (i-r-1)+1} \ge 0\) needs to be in the refutation at least \(2^{r}\) times. Consequently, R uses at least \(2^{r}\) constraints with the literal \(-x_{2 \cdot (i-r)}\).

  5. 5.

    For each r, R uses at least \(2^{r}\) constraints with the literal \(-x_{2 \cdot (i-r)}\). Thus, the constraint \(x_{2 \cdot (i-r)} -x_{2 \cdot (i-r-1)+1} \ge 0\) needs to be in the refutation at least \(2^{r}\) times. Consequently, R uses at least \(2^{r+1}\) constraints with the literal \(-x_{2 \cdot (i-r-1)+1}\).

  6. 6.

    The constraint \(x_1 \ge 0\) needs to be used at least \(2^{i}\) times by R.

Thus, for each \(2 \cdot i +1 \in S\), R must use the constraint \(x_1 \ge 0\) at least \(2^{i}\) times. Consequently, R must use the constraint \(x_1 \ge 0\) at least \(\sum _{2 \cdot i +1 \in S} 2^{i}\) times.    \(\square \)

4 Computational Complexity of the CC\(_D\) Problem

In this section, we explore the computational and approximation complexities of the copy complexity problem for width 3 HCSs.

First, we show that the problem of determining the copy complexity of an HCS is NP-complete even when each constraint in the HCS has at most 3 non-zero coefficients.

Let \(\varPhi \) be a 3-CNF formula with \(m'\) clauses over \(n'\) variables and let \(\mathbf{H}\) be the HCS constructed as follows:

  1. 1.

    For each variable \(x_i\) of \(\varPhi \), create the variables \(x_i\) and \(y_i\). Create the constraints \(- x_1 -y_1 \ge 0\), \(y_1 - x_2 -y_2 \ge 0, \ldots , y_{n'-2} - x_{n'-1} - y_{n'-1} \ge 0\), \(y_{n'-1} - x_n' \ge 1-m'\). These constraints are equivalent to the constraint \(-\sum _{i=1}^{n'} x_i \ge 1-m'\).

  2. 2.

    For each clause \(\phi _j \in \varPhi \), create the variable \(c_j\). Additionally, create the constraints \(c_j \ge 1\), \(c_j \ge 0\), and \(c_j \ge 0\).

  3. 3.

    For each clause \(\phi _j \in \varPhi \) and each variable \(x_i\) in clause \(\phi _j\), create the variable \(z_{i,j}\). Since each clause has at most 3 literals, there are at most \(3 \cdot m'\) such variables.

  4. 4.

    For each variable \(x_i\), let \(Pos(i)=\{\phi _{j_1}, \ldots , \phi _{j_{|Pos(i)|}}\}\) be the set of clauses containing the literal \(x_i\). Create the constraints \(x_i - z_{i,j_1} \ge 0\), \(z_{i,j_1} - c_{j_1} - z_{i,j_2} \ge 0, \ldots , z_{i,j_{|Pos(i)|}} - c_{j_{|Pos(i)|}} \ge 0\). This is equivalent to the constraint \(x_i - \sum _{\phi _j \in Pos(i)} c_j \ge 0\).

  5. 5.

    For each variable \(x_i\), let \(Neg(i)=\{\phi _{j'_1}, \ldots , \phi _{j'_{|Neg(i)|}}\}\) be the set of clauses containing the literal \(\lnot x_i\). Create the constraints \(x_i - z_{i,j'_1} \ge 0\), \(z_{i,j'_1} - c_{j'_1} - z_{i,j'_2} \ge 0, \ldots , z_{i,j'_{|Neg(i)|}} - c_{j'_{|Neg(i)|}} \ge 0\). This is equivalent to the constraint \(x_i - \sum _{\phi _j \in Neg(i)} c_j \ge 0\).

Note that \(\mathbf{H}\) has \(n\le (4 \cdot m'+2 \cdot n')\) variables.

We now show that a 3-CNF formula \(\varPhi \) has a solution if and only if the HCS \(\mathbf{H}\) has a copy complexity of 1.

Lemma 2

Let \(\varPhi \) be a 3-CNF formula and let \(\mathbf{H}\) be the HCS constructed from \(\varPhi \). \(\varPhi \) has a solution if and only if \(\mathbf{H}\) has a copy complexity of 1.

Proof

First, assume that \(\varPhi \) has a solution \(\mathbf{x}\). We will show that \(\mathbf{H}\) has copy complexity 1 by showing that \(\mathbf{H}\) has a read-once linear refutation R.

We construct R as follows:

  1. 1.

    Add the constraints \(- x_1 -y_1 \ge 0, \ldots , y_{n'-1} - x_n' \ge 1-m'\) to R. Note that summing these constraints results in the constraint \(-\sum _{i=1}^{n'} x_i \ge 1-m'\).

  2. 2.

    For each variable \(x_i\), if \(x_i\) is assigned true by \(\mathbf{x}\), then add the constraints \(x_i - z_{i,j_1} \ge 0, \ldots , z_{i,j_{|Pos(i)|}} - c_{j_{|Pos(i)|}} \ge 0\) to R. If \(x_i\) is assigned false by \(\mathbf{x}\), then add the constraints \(x_i - z_{i,j'_1} \ge 0, \ldots , z_{i,j'_{|Neg(i)|}} - c_{j'_{|Neg(i)|}} \ge 0\) to R.

  3. 3.

    For each clause \(\phi _r \in \varPhi \), let C(r) be the number of times the literal \(-c_r\) is used by a constraint in R so far. Since \(\phi _r\) has at most 3 literals, \(C(r) \le 3\). Additionally, since \(\mathbf{x}\) satisfies \(\varPhi \), the clause \(\phi _r\) must contain a literal T(r) set to true by \(\mathbf{x}\).

  4. 4.

    If T(r) is the literal \(x_i\), then the variable \(x_i\) is assigned true by \(\mathbf{x}\). Thus, by construction, R contains the equivalent of the constraint \(x_i - \sum _{\phi _r \in Pos(i)} c_r \ge 0\). Since \(\varPhi _r \in Pos(i)\), the literal \(-c_r\) is used by a constraint in R. Thus, \(C(r) \ge 1\).

  5. 5.

    If T(r) is the literal \(\lnot x_i\), then the variable \(x_i\) is assigned false by \(\mathbf{x}\). Thus, by construction, R contains the equivalent of the constraint \(x_i - \sum _{\phi _r \in Neg(i)} c_r \ge 0\). Since \(\varPhi _r \in Neg(i)\), the literal \(-c_r\) is used by a constraint in R. Thus, \(C(r) \ge 1\). Consequently, for each clause \(\phi _r \in \varPhi \), \(1 \le C(r) \le 3\).

  6. 6.

    For each clause \(\phi _r \in \varPhi \), add the constraint \(c_r \ge 1\) and \((C(r)-1)\) copies of the constraint \(c_r \ge 0\) to R. Note that \(\mathbf{H}\) has 2 copies of the constraint \(c_r \ge 0\) and that \(0 \le C(r)-1 \le 2\).

It is easy to see that summing all of the constraints in R results in a contradiction of the form \(0 \ge 1\). Thus, R is a read-once refutation of \(\mathbf{H}\).

Now assume that \(\mathbf{H}\) has a refutation R that uses no constraint more than once. We construct an assignment \(\mathbf{x}\) to \(\varPhi \) as follows: for each variable \(x_i\), if R contains a constraint of the form \(x_i - z_{i,j_1} \ge 0\) such that \(\phi _{j_1} \in Pos(x_i)\), then set \(x_i\) to true. Otherwise set \(x_i\) to false.

We make the following observations about R:

  1. 1.

    If the constraint \(- x_1 -y_1 \ge 0\) is removed from \(\mathbf{H}\), then \(\mathbf{H}\) is feasible. Thus, this constraint must be used by R. To cancel each \(y_i\) variable, all of the constraints \(- x_1 -y_1 \ge 0, \ldots , y_{n'-1} - x_n' \ge 1-m'\) must be in R. These constraints are equivalent to the constraint \(-\sum _{i=1}^{n'} x_i \ge 1-m'\).

  2. 2.

    To get a contradiction, the defining constant of the derived constraint must be positive. Note that the only constraints with positive defining constant in \(\mathbf{H}\) are of the form \(c_r \ge 1\). There are \(m'\) such constraints, thus they must all be used by R.

  3. 3.

    Consider the constraint \(c_r \ge 1\). The only constraints with \(-c_r\) in \(\mathbf{H}\) are of the form \(z_{i,r} - c_r - z_{i,r'} \ge 0\) where \(\phi _r \in Pos(i)\) or \(\phi _r \in Neg(i)\). Thus, R must contain a constraint of this form.

  4. 4.

    If R contains a constraint of the form \(z_{i,r} - c_r - z_{i,r'} \ge 0\) where \(\phi _r \in Pos(i)\), then to cancel all the \(z_{i,r}\) variables R must contain the constraints \(x_i - z_{i,j_1} \ge 0, \ldots , z_{i,j_{|Pos(i)|}} - c_{j_{|Pos(i)|}} \ge 0\). Thus, \(x_i\) is set to true by \(\mathbf{x}\). Note that \(\phi _r \in Pos(i)\) if and only if \(\phi _r\) contains the literal \(x_i\). Thus, \(\phi _r\) is satisfied by \(\mathbf{x}\).

  5. 5.

    If R contains a constraint of the form \(z_{i,r} - c_r - z_{i,r'} \ge 0\) where \(\phi _j \in Neg(i)\), then to cancel all the \(z_{i,r}\) variables R must contain the constraints \(x_i - z_{i,j'_1} \ge 0, \ldots , z_{i,j'_{|Neg(i)|}} - c_{j'_{|Neg(i)|}} \ge 0\). By construction, \(\mathbf{H}\) has only one constraint with the literal \(-x_i\). Thus, R cannot contain both the constraint \(x_i - z_{i,j'_1} \ge 0\) and a constraint of the form \(x_i - z_{i,j_1} \ge 0\) such that \(\phi _{j_1} \in Pos(x_i)\). Consequently, \(x_i\) is set to false by \(\mathbf{x}\). Note that \(\phi _r \in Neg(i)\) if and only if \(\phi _r\) contains the literal \(\lnot x_i\). Thus, \(\phi _r\) is satisfied by \(\mathbf{x}\).

Note that \(\mathbf{x}\) satisfies every clause in \(\varPhi \). Thus, \(\varPhi \) is satisfiable.    \(\square \)

Theorem 4

The CC\(_D\) problem for width 3 HCSs is NP-complete.

Proof

For each integer k we can establish that the copy complexity of a width 3 HCS is at most k by providing a Farkas vector \(\mathbf{y}\) such that \(||\mathbf{y}||_\infty \le k\). Note that \(||\mathbf{y}||_\infty \) is the largest element of \(\mathbf{y}\) and is called the \(L_\infty \) norm of \(\mathbf{y}\). Thus, the CC\(_D\) problem for width 3 HCSs is in NP.

From Lemma 2, we have that given a 3-CNF formula \(\varPhi \), we can construct a width 3 HCS \(\mathbf{H}\) such that \(\mathbf{H}\) has copy complexity 1 if and only if \(\varPhi \) is feasible. Thus, the CC\(_D\) problem for width 3 HCSs is NP-complete.    \(\square \)

The result in Theorem 4, relies on the fact that the problem of determining if a width 3 HCS has copy complexity 1 is NP-complete. However, this result can be extended to any fixed positive integer C.

Theorem 5

Let C be a positive integer. The problem of determining if a width 3 HCS has copy complexity at most C is NP-complete.

Proof

Let \(\varPhi \) be a 3-CNF formula and let \(\mathbf{H}\) be the HCS constructed from \(\varPhi \). From Lemma 2, we know that \(\mathbf{H}\) has copy complexity 1 if and only if \(\varPhi \) is satisfiable.

We can construct an HCS \(\mathbf{H'}\) from \(\mathbf{H}\) as follows:

  1. 1.

    Initially, \(\mathbf{H'} = \mathbf{H}\).

  2. 2.

    Let \(E \subseteq \mathbb {N}\) be such that \(\sum _{i\in E} 2^{i} = C\). For each \(k=1 \ldots |E|\), let E(k) be the \(k^{th}\) element of E.

  3. 3.

    For each constraint \(l_j\) of \(\mathbf{H}\):

    1. (a)

      Create the variables \(g_{j,1}\) through \(g_{j,2 \cdot \lfloor \log C \rfloor +1}\) and the constraint \(g_{j,1} \ge 0\). Additionally, create the constraints

      \(g_{j,2 \cdot l+1} - g_{j,2 \cdot l} - g_{j,2 \cdot l-1} \ge 0\) and \(g_{j,2 \cdot l} - g_{j,2 \cdot l-1} \ge 0\) for \(l=1 \ldots \lfloor \log C \rfloor \). Let \(S_j\) denote this set of constraints.

    2. (b)

      Create the variables \(e_{j,k}\) for \(k=0 \ldots |E|\). Additionally, create the constraints \(e_{j,0} - e_{j,1} \ge 0\), \(e_{j,1} - g_{j,2 \cdot E(1)+1} - e_{j,2} \ge 0, \ldots , e_{j,|E|-1} - g_{j,2 \cdot E(|E|-1)+1} - e_{j,|E|} \ge 0\), and \(e_{j,|E|} - g_{j,2 \cdot E(|E|)+1} \ge 0\).

    3. (c)

      Add the literal \(-e_{j,0}\) to the constraint \(l_j\).

We will now show that \(\mathbf{H'}\) has copy complexity at most C if and only if \(\mathbf{H}\) has copy complexity 1.

First assume that \(\mathbf{H}\) has copy complexity 1. Let R be a read-once refutation of \(\mathbf{H}\). We construct a refutation \(R'\) of \(\mathbf{H'}\) as follows:

  1. 1.

    Add each constraint used by R to \(R'\).

  2. 2.

    For each constraint \(l_j\) used by R, add the constraints \(e_{j,0} - e_{j,1} \ge 0, \ldots , e_{j,|E|} - g_{j,2 \cdot E(|E|)+1} \ge 0\) to \(R'\). Additionally, add enough copies of the constraints in \(S_j\) to cancel all of the \(g_{j,l}\) variables. From Theorem 3, this requires \(\sum _{i\in E} 2^{i} = C\) copies of the constraint \(g_{j,1} \ge 0\).

Note that \(R'\) is a refutation of \(\mathbf{H'}\) that uses each constraint at most C times. Thus, \(\mathbf{H'}\) has copy complexity at most C.

Now assume that \(\mathbf{H'}\) has a refutation that uses no constraint more than C times. We construct a read-once refutation R of \(\mathbf{H}\) as follows: for each constraint \(l_j\) in \(\mathbf{H'}\) used by \(R'\) add the corresponding constraint \(l_j\) from \(\mathbf{H}\) to R. By construction, the remaining constraints in \(R'\) are used to eliminate the variable \(e_{j,0}\) from each constraint \(l_j\). Since these variables do not exist in \(\mathbf{H}\), R is a refutation of \(\mathbf{H}\). All that remains is to show that no constraint \(l_j\) is used more than once by \(R'\).

Assume that for some j, the constraint \(l_j\) is used \(r >1\) times by \(R'\). Thus, by construction, the constraint \(e_{j,0} - e_{j,1} \ge 0\) must also be used r times by \(R'\). From Theorem 3, this means that the constraint \(g_{j,1} \ge 0\) needs to be used at least \(r \cdot C > C\) times by \(R'\). This contradicts the fact that R used no constraint more than C times. Thus, each constraint \(l_j\) is used at most once. Consequently, R is a read-once refutation of H.    \(\square \)

Since the CC\(_D\) problem is in NP, there exists a \(2^{p(m,n)}\) algorithm for this problem, where p(mn) is some polynomial in m and n. We now show that there cannot be a \(2^{o(n)}\) algorithm for the CC\(_D\) problem for width 3 HCSs unless the Exponential Time Hypothesis (ETH) fails [5, 6].

The ETH states that for each \(k \ge 3\), there exists a constant \(s_k >0\) such that k-SAT cannot be solved in time less than \(O(2^{s_k \cdot n})\). In particular, this precludes a \(2^{o(n)}\) time algorithm for 3-SAT. We now utilize the reduction used by Lemma 2 to establish a likely lower bound on the running time on any algorithm for solving the copy complexity problem for width 3 HCSs.

Theorem 6

There cannot be a \(2^{o(n)}\) algorithm for the CC\(_D\) problem for width 3 HCSs unless the ETH fails.

Proof

From Lemma 2, if there is a \(2^{o(n)}\) time algorithm for the copy complexity problem for HCSs, then there is a \(2^{o(n'+m')}\) algorithm for 3-CNF feasibility. This violates the ETH [5, 6]. Thus, it is unlikely that a \(2^{o(n)}\) time algorithm exists for the CC\(_D\) problem for HCSs.    \(\square \)

We now show that the problem of finding the copy complexity of a width 3 HCS is NPO complete [1]. We do this by a reduction from the Weighted Min-Ones problem.

The Weighted Min-Ones problem is defined as follows: Given a 3CNF formula \(\varPhi \) and positive integer valued variable weight function w, what is the satisfying assignment to \(\varPhi \) with least weight of variables set to true. This problem is known to be NPO-complete [11].

Let \(\varPhi \) be a CNF formula with m clauses over n variables where each variable \(x_i\) has weight w(i). Additionally, let W be the target weight. We construct the corresponding HCS \(\mathbf{H}\) as follows:

  1. 1.

    Let \(w_{max}\) be the largest weight of any variable \(x_i\) of \(\varPhi \). Additionally let \(f= \lfloor \log w_{max} \rfloor \).

  2. 2.

    For each variable \(x_i\) of \(\varPhi \), create the variables \(x_i\), \(t_i\) and \(y^+_i\).

  3. 3.

    Create the constraints \(-x_1 - t_1 \ge 0\), \(t_1 - x_2 - t_2 \ge 0, \ldots , t_{n-2} - x_{n-1} - t_{n-1} \ge 0\), and \(t_{n-1} - x_n \ge 1-m\). Let S be the set containing these constraints. Note that these constraints are the only constraints to use the variables \(t_i\) for \(i=1 \ldots (n-1)\). If any constraint in S is used more times by a refutation R of \(\mathbf{H}\) than any other constraint in S, then there would be a variable \(t_i\) left over in the resultant constraint. Thus, any refutation of \(\mathbf{H}\) must use all of these constraints an equal number of times. Note that together these constraints are equivalent to the constraint \(-\sum _{i=1}^n x_i \ge m-1\).

  4. 4.

    For each variable \(x_i\), let P(i) be the number of clauses in \(\varPhi \) containing the literal \(x_i\), and let N(i) be the number of clauses in \(\varPhi \) containing the literal \(\lnot x_i\). Create the variables \(z^+_{i,l}\) and \(t^+_{i,l}\) for \(l=1 \ldots P(i)\) and the variables \(z^-_{i,l}\) and \(t^-_{i,l}\) for \(l=1 \ldots N(i)\).

  5. 5.

    For each variable \(x_i\) of \(\varPhi \), create the constraints \(x_i \ge 0\), \(x_i - t^-_{i,1} \ge 0\), \(t^-_{i,1} - z^-_{i,1} \ge 0\), \(t^-_{i,1} - z^-_{i,1} - t^-_{i,2} \ge 0, \ldots , t^-_{i,N(i)-1} - z^-_{i,N(i)-1} \ge 0\), \(t^-_{i,N(i)-1} - z^-_{i,N(i)-1} - t^-_{i,N(i)} \ge 0\), and \(t^-_{i,N(i)} - z^-_{i,N(i)} \ge 0\). For each \(l=0 \ldots N(i)\), let \(S^-_{i,l}\) be the set:

    $$\begin{aligned} \{x_i - t^-_{i,1} \ge 0, t^-_{i,1} - z^-_{i,1} - t^-_{i,2} \ge 0, \ldots , t^-_{i,l} - z^-_{i,l} \ge 0\}. \end{aligned}$$

    Note that the constraints in \(S^-_{i,l}\) are equivalent to the constraint

    \(x_i - \sum _{j=1}^l z^-_{i,j} \ge 0\).

  6. 6.

    For each variable \(x_i\) of \(\varPhi \), create the constraints \(x_i - y^+_{i} \ge 0\), \(x_i - y^+_{i} - t^+_{i,1} \ge 0\), \(t^+_{i,1} - z^+_{i,1} \ge 0\), \(t^+_{i,1} - z^+_{i,1} - t^+_{i,2} \ge 0, \ldots , t^+_{i,P(i)-1} - z^+_{i,P(i)-1} \ge 0\), \(t^+_{i,P(i)-1} - z^+_{i,P(i)-1} - t^+_{i,P(i)} \ge 0\), and \(t^+_{i,P(i)} - z^+_{i,P(i)} \ge 0\). For each \(l=0 \ldots P(i)\), let \(S^+_{i,l}\) be the set:

    $$\begin{aligned} \{x_i - y^+_i - t^+_{i,1} \ge 0, t^+_{i,1} - z^+_{i,1} - t^+_{i,2} \ge 0, \ldots , t^+_{i,l} - z^+_{i,l} \ge 0\}. \end{aligned}$$

    Note that the constraints in \(S^+_{i,l}\) are equivalent to the constraint \(x_i - y^+ - \sum _{j=1}^l z^+_{i,j} \ge 0\).

  7. 7.

    For each clause \(\phi _j \in \varPhi \), create the variables \(c_j\) and \(d_j\). Additionally, create the constraint \(c_j - d_j \ge 1\).

  8. 8.

    For each clause \(\phi _j \in \varPhi \), create the variables \(d_{j,1}\) through \(d_{j,2 \cdot \lfloor \log W \rfloor +1}\) and the constraint \(d_{j,1} \ge 0\). Additionally, create the constraints

    \(d_{j,2 \cdot l+1} - d_{j,2 \cdot l} - d_{j,2 \cdot l-1} \ge 0\) and \(d_{j,2 \cdot l} - d_{j,2 \cdot l-1} \ge 0\) for \(l=1 \ldots \lfloor \log W \rfloor \). Let \(S'_j\) denote this set of constraints.

  9. 9.

    Let \(E_W \subseteq \mathbb {N}\) be such that \(\sum _{j\in E_W} 2^{j} = W\). For each \(k=1 \ldots |E_W|\), let E(Wk) be the \(k^{th}\) element of \(E_W\). For each clause \(\phi _j\), create the variables \(h_{j,k}\) for \(k=1 \ldots |E_W|\). Additionally, create the constraints \(d_j - h_{j,1} \ge 0\), \(h_{j,1} - d_{j,2 \cdot E(W,1)+1} - h_{i,2} \ge 0, \ldots , h_{j,|E_W|-1} - d_{j,2 \cdot E(W,|E_W|-1)+1} - h_{j,|E_W|} \ge 0\), and \(h_{j,|E_W|} - d_{j,2 \cdot E(W,|E_W|)+1} \ge 0\).

  10. 10.

    For each clause \(\phi _j \in \varPhi \) and each variable \(x_i\), if the literal \(x_i\) appears in the clause \(\phi _j\), add the constraints \(z^+_{i,l}-c_j \ge 0\) for \(l=1 \ldots P(i)\) to \(\mathbf{H}\). If the literal \(\lnot x_i\) appears in the clause \(\phi _j\), add the constraints \(z^-_{i,l}-c_j \ge 0\) for \(l=1 \ldots N(i)\) to \(\mathbf{H}\).

  11. 11.

    Create the variables \(g_1\) through \(g_{2 \cdot f+1}\) and the constraint \(g_1 \ge 0\). Additionally, create the constraints

    \(g_{2 \cdot l+1} - g_{2 \cdot l} - g_{2 \cdot l-1} \ge 0\) and \(g_{2 \cdot l} - g_{2 \cdot l-1} \ge 0\) for \(l=1 \ldots f\). Let \(S_f\) denote this set of constraints.

  12. 12.

    For each variable \(x_i\) of \(\varPhi \), let \(E_i \subseteq \mathbb {N}\) be such that \(\sum _{j\in E_i} 2^{j} = w(i)\). For each \(k=1 \ldots |E_i|\), let E(ik) be the \(k^{th}\) element of \(E_i\). Create the variables \(e_{i,k}\) for \(k=1 \ldots |E_i|\). Additionally, create the constraints \(y^+_i - e_{i,1} \ge 0\), \(e_{i,1} - g_{2 \cdot E(i,1)+1} - e_{i,2} \ge 0, \ldots , e_{i,|E_i|-1} - g_{2 \cdot E(i,|E_i|-1)+1} - e_{i,|E_i|} \ge 0\), and \(e_{i,|E_i|} - g_{2 \cdot E(i,|E_i|)+1} \ge 0\).

We now show that a CNF formula \(\varPhi \) has a solution in which the total weight of true variables is at most W if and only if the HCS \(\mathbf{H}\) has a copy complexity of at most W.

Lemma 3

Let \(\varPhi \) be a CNF formula with weighted variables and let \(\mathbf{H}\) be the HCS constructed from \(\varPhi \). \(\varPhi \) has a solution in which the total weight of true variables is at most W if and only if \(\mathbf{H}\) has a copy complexity of at most W.

Proof

First, assume that \(\varPhi \) has a solution \(\mathbf{x}\) such that \(W^*=\sum _{x_i : x_i = \mathbf{true}} w(i) \le W\). We will show that \(\mathbf{H}\) has copy complexity at most W by showing that \(\mathbf{H}\) has a refutation R that uses each constraint at most \(W^*\) times. We construct R as follows:

  1. 1.

    Add the constraints in S to R. Recall that these constraints are equivalent to

    \(-\sum _{i=1}^{n} x_i \ge 1-m\).

  2. 2.

    For each clause \(\phi _j \in \varPhi \) let T(j) be a literal in \(\phi _j\) set to true by \(\mathbf{x}\).

  3. 3.

    For each variable \(x_i\), let \(Pos(i) = \{\phi _j : T(j)=x_i\}\), and let \(Neg(i) = \{\phi _j : T(j)=\lnot x_i\}\).

  4. 4.

    For each variable \(x_i\), if \(x_i\) is assigned true by \(\mathbf{x}\), then add the constraints in \(S^+_{i,|Pos(i)|}\) to R. If \(x_i\) is assigned false by \(\mathbf{x}\), then add the constraints in \(S^-_{i,|Neg(i)|}\) to R.

  5. 5.

    For each variable \(x_i\) set to true by \(\mathbf{x}\), add the constraints \(y^+_i - e_{i,1} \ge 0, \ldots , e_{i,|E_i|} - g_{2 \cdot E(i,|E_i|)+1} \ge 0\) to R. Additionally, add enough copies of the constraints in \(S_f\) to cancel all of the \(g_l\) variables. From Theorem 3, this requires \(\sum _{j\in E_i} 2^{j} = w(i)\) copies of the constraint \(g_1 \ge 0\).

  6. 6.

    For each variable \(x_i\) set to true by \(\mathbf{x}\) and for each \(l=1 \ldots |Pos(i)|\), let \(\phi _j\) be the \(l^{th}\) element of Pos(i). Add the constraint \(z^+_{i,l}-c_j \ge 0\) to R.

  7. 7.

    For each variable \(x_i\) set to false by \(\mathbf{x}\) and for each \(l=1 \ldots |Neg(i)|\), let \(\phi _j\) be the \(l^{th}\) element of Neg(i). Add the constraint \(z^-_{i,l}-c_j \ge 0\) to R.

  8. 8.

    Add the constraints \(c_1 -d_1 \ge 1\) through \(c_m - d_m \ge 1\) to R.

  9. 9.

    For each clause \(\phi _j\), add the constraints \(d_j - h_{j,1} \ge 0, \ldots , h_{j,|E_W|} - d_{j,2 \cdot E(W,|E_W|)+1} \ge 0\) to R. Additionally, add enough copies of the constraints in \(S'_j\) to cancel all of the \(d_{j,l}\) variables. From Theorem 3, this requires \(\sum _{l\in E_W} 2^{l} = W\) copies of the constraint \(d_{j,1} \ge 0\).

It is easy to see that summing all of the constraints in R results in a contradiction of the form \(0 \ge 1\). Thus, R is a refutation of \(\mathbf{H}\). Note that the constraints reused by R belong to the sets \(S_f\) and \(S'_j\) for \(j=1 \ldots m\). From Theorem 3, the constraints reused the most are the constraint \(g_1 \ge 0\) and the constraints \(d_{j,1} \ge 0\) for \(j =1 \ldots m\). These constraints are each used at most W times as desired.

Now assume that \(\mathbf{H}\) has a refutation that uses no constraint more than W times. Thus, \(\mathbf{H}\) has a refutation R such that the constraint \(g_1 \ge 0\) is used \(W^* \le W\) times. We construct an assignment \(\mathbf{x}\) to \(\varPhi \) as follows: for each variable \(x_i\), if R contains the constraint \(x_i - y^+_i \ge 0\) or \(x_i - y^+_i - t^+_{i,1} \ge 0\), then set \(x_i\) to true. Otherwise set \(x_i\) to false.

We make the following observations about R:

  1. 1.

    If the constraints in S are removed from \(\mathbf{H}\), then \(\mathbf{H}\) is feasible, thus these constraints must be used by R. Recall that these constraints are equivalent to \(-\sum _{i=1}^{n} x_i \ge 1-m\).

  2. 2.

    To get a contradiction, the defining constant of the derived constraint must be positive. Note that the only constraints with positive defining constant in \(\mathbf{H}\) are of the form \(c_j -d_j \ge 1\). As noted previously, eliminating \(d_j\) from each of these constraints requires W copies of the constraint \(d_{j,1} \ge 0\). Thus, each of these constraints is used at most once by R. There are m such constraints, thus they must all be used by R. Consequently, the constraints in S can be each used at most once by R.

  3. 3.

    Consider the constraint \(c_j-d_j \ge 1\). The only constraints with \(-c_j\) in \(\mathbf{H}\) are of the form \(z^-_{i,l}-c_j \ge 0\) and \(z^+_{i,l}-c_j \ge 0\). Thus, R must contain a constraint of this form.

  4. 4.

    If R contains a constraint of the form \(z^-_{i,l}-c_j \ge 0\), then it must contain the constraint \(t^-_{i,l} - z^-_{i,l} \ge 0\) or \(t^-_{i,l} - z^-_{i,l} - t^-_{i,l+1}\ge 0\). To cancel the \(t^-_{i,l}\) variables, R must include the constraint \(x_i - t^-_{i,1} \ge 0\). This constraint cancels the variable \(x_i\) from the constraint \(-\sum _{i=1}^{n} x_i \ge 1-m\). Thus, the constraints \(x_i - y^+_i \ge 0\) and \(x_i - y^+_i - t^+_{i,1} \ge 0\) cannot be in R. This means that \(x_i\) is set to false by \(\mathbf{x}\). Note that the constraint \(z^-_{i,l}-c_j \ge 0\) is in \(\mathbf{H}\) if and only if \(\phi _j\) contains the literal \(\lnot x_i\). Thus, \(\phi _j\) is satisfied by \(\mathbf{x}\).

  5. 5.

    If R contains a constraint of the form \(z^+_{i,l}-c_j \ge 0\), then it must contain the constraint \(t^+_{i,l} - z^+_{i,l} \ge 0\) or \(t^+_{i,l} - z^+_{i,l} - t^+_{i,l+1}\ge 0\). To cancel the \(t^+_{i,l}\) variables, R must include the constraint \(x_i - y^+_i - t^+_{i,1} \ge 0\). This means that \(x_i\) is set to true by \(\mathbf{x}\). Note that the constraint \(z^-_{i,l}-c_j \ge 0\) is in \(\mathbf{H}\) if and only if \(\phi _j\) contains the literal \(\lnot x_i\). Thus, \(\phi _j\) is satisfied by \(\mathbf{x}\).

  6. 6.

    As observed previously, canceling \(y^+_i\) from the constraint \(x_i-y_i^+ - z^+_{i,1} \ge 0\) takes at least w(i) uses of the constraint \(g_1 \ge 0\). Thus,

    \(\sum _{x_i : x_i = \mathbf{true}} w(i) \le W^* \le W\) as desired.    \(\square \)

Using Lemma 3, we now show that the CC\(_{Opt}\) problem for width 3 HCSs is NPO-complete.

Theorem 7

The CC\(_{Opt}\) problem for width 3 HCSs is NPO-complete.

Proof

The copy complexity of an HCS can be verified in polynomial time by providing the Farkas vector. Thus, the CC\(_{Opt}\) problem is in NPO. All that remains is to show NPO-hardness.

Let \(\varPhi \) be a CNF formula with m clauses over n variables where each variable \(x_i\) has weight w(i). Using the construction in this section, we can construct a corresponding width 3 HCS \(\mathbf{H}\). From Theorem 3, this HCS has a copy complexity of at most W if and only if \(\varPhi \) has a solution in which the total weight of true variables is at most W. This is a strict (and hence) PTAS reduction [11]. Consequently, the CC\(_{Opt}\) problem for width 3 HCSs is NPO-complete.    \(\square \)

Since the CC\(_{Opt}\) problem for width 3 HCSs is NPO-complete, this problem cannot be approximated to within a polynomial factor unless P \(=\) NP [7].

5 Conclusion

In this paper, we analyzed the problem of determining bounds on the copy complexity bounds of HCSs. We showed that for any HCS, the copy complexity cannot exceed \(2^{n-1}\), where n is the number of variables in the HCS. We also showed that for each n, there exists a family of width 3 HCSs with copy complexity \(2^{\lfloor \frac{n}{2} \rfloor }\). Additionally, we showed that the CC\(_D\) problem for width 3 HCSs is NP-complete.

From our perspective, the following avenues are worth pursuing:

  1. 1.

    The focus of this paper has been copy complexity with respect to the ADD refutation system. However, additional inference rules exist which allow for constraints to be multiplied by and divided by positive integers. We hope to replicate the analysis in this paper when we allow for the use of these additional inference rules.

  2. 2.

    The goal of this paper was to focus on the copy complexity of HCSs. In some refutation models, the goal is not so much to minimize the copy complexity, but to minimize the total number of distinct constraint replications. In other words, the first replication has a cost associated with it, but all other replications are gratis. It would be interesting to study HCSs in this model.