Keywords

1 Introduction

Given a Boolean formula in conjunctive normal form(CNF), the satisfiability problem (SAT), which is a prototype of many NP-complete problems, asks for the existence of a satisfying assignment to the formula. In the past decades, SAT has been one of the most active and prolific research areas. Many problems, such as planning [21] and Pseudo-Boolean Constraints, can be translated into SAT [10]. Recently, the success of SAT research has led to exploring its optimization formalisms, such as the maximum satisfiability problem (MaxSAT) [46, 8, 12, 15, 19, 23] and the minimum satisfiability problem (MinSAT) [14, 16, 17]. MaxSAT asks for a Boolean assignment to maximize the number of satisfied clauses, while MinSAT asks for a Boolean assignment to minimize the number of satisfied clauses.

MaxSAT is considered as one of the fundamental combinatorial optimization problems, with close ties to important problems like max cut or max clique, and with applications in scheduling, routing, etc. For MaxSAT, there is a long tradition of theoretical works, e.g. [6, 12, 15, 19]. Moreover, Coppersmith et al. consider the phase transitions of random Max\(k\)SAT (\(k\ge 2\)) problem, where each clause contains two literals and is selected independently and randomly [9]. They demonstrate that, with increasing of \(r\), i.e. the ratio between number of clauses and number of variables, the expected number of unsatisfied clauses under an optimal assignment quickly changes from \(\varTheta (1/n)\) to \(\varTheta (n)\). Furthermore, they provide the upper and lower bounds of the maximum number of satisfied clauses for MaxSAT. Xu et al. improve the upper bound of Max2SAT by the first moment argument via correcting error items [24]. Achlioptas et al. first studied the \(p\)-satisfiable problem while there exists a truth assignment satisfying a fraction of \(1-2^{-k}+p2^{-k}\) of all clauses. They introduced weighting second moment method to prove the upper and lower bound of \(r\)(clauses/variables) [2]. Zhou et al. improve the lower bound by giving a different weight to the truth assignment if exactly one of \(k\) literals in a clause is satisfied [26].

The research of MaxSAT leads to increasing interest in its counterpart, MinSAT, which is introduced by Kohli, Krishnamurti and Mirchandani in [14]. They show that MinSAT is NP-complete, even when the formula is a 2-CNF formula, i.e. each clause of which contains at most two literals, or a Horn formula, i.e. each clause of which contains at most one positive variable. They also analyze the performances of deterministic greedy and probabilistic greedy heuristics for MinSAT. A reduction from MinSAT to the minimum vertex cover (MinVC) problem is given in [20], to improve the approximation ratio of MinSAT to \(2\). A simple randomized \(1.1037\)-approximation algorithm for Min\(2\)SAT, and a \(1.2136\)-approximation algorithm for Min3SAT, are given by Avidor and Zwick in [1]. The first exact algorithm for MinSAT is by encoding MinSAT to MaxSAT and solving it with a MaxSAT solver [16]. A branch and bound algorithm for solving MinSAT is proposed by Li et al. in [17]. Their experiments show that solving problems like MaxClique and combinatorial auction problems, is faster by encoding them to MinSAT than reducing them to MaxSAT.

Compared with MaxSAT, there is a lack of knowledge about the bounds of random MinSAT. A \(k\)-CNF of \(q\)-satisfiable asks for a truth assignment satisfying a fraction of \(1-2^{-k}-q(1-2^{-k})\) of all clauses. In this paper, we give upper and lower bounds of \(r\) for \(k\)-CNF to be \(q\)-satisfiable. The upper bound is obtained by the first moment method, while the proof of lower bound is weighted second moment used by [2].

We also present experimental results to demonstrate the tightness of these lower and upper bounds. For MinSAT, the experimental results explain the excellent performance of the state-of-the-art solver MinSatz. Moreover, we investigate the relationship between MaxSAT and MinSAT, and propose a conjecture about the expected sum of the maximum number and the minimum number of satisfied clauses for a random SAT instance.

This paper is organized as follows. In the next section, we review some basic definitions about SAT and MinSAT. Then the lower and upper bounds of \(k\)-CNF to be \(q\)-satisfiable are proved respectively. After that, experimental results are presented, as well as a discussion on the relationship between MaxSAT and MinSAT suggested by the experimental results. Finally, we conclude our work and point out future research directions.

2 Preliminaries

A Boolean formula in conjunctive normal form \(F\) is a set of clauses \(\{C_1,C_2,...,C_m\}\), where \(m\) is the number of clauses in \(F\). A clause is a disjunction of literals, \({x_{i_1}} \vee {x_{i_2}} \vee \cdots \vee {x_{i_k}}\), where \(k\) is the length of the clause. A literal is either a Boolean variable \(x\) or its negation \(\overline{x}\). The SAT problem is to determine the existence of an assignment satisfying all the clauses. If there is no assignment to satisfy all the clauses, a natural but more practical question is, how far or how close can one get to satisfiability? This is the optimization version of SAT, which is to find an assignment satisfying the most or the least number of clauses.

Definition 1

(MinSAT) Given a SAT instance \(F\), MinSAT asks for an assignment to all the variables such that the minimum number of clauses are satisfied. This number is called the value of the MinSAT instance.

Definition 2

(\(q\)-satisfiable) Given a \(k\)-CNF formula \(F\), if there exists a truth assignment satisfying a fraction of \(1-2^{-k}-q(1-2^{-k})\) \((0<q<1)\) of all clauses, it is \(q\)-satisfiable.

For simplicity, we use \(x\) in lieu of \(\left\lfloor x \right\rfloor \), the largest integer no more than \(x\). The following standard asymptotic notations will be used in this paper.

$$\begin{aligned} \mathop {\lim }\limits _{x \rightarrow \infty } \frac{{f(x)}}{{g(x)}} = 0\ \Rightarrow \ f(x) = o(g(x)) \end{aligned}$$
$$\begin{aligned} \mathop {\lim }\limits _{x \rightarrow \infty } \frac{{f(x)}}{{g(x)}} = 1\ \Rightarrow \ f(x) \simeq g(x) \end{aligned}$$
$$\begin{aligned} \mathop {\lim }\limits _{x \rightarrow \infty } \text {sup}\frac{{f(x)}}{{g(x)}} \le M(M > 0)\ \Rightarrow \ f(x) = O(g(x)) \end{aligned}$$

Especially, while \(M=1\), \(f(x)\lesssim g(x)\) means \(f(x)\) is less than or equal to \(g(x)\) asymptotically.

3 The Upper Bound

Given a \(k\)-CNF instance \(F\) to be \(q\)-satisfiable. Let \(P_r\) be the probabilistic distribution and let \(N\) denote the solutions number of the instance.

Theorem 1

Let \(r>\frac{2\ln 2}{q^2(2^{k}-1)}\), we have

$$\lim _{n\rightarrow \infty }Pr[F\ is\ q-satisfiable]=0$$

Proof. The expected value of \(N\), denoted as \(E(N)\), is given by

$$\begin{aligned} E(N)={2^n}\sum \limits _{i=0}^{\rho rn}{rn \atopwithdelims ()i} {\left( \frac{1}{2^k}\right) ^{rn - i}}{\left( 1-\frac{1}{2^k}\right) ^i} \end{aligned}$$

where \(\rho =1-2^{-k}-q(1-2^{-k})\).

The last term is maximized for \(q \in (0,1)\), so \(E(N)\) is upper bounded by

$$\begin{aligned} E(N) \le 2^n {(\rho rn+1)}{rn \atopwithdelims ()\rho rn} {\left( \frac{1}{2^k}\right) ^{rn - \rho rn}}{\left( 1-\frac{1}{2^k}\right) ^{\rho rn}} \end{aligned}$$

According to Stirling’s formula \(n! \simeq {(n/e)^{n}}O(n)\), and

$$\begin{aligned} {rn \atopwithdelims ()\rho rn} \simeq {\left( {\rho ^{ - \rho }}{(1 - \rho )^{\rho - 1}}\right) ^{rn}} , \end{aligned}$$

We have

$$\begin{aligned} E(N)\le {2^n}{(\rho rn+1)}{\left( {\rho ^{ - \rho }}{(1 - \rho )^{\rho - 1}}{\left( \frac{1}{2^k}\right) ^{1 - \rho }}{\left( 1-\frac{1}{2^k}\right) ^\rho }\right) ^{rn}} \end{aligned}$$

It is easy to prove that \(E(N)<0\) while \(r>\frac{2\ln 2}{q^2(2^{k}-1)}\). By the Markov inequality \(Pr(SAT)\le E(N)\), the upper bound is obtained.

4 The Lower Bound

Generally, the standard second moment method can be used to prove the lower bound for random problems, such as SAT, CSP. Unfortunately, it fails by using to optimization problem like MaxSAT. To cover this problem, Achlioptas provide a weighting scheme of the second moment to improve the lower bound of kSAT [3] and MaxSAT [2]. We following this line to prove the lower bound of \(q\)-satisfiable problem.

For any truth assignment \(\sigma \in \{0,1\}^n\), let

$$\begin{array}{lll} &{}H=H(\sigma ,F)=Unsat(l,\sigma )-Sat(l,\sigma )\\ &{}S=S(\sigma ,F)=Sat(c,\sigma ) \end{array}$$

where \(Unsat(l,\sigma )(Sat(l,\sigma ))\) is the number of unsatisfied(satisfied) literal, \(Sat(c,\sigma )\) is the number of satisfied clauses.

Let \(N\) defined as the number of solutions to \(q\)-satisfiable problem. Define

$$N=\varSigma _\sigma {\gamma ^{H(\sigma ,F)}\eta ^{S(\sigma ,F)-s_0rn}}$$

where \(s_0=1-q-(1-q)2^{-k}\), \(0<\gamma ,\eta <1\).

Using the following two functions,

$$\begin{array}{lll} &{}f(\alpha ,\gamma ,\eta )\\ =&{}\eta ^{-2s_0}E[\gamma ^{H(\sigma ,F)+S(\tau ,F)}\eta ^{H(\sigma ,F)+S(\tau ,F)}]\\ =&{}\eta ^{2s_0}\left[ {\left( \alpha (\frac{\gamma ^2+\gamma ^{-2}}{2})+1-\alpha \right) ^k} \right. \\ &{} \left. {-~2(1-\eta )\left( \left( \alpha (\frac{\gamma ^2+\gamma ^{-2}}{2})+1-\alpha \right) ^k-\left( \frac{\alpha \gamma ^{-2}+1-\alpha }{2}\right) ^k\right) }\right. \\ &{} \left. {+~(1-\eta )^2\left( \left( \left( \alpha (\frac{\gamma ^2+\gamma ^{-2}}{2})+1-\alpha \right) ^k-2\left( \frac{\alpha \gamma ^{-2}+1-\alpha }{2}\right) ^k\right) +2^{-k}\left( \alpha \gamma ^{-2}\right) ^k\right) } \right] \end{array}$$

Let

$$g_r(\alpha ,\gamma ,\eta )=\frac{f(\alpha ,\gamma ,\eta )^r}{\alpha ^\alpha (1-\alpha )^{1-\alpha }}$$

We have

$$\begin{array}{lll} E^2[N]&=\frac{2^n}{\eta ^{s_0rn}}{\left[ (\frac{\gamma +\gamma ^{-1}}{2})^k-(1-\eta )[\frac{\gamma +\gamma ^{-1}}{2})^k-(2\gamma )^{-k}]\right] }^{rn}=\left( 2g_r\left( \frac{1}{2},\gamma ,\eta \right) \right) ^n \end{array}$$

Theorem 2

Given a \(k\)-CNF formula \(F\), if \(r\le \frac{\ln 2}{(q+(1-q)\ln (1-q))2^k}(1-O(k2^{-k}))\),

$$\lim _{n\rightarrow \infty }[F\ is\ q-satisfiable]=1.$$

The proof is similar to the lower bound for MaxSAT in [2], so we ignore here.

Fig. 1.
figure 1

The upper and lower bound for \(q\)-satisfiable

5 Experimental Results

We conduct experiments to \(k\)-CNF formula to be \(q\)-satisfiable, the results are presented in Fig. 1. The upper bound (upper) is proved by the first moment method, while the proof of the lower bound (lower) is weighted scheme of the second moment method. We also give a poor lower bound (lower*) by algorithm analysis, which was used to MaxSAT in [9].

Fig. 2.
figure 2

The bounds of \(n = 80\) for Min3SAT

From Fig. 1, we can see that the space between upper bond and lower bound become smaller as \(q\) increase. That is to say, the bounds is tighter while the number of satisfying clauses is less. Based on this, the lower and upper bounds of Min\(k\)SAT is presented in Fig. 2, which indicate that the bounds for MinSAT (\(k=3\)) provided by this paper is very tight. However, the second moment is poor to upper bound for Min2SAT, so algorithm analysis is considered in Fig. 3.

Fig. 3.
figure 3

The bounds of \(n = 120\) for Min2SAT

In the second experiment, we apply our work to the state-of-the-art solver MinSatz. In MinSatz, UB is the largest number of falsified clauses while extending the current partial assignment to a complete one, and LB is the number of clauses falsified in the best assignment found so far. if LB \( < \) UB, MinSatz select a variable and instantiate it, otherwise, the solver backtracks. Besides, MinSatz introduces both clique partitioning algorithms and MaxSAT technology to improve the UB so as to prune the search tree quickly. However, we focus on the other side, and give LB an initial number so as to reduce the branching number. We conduct an experiment to test the performances before and after adding our work to MinSatz in Table 1. The branching number can be reduced while giving UB a initial value computed by this paper, but the improvements is not so obvious. Further analysis indicate that the genuine MinSatz is an excellent solver for MinSAT. However, if the MinSAT solver is not so excellent as MinSatz, such as a trivial MinSAT solver with no inference rules (minsat), the improvements will be more obvious. The results are presented in Table 2, which indicates that a MinSAT solver with our work outperforms the one without.

Table 1. Comparing with MinSATz with branching number
Table 2. Application to MinSAT solver

Li et al. have found from experiments that the relationship between MaxSAT and MinSAT is counter-intuitive [17, 18]: for the same instances, the bigger the MaxSAT value is, the smaller the MinSAT value is, the opposite is also true. They focus these instances at the threshold (\(c = 4.25\) for 3SAT). We follow this line, and conjecture that, the sum of the MaxSAT and MinSAT value is a constant value. In our experiments, the number of variables in random instances ranges from \(40\) to \(140\), and the density \(r\) considered are \(6\), \(7\) and \(8\). We compare the sum of exact MinSAT value and MaxSAT value (‘sum’ in Table 3) with our conjectured value (‘conjecture’ in Table 3). From experiments, the accuracy of our conjecture is found to be over 99 % (‘accuracy’ in Table 3). This indicates that our conjecture is close to the exact value. For \(k\)SAT instances, the sum of MinSAT value and MaxSAT value is approximately \((2-2^{1-k})cn\). If the relationship between MinSAT and MaxSAT values is clear, this conjecture can be used to get a better upper bound for MaxSAT. In other words, give the exact value of MinSAT, the value of MaxSAT for the same instance can be approximated. Besides, this value is significantly better than the upper bound obtained by [9], see Table 4.

Table 3. The relationship of MinSAT and MaxSAT
Table 4. Conjecture about the upper bound of MaxSAT. ‘MinSAT value’ (‘MaxSAT value’) is the exact value of MinSAT (MaxSAT), ‘ub04’ is the upper bound of MaxSAT obtained by [9], \({ub}^{\star }\) is our guess value.

6 Conclusions and Future Work

We have presented upper and lower bounds of the minimization versions of the SAT problem. For the upper bound, the first moment argument is used, while the lower bound is derived by weighting second moment. The experimental results confirm the correctness and accuracy of our work. Furthermore, we consider the relationship between MinSAT and MaxSAT, and an interesting conjecture is presented. As for future work, the bounds of other optimization problems such as Min-CUT of random graphs [7] and Min-CSP [13] of random CSPs [25] may be considered.