Keywords

1 Introduction

Propositional Satisfiability is the problem of determining, for a formula of the propositional calculus, if there is an assignment of truth values to its variables for which that formula evaluates to true. By SAT, we mean the problem of propositional satisfiability for formulas in conjunctive normal form (CNF) [5]. The SAT problem is the first problem proved to be NP-complete [4] and it plays an important role in computational complexity and artificial intelligence [1]. There are numerous investigations on this problem in different fields, such as approximation algorithms, randomized algorithms, heuristic algorithms, and exact and parameterized algorithms. In this paper, we study parameterized algorithms for SAT parameterized by the input length.

To measure the running time bound for the SAT problem, there are three frequently used parameters: the number of variables n, the number of clauses m, and the input length L. The input length L is defined as the sum of the number of literals in each clause. The number of variables n should be the most basic parameter. The simple brute force algorithm to try all \(2^n\) possible assignments of the n variables will get the running time bound of \(O^*(2^n)\).Footnote 1 After decades of hard work, no one can break this trivial bound. The Strong Exponential Time Hypothesis conjectures that the SAT problem cannot be solved in time \(O^*(c^n)\) for some constant \(c<2\) [11]. For a restricted version, the k-SAT problem (the length of each clause in the formula is bounded by a constant k), better results have been developed. For example, 3-SAT can be solved in \(O^*(1.3279^n)\) time [13], 4-SAT can be solved in \(O^*(1.4986^n)\) time [13], and k-SAT can be solved in \(O^*(c(k)^n)\) time for some value c(k) depending on k [13]. When it comes to the parameter m, Monien et al. first gave an algorithm with time complexity \(O^*(1.260^m)\) in 1981 [14]. Later, the bound was improved to \(O^*(1.239^m)\) by Hirsch in 1998 [9], and then improved to \(O^*(1.234^m)\) by Yamamoto in 2005 [19]. Now the best result is \(O^*(1.2226^m)\) obtained by Chu, Xiao, and Zhang [3].

The input length L is another important and frequently studied parameter. It is probably the most precise parameter to describe the input CNF-Formula. From the first algorithm with running time bound \(O^*(1.0927^L)\) by Van Gelder in 1988 [16], the result was improved several times. In 1997, the bound was improved to \(O^*(1.0801)\) by Kullmann and Luckhardt [12]. In 1998, the bound was improved to \(O^*(1.0758^L)\) by Hirsch [9], and improved again by Hirsch to \(O^*(1.074^L)\) in 2000 [10]. Then Wahlström gave an \(O^*(1.0663^L)\)-time algorithm in 2005 [17]. In 2009, Chen and Liu [2] used the measure-and-conquer method to analyze the running time bound and further improved the result to \(O^*(1.0652^L)\). We list the major progress and our result in Table 1.

Table 1. Previous and our upper bound for SAT

Our algorithm, as well as most algorithms for the SAT problem, is based on the branch-and-search process. The idea of branch-and-search is simple and practical: for a given CNF-formula \(\mathcal {F}\), we iteratively branch on a variable or literal x into two branches by assigning value 1 or 0 to it. Let \(\mathcal {F}_{x=1}\) and \(\mathcal {F}_{\overline{x}=1}\) be the resulted CNF-formula by assigning value 1 and 0 to x, respectively. It holds that \(\mathcal {F}\) is satisfiable if and only if at least one of \(\mathcal {F}_{x=1}\) and \(\mathcal {F}_{\overline{x}=1}\) is satisfiable. To get a running time bound, we need to analyze how much the parameter L can decrease in each branch. To break some bottlenecks in direct analysis, some references [2, 17] analyzed the algorithm based on some other measures and gave the relation between the new measure and L. The current best result [2] was obtained by using the measure-and-conquer method, which is also to use a new measure. This is the first time to bring the measure-and-conquer method to this research line. In this paper, we further improve the running time bound by still using the measure-and-conquer method. Similar to many measure-and-conquer algorithms, our algorithm and the algorithm in [2] deal with variables from high degree to low degree. The algorithm in [2] carefully analyzed branching operations for variables of degree 4. Our algorithm will simplify the branching operation for 4-degree variables and carefully analyze the branching operation for 5-degree variables. Finally, we can improve the bound to \(O^*(1.0646^L)\).

Due to the limited space, the proofs of some lemmas marked with (*) are omitted, which can be found in the full version of this paper [15].

2 Preliminaries

Let \(V = \{x_1, x_2, ..., x_n\}\) denote a set of n boolean variables. Each variable \(x_i\) (\(i \in \{1, 2, ..., n\}\)) has two corresponding literals: positive literal \(x_i\) and negative literal \(\overline{x_i}\) (we use \(\overline{x}\) to denote the negation of a literal x, and \(\overline{\overline{x}}=x\)). A clause on V consists of some literals on V. Note that we allow a clause to be empty. A clause \(\{z_1,z_2,\dots , z_q\}\) is also simply written as \(z_1z_2\dots z_q\). Thus, we use zC to denote the clause containing literal z and all literals in clause C. We also use \(C_1C_2\) to denote the clause containing all literals in clauses \(C_1\) and \(C_2\). We use \(\overline{C}\) to denote a clause that contains the negation of every literal in clause C. That is, if \(C=z_1z_2...z_q\), then \(\overline{C} = \overline{z_1}\overline{z_2}...\overline{z_q}\). A CNF-formula on V is the conjunction of a set of clauses \(\mathcal {F} = \{C_1, C_2, ..., C_m\}\). When we say a variable x is contained in a clause (or a formula), it means that the clause (at least one clause of the formula) contains a literal x or its negative \(\overline{x}\).

An assignment for V is a map \(A: V\rightarrow \{0, 1\}\). A clause \(C_j\) is satisfied by an assignment if and only if there exists at least one literal in \(C_j\) such that the assignment makes its value 1. A CNF-formula is satisfied by an assignment A if and only if each clause in it is satisfied by A. We say a CNF-formula is satisfiable if it can be satisfied by at least one assignment. We may assign value 0 or 1 to a literal, which is indeed to assign a value to its variable to make the corresponding literal 0 or 1.

A literal z is called an (ij)-literal (resp., an \((i^+, j)\)-literal or \((i^-, j)\)-literal) in a formula \(\mathcal {F}\) if z appears i (resp. at least i or at most i) times and \(\overline{z}\) appears j times in the formula \(\mathcal {F}\). Similarly, we can define \((i, j^+)\)-literal, \((i, j^-)\)-literal, \((i^+, j^+)\)-literal, \((i^-, j^-)\)-literal, and so on. Note that literal z is an (ij)-literal if and only if literal \(\overline{z}\) is a (ji)-literal. A variable x is an (ij)-variable if the positive literal x is an (ij)-literal. For a variable or a literal x in formula \(\mathcal {F}\), the degree of it, denoted by deg(x), is the number of x appearing in \(\mathcal {F}\) plus the number of \(\overline{x}\) appearing in \(\mathcal {F}\), i.e., \(deg(x)=i+j\) for an (ij)-variable or (ij)-literal x. A d-variable (resp., \(d^+\)-variable or \(d^-\)-variable) is a variable with the degree exactly d (resp., at least d or at most d). The degree of a formula \(\mathcal {F}\) is the maximum degree of all variables in \(\mathcal {F}\). For a clause or a formula C, the set of variables whose literal appears in C is denoted by var(C).

The \(\textit{length}\) of a clause C, denoted by |C|, is the number of literals in C. A clause is a k-clause or \(k^+\)-clause if the length of it is k or at least k. We use \(L(\mathcal {F})\) to indicate the length of a formula \(\mathcal {F}\). It is the sum of the lengths of all clauses in \(\mathcal {F}\), which is also the sum of the degrees of all variables in \(\mathcal {F}\). A formula \(\mathcal {F}\) is called k-CNF formula if each clause in \(\mathcal {F}\) has a length of at most k.

In a formula \(\mathcal {F}\), a literal x is called a neighbor of a literal z if there is a clause containing both z and x. The set of neighbors of a literal z in a formula \(\mathcal {F}\) is denoted by \(N(z, \mathcal {F})\). We also use \(N^{(k)}(x, \mathcal {F})\) (resp., \(N^{(k+)}(z, \mathcal {F})\)) to denote the neighbors of z in k-clauses (resp., \(k^+\)-clauses) in \(\mathcal {F}\), i.e., for any \(z'\in N^{(k)}(z, \mathcal {F})\) (resp., \(z'\in N^{(k+)}(z, \mathcal {F})\)), there exists a k-clause (resp., \(k^+\)-clause) containing both z and \(z'\).

3 Branch-and-Search and Measure-and-Conquer

Our algorithm is a standard branch-and-search algorithm, which first applies some reduction rules to reduce the instance as much as possible and then searches for a solution by branching. The branching operations may exponentially increase the running time. We will use a measure to evaluate the size of the search tree generated in the algorithm. For the SAT problem, the number of variables or clauses of the formula is a commonly used measure. More fundamentals of branching heuristics about the SAT problem can be found in [1].

We use \(T(\mu )\) to denote the maximum size or number of leaves of the search tree generated by the algorithm for any instance with the measure being at most \(\mu \). For a branching operation that branches on the current instance into l branches with the measure decreasing by at least \(a_i\) in the i-th branch, we get a a recurrence relation

$$\begin{aligned} T(\mu )\le T(\mu -a_1)+T(\mu -a_2)+\dots + T(\mu -a_l). \end{aligned}$$

The recurrence relation can also be simply represented by a branching vector \([a_1,a_2,\dots ,a_l]\). The largest root of the function \(f(x) = 1-\sum _{i=1}^lx^{-a_i}\) is called the branching factor of the recurrence. If the maximum branching factor for all branching operations in the algorithm is at most \(\gamma \), then \(T(\mu )=O(\gamma ^\mu )\). If on each node of the search tree, the algorithm runs in polynomial time, then the total running time of the algorithm is \(O^*(\gamma ^\mu )\). For two branching vectors \(\mathbf {a}=[a_1,a_2,\dots ,a_l]\) and \(\mathbf {b}=[b_1,b_2,\dots ,b_l]\), if \(a_i\ge b_i\) holds for all \(i=1,2\dots ,l\), then the branching factor of \(\mathbf {a}\) is not greater than that of \(\mathbf {b}\). For this case, we say \(\mathbf {b}\) dominates \(\mathbf {a}\). This property will be used in many places to simplify some arguments in the paper. More details about analyzing recurrences can be found in the monograph [8].

The measure-and-conquer method [7] is a powerful tool to analyze branch-and-search algorithms. The main idea of the method is to adopt a new measure in the analysis of the algorithm. For example, instead of using the number of variables as the measure, it may set weights to different variables and use the sum of all variable weights as the measure. This method may be able to catch more structural properties and then get further improvements. Nowadays, the fastest exact algorithms for many NP-hard problems were designed by using this method. In this paper, we will also use the measure-and-conquer method.

We introduce a weight to each variable in the formula according to the degree of the variable, \(w:\mathbb {Z}^+ \rightarrow \mathbb {R}^+\), where \(\mathbb {Z}^+\) and \(\mathbb {R}^+\) denote the sets of nonnegative integers and nonnegative reals, respectively. Let \(w_i\) denote the weight of a variable with degree i. A variable with lower degree will not receive a higher weight. i.e., \(w_i\ge w_{i-1}\). In our algorithm, the measure of a formula \(\mathcal {F}\) is defined as

$$\begin{aligned} \mu (\mathcal {F}) = \sum _{x} w_{deg(x)}. \end{aligned}$$
(1)

In other words, \(\mu (\mathcal {F})\) is the sum of the weight of all variables in \(\mathcal {F}\). Let \(n_i\) denote the number of i-variables in \(\mathcal {F}\). Then we also have that \(\mu (\mathcal {F})=\sum _{i} w_in_i\).

One important step is to set the value of weight \(w_i\). Different values of \(w_i\) will generate different branching vectors and factors. We need to find a good setting of \(w_i\) so that the worst branching factor is as small as possible. We will get the value of \(w_i\) by solving a quasiconvex program after listing all our branching vectors. However, we pre-specify some requirements of the weights to simplify arguments. Some similar assumptions were used in previous measure-and-conquer algorithms. We set the weight such that

$$\begin{aligned} \begin{aligned}&w_1=w_2=0,\\&0<w_3< 2, w_4= 2w_3, ~\text {and}\\&w_i = i ~\text {for }~ i\ge 5. \end{aligned} \end{aligned}$$
(2)

We use \(\delta _i\) to denote the difference between \(w_i\) and \(w_{i-1}\) for \(i > 0\), i.e., \(\delta _i=w_i-w_{i-1}\). By (2), we have

$$\begin{aligned} w_3=\delta _3=\delta _4. \end{aligned}$$
(3)

We also assume that

$$\begin{aligned} \begin{aligned}&\delta _i \le \delta _{i-1} ~\text {for}~i\ge 3,~\text {and}\\&w_3\ge \delta _5.&\end{aligned} \end{aligned}$$
(4)

Under these assumptions, it holds that \(w_i\le i\) for each i. Thus, we have

$$\begin{aligned} \mu (\mathcal {F}) \le L(\mathcal {F}). \end{aligned}$$
(5)

This tells us that if we can get a running time bound of \(O^*(c^{\mu (\mathcal {F})})\) for a real number c, then we also get a running time bound of \(O^*(c^{L(\mathcal {F})})\) for this problem. To obtain a running time bound in terms of the formula length \(L(\mathcal {F})\), we consider the measure \(\mu (\mathcal {F})\) and show how much the measure \(\mu (\mathcal {F})\) decreases in the branching operations of our algorithm and find the worst branching factor among all branching vectors.

4 The Algorithm

We will first introduce our algorithm and then analyze its running time bound by using the measure-and-conquer method. Our algorithm consists of reduction operations and branching operations. When no reduction operations can be applied anymore, the algorithm will search for a solution by branching. We first introduce our reduction rules.

4.1 Reduction Rules

We have ten reduction rules. They are well-known and frequently used in the literature (see [2, 17] for examples). So we may omit the proofs of the correctness of some rules. We introduce the reduction rules in the order as stated and a reduction rule will be applied in our algorithm only when all previous reduction rules can not be applied on the instance.

R-Rule 1 (Elimination of duplicated literals)

If a clause C contains duplicated literals z, remove all but one z in C.

R-Rule 2 (Elimination of subsumptions)

If there are two clauses C and D such that \(C\subseteq D\), remove clause D.

R-Rule 3 (Elimination of tautology)

If a clause C contains two opposite literals z and \(\overline{z}\), remove clause C.

R-Rule 4 (Elimination of 1-clauses and pure literals)

If there is a 1-clause \(\{x\}\) or a \((1^+, 0)\)-literal x, assign \(x=1\).

Davis-Putnam Resolution, proposed in [6], is a classic and frequently used technology for SAT. Let \(\mathcal {F}\) be a CNF-formula and x be a variable in \(\mathcal {F}\). Assume that clauses containing literal x are \(xC_1,xC_2,...,xC_a\) and clauses containing literal \(\overline{x}\) are \(\overline{x}D_1, \overline{x}D_2, ..., \overline{x}D_b\). A Davis-Putnam resolution on x is to construct a new CNF-formula \(DP_{x}(\mathcal {F})\) by the following method: initially \(DP_{x}(\mathcal {F}) = \mathcal {F}\); add new clauses \(C_iD_j\) for each \(1\le i\le a\) and \(1\le j\le b\); and remove \(xC_1,xC_2,...,xC_a, \overline{x}D_1, \overline{x}D_2,...,\overline{x}D_b\) from the formula. It is known that

Proposition 1

([6]). A CNF-formula \(\mathcal {F}\) is satisfiable if and only if \(DP_{x}(\mathcal {F})\) is satisfiable.

In the resolution operation, each new clause \(C_iD_j\) is called a resolvent. A resolvent is trivial if it contains both a literal and the negation of it. Since trivial resolvents will always be satisfied, we can simply delete trivial resolvents from the instance directly. So when we do resolutions, we assume that all trivial resolvents will be deleted.

R-Rule 5 (Trivial resolution)

If there is a variable x with at most one non-trivial resolvent, then apply resolution on x.

R-Rule 6

([2]). If there are a 2-clause \(z_1z_2\) and a clause C containing both \(z_1\) and \(\overline{z_2}\), then remove \(\overline{z_2}\) from C.

R-Rule 7

If there are two clauses \(z_1z_2C_1\) and \(z_1\overline{z_2}C_2\), where literal \(\overline{z_2}\) appears in no other clauses, then remove \(z_1\) from clause \(z_1z_2C_1\).

Lemma 1

(*) Let \(\mathcal {F}\) be a CNF-formula and \(\mathcal {F}'\) be the resulting formula after applying R-Rule 7 on \(\mathcal {F}\). Then \(\mathcal {F}\) is satisfiable if and only if \(\mathcal {F}'\) is satisfiable.

R-Rule 8

([2]). If there is a 2-clause \(z_1z_2\) and a clause \(\overline{z_1}\overline{z_2}C\) such that literal \(\overline{z_1}\) appears in no other clauses, remove the clause \(z_1z_2\) from \(\mathcal {F}\).

R-Rule 9

([2]). If there is a 2-clause \(z_1z_2\) such that either literal \(z_1\) appears only in this clause or there is another 2-clause \(\overline{z_1}\overline{z_2}\), then replace \(z_1\) with \(\overline{z_2}\) in \(\mathcal {F}\) and then apply R-Rule 3 as often as possible.

R-Rule 10

([2]). If there are two clauses \(CD_1\) and \(CD_2\) such that \(|D_1|, |D_2|\ge 1\) and \(|C|\ge 2\), then remove \(CD_1\) and \(CD_2\) from \(\mathcal {F}\), and add three new clauses xC, \(\overline{x}D_1\), and \(\overline{x}D_2\), where x is a new 3-variable.

This is like the Davis-Putnam resolution in reverse and thus it is correct.

Definition 1 (Reduced formulas)

A CNF-formula \(\mathcal {F}\) is called reduced, if none of the above reduction rules can be applied on it.

Our algorithm will first iteratively apply above reduction rules in the order to get a reduced formula. We will use \(R(\mathcal {F})\) to denote the resulting reduced formula obtained from \(\mathcal {F}\). Next, we show some properties of reduced formulas.

Lemma 2

(*) In a reduced CNF-formula \(\mathcal {F}\), all variables are \(3^+\)-variables.

Lemma 3

(*) In a reduced CNF-formula \(\mathcal {F}\), if there is a 2-clause xy, then no other clause in \(\mathcal {F}\) contains xy, \(\overline{x}y\), or \(x\overline{y}\).

Lemma 4

(*) In a reduced CNF-formula \(\mathcal {F}\), if there is a clause xyC, then

  1. (i)

    no other clause contains xy;

  2. (ii)

    no other clause contains \(\overline{x}y\) or \(\overline{x}\overline{y}\) if x is a 3-variable.

Lemma 5

(*) In a reduced CNF-formula \(\mathcal {F}\), if there is (1, i)-literal x and xC is the only clause containing x, then

  1. (i)

    \(|C|\ge 2\);

  2. (ii)

    all variables in C are different from all variables in \(N^{(2)}(\overline{x}, F)\), that is, if \(y\in N^{(2)}(\overline{x}, F)\), then \(y, \overline{y}\notin C\).

4.2 Branching Rules and the Algorithm

After getting a reduced formula, we will search for a solution by branching. In a branching operation, we will generate two smaller CNF-formulas such that the original formula is satisfiable if and only if at least one of the two new formulas is satisfiable. The two smaller formulas are generated by specifying the value of a set of literals in the original formula.

The simplest branching rule is that we pick up a variable or literal x from \(\mathcal {F}\) and branch into two branches \(\mathcal {F}_{x=1}\) and \(\mathcal {F}_{x=0}\), where \(\mathcal {F}_{x=1}\) and \(\mathcal {F}_{x=0}\) are the formulas after assigning \(x=1\) and \(x=0\) in \(\mathcal {F}\), respectively. When the picked literal x is a \((1,1^+)\)-literal, we will apply a stronger branching. Assume that xC is the only clause containing x. Then we branch into two branches and \(\mathcal {F}_{x=0}\), where is the resulting formula after assigning 1 to x and 0 to all literals in C in \(\mathcal {F}\). The correctness of this branching operation is also easy to observe. Only when all literals in C are assigned 0, we need to assign 1 to x.

figure a

The main steps of our algorithm for the SAT problem are given in Algorithm 1. The algorithm will execute one step only when all previous steps can not be applied. In Step 2, the algorithm first reduces the formula by applying the reduction rules. Step 3 will branch on a variable of degree \(\ge 6\) if it exists. Steps 4–8 deal with 5-variables. Note that if Steps 1–7 do not apply, then \(\mathcal {F}\) can be written as \(\mathcal {F}=\mathcal {F}^* \wedge \mathcal {F}'\), where \(\mathcal {F}^*\) is a 3-CNF with \(var(\mathcal {F}^*)\) be the set of 5-variables in \(\mathcal {F}\) and \(var(\mathcal {F}^*)\cap var(\mathcal {F}') = \emptyset \). So we can do Step 8. Steps 9–10 deal with 4-variables. When the algorithm comes to the last step, all variables must have a degree of 3 and the algorithm deals with this special case.

We compare our algorithm with the previous algorithm by Chen and Liu [2]. We can see that they used a simple and uniform branching rule to deal with variables of degree at least 5 and used careful and complicated branching rules for 4-variables. Their bottlenecks contain one case of branching on (2, 3)-variables (or (3, 2)-variables) and one case of dealing with 4-variables. To get further improvements, we carefully design and analyze the branching rules for 5-variables to avoid one previous bottleneck, and also refine the branching rules for 4-variables.

5 Framework of the Analysis

We use the measure-and-conquer method to analyze the running time bound of our algorithm, and adopt \(\mu (\mathcal {F})\) defined in (1) as the measure to construct recurrence relations for our branching operations. Before analyzing each detailed step of the algorithm, we first introduce the general framework of our analysis.

In each sub-branch of a branching operation, we assign value 1 or 0 to some literals and remove some clauses and literals. If we assign value 1 to a literal x in the formula \(\mathcal {F}\), then we will remove all clauses containing x and all \(\overline{x}\) literals from the clauses containing \(\overline{x}\). The assignment and removing together are called an assignment operation. We may assign values to more than one literal and we do assignment operations for each literal. Let S be a subset of literals. We use \(\mathcal {F}_{S=1}\) to denote the resulting formula after assigning 1 to each literal in S and doing assignment operations. Note that \(\mathcal {F}_{S=1}\) may not be a reduced formula and we will apply our reduction rules to reduce it. We use \(\mathcal {F}'_{S=1}\) to denote the reduced formula obtained from \(\mathcal {F}_{S=1}\), i.e., \(\mathcal {F}'_{S=1}=R(\mathcal {F}_{S=1})\). We analyze how much we can reduce the measure in each branch by establishing some lower bounds for

$$\begin{aligned} \varDelta _{S} = \mu (\mathcal {F}) - \mu (\mathcal {F}_{S=1}'). \end{aligned}$$

We also define

$$\begin{aligned} \xi _{S}^{(1)} = \mu (\mathcal {F}) - \mu (\mathcal {F}_{S=1}); \end{aligned}$$
$$\begin{aligned} \xi _{S}^{(2)} = \mu (\mathcal {F}_{S=1}) - \mu (\mathcal {F}_{S=1}'). \end{aligned}$$

Thus, \(\varDelta _{S} = \xi _{S}^{(1)} + \xi _{S}^{(2)}\).

In a branching operation, we will branch into two sub branches. Assume that the set of literals in \(S_1\) are assigned the value in the first sub branch and the set of literals in \(S_2\) are assigned the value in the second sub branch. If we can show

$$ \min (\varDelta _{S_1}, \varDelta _{S_2}) \ge a ~~\text {and}~~ \varDelta _{S_1}+ \varDelta _{S_2} \ge b,$$

then we can always get a branching vector covered by one of

$$[a, b-a]~~~and~~~[b-a,a].$$

This technique will be frequently used in our analysis.

5.1 Some Lower Bounds

Next, we show some detailed lower bounds for \(\varDelta _{S}\) (as well as for \(\varDelta _{S_1}+\varDelta _{S_2}\)). We first consider \(\xi _{S}^{(1)}\) and \(\xi _{S}^{(2)}\).

According to the assignment operation, we know that all variables of the literals in S will be deleted in \(\mathcal {F}_{S=1}\). So we have a trivial bound

$$\begin{aligned} \xi _{S}^{(1)} \ge \sum _{v\in S}w_{deg(v)}. \end{aligned}$$
(6)

To get better bounds, we first define some notations. For a literal x in a reduced formula \(\mathcal {F}\), we define:

  • \(n_i(x)\): the number of i-variables whose literals appear in \(N(x, \mathcal {F})\);

  • \(n'_i(x)\): the number of i-variables whose literals appear in \(N^{(2)}(x, \mathcal {F})\);

  • \(n''_i(x)\): the number of i-variables whose literals appear in \(N^{(3+)}(x, \mathcal {F})\).

Note that by the definition, we always have that \(n_i(x)=n'_i(x)+n''_i(x)\).

Next, we give some lower bounds on \(\xi _{S}^{(1)}\), \(\xi _{S}^{(2)}\), and \(\varDelta _{S_1}+\varDelta _{S_2}\), which will be used to prove our main results.

Lemma 6

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula. Let \(S = \{x\}\), where x is a literal in \(\mathcal {F}\). It holds that

$$\begin{aligned} \xi _{S}^{(1)} \ge w_{deg(x)} + \sum _{i\ge 3}n_i(x)\delta _i. \end{aligned}$$
(7)

Lemma 7

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of degree d. Let \(S = \{x\}\), where x is a \((j, d-j)\)-literal in \(\mathcal {F}\). It holds that

$$\begin{aligned} \xi _{S}^{(1)} \ge w_d + j\delta _d. \end{aligned}$$
(8)

Lemma 8

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of degree d. Let \(S = \{x\}\), where x is a literal in \(\mathcal {F}\). It holds that

$$\begin{aligned} \xi _{S}^{(2)}\ge n'_3(\overline{x})w_3 + \sum _{4\le i\le d}n'_i(\overline{x})w_{i-1}. \end{aligned}$$
(9)

Lemma 9

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of degree d. Let \(S_1 = \{x\}\) and \(S_2 = \{\overline{x}\}\), where the corresponding variable of x is a d-variable in \(\mathcal {F}\). It holds that

$$\begin{aligned} \varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d + 2d\delta _d + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _d) + \sum _{4\le i\le d}{(n'_i(x)+n'_i(\overline{x}))(w_i-2\delta _d)}. \end{aligned}$$
(10)

Lemma 10

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of degree d. Let x be a \((1, d-1)\)-literal and xC be the only clause containing x in \(\mathcal {F}\). Let \(S=\{x\}\cup \overline{C}\). It holds that

$$\begin{aligned} \varDelta _{S}\ge w_d+2w_3+\sum _{3\le i\le d}{n'_i(\overline{x})w_i}. \end{aligned}$$
(11)

Lemma 11

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of degree d. Let x be a \((1, d-1)\)-literal and xC be the only clause containing x in \(\mathcal {F}\). Let \(S_1=\{x\}\cup \overline{C}\) and \(S_2=\{\overline{x}\}\). It holds that

$$\begin{aligned} \varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d+2w_3+2(d-1)\delta _d. \end{aligned}$$
(12)

Lemma 12

(*) Assume that \(\mathcal {F}\) is a reduced CNF-formula of \(d=5\). Let \(S_1=\{x\}\) and \(S_2=\{\overline{x}\}\), where the corresponding variable of x is a 5-variable in \(\mathcal {F}\). If all clauses containing x or \(\overline{x}\) are \(3^+\)-clauses, it holds that

$$\begin{aligned} \varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+(\sum _{3\le i\le 5}{(n_i(x)+n_i(\overline{x}))})\delta _5+(\sum _{3\le i\le 4}{(n_i(x)+n_i(\overline{x}))})(w_3-\delta _5). \end{aligned}$$
(13)

6 Step Analysis

Equipped with the above lower bounds, we are ready to analyze the branching vector of each step in the algorithm.

6.1 Step 2

In this step, we do not branch and only apply reduction rules to reduce the formula. However, it is still important to show that the measure will never increase when applying reduction rules, and reduction operations use only polynomial time.

Lemma 13

(*) For any CNF-formula \(\mathcal {F}\), it holds that

$$\mu (R(\mathcal {F}))\le \mu (\mathcal {F}).$$

Lemma 14

(*) For any CNF-formula \(\mathcal {F}\), we can apply the reduction rules in polynomial time to transfer it to \(R(\mathcal {F})\).

6.2 Step 3

In this step, we branch on a variable x of degree at least 6. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 15

The branching vector generated by Step 3 is covered

$$\begin{aligned}{}[w_6+\delta _6, w_6+11\delta _6]~\text {or}~[w_6+11\delta _6, w_6+\delta _6]. \end{aligned}$$
(14)

Proof

Since R-Rule 4 is not applicable, both x and \(\overline{x}\) are \((1^+,1^+)\)-literals. By the condition of this case, we have \(d\ge 6\) and \(\delta _d=\delta _6\) by (2).

By Lemma 7, we can get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_d + j\delta _d \ge w_6 + \delta _6\) since x is a \((j, d-j)\)-literal with \(j\ge 1\). Also, we can get \(\varDelta _{S_2}\ge w_6+\delta _6\) by the same method.

By Lemma 9, we have that \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d + 2d\delta _d + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _d) + \sum _{4\le i\le d}{(n'_i(x)+n'_i(\overline{x}))(w_i-2\delta _d)}\ge 2w_6 + 12\delta _d\) since \(w_3>\delta _d\) and \(w_i>2\delta _d\) for \(4\le i\le d\).

With \(\min (\varDelta _{S_1},\varDelta _{S_2})\ge w_6+\delta _6\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_6+12\delta _6\), we can know that the branching vector of this case is covered by \([w_6+\delta _6, w_6+11\delta _6]\) or \([w_6+11\delta _6, w_6+\delta _6]\).   \(\square \)

6.3 Step 4

In this step, the algorithm will consider a (1, 4)-literal x. Assume that xC is the only clause containing x. The two sub-branches are: \(S_1=\{x\}\cup \overline{C}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 16

The branching vector generated by step 4 is covered by

$$\begin{aligned}{}[w_5+2w_3, w_5+8\delta _5]~\text {or}~[w_5+8\delta _5, w_5+2w_3]. \end{aligned}$$
(15)

Proof

By Lemma 10, we get that \(\varDelta _{S_1}\ge w_d+2w_3+\sum _{3\le i\le d}{n'_i(\overline{x})w_i}\ge w_5+2w_3\). By Lemma 7, we have that \(\varDelta _{S_2}\ge \xi _{S_2}^{(1)} \ge w_d + j\delta _d= w_5+4\delta _5\) since \(\overline{x}\) is a (4, 1)-literal.

By Lemma 11, we can get that \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d+2w_3+2(d-1)\delta _d= 2w_5+2w_3+8\delta _5\).

Since \(w_3<2\) and \(w_5=5\) by (2), we have \(2w_5> 5w_3\), i.e., \(2w_5-4w_3> w_3\). Since \(w_4=2w_3\) by (4), we have \(2w_5-2w_4> w_3\Rightarrow 2\delta _5> w_3\). So \(\min (\varDelta _{S_1}, \varDelta _{S_2})\ge w_5+2w_3\). Since \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+2w_3+8\delta _5\), we know that the branching vector of this case is covered by \([w_5+2w_3, w_5+8\delta _5]\) or \([w_5+8\delta _5, w_5+2w_3]\).   \(\square \)

6.4 Step 5

In this step, we branch on a 5-variable x such that either x or \(\overline{x}\) is in a 2-clause. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 17

The branching vector generated by step 5 is covered by one of

$$\begin{aligned}{}[w_5+2\delta _5, w_5+4w_3+4\delta _5]\text {,}~[w_5+4w_3+4\delta _5, w_5+2\delta _5]\text {,}\\ [w_5+3\delta _5, w_5+2w_3+5\delta _5],~\text {and}~[w_5+2w_3+5\delta _5, w_5+3\delta _5]. \end{aligned}$$

Proof

We will consider two subcases:

Case 1. There are at least two 2-clause containing literal x or \(\overline{x}\). Now it holds that \(\sum _{3\le i\le d}n'_i(x)+n'_i(\overline{x})\ge 2\). By Lemma 7, we get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_d + j\delta _d\ge w_5+2\delta _5\) since x is a (2, 3)-literal or (3, 2)-literal. In a similar way, we can get that \(\varDelta _{S_2}\ge w_5+2\delta _5\).

By Lemma 9, we have \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d + 2d\delta _d + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _d) + \sum _{4\le i\le d}{(n'_i(x)+n'_i(\overline{x}))(w_i-2\delta _d)}\ge 2w_5 + 10\delta _5 + \sum _{3\le i\le 5}{(n'_i(x)+n'_i(\overline{x}))(2w_3-2\delta _5)}\ge 2w_5+10\delta _5+2(w_3-2\delta _5)\ge 2w_5+4w_3+6\delta _5\) since \(w_4,w_5\ge 2w_3\) and \(\sum _{3\le i\le 5}n'_i(x)+n'_i(\overline{x})\ge 2\).

By \(\min (\varDelta _{S_1}, \varDelta _{S_2})\ge w_5+2\delta _5\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+4w_3+6\delta _5\), we know that the branching vector of this case is covered by \([w_5+2\delta _5, w_5+4w_3+4\delta _5]\) or \([w_5+4w_3+4\delta _5, w_5+2\delta _5]\).

Case 2. There is only one 2-clause containing literal x or \(\overline{x}\). Note that \(\sum _{3\le i\le d}n'_i(x)+n'_i(\overline{x})=1\). For literal x, it is contained in at least two clauses and at most one of them is 2-clause. So \(\sum _{3\le i\le d}n_i(x)\ge 3\) holds.

By Lemma 6, we get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_5 + \sum _{3\le i\le d}n_i(x)\delta _i\ge w_5 + (\sum _{3\le i\le d}n_i(x))\delta _5\ge w_5+3\delta _5\). Similarly, we also can get that \(\varDelta _{S_2}\ge w_5+3\delta _5\). By Lemma 9, we get that \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d + 2d\delta _d + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _d) + \sum _{4\le i\le d}{(n'_i(x)+n'_i(\overline{x}))(w_i-2\delta _d)}\ge 2w_5 + 10\delta _5 + \sum _{3\le i\le 5}{(n'_i(x)+n'_i(\overline{x}))(2w_3-2\delta _5)}\ge 2w_5+10\delta _5+(2w_3-2\delta _5) \ge 2w_5+2w_3+8\delta _5\) since \(w_4,w_5\ge 2w_3\) and \(\sum _{3\le i\le d}n'_i(x)+n'_i(\overline{x})= 1\).

Since \(\min (\varDelta _{S_1},\varDelta _{S_2})\ge w_5+3\delta _5\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+2w_3+8\delta _5\), we know that the branching vector of this subcase is covered by \([w_5+3\delta _5, w_5+2w_3+5\delta _5]\) or \([w_5+2w_3+5\delta _5, w_5+3\delta _5]\).

These two cases complete the proof.    \(\square \)

6.5 Step 6

In this step, all clauses containing a 5-variable are \(3^+\)-clauses now. We branch on a 5-variable x contained in a \(4^+\)-clause. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 18

The branching vector generated by step 6 is covered by

$$\begin{aligned}{}[w_5+4\delta _5, w_5+7\delta _5]~\text {or}~[w_5+7\delta _5, w_5+4\delta _5]. \end{aligned}$$
(16)

Proof

Literal x is contained in at least two \(3^+\)-clauses. So \(\sum _{3\le i\le d}n_i(x)\ge 4\) holds. By Lemma 6, we get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_5 + (\sum _{3\le i\le d}n_i(x))\delta _d\ge w_5+4\delta _5\). Similarly, we get that \(\varDelta _{S_2}\ge w_5+4\delta _5\).

Let \(m_4\) be the number of \(4^+\)-clauses containing x. We have that \(\sum _{3\le i\le 5}n_i(x)+n'_i(x)\ge 2(5-m_4)+3m_4\ge 10+m_4\ge 11\) since \(m_4\ge 1\). By Lemma 12, We get that \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+(\sum _{3\le i\le 5}{(n_i(x)+n_i(\overline{x}))})\delta _5+(\sum _{3\le i\le 4}{(n_i(x)+n_i(\overline{x}))})(w_3-\delta _5) \ge 2w_5+11\delta _5\) since \(w_3\ge \delta _5\).

Since \(\min (\varDelta _{S_1}, \varDelta _{S_2})\ge w_5+4\delta _5\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+11\delta _5\), we know that the branching vector of this case is covered by \([w_5+4\delta _5, w_5+7\delta _5]\) or \([w_5+7\delta _5, w_5+4\delta _5]\).   \(\square \)

6.6 Step 7

In Step 7, all clauses containing a 5-variable are 3-clauses. We branch on a 5-variable x whose literal and a literal of a \(4^-\)-variable are in the same clause. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have that

Lemma 19

The branching vector generated by step 7 is covered by

$$\begin{aligned}{}[w_5+4\delta _5, w_5+w_3+5\delta _5]~\text {or}~[w_5+w_3+5\delta _5, w_5+4\delta _5]. \end{aligned}$$
(17)

Proof

There is at least one \(4^-\)-variable whose literal is in \(N(x, \mathcal {F})\cup N(\overline{x}, \mathcal {F})\). So it holds that \(\sum _{3\le i\le 4}(n_i(x)+n_i(\overline{x}))\ge 1\).

For literal x, it is contained in at least two 3-clauses, which means that \(\sum _{3\le i\le d}n_i(x)\ge 4\) holds. By Lemma 6, we get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_5 + \sum _{3\le i\le 5}n_i(x)\delta _i\ge w_5 + (\sum _{3\le i\le d}n_i(x))\delta _5=w_5 + 4\delta _5\). Similarly, we can get that \(\varDelta _{S_2}\ge w_5+4\delta _5\).

By Lemma 12, we get that \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+(\sum _{3\le i\le 5}{(n_i(x)+n_i(\overline{x}))})\delta _5+(\sum _{3\le i\le 4}{(n_i(x)+n_i(\overline{x}))})(w_3-\delta _5)\ge 2w_5+w_3+9\delta _5\) since \(\sum _{3\le i\le 4}n_i(x)+n_i(\overline{x})\ge 1\).

Since \(\min (\varDelta _{S_1},\varDelta _{S_2})\ge w_5+4\delta _5\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_5+2w_3+8\delta _5\), we know that the branching vector is covered by \([w_5+4\delta _5, w_5+w_3+5\delta _5]\) or \([w_5+w_3+5\delta _5, w_5+4\delta _5]\).   \(\square \)

6.7 Step 8

In Step 8, the literals of all 5-variables form a 3-SAT instance \(\mathcal {F}^*\). We apply the \(O^*(1.3279^n)\)-time algorithm in [13] for 3-SAT to solve our problem, where n is the number of variables in the instance. Since \(w_5=5\), we have that \(n=\mu (\mathcal {F}^*)/w_5=\mu (\mathcal {F}^*)/5\). So the running time for this part will be

$$O^*(1.3279^{\mu (\mathcal {F}^*)/w_5})=O^*(1.0584^{\mu (\mathcal {F}^*)}).$$

6.8 Step 9

In this step, we branch on a (1, 3)-literal x. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 20

(*) The branching vector generated by step 9 is covered by

$$\begin{aligned}{}[w_4+2w_3, w_4+6\delta _4]~\text {or}~[w_4+6\delta _4, w_4+2w_3]. \end{aligned}$$
(18)

6.9 Step 10

In this step, we branch on a (2, 2)-literal x. The two sub-branches are: \(S_1=\{x\}\); \(S_2=\{\overline{x}\}\). We have the following result:

Lemma 21

The branching vector generated by step 10 is covered by

$$\begin{aligned}{}[w_4+2\delta _4, w_4+6\delta _4]~\text {or}~[w_4+6\delta _4, w_4+2\delta _4]. \end{aligned}$$
(19)

Proof

By Lemma 7, we get that \(\varDelta _{S_1}\ge \xi _{S_1}^{(1)} \ge w_d + j\delta _d=w_4+2\delta _4\) since x is a (2, 2)-literal. Similarity, we can get that \(\varDelta _{S_2}\ge w_4+2\delta _4\).

By Lemma 9, we have \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_d + 2d\delta _d + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _d) + \sum _{4\le i\le d}{(n'_i(x)+n'_i(\overline{x}))(w_i-2\delta _d)}= 2w_4 + 8\delta _4 + (n'_3(x)+n'_3(\overline{x}))(2w_3-2\delta _4) + (n_4'(x)+n_4'(\overline{x}))(w_4-2\delta _4)= 2w_4+8\delta _4\).

Since \(\min (\varDelta _{S_1},\varDelta _{S_2})\ge w_4+2\delta _4\) and \(\varDelta _{S_1}+\varDelta _{S_2}\ge 2w_4+8\delta _4\), we know that the branching vector is covered by \([w_4+2\delta _4, w_4+6\delta _4]\) or \([w_4+6\delta _4, w_4+2\delta _4]\).

   \(\square \)

6.10 Step 11

All variables are 3-variables now. We apply the \(O^*(1.1279^n)\)-time algorithm by Wahlström [18] to solve this special case, where n is the number of variables. For this case, we have that \(n=\mu (\mathcal {F})/w_3\). So the running time of this part is

$$O^*((1.1279^{1/w_3})^{\mu (\mathcal {F})}).$$

7 The Final Result

Each one of the branching vectors above will generate a constraint in our quasiconvex program to solve the best value for \(w_3\) and \(w_4\). Let \(\alpha _i\) denote the branching factor for branching vector (i) where \(14\le i\le 21\). We want to find the minimum value \(\alpha \) such that \(\alpha \le \alpha _i\) and \(\alpha \le 1.1279^{1/w_3}\) (generated by Step 11) under the assumptions (2) and (4). By solving this quasiconvex program, we get that \(\alpha =1.0646\) by letting \(w_3=1.9234132344759123\) and \(w_4=3.8468264689518246\). Note that \(\alpha = 1.0646\) is greater than 1.0584 the branching factor generated in Step 8. So 1.0646 is the worst branching factor in the whole algorithm. By (5), we get the following result.

Theorem 1

Algorithm 1 solves the SAT problem in \(O^*(1.0646^L)\) time.

Table 2. The weight setting
Table 3. The branching vector and factor for each step

We also show the whole weight setting in Table 2 and the branching vector of each step under the setting in Table 3. From Table 3, we can see that we have four bottlenecks: Steps 7, 9, 10, and 11. In fact, Steps 9, 10, and 11 have the same branching vector \([4w_3, 8w_3]\) under the assumption that \(w_4=2w_3\) (for Step 11, the worst branching vector in [18] is [4, 8]). The branching factor for these three steps will decrease if the value of \(w_3\) increases. On the other hand, the branching factor for Step 7 will decrease if the value of \(w_3\) decreases. We set the best value of \(w_3\) to balance them. If we can either improve Step 7 or improve Steps 9, 10, and 11 together, then we may get a further improvement. However, the improvement is very limited and several other bottlenecks will appear.

8 Concluding Remarks

In this paper, we show that the SAT problem can be solved in \(O^*(1.0646^L)\) time, improving the previous bound in terms of the input length obtained more than 10 years ago. Nowadays, improvement becomes harder and harder. However, SAT is one of the most important problems in exact and parameterized algorithms, and the state-of-the-art algorithms are frequently mentioned in the literature. Furthermore, in order to give a neat and clear analysis, we introduce a general analysis framework, which can even be used to simplify the analysis for other similar algorithms based on the measure-and-conquer method.