Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

LMHS is a weighted partial maximum satisfiability (MaxSAT) solver. Weighted partial MaxSAT is a common generalization of maximum satisfiability that allows some clauses to be designated as mandatory and assigns weights to clauses that may be left unsatisfied. LMHS implements the so-called implicit hitting set approach [16, 18] for weighted partial MaxSAT, and can be viewed as an independent from-scratch re-implementation of the MaxHS solver [810]. On top of the main algorithm, LMHS integrates MaxSAT preprocessing [36] into the solver, and offers solution enumeration, an incremental API, as well as the use of a choice of SAT and IP solvers. The solver entered the 2015 MaxSAT Evaluation [2], where it solved the most problems (among non-portfolio solvers) in the crafted and industrial weighed partial MaxSAT categories. This paper gives an overview of key features of the LMHS solver as well as the effects of preprocessing and the choice of the SAT and IP solvers on its performance.

2 Overview of LMHS

LMHS implements an instantiation of an implicit hitting set algorithm [16] for weighted partial MaxSAT, following MaxHS [810]. Given an unsatisfiable CNF formula \(\mathcal {F}\), the MaxSAT problem is to identify a minimum (minimum-cost for weighted problems) set H of (soft) clauses such that \(\mathcal {F}{\setminus }H\) is satisfiable. A connection to implicit hitting set problems comes from the unsatisfiable subsets of clauses, or cores \(\kappa \), or the formula. The set of clauses \(H\) must hit a clause from each core \(\kappa \), so an optimal MaxSAT solution can be obtained by computing a minimum-cost hitting set (MCHS) over the set of all cores. When the set of all cores is not known, this becomes an implicit hitting set problem.

The implicit hitting set approach for weighted partial MaxSAT, given hard clauses \(\mathcal {F}_h\), soft clauses \(\mathcal {F}_s\), and cost function \(c : \mathcal {F}_s \rightarrow \mathbb R^+\), is described in more detail in Algorithm 1 and Fig. 1. It uses both a SAT and an IP solver. During the solving process it accumulates a set \(\mathcal {K}\) of cores and stores a MCHS of \(\mathcal {K}\) in \(H\). Starting with \(H= \emptyset \), the algorithm tests the satisfiability of \(\mathcal {F}{\setminus }H\) using the SAT solver. If satisfiable, the variable assignment \(\tau \) returned by the SAT solver is optimal. If unsatisfiable, a new core \(\kappa \) of \(\mathcal {F}{\setminus }H\) is obtained from the SAT solver and added to \(\mathcal {K}\). Finally, the IP solver is used to update \(H\) to a new hitting set by computing a MCHS of \(\mathcal {K}\).

figure a

In practice, every soft clause \(C_i \in \mathcal {F}_s\) is augmented with a unique auxiliary variable \(a_i\) so that if \(a_i = 1\), then \(C_i\) need not be satisfied, i.e., creating the clause \(C_i \vee a_i\). Arbitrary sets of soft clauses can then be excluded from the formula by assuming \(a_i = 1\) for the corresponding auxiliary variables in a SAT solver call. To obtain a MCHS of \(\mathcal {K}\), the IP solver minimizes \(\sum _{C_i \in \mathcal {F}_s} a_i \cdot c(C_i)\) subject to the constraint \(\sum _{C_i \in \kappa } a_i \ge 1\) for each core \(\kappa \in \mathcal {K}\), enforcing that each core in \(\mathcal {K}\) is hit.

Besides the features elaborated on in Sect. 3, some design choices differentiate LMHS from the MaxHS solver of Davies and Bacchus (http://maxhs.org). LMHS never enforces the equivalence \(\lnot a_i \leftrightarrow C_i\) of auxiliary variables and clauses explicitly in CNF. Instead, the value of each \(a_i\) is fixed via the assumptions interface for every SAT solver call, which ensure that \(\lnot a_i \leftrightarrow C_i\) implicitly holds. In terms of heuristic optimizations, by default LMHS finds a maximal disjoint set of cores on each iteration and uses a greedy hitting set algorithm in place of an IP solver call whenever possible. At each iteration, the greedy hitting set algorithm is used in place of an IP solver call as long as this results in an unsatisfiable formula (i.e., a core is produced). When the greedy method does not yield a core, the IP solver is used to compute a minimum-cost hitting set. The 2015 MaxSAT Evaluation versions LMHS-I and LMHS-C differ slightly in this regard: LMHS-I did not use the greedy algorithm, while LMHS-C finds a set of possibly overlapping cores at each iteration. Furthermore, as a consequence of the integration of SAT-based preprocessing, auxiliary variables may not be limited to a single unique \(a_i\) per soft clause.

Fig. 1.
figure 1

Information flow in the implicit hitting set approach to MaxSAT

3 Features

Here we give an overview of the main features offered by LMHS on top of the main algorithm it implements.

Integrated Preprocessing. The use of SAT preprocessing techniques for \(\text {MaxSAT}\) [4] is integrated into LMHS using the Coprocessor 2.0 SAT preprocessor [17]. Many SAT preprocessing techniques, such as bounded variable elimination [11], are not sound for \(\text {MaxSAT}\) on their own [4]. However, they can be made sound by introducing a layer of auxiliary variables (labels) and forbidding their removal during preprocessing [3, 4]. Concretely, a new variable \(l_i\) is introduced for each soft clause \(C_i\) prior to applying preprocessing. The original soft clause replaced by a hard clause \((C_i \vee l_i)\) with the restriction that the variable \(l_i\) may not be eliminated from the formula. After preprocessing, soft clauses \((\lnot l_i)\) with the weights of the original clauses \(C_i\) are added to the formula.

Efficient integration of SAT-based preprocessing in LMHS is enabled by the observation that the MaxHS algorithm is sound even in cases where an assumption variable is shared between clauses or a clause contains more than one assumption variable [5]. This allows LMHS to re-use the variables \(l_i\) introduced by preprocessing in place of the auxiliary variables \(a_i\), avoiding the introduction of a new layer of assumption variables for SAT-based core extraction within the main algorithm.

Following [6], we further avoid the addition of unnecessary auxiliary variables in LMHS by detecting variables in the original instance which can be reused already in the preprocessing phase. Any literal \(l \in \{x, \lnot x\}\) which occurs only in a single unit soft clause \((\lnot l)\) and some hard clauses \((C_1 \vee l), \cdots , (C_n \vee l)\) of the input instance is detected by simple pattern matching and re-used by the preprocessor and thereafter by the main algorithm. Such variables are introduced by, e.g., a straightforward encoding of group constraints [13].

Solution Enumeration. LMHS offers command-line options for enumerating MaxSAT solutions. The solver can enumerate the k best solutions or all optimal solutions. Enumeration can be based on variable assignments or satisfied clauses. In the latter case, only solutions which satisfy a unique set of soft clauses are considered. Solution enumeration in LMHS is implemented as Algorithm 2. The MaxHS algorithm is enclosed within the loop on Line 3. When the first solution is found, Line 10 records its cost as the optimal cost. On subsequent optimal solutions, Line 14 adds a single clause which forbids the latest obtained optimal solution. The termination condition on Line 11 is met when all optimal solutions have been found.

figure b

To enumerate unique solutions in terms of satisfied clauses, the refinement of \(\mathcal {F}\) on Line 14 is replaced by adding the constraint \(\sum _{C_i \in H} a_i - \sum _{C_i \in \mathcal {F}_s {\setminus } H} a_i < |H|\) to the hitting set IP, followed by a re-computation of the hitting set. A fixed number of best solutions can be found by modifying the condition of Line 11 to only break after enough solutions have been found. An application of the solution enumeration interface—and the incremental API described next—is presented in [19], where LMHS is used for MaxSAT to deriving cutting planes in an IP-based approach to learning optimal Bayesian network structures.

Incremental API. LMHS also implements a more general type of incrementality. Through a C or C++ API, the working formula can be incrementally extended with arbitrary clauses and the solver subsequently incrementally used for finding optimal solutions to the altered formula without starting search from scratch. In terms of Algorithm 2, operations performed through the API in effect replace Line 14. An overview of the interface follows.

  • reset Resets the internal state of LMHS, allowing a new instance to be started.

  • initialize Initializes LMHS and its components. Three variants of this method are offered. An instance can be initialized from a file, from clauses in memory, or as an empty instance to be built using the API.

  • getNewVariable Requests a new variable from the internal SAT solver.

  • addHardClause Adds a hard clause to the working \(\text {MaxSAT}\) instance.

  • addSoftClause Adds a soft clause to the working \(\text {MaxSAT}\) instance. This automatically internally creates a blocking (auxiliary) variable for the clause. This variable is returned by the function in case the user wishes to make use of it. As a rule, the blocking variable created will always have a larger index number than the last variable created with getNewVariable.

  • addCoreConstraint If a subset of soft clauses is known to be unsatisfiable, it can be explicitly added to the set of cores, expressed using the blocking variables of the soft clauses.

  • forbidLastModel Internally creates a SAT constraint forbidding the previously found variable assignment.

  • forbidLastSolution Internally creates an IP constraint forbidding the previously found set of satisfied soft clauses.

  • getOptimalSolution Optimally solves the current \(\text {MaxSAT}\) instance.

Choice of SAT and IP Solvers. A lightweight interface between LMHS and its SAT solver component allows for flexibility in the choice of solver. Any solver which provides an assumption-based incremental interface can be integrated into LMHS by implementing a small interface class and making minor modifications to the build process. Interfaces for two such solvers, MiniSat 2.2 [12] and the inprocessing [15] SAT solver Lingeling [7], are included in the current release of LMHS. Similarly, LMHS was also designed to allow for the use of different IP solvers. Currently LMHS includes interfaces to the state-of-the-art commercial IP solver IBM CPLEX [14] and the open-source non-commercial IP solver SCIP [1].

Input Format. In addition to adhering to the DIMACS WCNF input format for weighted partial MaxSAT, LMHS also supports the use of floats (without preprocessing) in the input WCNF, i.e., MaxSAT with cost functions associating real-valued non-negative weights to clauses. Within the solver, the costs are handled by the IP solver.

Fig. 2.
figure 2

Effect of integrated preprocessing on LMHS on 2015 MaxSAT evaluation instances.

Fig. 3.
figure 3

A comparison of SAT (right) and IP (left) solver components within LMHS.

4 Performance Overview

This section examines some interesting aspects of the performance of LMHS. We evaluate the solver on the complete set of 2209 crafted and industrial partial and weighted partial benchmarks of the 2015 MaxSAT Evaluation [2]. The experiments were run on machines with 32-GB memory and Intel Xeon E5540 processors under Ubuntu Linux 12.04. A per-instance time limit of 1800 seconds (30 min) was enforced. Figure 2 is a plot of the number of instances solved at different per-instance timeouts, showing the impact of integrating preprocessing into the solver by reusing auxiliary variables. It shows an interesting effect, in that the ordinary application of MaxSAT preprocessing to the instance (LMHS+pre) degrades solver performance compared to no preprocessing (LMHS), but the tighter integration of preprocessing by reusing variables (LMHS+pre+reuse) produces a clear improvement. While the reasons for this effect are not entirely clear at present, we suspect it to be at least in part due to the larger search space resulting from the added auxiliary variables. The extra layer of variables can also be detrimental in terms of potential additional constraints available for the IP solver; see [5, Example 1].

Figure 3 compares the use of the SCIP 3.0.1 IP solver to CPLEX 12.6.0 (left) and the MiniSat 2.2 solver to Lingeling (right) as the MIP and SAT components, respectively, in LMHS. The combination of CPLEX and MiniSat was mainly used during the development of LMHS, so these components can be expected to perform better in the default configuration. Figure 3 plots per-instance runtimes for solved instances, and clearly shows better results with CPLEX and MiniSat. However, although SCIP and Lingeling results in worse performace overall compared to CPLEX and MiniSat, there is significant number of instances which they enable solving faster. This suggests that choosing a combination of a SAT solver and an IP solver on a per-instance basis could result in improved performance. Additionally, more in-depth analysis of which instances are best suited to each solver component could yield interesting further insights.

5 Availability and Conclusions

LMHS is competitive with the current state-of-the-art in MaxSAT solvers, recently having reached top positions in the 2015 MaxSAT Evaluation. LMHS integrates MaxSAT preprocessing into the solver. LMHS was designed to be flexible in allowing for integrating different SAT and IP solvers. The solver is open source and released under the MIT license at http://www.cs.helsinki.fi/group/coreo/lmhs/.