1 Introduction and Background

If the performance of a constraint model is found to be inadequate, a natural step is to consider adding constraints to the model in order to assist the constraint solver in detecting dead ends in search and therefore reducing overall search effort. One approach is to add implied constraints, which can be inferred from the initial model and are therefore guaranteed to be sound. Effective implied constraints have been found both by hand [17, 18] and via automated methods [10, 11, 19]. If implied constraints cannot be found, or improve performance insufficiently, for satisfiable problemsFootnote 1 a more aggressive step is to add streamliner constraints [22], which are not guaranteed to be sound but are designed to reduce significantly the search space while permitting at least one solution. For several problem classes, effective streamliners have been found by hand by looking for patterns in the solutions of small instances of those classes [22, 24, 26, 27].

Wetter et al. [40] demonstrated how to generate effective streamliners automatically from the specification of a constraint problem class in the abstract constraint specification language Essence [14,15,16]. This method, which we expand upon, exploits the structure apparent in an Essence specification, such as that of the Progressive Party Problem (Fig. 1), to produce candidate streamliners via a set of streamliner generation rules. An effective streamliner that we automatically generate for this problem class constrains approximately half of the entries in the sched set variable to be monotonically increasing functions.

Using training instances drawn from the problem class under consideration, streamliner candidates are evaluated via a toolchain consisting of the automated constraint modelling tools Conjure [1,2,3,4] and Savile Row [31,32,33], and the constraint solver Minion [21]. Promising candidates, which retain at least one solution to the training instances while significantly reducing search, are used to solve more difficult instances from the same problem class. Candidate streamliners are often most effectively used in combination. For example, Wetter et al. presented an effective combination of three streamliners for the Van der Waerden numbers problem. Hence, the space of streamlined models forms a lattice where the root is the original Essence specification and an edge represents the addition of a streamliner to the combination associated with the parent node.

A shortcoming of Wetter et al.’s work is the uninformed manner in which the streamliner lattice is explored using depth- or breadth-first search. Our principal contribution is a new method for exploring the lattice via Monte Carlo-style search, allowing more effective streamlined models to be found in a time budget. A second contribution is a set of new streamliner generator rules for sequence and matrix Essence type constructors to complement those presented by Wetter et al. Finally, we demonstrate the efficacy of our approach on a variety of problems.

Fig. 1.
figure 1

The Progressive Party Problem [36] in Essence. There are two abstract decision variables, a set of host boats, and a set of functions from boats to boats representing the assignment of guests to hosts in each period. From this very concise, structured statement of the problem, 160 candidate streamliners can be generated by our system.

2 Essence Specifications and Streamliner Generators

An Essence specification such as that presented in Fig. 1 identifies: the input parameters of the problem class (given), whose values define an instance; the combinatorial objects to be found (find); the constraints the objects must satisfy (such that); identifiers declared (letting); and an (optional) objective function (min/maximising). The key feature of the language is its support for abstract decision variables, such as set, multiset, relation and function, as well as nested types, such as the set of functions found in Fig. 1.

A concise, structured specification of a problem class directly supports the generation of powerful candidate streamliners: in the example, it is readily apparent that the problem requires us to find a set of boats and a set of functions assigning guest boat crews to hosts. Hence, streamliners related to sets and functions, such as that given in the introduction, can be generated straightforwardly. In contrast an equivalent constraint model in, for example, MiniZinc [30] or Essence Prime [32] has to represent these abstract decision variables with constrained collections of more primitive (e.g. integer domain) variables, such as the matrix model [12, 13] proposed by Smith et al. [36]. In this context, it is significantly more difficult to recognise the structure (i.e. the set and set of functions) in the problem and generate the equivalent streamliners.

Fig. 2.
figure 2

Streamliner generators for sequence and matrix domains.

Wetter et al. present a set of streamliner generation rules for the Essence type constructors set, function and partition, as well as simple integer domains [40]. Our first contribution is to extend this set also to cover sequence and matrix type constructors. These are summarised in Fig. 2. For decision variables with matrix domains, one generator (matrix all) takes another streamliner generator as a parameter (R, as a simple example: constrain an integer variable to take an even value) and lifts it to operate on all entries in a matrix. This rule can be applied to higher dimensional matrix domains as well, in which case the multi-dimensional matrix domain is interpreted in the same way as a series of nested one-dimensional matrix domains. The generators matrix most (and matrix half and matrix approximately half) operate in a similar way. In contrast to the matrix all streamliner generator, these generators first reify the result of applying R, and then restrict the number of places the constraint must hold.

For sequence domains, we present two sets of first-order streamliner generators: monotonically increasing (or decreasing) and smallest (or largest) first. These do not take another generator as a parameter but directly post constraints on the sequence decision variable. The sequence on range and sequence on defined generators take an existing streamliner generator as a parameter and lift it to work on the range or the defined set of the sequence domain respectively.

3 Monte Carlo Search for Streamliner Combinations

Le Bras et al. [25] and Wetter et al. [40] both observed that by applying several streamlining constraints to a model simultaneously the search required for finding the first solution can be reduced further than by applying the streamliners in isolation. Finding an effective streamliner combination involves searching the streamliner lattice, the size of which is determined by the power set of all candidate streamliners for a given problem class. Table 1 presents the number of candidate streamliners our current set of generation rules produces for a number of problem classes. In some cases the number of candidates generated is small. However, the cost of evaluating each combination on a set of test instances means that it is typically not feasible to evaluate all possible streamliner combinations.

Wetter et al. employed breadth-first and depth-first search to explore the streamliner lattice in an uninformed manner. The motivation for our work is the hypothesis that a best-first search can allow more effective streamliner combinations to be discovered within a given time budget. Our approach is to focus the search onto areas of the lattice where the streamliners combine to give the greatest reduction in search while retaining at least one solution.

For a given problem class, we have no prior knowledge of the performance of the set of candidate streamliners, either individually or in combination. This raises the issue of the exploration/exploitation problem: if we can identify a combination of streamliners that performs well, should we try and exploit that combination further by evaluating the addition of further streamliners, or should we explore other parts of the lattice that may at present seem less promising?

The exploration/exploitation tradeoff can be formalised in several reinforcement learning variants, including via markov decision processes [7]. We model this situation as a multi-armed bandit problem [5], which allows us to employ regret minimising algorithms to deal with the exploration/exploitation dilemma. The multi-armed bandit can be seen as a set of real distributions, each distribution being associated with the rewards delivered by one of the K levers. Since the multi-armed bandit problem assumes that each lever corresponds to an independent action, in order to use it directly we would have to associate a lever with each point in the streamliner lattice, which is infeasible in general. Instead, we use the bandit algorithm to guide the exploration of the lattice in a process reminiscent of Monte Carlo Tree Search (MCTS) [8], as described below.

3.1 Algorithm Outline

Our algorithm has the same four basic steps as MCTS. It uses Upper Confidence bound applied to Trees (UCT) [8] to balance exploration and exploitation.

  1. 1.

    Selection: Starting at the root node, the UCT policy is recursively applied to traverse the explored part of the lattice until an unexpanded, but expandable node is reached. A node is expandable if it has at least one child that is not marked as pruned (Sect. 3.5). A child node is selected to maximise:

    $$ UCT = X_j + 2C_p \sqrt{\frac{2\ln n}{n_j}}$$

    where n is the number of times the current (parent) node in the lattice has been visited, \(n_j\) the number of times child j has been visited, \(X_j\) is the mean reward associated with child j and \(C_p>\) 0 is a constant [8].

  2. 2.

    Expansion: Enumerate the children of the Selected node and choose one to expand according to the heuristic explained in Sect. 3.4.

  3. 3.

    Simulation: The collection of streamliners associated with the expanded node are evaluated using the Conjure , Savile Row , and Minion toolchain.

  4. 4.

    BackPropagation: The result of the evaluation is propagated back up through the lattice to update reward values, as explained below.

3.2 Back Propagation

Since our search is operating over a lattice, a node may have multiple parents. This requires an alteration to the back propagation employed in MCTS: when we perform back propagation that reward value is back propagated up all paths from that node to the root. To illustrate consider a problem with two streamliners {A,B} and we are back propagating from a node in the lattice representing the combination {AB}. There are two paths by which this node could have been reached, {root \(\rightarrow \) A \(\rightarrow \) B} and {root \(\rightarrow \) B \(\rightarrow \) A}. Although the algorithm will have only descended one of these paths, because the reward value of a node in the lattice is representative of the ability of the streamliner combination represented by that node to combine and produce effective reductions in search, the node in the lattice that represents streamliner combination {B} should also receive this reward. For this reason both paths are rewarded accordingly and the reward generated is back propagated up all paths from that node to the root. We also ensure that a node that lies on more than one such path is rewarded only once. The cost of back propagation thus grows exponentially with depth. However, since each level of the lattice represents an additional constraint it is unlikely that satisfiability is maintained at great enough depths for this to become an issue. Empirically, the cost is insignificant relative to solving the training instances.

We must also consider the situation where a node in a path back to the root has not yet been expanded. If we ignore such nodes, their true reward is not reflected in their reward values because all reward values back propagated from child nodes prior to their creation are lost. Our approach is that when a node is expanded, it absorbs the reward value and visit count of its immediate children that have already been expanded. This avoids caching a potentially large set of values while maintaining reward values for nodes around the focus of our search.

3.3 Simulation Reward

The performance of our best first search algorithm is heavily reliant on how the reward is produced from the simulation step. Initially we assigned rewards as follows: if the majority of the instances evaluated are unsatisfiable a reward of −1 is back-propagated, otherwise a reward of one minus the average reduction in search space (expressed in search nodes) is returned. While this is valid, our initial experiments revealed that its effect was to produce a search strategy similar to breadth-first search - i.e. too strong a bias towards exploration.

The reason for this is that the penalty value is too punitive when evaluating larger streamliner combinations. Intuitively, the penalty should be sensitive to the depth we have travelled into the lattice: as we add streamliners we reduce the search space and we expect the probability of such failure to rise. Therefore we divide the penalty value by the depth of the node being evaluated, allowing the prolonged exploration of promising paths.

3.4 Expansion Heuristic

The order of expansion of child nodes is an important factor in performance. An expanded child consumes time to perform simulation and, because the simulation reward is back propagated, if a penalty is awarded it can affect the likelihood of the parent node being selected on the next iteration. During the expansion phase of our algorithm child nodes are expanded in descending likelihood of the application of the associated streamliner combination resulting in a satisfiable problem. In order to facilitate this, when a successful simulation is performed, for a representative instance the solution found is stored, along with the approximate size of its search space (via the product of the domains of the decision variables in the model) and the proportion of the space explored to find the solution.

Upon expansion all potential children are enumerated and for each we check whether the additional associated streamliner invalidates the solution stored at the expanding parent. If the solution remains valid then the child is preferred for expansion because we know pre-simulation that the associated streamliner combination satisfies at least one instance and the additional streamliner might reduce search. If the solution is invalidated then the search space explored by the child is smaller than the expanding parent. We use the proportion of search space explored to find the solution associated with the expanding parent to estimate the likelihood of a solution existing in that subspace. Intuitively, if the parent explored a large fraction of the space then it is less likely that a solution will be found when adding the streamliner associated with the child node.

3.5 Pruning the Streamliner Lattice

As per Sect. 3.2, when a simulation for a streamliner combination reveals that the majority of training instances are unsatisfiable, a penalty is back propagated up the lattice. We also mark the node associated with the simulation as pruned and never consider any of its children for expansion. In addition, we prune nodes whose additional streamliner is determined to be redundant in combination with those inherited from the expanding parent, in the sense that it causes no further reduction in search on the evaluated instances. Pruning the lattice by these two methods typically reduces the number of nodes to be expanded very significantly.

4 Empirical Evaluation

We evaluate two hypotheses empirically. First, that the best-first search is more effective in exploring the streamliner lattice than the simpler depth- and breadth-first search methods employed in [40]. Second, that our method is able to automatically find streamliner combinations that drastically reduce the search space across a variety of problem classes.

We experiment with thirteen problem classes, eight from Wetter et al. and the remainder selected for variety, particularly problem classes requiring significantly more instance data such as SONET [28]. Streamlining can aid in the search for feasible solutions of optimisation problems, but not the proof of optimality. Hence, in our experiments we transformed optimisation into decision problems by the standard method of bounding the objective and searching for a satisfying solution. The results we obtain are very positive, as presented in Table 1.

Table 1. For each of the thirteen problem classes, fifteen instances were split 70/30 into training and test sets. All three methods received the same training budget of six hours per problem class, a cost which is amortised over the entire problem class for which the streamliners are applicable. We record the number of candidate streamliners and the mean time to solution for the test instances using non-streamlined models in columns 2 and 3. For the substantial majority of the problem classes the most effective streamliner discovered is composed of a combination of individual streamlining constraints, as presented in columns 4, 6 and 8. In these cases the Monte Carlo search method is able to discover larger combinations that yield superior results. For the problem classes where a single streamliner was found to be the most effective all methods are equally effective, as through pruning (Sect. 3.5) we were able to exhaustively search the space of all streamliner combinations. Mean reduction for both time and search nodes is measured by comparing the search required to find the first solution on the non-streamlined model with the model streamlined with the most effective streamliner combination found using the respective search method. The selected streamliners for each problem class retained satisfiability on all test instances. Through the addition of streamlining constraints we obtain a uniformly vast reduction in both search nodes and time. All computational experiments were run on a 32-core AMD Opteron 6272 at 2.1 GHz and an 8 core Intel Core i7-6920HQ at 2.90 GHz. All experiments for an individual problem class were run on a single machine.

5 Conclusion

We have presented a new method for the automated generation of streamlined constraint models from a large set of candidates via Monte Carlo search. Our method is efficient in searching the space of candidates, producing more effective streamlined models in less time than less informed approaches. Our empirical results demonstrate a vast reduction in search across a variety of benchmarks.

As part of future work we plan to explore the generation of streamlined versions of alternative models generated by Conjure . We expect the utility of particular streamlining constraints to vary depending on the model.