Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Informal Introduction to CSP

For a formal definition please skip to the next section. A constraint satisfaction problem consists of a set of variables and each variable must be assigned one value from its finite set of values, called its domain. A set of constraints restricts certain simultaneous assignments. In most GlossaryTerm

CSP

s, the objective is to search for a simultaneous assignment of all the variables such that all constraints are satisfied, i. e., no forbidden simultaneous assignment from the set of constraints is used.

A famous example is the SEND MORE MONEY puzzle, where each letter must be replaced by a unique number such that the following sum holds [5]

S E N D + M O R E = M O N E Y .

In this GlossaryTerm

CSP

, the variables are S , E , N , D , M , O , R , Y and the domains are { 1 , , 9 } for S , M and { 0 , , 9 } for E , N , D , O , R , Y . The constraint can be also written as 1000 × S + 100 × E + 10 × N + D + 1000 × M + 100 × O + 10 × R + E = 10000 × M + 1000 × O + 100 × N + 10 × E + Y . Every GlossaryTerm

CSP

A can be rewritten into an another GlossaryTerm

CSP

B where a bijective mapping exists between the solutions of A and B, which follows from the reducibility theorem from complexity theory [6]. The solution to this GlossaryTerm

CSP

is the assignment S = 9 , E = 5 , N = 6 , D = 7 , M = 1 , O = 0 , R = 8 , Y = 2 , which uniquely satisfies the constraint.

Other very well-known constraint satisfaction problems are map coloring, more commonly known as vertex coloring (Sect. 65.5.2), and the recreational game Sudoku, which is equivalent to completing a graph 9-coloring problem on a given specific graph with 81 vertices. A specific GlossaryTerm

EC

solution is provided by Lewis [7]. Quite a lot of constraint satisfaction problems exist; we will first look at GlossaryTerm

CSP

in general within the context of GlossaryTerm

EC

as problem solvers. Then we will discuss several specific constraint satisfaction problems and the particular GlossaryTerm

EC

approaches applied to these problems. Last, we will provide a brief overview on using GlossaryTerm

EC

for generating problem instances for GlossaryTerm

CSP

.

2 Formal Definitions

Slightly different, but equivalent, formal definitions of GlossaryTerm

CSP

exist. The most common definition is:

Definition 65.1 (Constraint Satisfaction Problem)

is a triple V , D , C :

  • V is an n-tuple of variables V = v 1 , v 2 , , v n ,

  • Each v V has a corresponding m-tuple of values called its domains, D v = d 1 , d 2 , , d m of which it can be assigned one and

  • C = C 1 , , C t is a t-tuple of constraints where each c C restricts certain simultaneous variable assignments to occur.

The definition of a constraint is often reversed in the literature, where generic GlossaryTerm

CSP

is discussed in that constraints are defined as the set of assignments that are allowed rather than restricted. Note, in generic GlossaryTerm

CSP

literature, variables are often denoted with X, whereas in graph-oriented problem domains such as graph coloring and maximum clique, V is adopted.

Definition 65.2 (Solution to a CSP)

is an assignment of variables ( d 1 , , d n ) D 1 × × D n such that for every constraint c C on x i 1 , , x i m : ( d i 1 , , d i m ) c .

In the context of one constraint c, we say an assignment of variables satisfies the constraint c if the assignment is in c or violates the constraint c if the assignment is not in c. A GlossaryTerm

CSP

can be insoluble – more commonly written as insolvable, which means every assignment of variables will violate at least one constraint.

constraint solver is an algorithm that takes as input a GlossaryTerm

CSP

and produces as output either a solution or a proof that no solution exists or a notification of failure. The input is often referred to as a problem instance, as a GlossaryTerm

CSP

is often defined to cover a class of problems such as, 3-satisfiability. The output can be more than one solution, in fact it could be every solution. However, as GlossaryTerm

EC

techniques are based on sampling, in principle they cannot proof that every solution has been found, which is referred to as not complete. Moreover, they cannot proof no solution exists, which is referred to as not sound. Therefore, constraint solvers based on GlossaryTerm

EC

and other heuristic approaches often terminate after a certain criterion is met, e. g., a predefined elapsed time is reached in terms of the number of solutions evaluated, the computation time spent, or a certain convergence of the population reached.

We recommend the following books for further reading on constraint satisfaction. For the foundations of the problem and basic algorithms, Tsang [8]; for an introduction with comprehensive overview of constraint programming techniques, Dechter [9] and Lecoutre [10]; and for a more theoretical approach Apt [1] and Chen [11].

3 Solving CSP with Evolutionary Algorithms

In this chapter we will restrict ourself to covering the conceptual mapping required to solve a GlossaryTerm

CSP

with an evolutionary algorithm. This mapping will consist of choosing a representation for the problem and a corresponding fitness function to determine the quality of a solution. Once this mapping is complete, the evolutionary algorithm will require other components, such as appropriate variation operators, selection mechanisms, and a suitable initialization method for the population and termination criteria. All these, and other optional variants can be found elsewhere in the handbook.

We will explain the two most common mappings using the well-known n-queens on an n × n -chessboard problem. These mappings are direct encoding and indirect encoding. First we introduce a conceptual definition of the problem.

The n-queens problem requires the placing of n queens on an n × n chessboard such that no queen attacks any of the other n - 1 queens. Thus, a solution requires that no two queens share the same row, column, or diagonal. Several common formal definitions of the problem exist. The most common is to define n variables { q 1 , , q n } , where each variable q has a domain that consists of the row position the queen will be placed on in its corresponding unique column, i. e., q { 1 , , n } i = 1 , , n . The set of constraints consists of q i q j (i. e., not in the same row) and | q i - q j | | i - j | i , j = 1 , , n (i. e., not in the same diagonal).

The n-queens problem is no longer considered a challenging problem as it has a structure that can be exploited to solve very large problems of over 9 million queens by repeating a pattern [12]. It is, however, an excellent problem for explaining characteristics of constraint satisfaction problems and their solvers due to the simple 2-D spatial nature of the problem. For instance, to explain symmetry in GlossaryTerm

CSP

, the 8-queens problem can be used to show it has 12 unique solutions, as shown in Fig. 65.1 out of the 92 distinct solutions when removing variants due to rotational and reflection symmetry.

Fig. 65.1
figure 1figure 1

The 12 unique solutions under symmetry via rotations and reflections for the 8-queens problem

3.1 Direct Encoding

With a direct encoding the genotype consists of a vector g where each element corresponds uniquely to one variable of the GlossaryTerm

CSP

; an element g i contains values directly from the domain of its corresponding variable D i . A wide variety of genetic operators both for mutation and recombination are applicable to this encoding and can be found in [13]. Most of these operators will be called discrete or mixed-integer operations.

The genotype is mapped to the phenotype by taking into consideration the constraints; it requires a measurement for determining the quality of candidate solutions. Thus, we need to introduce a fitness function. The most common fitness function takes the sum of all constraints violated by a candidate solution

fitness ( g ) = c C violated ( c ) , where violated ( c ) = { 1 if c violated by g 0 if c satisfied by g . .

The fitness should be minimized and once it reaches zero, a solution has been found.

3.2 Indirect Encoding

With an indirect encoding the genotype first needs to be transformed into a full or partial assignment of the variables of the GlossaryTerm

CSP

. It is also referred to as local search depending on the level of sophistication; these transformations range from as simple as a greedy assignment all the way to sound search algorithms evaluating a small part of the GlossaryTerm

CSP

.

The most common approach for this representation takes as a genotype the permutation of variables of the GlossaryTerm

CSP

. Many genetic operators are designed to maintain a permutation and several are explained in the Handbook of Evolutionary Computation [13]. The permutation is the input to the local search and determines the order in which variables are processed; processing a variable involves trying to assign a value such that no constraint is violated and perhaps further steps if no value can be assigned without violating at least one constraint.

More advanced encodings may also include the ordering in which to consider values from each variable’s domain. From constraint programming we know that the order in which variables and values are considered has a huge impact on the efficiency of search algorithms [14]; more often it is the search method that determines the order using a particular heuristics such as choosing the next vertex with the maximum saturation degree, as is used in DSatur [15]. The saturation degree for a vertex is defined as the total number of colors used for coloring its neighbors. The principle has been used in many algorithms since its introduction in 1979.

The most common fitness function used with indirect encoding simply counts the number of unassigned variables after the local search terminates. Note that two different strategies will influence the resolution of this function. If the local search terminates after it first encounters a variable it cannot assign, then many candidate solutions will have the same fitness but can still be very different. On the other hand, terminating after all variables have been considered will give a richer landscape to consider but may incur more computational effort. See [16] for a comprehensive theoretical and empirical analysis of sampling in GlossaryTerm

EC

.

3.3 General Techniques to Improve Performance

Over the past two decades, many techniques were developed to improve the efficiency and/or the effectiveness of GlossaryTerm

EC

for solving constraint satisfaction problems. Only a handful of these techniques were evaluated on more than one problem. Hence, we cannot draw any general conclusions about the success of these techniques. Even worse is that many studies will show improvement only compared to their previous results or compare their results with an algorithm that has already been superseded in terms of performance by many other techniques. Often the set of competitor algorithms is chosen to fall within GlossaryTerm

EC

, which severely limits the strength of the competition. Therefore, we will discuss techniques for improving performance in the context of the problems they were developed for. Section 65.5 reviews several popular GlossaryTerm

CSP

s used for developing more efficient and effective evolutionary algorithms.

One approach that has been applied to several GlossaryTerm

CSP

s with varying success is that of assigning weights to constraints to allow biasing the search towards satisfying certain constraints; in the first experiments this approach was referred to as penalty functions [17]. Moreover, the search can be influenced dynamically by adapting weights according to heuristics, such as increasing the weight of the constraint that has been satisfied the least number of times recently [18]. The origin of this idea can be found in the self-adaptation used in evolution strategies [19].

With penalty functions, the optimization objectives replacing the constraints are traditionally viewed as penalties for constraint violation, hence to be minimized [20]. There are two basic types of penalties:

  1. 1.

    Penalty for violated constraints

  2. 2.

    Penalty for wrongly instantiated variables.

Formally, let us assume that we have constraints c i ( i = { 1 , , m } ) and variables v j ( j = { 1 , , n } ) . Let C j be the set of constraints involving variable v j . Then the penalties relative to the two options described above can be expressed as follows:

  1. 1.

    f 1 ( s ) = i = 1 m w i × χ ( s , c i ) , where

    χ ( s , c i ) = { 1 if s violates c i 0 otherwise . ,
  2. 2.

    f 2 ( s ) = j = 1 n w j × χ ( s , C j ) , where

    χ ( s , C j ) = { 1 if s violates at least one c C j 0 otherwise , . ,

    where the w i and w j are weights that correspond to a constraint and a variable, respectively. These will be important later on, for now we assume all these weights equal to 1.

Obviously, for each of the above functions f { f 1 , f 2 } and for each s S we have that ϕ ( s ) = true if and only if f ( s ) = 0 . For instance, in the graph 3-coloring problem the vertices of a given graph G = ( V , E ) , E V × V , have to be colored by three colors in such a way that no neighboring vertices, i. e., graph nodes connected by an edge, have the same color. This problem can be formalized by means of a GlossaryTerm

CSP

with n = | V | variables, each with the same domain D = { 1 , 2 , 3 } . Furthermore, we have m = | E | constraints, one for each edge e = ( k , l ) E , with c e ( s ) = true if and only if s k s l . Then the corresponding GlossaryTerm

CSP

is S , ϕ , where S = D n and ϕ ( s ) = e E c e . Using the constraint-oriented penalty function f 1 with w i = 1 for all i = { 1 , , m } we count the incorrect edges that connect two vertices with the same color. The variable-oriented penalty function f 2 with  w i = 1 for all i = { 1 , , m } amounts to counting the incorrect vertices that have a neighbor with the same color.

Advantages of indirect encoding:

  • Introduces in general, e. g., f 1 , f 2 are problem-independent penalty functions

  • Reduces problem to simple optimization

  • Allows user preferences by weights.

Disadvantages of indirect encoding:

  • Loss of information by packing everything in a single number

  • In the case of constrained optimization (as opposed to GlossaryTerm

    CSP

    as we are handling here) f 1 , f 2 are reported to be weak [21].

4 Performance Indicators

An understanding of the efficiency and effectiveness is vital when choosing which solver to use or when developing an algorithm to deal with a specific GlossaryTerm

CSP

. In this section we briefly explain measures for determining these properties in the context of solving GlossaryTerm

CSP

. However, these properties must be measured using a suite of benchmark instances and, as EAs are generally randomized algorithms, with multiple independent runs of the algorithm on each instance. Choosing an appropriate suite of benchmark instances is paramount to making decisions on which algorithm, parameter setting, or next algorithmic feature to add.

In a sense, the search for a good algorithm is in itself an optimization problem. The suite of benchmark instances represents only the problem, just like training data in a machine learning problem represents all data possibly encountered. Changing an algorithm and tuning its parameters on the same small suite of instances could lead to over-fitting [22, 23], which in turn means the algorithm will have a poorer performance in the general case. Therefore, the first step should be to characterize the problem well and have a good representation, e. g., spread, of the instances possibly encountered when deployed.

4.1 Efficiency

The time taken by an algorithm to provide a solution is an important factor. Even more so in situations where solutions are required in real time. Much research is devoted to speeding up algorithms, either by cleverly exploiting properties of the problem, by parallelization, or by balancing aspects of the quality of the solution.

The most common approach to measuring the efficiency of evolutionary algorithms is by counting the number of evaluations, i. e., the number of times the fitness function is executed. This approach has several drawbacks. First, the approach allows comparison only with algorithms that use the exact same fitness function and spend the most significant part of their time on computing that function. Second, the computational complexity of the evolutionary algorithm may not be dependent on the fitness function. For instance, with the indirect encoding described in Sect. 65.3.2, much computational effort will go into the local search, whereas the computation of the fitness is trivial.

Another common approach is to measure time spent as reported by the operating system. This has even more drawbacks as the reported numbers will depend on the computer programming language used for implementing the algorithm, the compiler and its setting for translating the implementation into machine code, the architecture of the computer for executing the machine code, and the operating system for hosting the execution environment. Variations of these will have an affect on the reported results and, moreover, as these environments themselves change over time, future studies will find it hard to reproduce results accurately or even create meaningful comparisons to reported results.

A more meaningful solution is to count all the atomic operations that are directly related to the problem. The operations that must be included should be those that in theory increase exponentially in numbers with larger problems, as GlossaryTerm

CSP

fall under the class of non-polynomial deterministic problems. The most common operation will be a conflict check; this is also referred to as a constraint check, but in the strictest sense, a constraint check consists of multiple conflict checks [8]. For example, when solving the n-queens problem, every time the algorithm checks q i q j for any q i and q j , this should be recorded as one check. The same procedure should be followed for the constraint concerning diagonal attacks | q i - q j | | i - j | . The sum of all checks when the algorithm terminates is the computational effort spent.

By reporting the number of conflict checks we assure future studies can compare with current results as this measurement will not be affected by future changes in hardware and software environments. We are measuring a property of the algorithm here as opposed to a property of one implementation of the algorithm running in one particular environment.

It is important to note that there are subtle differences in the reporting used in different studies. Some studies report the average number of operations over all independent runs, including runs that are unsuccessful, i. e., where no solution was found. Other studies report the average number of operations to a solution, where only the runs that yield a solution are taken into account. The former method will produce higher averages than the latter if the success rate is less than 1.

4.2 Effectiveness

Efficiency is only one aspect of which to measure the success of a constraint solver. The other most important aspect is that of effectiveness, which measures how successful an algorithm is in finding or approximating a solution. The easiest and most commonly used measurement is that of the success rate, which is defined for an experiment as the number of runs in which an algorithm finds a solution divided by the total of number of runs of the same algorithm in that experiment. As no prior knowledge is required about whether problem instances are insolvable, this measurement is straightforward to implement.

Another popular measurement in combinatorial optimization is distance to the optimal solution. This measurement poses two challenges in the context of constraint satisfaction. Unlike a combinatorial optimization problem, which has the function to optimize, a GlossaryTerm

CSP

has no such function. As an alternative we could use the fitness function, but that is not an inherent property of the problem. Also, we often do not know whether a GlossaryTerm

CSP

has a solution and when it does not, then we do not know the optimal fitness function. Distance to the optimal solution is rarely used when solving GlossaryTerm

CSP

due to these impracticalities.

5 Specific Constraint Satisfaction Problems

Many specific constraint satisfaction problems have been addressed in the literature. A full overview of these would not provide much benefit, as the most likely scenario is that one is looking for papers that provide descriptions of algorithms and results with those algorithms on a certain problem. The exceptions to this are several problems that in the literature are used to drive the development of algorithms in terms of efficiency and effectiveness. These core problems are used over and over to test whether new algorithms are better than existing algorithms.

Several reasons exist for the choice of these problems. Their compact definition means that the problem is easy to replicate by everyone and quick to introduce in papers. The most popular problems were used in the 1970s when the theory on non-polynomial deterministic problems was developed, which were consequently seen as important intelligent building blocks. Also, test sets and later problem generators were released in the public domain, thereby providing easy access to test suites.

We will use several of these core problems to describe the progress of development in evolutionary computation for constraint satisfaction problems. For each problem we will provide a quick introduction, a justification of its importance in terms of practical applications, and a set of pointers to problem suites before describing the approaches used.

5.1 Boolean Satisfiability Problem

Given a Boolean formula ϕ determine whether an assignment of the variables in ϕ exists that makes it TRUE. It is often referred to as satisfiability and abbreviated to GlossaryTerm

SAT

 [24]. In GlossaryTerm

SAT

variables are often referred to as literals. Most often the problem is studied in conjunctive normal form (GlossaryTerm

CNF

) where ϕ is a conjunction of clauses where each clause is a disjunction of variables. Every GlossaryTerm

SAT

problem can be reduced to a GlossaryTerm

3-CNF-SAT

(three variables/clause-conjunctive normal form-satisfiability) [25], where each clause has three literals.

GlossaryTerm

3-CNF-SAT

was the first problem to be shown to be NP-complete [26]. It serves as an important basis to proving that other problems are NP-complete, such as the maximal clique problem. Such a proof involves a polynomial-time reduction from GlossaryTerm

3-CNF-SAT

to the other problem [6].

The following is an example of GlossaryTerm

3-CNF-SAT

:

  • ϕ = ( x 1 ¬ x 3 x 4 ) ( ¬ x 2 x 1 ¬ x 6 ) ( x 3 x 2 ¬ x 5 )

  • A solution: x 1 = 1 , x 2 = 0 , x 3 = 1 , x 4 = 0 , x 5 = 0 , x 6 = 0 .

Important practical applications of GlossaryTerm

SAT

are model checking [27], for example, in mathematical proof planning [28], generic planning problems, especially using the planning domain definition language (GlossaryTerm

PDDL

) [29], test pattern generation [30], and haplotyping in the scientific field of bioinformatics [31].

As far as the development of efficient and effective GlossaryTerm

CSP

solvers go, GlossaryTerm

SAT

is the most active field. It has an annual conference – The International Conference on Theory and Applications of Satisfiability Testing, which also hosts an annual competition to determine the current best solvers. The latter also ensures that new problem instances are continuously added, which prevents what is called overfitting [32] of the solvers to an existing set of problem instances.

The general approach to solve satisfiability with GlossaryTerm

EC

is to directly represent the variables in ϕ and assign these either TRUE or FALSE, i. e., these form the domain. The fitness function used is the number of clauses violated, which should be minimized.

The earliest evolutionary algorithm for GlossaryTerm

SAT

was reported in 1994 by [33] and was soon followed by the work of Gottlieb and Voss [34, 35], who were looking to improve its performance. Soon after, independent efforts led to parallelized algorithms [36, 37]. In 2000, the first adaptive evolutionary algorithms were applied [38], which was 3 years after they were applied to graph coloring (Sect. 65.5.2).

The introduction of hybrid evolutionary algorithms with local search created a real boost of research activity [39, 40, 41, 42, 43]. However, a major issue remains with research on solving satisfiability with GlossaryTerm

EC

, as all studies include only local search and evolutionary algorithms without comparing to the state-of-art GlossaryTerm

DPLL

and heuristic solvers from the annual satisfiability community. This holds true even for recent studies such as [44]. Due to this major gap between the two communities of GlossaryTerm

EC

and GlossaryTerm

CP

, we do not comment on the comparison in terms of effectiveness and efficiency.

New research [45] focusses on using GlossaryTerm

EC

to evolve parameter settings for existing sound GlossaryTerm

SAT

solvers, mostly ones based on the Davis–Putnam–Logemann–Loveland algorithm [46]. All modern GlossaryTerm

SAT

solvers have many parameters to tune how the search is organized. These parameters are often tuned manually, which allows for only a small exploration. Using GlossaryTerm

EC

, a much larger space can be explored in order to create fast GlossaryTerm

SAT

solvers for a given benchmark.

5.2 Graph Coloring

Graph coloring has several variants. The most commonly used definition is that of graph k-coloring, also known as the vertex coloring problem. Given a graph of vertices and edges V , E the goal is to find a coloring of the vertices V of the graph such that no two adjacent vertices have the same coloring. If c ( v ) provides the color assigned to v, then v , w V : c ( v ) c ( w ) iff ( v , w ) E . The objective is to make use of k or less colors. The problem is known to be NP-complete for k 3 and to be decidable in linear time for k 2 .

Graph coloring is an abstract problem that lies at the core of many applications. Well-known applications are scheduling, most specifically timetabling [47], register allocation in compilers [48], and frequency assignment in wireless communication [49]. It is a well-studied problem as is shown by the number of entries in the best-kept bibliography source until April 2010 with over 450 publications contributing to vertex coloring [50].

The Second DIMACS Implementation Challenge in 1992–1993 focused on maximum clique, graph coloring, and satisfiability. The challenge provided not only a standard format for graph k-coloring problem instances, but also provided a set of problem instances that is still popular today. Soon after, in 1994, Culberson and Luo [51] created a problem instance generator, which can create problem instances with a known k and various other properties. Several other generators exist with specific goals, such as to hide cliques [52], to create register-interference graphs [53], and to create timetabling problems (Sect. 65.5.4).

The most straightforward approach to solving graph k-coloring with GlossaryTerm

EC

is to represent a genome as a vector of all variables of the problem. This vector can then undergo genetic operators suitable for integer representations. The fitness function is simply the number of violated constraints, which should be minimized until a solution is found when the fitness is equal to zero. Unfortunately, this approach leads to algorithms that are inefficient and ineffective [54].

To make GlossaryTerm

EC

more efficient and effective for solving graph k-coloring, new algorithms have been developed; these broadly fall into two categories. The first category consists of adding mechanisms that prevent the stagnation of search due to premature convergence. The second category consists of alternative representations that make use of decoders to map genotypes to phenotypes. The two categories are not mutually exclusive, and studies have included algorithms that combine mechanisms from both categories.

The earliest work on solving graph k-coloring with GlossaryTerm

EC

includes the following. Fleurent and Ferland successfully considered various hybrid evolutionary algorithms [55] with Tabu search and extended their work into a general implementation of heuristic search methods in [56]. Von Laszewski looked at structured operators and used adaption to improve the convergence rate of a genetic algorithm [57]. Davis designed an algorithm [58] to maximize the total of weights of nodes in a graph colored with a fixed number of colors. Coll etal [59] discussed graph coloring and crossover operators in a more general context.

Juhos and van Hemert introduced several heuristics [60, 61] for guiding the search of an evolutionary algorithm. All these heuristics depend on their novel representation that collapses the graph by combining nodes assigned with the same color into one hypernode, which speeds up further constraint checking as edges are merged into hyperedges [62]. This representation benefits both complete and heuristic methods.

Moreover, as shown in the results in Fig. 65.2, the evolutionary algorithms developed by Juhos and van Hemert are able to outperform a complete method (Backtracking-DSatur) on very difficult problem instances where the chromatic number is 10 or 20. These algorithms are unable to compete with the complete method for smaller chromatic numbers of 3 and 5.

Fig. 65.2 a–d
figure 2figure 2

Results of several evolutionary algorithms against the complete method Backtracking-DSatur; average minimum number of colors used through the phase transition

5.3 Binary Constraint Satisfaction Problems

binary constraint satisfaction problem (GlossaryTerm

BINCSP

) is a GlossaryTerm

CSP

where every constraint c C restricts at most two variables [63]. Often, network graphs are used to visualize (GlossaryTerm

CSP

) instances. In Fig. 65.3, we provide an example of a restricting hypergraph of a GlossaryTerm

BINCSP

. It consists of three variables V = { v 1 , v 2 , v 3 } , all of which have domain D = { a , b } . In a hypergraph every vertex corresponds to a possible variable assignment, i. e.,  v , d , where v V and d D v . Every edge indicates the variable assignments that are forbidden by the set of constraints C. In the example, we show all the edges that correspond to the following set of forbidden value pairs C = { { v 1 , a , v 2 , a } , { v 1 , a , v 3 , b } , { v 1 , b , v 2 , a } , { v 1 , b , v 2 , b } , { v 1 , b , v 3 , a } , { v 1 , b , v 3 , b } , { v 2 , a , v 3 , a } , { v 2 , a , v 3 , b } } .

Fig. 65.3
figure 3figure 3

Example of a  | V | -partite hypergraph of a (BINCSP) with one solution: { v 1 , a , v 2 , b , v 3 , a }

For problem instances, studies on GlossaryTerm

BINCSP

generally create large sets of instances using one of many problem instance generators. Several models to randomly create GlossaryTerm

BINCSP

s have been designed and analyzed [63, 64, 65]. All of these incorporate a set of parameters that may be used to control the size and difficulty of the problems. Often, these parameters can be used to create a set of problems that go through a phase transition. That is, we order the set on the parameters and observe how the algorithms behave when we move through the parameter space. In most constraint satisfaction problems we observe that the performance drops gradually until it reaches a minimum, after which it rises again. Most researchers test their algorithms in the region where the minimum is reached. Here the set of most difficult to solve problem instances is found. We will discuss these methods next.

The model most often used in empirical research on binary constraint satisfaction problems is one that uses four parameters to control, to some degree, the difficulty of an instance. By varying these global parameters one can characterize instances that are more likely to be either more or less difficult to solve. These parameters are: the number of variables n = | V | , the size of each variable’s domain m = | D v 1 | = | D v 2 | = = | D v n | , the density of constraints p 1, and the average tightness of all the constraints p 2. There are two ways of looking at parameters p 1 and p 2. We will use the following definitions.

Definition 65.3 (Density)

The density of a GlossaryTerm

BINCSP

is the ratio between the maximum number of constraints n 2 and the actual number of constraints | C | ,

p 1 = | C | n 2 .

Definition 65.4 (Tightness)

The tightness of a constraint c C over the variables v , w V of a GlossaryTerm

BINCSP

V , D , C is the ratio between the total number of forbidden variable assignments | c | and the total number of combinations of variable assignments possible m = | D v | | D w | ,

p 2 ( c ) = | c | m 2 .

Definition 65.5 (Average Tightness)

The average tightness of a GlossaryTerm

BINCSP

V , D , C is the sum of the tightness over all constraints divided by the number of constraints,

p 2 = c C p 2 ( c ) | C | .

These definitions give the density and tightness in terms of a ratio, or in other words, as the percentages of the maximum. Another way of looking at these two properties uses probabilities [66]. We could define the density of a GlossaryTerm

BINCSP

as the probability that a constraint exists between two variables. The tightness can be alternatively defined in an analogous way, as the probability that a conflict exists between two instantiations of two variables. The differences in these viewpoints becomes apparent in the different implementations of algorithms that generate GlossaryTerm

BINCSP

s, as with uniform generation the ratio in an instance is determined beforehand, while with probability the ratio will vary according to a normal distribution. When comparing studies it is important to know when probabilities are used whether the results reported are against the probability set or the actual measured ratio in the whole instance.

The simplest way to empirically test the performance of an algorithm on solving GlossaryTerm

CSP

s is by generating instances using different settings for the four main parameters, n, m, p1, and p 2 . However, there are two ways of choosing where to put constraints in a constraint network. We can choose the number of constraints we want to have beforehand and then uniformly distribute them in the constraint network. Alternatively, we can choose for each possible edge in the constraint network with the probability p 1 if this edge is inserted, i. e., a constraint is added. We will call the first model the uniform model and the second the probability model. The same categorization holds for nogoods. Given a constraint we can either distribute p 2 m 2 nogoods uniformly or with probability p 2 decide which value pairs become nogoods. Now we can define four different models and we will name them according to the models in [63, 65]. The models are shown in Table 65.1.

Definition 65.6 (Parameter Vector of a BINCSP)

parameter vector of a binary constraint satisfaction problem (GlossaryTerm

BINCSP

) with n variables and m as each variable’s domain size is a 4-tuple n , m , p 1 , p 2 of four parameters: the number of variables n, the domain size of each variable m, the density p 1, and the average tightness p 2 .

We can also characterize a set of binary constraints satisfaction problems using the parameter vector as a set B of GlossaryTerm

BINCSP

instances where

n , m , p 1 , p 2 , n , m , p 1 , p 2 B n = n m = m p 1 = p 1 p 2 = p 2 .

Such a set we call a suite of problem instances.

Table 65.1 Different models for the general method for generating binary constraint satisfaction problems

Achlioptas etal proves in [64] that as the number of variables becomes large almost all instances created by Models A–D become unsolvable. The reason lies in the existence of flawed variables. Whenever a variable v is involved in a constraint and has all its values incompatible with a value of an adjacent variable w, this variable is called flawed. In terms of compound labels using the constraint c over variables v and w this is written as,

v D v : w D w : satisfies ( ( v , v , w , w ) , c ) c C .

When the number of variables is increased without changing the other parameters, the number of flawed variables will increase, thus making it easy to prove instances have no solution. To overcome the problems a new model is proposed [64]:

Definition 65.7 (Model E)

The graph C Π is a random n-partite graph with m nodes in each part that is constructed by uniformly, independently, and with repetitions selecting p e n 2 m 2 edges out of the n 2 m 2 possible ones.

The idea behind this model is that the difficulty is controlled by the tightness and not influenced by the structure of the constraint network. The parameter p e is responsible for the average tightness of the GlossaryTerm

BINCSP

. However, it is not the same parameter as the average tightness p 2 . Because we allow repetitions in the process we end up with an average tightness smaller than or at most equal to p e .

Parameter p e also influences the value of p1. In [65] we find the proof that using Model E with fairly small values ( p e < 0.05 ) will result in a fully connected constraint network ( p 1 = 1 ) . This is seen as a flaw in Model E, as many problems do not require a fully connected constraint network. This has led to yet another model.

MacIntyre etal propose a more generalized version of Model E called Model F [65]. This model starts out the same way as Model E by generating p 1 p 2 m n 2 nogoods. Afterwards, a constraint network is generated with exactly p 1 n 2 edges in the uniform way. All nogoods that are not in a constraint in the constraint network are removed from the problem instance. Model E is the special case of Model F where p 1 = 1 . The benefit of Model F is the ability to generate problems where p 1 < 1 , which is more realistic towards real-world

Craenen etal [67] present the largest comparison study of GlossaryTerm

EC

and GlossaryTerm

CP

approaches for the GlossaryTerm

BINCSP

. In this study they compare the success rate and average number of conflict checks to a solution of 11 evolutionary algorithms. The best four evolutionary algorithms are compared with forward checking with conflict-directed backjumping [68], and the authors concluded the latter has a superior performance on every problem instance in the benchmark.

The following heuristic approaches are included in the study. In [69, 70], Eiben etal propose to incorporate existing GlossaryTerm

CSP

heuristics into genetic operators. A study on the performance of these heuristic-based operators when solving binary GlossaryTerm

CSP

s was published in [71]. Two heuristic-based genetic operators are specified: an asexual operator that transforms one individual into a new one and a multi-parent operator that generates one offspring using a number of parents. In [72, 73, 74], Riff-Rojas introduced an EA for solving GlossaryTerm

CSP

s that uses information about the constraint network in the fitness function and in the genetic operators (crossover and mutation). The fitness function is based on the notion of the error evaluation of a constraint. Marchiori etal introduced and investigated EAs for solving GlossaryTerm

CSP

s based on pre-processing and post-processing techniques [75, 76, 77]. Included in the comparison is the variant form [75, 78] that transforms constraints into a canonical form in such a way that there is only one single (type of) primitive constraint; we call this algorithm glass-box. This approach is used in constraint programming, where GlossaryTerm

CSP

s are given in implicit form by means of formulas of a given specification language. In [79, 80] Handa etal formulate a coevolutionary algorithm where a population of schemata are parasitic on the host population. Schemata in this algorithm are individuals where a portion of variables in the individual has values while all other variables have do-not-care symbols represented by asterisks.

The following approaches with emphasis on adaptive features are included in the comparison; a co-evolutionary approach invented by Paredis and evaluated on different problems, such as neural net learning [81], constraint satisfaction [81, 82], and searching for cellular automata that solve the density classification task [83]. Furthermore, results on the performance of the co-evolutionary approach when facing the task of solving binary GlossaryTerm

CSP

s are reported in [84, 85]. In the co-evolutionary approach for GlossaryTerm

CSP

s two populations evolve according to a predator-prey model: a population of candidate solutions and a population of constraints. In the approach proposed by Dozier etal in [86] and further refined and applied in [87, 88, 89], information about the constraints is incorporated both in the genetic operators and in the fitness function. In the microgenetic iterative descent algorithm the fitness function is adaptive and employs Morris’ breakout creating mechanism [90] to escape from local optima. The stepwise adaptation of weights mechanism was introduced by Eiben and van der Hauw [91, 92] as an improved version of the weight adaptation mechanism of Eiben etal [93, 94]. The approach has been studied in several comparisons and often proved to be a robust technique for solving several specific GlossaryTerm

CSP

s [95, 96, 97]. A comprehensive study of different parameters and genetic operators can be found in [98]. The basic idea is that constraints that are not satisfied or variables causing constraint violations after a certain number of steps must be hard, thus must be given a high weight (penalty) in the fitness function.

5.4 Examination Timetabling

Examination timetabling has been studied for many years as it is a common problem in many organizations. Already in 1986, Carter gave an extended survey of work on automated timetabling [99]. He is also responsible for providing problem instances, which are still available and popular today [100], although a more diverse benchmark is used in the annual timetabling competition [101]. Burke etal provide the most extensive recent surveys of automated timetabling in [102, 103]. Examination timetabling is just one of many problems under the topic of timetabling [104].

Timetabling as a problem has many different definitions due to different kinds of constraints and objectives. The definition that is most relevant for constraint satisfaction is often referred to as examination timetabling. The most abstract definition simply consists of a matrix C where C i , j = 1 if exam i conflicts with exam j by having common students that must take both exams, C i , j = 0 otherwise. This definition is equivalent to a graph coloring problem if the objective is to minimize the number of exam slots required, where the number of slots equals the number of colors required for coloring the graph with incidence matrix C. Hence, an appropriate approach to performance testing is via graph coloring instances based on examination timetabling, such as the problem instances labeled GlossaryTerm

SCH

(school) in the graph coloring instances suite provided by Lewandowski [105].

Many problem instances and problem instance generators exist. Infrequently, an International Timetabling Competition is organized by The International Series of Conferences on the Practice and Theory of Automated Timetabling. At each event, another definition of timetabling problems is tackled. The differences between definitions are in the objectives and the soft and hard constraints used. Hard constraints are treated the same as in constraint satisfaction, whereas soft constraints may be violated but will either incur an additional penalty on the objective function or be used to prioritize solutions otherwise, for instance, using a Pareto front. Corne etal [106] identified five categories of constraints, unary, binary, capacity, event spread, and agent preference.

Three approaches exist to solving timetabling problems. The first approach is called one-stage optimization . It aggregates all types of constraints of one problem, often by summation, into one objective function where each type is assigned a weight. The advantage is that, in principle, the approach can be applied to any set of constraints. In practice, it may prove difficult to optimize such a function. Representations of the problem fall into the two main categories direct encoding (Sect. 65.3.1) [107] and indirect encoding (Sect. 65.3.2) [106, 108].

The second approach is called two-stage optimization . It first solves the problem of finding a feasible solution where all the hard constraints are satisfied. In the second stage it searches within the space set with these hard constraints and optimizes only against the soft constraints. The benefits are that during search we do not have to distinguish between feasible and infeasible constraints and, therefore, are not in danger of the search wandering off into an infeasible part of the search space. Thompson and Dowsland [109] were the first to report on this approach using simulated annealing, closely followed by the first EA by Yu and Sung [110].

The third approach uses relaxation of constraints . Typically, relaxation in timetabling is achieved by not assigning events to slots or by adding additional time slots. An early example of an EA is by Burke etal [111], where an indirect encoding is used and additional time slots are used to relax the problem.

6 Creating Rather than Solving Problems

So far we have covered evolutionary computation for solving GlossaryTerm

CSP

. A contrasting idea proposed first for constraint satisfaction in [112] is to use evolutionary computation to generate problem instances. Such an approach allows a search for problem instances that adhere to certain properties as long as these can be measured efficiently by a fitness function.

A straightforward use for such an approach is to evolve problem instances that are difficult to solve for a particular algorithm. By measuring the efficiency of an algorithm to solve instances of a certain problem we can then change the instances with the aim of decreasing the efficiency. Measurements for efficiency of GlossaryTerm

EC

for GlossaryTerm

CSP

are discussed in Sect. 65.4.1. It is important to note that the algorithm we are evolving problem instances for can be of any kind, as long as we can execute it on problem instances generated and we can measure its efficiency.

Such hard problem instances identify the weak spots in the algorithm that tries to solve it. Moreover, if we can characterize a set of problem instances where all members of the set are hard for an algorithm, then we can use that characterization to decide what algorithm is suitable for solving a new problem instance. That is, if the work required to obtain the characteristics of one instance takes less effort than solving the actual problem instance itself [113].

6.1 Evolving Binary Constraint Satisfaction Problem Instances

The first application to constrained problems was for the binary constraint satisfaction problem (Sect. 65.5.3), where problem instances are represented as a binary vector with each element corresponding to the element of a conflict matrix between two variables [114]. Even the small instances investigated in the study led to large vectors, i. e., with 15 variables each with a domain of size 15, the corresponding vector has 15 2 15 2 = 23625 elements. Results with problem instances of this size show problem instances can be created that are far more difficult to solve than when creating a much larger set of randomly generated instances [112]. Furthermore, analysis of these instances provides an insight as to what structure is responsible for making instances difficult for the algorithm; two well-known algorithms from constraint programming were tested: chronological backtracking [115] and forward checking with conflict-directed backjumping [116].

6.2 Evolving Boolean Satisfiability Problem Instances

In [114] an evolutionary algorithm is used to evolve solvable Boolean satisfiability problem instances that are in conjunctive normal form and have three variables per clause. A 3-GlossaryTerm

SAT

problem is represented by a list of natural numbers. A number in the list, i. e., a gene, corresponds to a unique clause with three different literals. The number of possible unique clauses depends on the number of variables and the size of the clause. Here, the number of variables is set to 100 and the size of the clause is 3, hence there are 1313400 unique clauses. This representation has strong advantages over a simple one gene for every literal approach. Most importantly, it prevents duplicate variables in clauses, which reduces the state space and could otherwise introduce trivial clauses, e. g., ( x ¬ x y ) , or 2-GlossaryTerm

SAT

clauses, e. g., ( x x y ) . Also, the variation operators now simply become mutation and uniform crossover for lists of natural numbers over a fixed domain.

Two problem solvers are used from the annual GlossaryTerm

SAT

competition [117]; both are based on the Davis–Putnum procedure [4]. zChaff [118] is based on Chaff [119], a GlossaryTerm

SAT

solver that employs a particularly efficient implementation of Boolean constraint propagation and a novel low overhead decision strategy. Relsat [120] is explained in [121, 122]. In both solvers, the number of states of instantiations are enumerated to determine the search effort required.

The change of certain structural properties over the duration of evolution was analyzed. Two established properties were used: the number of solutions [123, 124] and the backbone size [125]. No clear relationship was identified with these properties.

However, a new relationship was identified: when problem instances are becoming more difficult to solve, the variance in the frequency in variable usage decreases. In other words, the distribution of variables throughout the instances is more uniform when problems are more difficult to solve.

6.3 Further Investigations

The application of evolutionary computation in problem generation is widespread. Smith-Miles and Lopes [126] provide an extensive review in terms of measuring instance difficulty in combinatorial optimization problems, which also discusses studies that evolve problem instances for constrained optimization as well as for constraint satisfaction problems.

The maximization of the effort required to solve a problem instance highlights only one aspect of the problem difficulty. Another aspect that looks at the effectiveness is to maximize the distance a solver is able to reach to the optimal solution. To compute this distance, we require the fitness of the optimal solution a priori. Note, however, we do not need to know what the optimal solution is, only its fitness. Another approach is to directly compare solvers by maximizing the difference in some aspect, e. g., efficiency or effectiveness, between two solvers.

7 Conclusions and Future Directions

Research on solving constraint satisfaction problems with evolutionary computation has produced a rich set of research papers that contribute solvers, insights into solvers and their performance, and heuristic subroutines. One major flaw in this research has remained consistent over the past 20 years: most studies compare performance results only to other evolutionary or closely associated techniques. Even recent studies, such as [127, 128, 129], restrict themselves to comparing only results from other heuristic methods or have not included alternative techniques at all.

Many studies report on the promising performance of a particular evolutionary algorithm over another existing heuristic technique. The few systematic studies that do compare evolutionary and constraint programming techniques conclude that constraint programming is superior in terms of efficiency [60, 67]. Also, constraint programming techniques are generally sound and, therefore, given sufficient time, always find a solution or proof that none exists. Hence, these solvers are more effective unless they are bounded by time. Recent efforts have shown success in speeding up modern GlossaryTerm

DPLL

-based techniques using heuristics for guiding the search [130, 131].

In Sect. 65.5 we reviewed many techniques that were developed and studied for the purpose of improving GlossaryTerm

EC

in terms of efficiency and effectiveness. The vast majority of these techniques was applied to one problem only. A huge benefit would come from studies that show the success of a technique across several GlossaryTerm

CSP

s. Such studies would be especially opportune for the GlossaryTerm

SAT

problem, which is still the most actively used GlossaryTerm

CSP

for benchmarking algorithms [132].