Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Constrained optimisation problems (COPs), specially non-linear ones, are very important and widespread in real world applications [1]. This has motivated introducing various algorithms to solve COPs. The focus of these algorithms is to handle the involved constraints. In order to deal with the constraints, various mechanisms have been adopted by evolutionary algorithms. These techniques include penalty function, decoder-based methods and special operators that separate the treatment of constraints and objective functions. For an overview of different types of methods we refer the reader to Mezura-Montes and Coello Coello [6].

With the increasing number of evolutionary algorithms, it is hard to predict which algorithm performs better for a newly given COP. Various benchmark sets such as CEC’10 [3] and BBOB’10 [2] have been proposed to evaluate the algorithm performances on continuous optimisation problems. The aim of these benchmarks is to find out which algorithm is good on which classes of problems. For constrained continuous optimisation problems, there has been an increasing interest to understanding problem features from a theoretical perspective [9]. The feature-based analysis of hardness for certain classes of algorithms is a relatively new research area. Such studies classify problems as hard or easy for a given algorithm based on the features of given instances. Initial studies in the context of continuous optimisation have recently been carried out in [4, 5]. Having enough knowledge on problem properties that make it hard or easy, we may choose the most suited algorithm to solve it. To do this, two steps approach has been proposed by Mersmann et al. [4]. First, one has to extract the important features from a group of investigated problems. Second, in order to build a prediction model, it is necessary to analyse the performance of various algorithms on these features. Feature-based analysis has also been used to gain new insights in algorithm performance for discrete optimisation problems [7, 10].

In this paper, we carry out a feature-based analysis for constrained continuous optimisation and generate a variety of problem instances from easy to hard ones by evolving constraints. This ensures that the knowledge obtained by analysing problem features covers a wide range of problem instances that are of particular interest. Although what makes a problem hard to solve is not a standalone feature, it is assumed that constraints are certainly important in COPs. Evolving constraints is a new technique to generate hard and easy instances. So far, the influence of one linear constraint has been studied [8]. However, real world problems have more than one linear constraint (such as linear, quadratic and their combination). Hence, our study is to generate COP instances to investigate which features of the linear and quadratic constraints make the COP hard to solve. To provide this knowledge, we need to use a common suitable evolutionary algorithm that handles the constraints. The \(\varepsilon \)-constrained differential evolution with an archive and gradient-based mutation (\(\varepsilon \)DEag) [12] is used. The \(\varepsilon \)DEag (winner of CEC 10 special session for constrained problems) is applied to generate hard and easy instances to analyse the impact of set of constraints on it.

Our results provide evidence on the capability of constraints (linear, quadratic or their set of combination) features to classify problem instances to easy and hard ones. Feature analysis by solving the generated instances with \(\varepsilon \)DEag enables us to obtain the knowledge of influence of constraints on problem hardness which could later could be used to design a successful prediction model for algorithm selection.

The rest of the paper is organised as follows. In Sect. 2, we introduce the constrained optimisation problems. Then, we discuss \(\varepsilon \)DEag algorithm that we use to solve the generated problem instances. Section 3 includes our approach to evolve and generate problem instances. Furthermore, the constraint features are discussed. In Sect. 4, we carry out the analysis of the linear and quadratic constraint features. Finally, we conclude with some remarks.

2 Preliminaries

2.1 Constrained Continuous Optimisation Problems

Constrained continuous optimisation problems are optimisation problems where a function f(x) on real-valued variables should be optimised with respect to a given set of constraints. Constraints are usually given by a set of inequalities and/or equalities. Without loss of generality, we present our approach for minimization problems.

Formally, we consider single-objective functions \(f :S \rightarrow \mathbb {R}\), with \(S \subseteq \mathbb {R}^n\). The constraints impose a feasible subset \(F \subseteq S\) of the search space S and the goal is to find an element \(x \in S \cap F\) that minimizes f.

We consider problems of the following form:

$$\begin{aligned} \begin{aligned} \text {minimize }&\quad f(x), \quad x = (x_1,\ldots ,x_n) \in \mathbb {R}^n \\ \text {subject to}&\quad g_i(x) \le 0 \quad \forall i \in \{1,\ldots ,q\}\\&\quad h_j(x) = 0 \quad \forall j \in \{q + 1,\ldots , p\} \end{aligned} \end{aligned}$$
(1)

where \(x = (x_1,x_2,\dots ,x_n)\) is an n dimensional vector and \(x \in S \cap F\). Also \(g_{i}(x)\) and \(h_{j}(x)\) are inequality and equality constraints respectively. Both inequality and equality constraints could be linear or nonlinear. To handle equality constraints, they are usually transformed into inequality constraints as \(|h_j(x)| \le \varepsilon \), where \(\varepsilon =10e^{-4}\) (used in [3]). Also, the feasible region \(F \subseteq S\) of the search space S is defined by

$$\begin{aligned} l_i \le x_i \le u_i,\quad \quad 1 \le i \le n \end{aligned}$$
(2)

where both \(l_i\) and \(u_i\) denote lower and upper bounds for the ith variable and \(1\le i \le n\) respectively.

2.2 \(\varepsilon \)DEag Algorithm

One of the most prominent evolutionary algorithms for COPs is \(\varepsilon \)-constrained differential evolution with an archive and gradient-based mutation (\(\varepsilon \)DEag). The algorithm is the winner of 2010 CEC competition for constrained continuous problems [3]. The \(\varepsilon \)DEag uses \(\varepsilon \)-constrained method to transform algorithms for unconstrained problems to constrained ones. It adopts \(\varepsilon \)-level comparison instead of ordinary ones to order the possible solutions. In other words, the lexicographic order is performed in which constraint violation (\(\phi (x)\)) has more priority and proceeds the function value (f(x)). This means feasibility is more important. Let \(f_{1}\),\(f_{2}\) and \(\phi _{1}\),\(\phi _{2}\) are objective function values and constraint violation at \(x_{1}\),\(x_{2}\) respectively. Hence, for all \(\varepsilon \ge 0\), the \(\varepsilon \)-level comparison of two candidates \((f_{1},\phi _{1})\) and \((f_{2},\phi _{2})\) is defined as the follows:

$$\begin{aligned} (f_{1},\phi _{1}) <_{\varepsilon } (f_{2},\phi _{2}) \iff {\left\{ \begin{array}{ll} f_{1} < f_{2}, &{} \quad \text {if} \quad \phi _{1},\phi _{2} \le \varepsilon \\ f_{1} < f_{2}, &{} \quad \text {if} \quad \phi _{1} = \phi _{2} \\ \phi _{1} < \phi _{2}, &{} \quad \text {otherwise} \\ \end{array}\right. } \end{aligned}$$

In order to improve the usability, efficiency and stability of the algorithm, an archive has been applied. Using it improves the diversity of individuals. For a detailed presentation of the algorithm, we refer the reader to [12].

3 Evolving Constraints

It is assumed that the role of constraints in problem difficulty is certainly important for COP. Hence, it is necessary to analyse various effects that constraint can impose on a constrained problem. Evolving constraints is a novel methodology to generate hard and easy instances based on the performance of the problem solver (optimisation algorithm).

3.1 Algorithm

In order to analyse the effects of constraints, the variety of them needs to be studied over a fixed objective function. First, constraint coefficients are randomly chosen to construct problem instances. Second, the generated COP is solved by a solver algorithm (\(\varepsilon \)DEag). Then, the required function evaluation number (FEN) to solve this instance is considered as the fitness value for evolving algorithm. This process is repeated until hard and easy instances of constraint problem are generated (see Fig. 1).

To generate hard and easy instances, we use the approach outlined in [8]. It uses fast and robust differential evolution (DE) proposed in [11] to evolve through the problem instances (by generating various constraint coefficients). It is necessary to note that the aim is to optimise (maximise/minimise) the FEN that is required by a solver to solve the generated problem. Also, to solve this generated problem instance and find the required FEN we use \(\varepsilon \)DEag as a solver. The termination condition of this algorithm (evolver) is set to reaching FENmax number of function evaluations or finding a solution close enough to the feasible optimum solution as follows:

$$\begin{aligned} |f(x_{optimum})-f(x_{best})| \le e^{-12} \end{aligned}$$
(3)

This process generates harder and easier problem instances until it reaches the certain number of generation for the DE algorithm (evolver). Once two distinct sets of easy and hard instances are ready, we start analysing various features of the constraints for these two categories. This could give us the knowledge to understand which features of constraints have more contribution to problem difficulty.

Fig. 1.
figure 1

Evolving constraints process

3.2 Evolving a Set of Inequality Constraints

We focus on analysing the effects of constraints (linear, quadratic and their combination) on the problem and algorithm difficulty. We extract features of constraints and analyse their effect on problem difficulty. The experimented constraints are linear and quadratic as the form of:

$$\begin{aligned} \text {linear constraint} \quad g(x) = b + a_{1}x_{1} + \ldots + a_{n}x_n \end{aligned}$$
(4)
$$\begin{aligned} \text {quadratic constraint} \quad g(x) = b +a_{1}x_{1}^{2} + a_{2}x_{1}\ldots + a_{2n-1}x_n^{2} +a_{2n}x_n \end{aligned}$$
(5)

or combination of them. We also consider various numbers of these constraints in this study. Here, \(x_{1},x_{2} \dots ,x_{n}\) are the variables from Eq. 1 and \(a_{1},a_{2} \dots ,a_{n}\) are coefficients within the lower and upper bounds (\(l_{c}, u_{c}\)). We construct COPs where the optimum of the experimented unconstrained problem is feasible. We use quadratic functions of the form of Eq. 5 (univariate) since it is more popular in recent constrained problem benchmarks. Also, the influence of each \(x_{n}\)s can be analysed independently (exponent 2). The optimum of the investigated problems is \(x^*=(0, \ldots , 0)\) and we ensure that this point is feasible by requiring \(b \le 0\), when evolving the constraints.

3.3 Constraints Features

We study a set of statistic based features that leads to generating hard and easy problem instances. These features are discussed as follows:

  • Constraint Coefficients Relationship: It is likely that the statistics such as standard deviation, population standard deviation and variance of the constraints coefficients can represent the constraints influences to problem difficulty. These constraint coefficients are \((b, a_{1}, a_{2},\dots ,a_{n})\) in Eqs. 4 and 5.

  • Shortest Distance: This feature is related to the shortest distance between the objective function optimum and constraint hyperplane. In this paper, the shortest distance to the known optimum from each constraint and their relations to each other is discussed. To find the shortest distance of optimum point \((x_{01},x_{02},\dots ,x_{0n})\) to the linear constraint hyperplane (\(a_{1}x_{1}+a_{2}x_{2}+ \dots +a_{n}x_{n}+ b=0\)) we use Eq. 6. Also, for quadratic constraint hyperplane (\(a_{1}x_{1}^{2} +a_{2}x_{1}\ldots +a_{(2n-1)}x_n^{2} +a_{2n}x_n +b =0\)) we need to find the minimum of Eq. 7.

    $$\begin{aligned} d_{\bot } = \frac{a_{1}x_{01}+a_{2}x_{02}+ \dots a_{n}x_{0n}+ b}{\sqrt{{a_{1}}^2 +{a_{2}}^2+ \dots +{a_{n}}^2 }} \end{aligned}$$
    (6)
    $$\begin{aligned} d_{\bot } = \sqrt{(x_{1}-x_{01})^2 +(x_{2}-x_{02})^2+\dots +(x_{n}-x_{0n})^2 } \end{aligned}$$
    (7)

    where \(d_{\bot }\) in Eq. 7 is the distance from a point to a quadratic hyperplane. Minimising the distance squared (\(d_{\bot }^2\)) is equivalent to minimising the distance \(d_{\bot }\).

  • Angle: This feature describes the angle of the constraints hyperplanes to each other. It is assumed that the angle between the constraints can influence problem difficulty. To calculate the angle between two linear hyperplanes, we need to find their normal vectors and angle between them using the following equation:

    $$\begin{aligned} \theta = \arccos \frac{n_{1} \cdot n_{1}}{|n_{1}||n_{2}|} \end{aligned}$$
    (8)

    where \(n_{1}\),\(n_{2}\) are normal vectors for two hyperplanes. Also, the angle between two quadratic constraints is the angle between two tangent hyperplanes of their intersection. Then, the angle between these tangent hyperplanes can be found by Eq. 8.

  • Number of Constraints: Number of constraints plays an important role in problem difficulty. The number of constraints and their effects to make easy and hard problem instances is analysed.

  • Optimum-local Feasibility Ratio: Although the global feasibility ratio is important to find the initial feasible point, it should not affect the convergence rate during solving the problem. So, the feasibility ratio of generated COP is calculated by choosing random points within the vicinity of the optimum in search space and the ratio of feasible points to all chosen ones is reported. In our experiment, the vicinity of optimum is equivalent to 1/10 of boundaries from optimum for each dimension.

4 Experimental Analysis

We now analyse the features of constraints (linear, quadratic and their combination) for easy and hard instances. We generate these instances for (\(\varepsilon \)DEag) algorithm using well known objective functions. In our experiments, we generate two sets of hard and easy problem instances. Due to stochastic nature of evolutionary algorithms, for each number of constraints we perform 30 independent runs for evolving easy and hard instances. We set the evolving algorithm (DE) generation number to 5000 for obtaining the proper easy and hard instances. The other parameters of evolving algorithm are set to pop size = 40, CR = 0.5, scaling factor = 0.9 and \(FEN_{max}\) is 300, 000. Values for these parameters have been obtained by optimising the performance of the evolving algorithm in order to achieve the more easier and harder problem instances. For (\(\varepsilon \)DEag) algorithm, its best parameters are chosen based on [12]. These parameters are: generation number = 1500, pop size = 40, CR = 0.5, scaling factor = 0.9. Also, the parameters for \(\varepsilon \)-constraint method are set to control generation (Tc) = 1000, initial e level (q) = 0.9, archive size = 100n (n is dimension number), gradient-based mutation rate (Pg) = 0.2 and number of repeating the mutation (Rg) = 3.

4.1 Analysis for Linear Constraints

In order to focus only on constraints, we carry out our experiments on various well-known objective functions. These functions are: Sphere (bowl shaped), Ackley (many local optima), Rosenbrock (valley shaped) and Schaffer (many local minima) (see [2]). The linear constraint is as the form of Eq. 4 with dimension (n) as 30 and all coefficients are within the range of \([-5,5]\). Also, number of constraints is considered as 1 to 5. To discuss and study some features such as shortest distance to optimum, we assume that zero is optimum (all bs should be negative). We used (\(\varepsilon \)DEag) algorithm as solver to generate more easy and hard instances. In the following we will present our findings based on various features for linear constraints (for each dimension).

Figure 2 shows some evidence about linear constraints coefficients relationship such as standard deviation. It is obvious that there is a systematic relationship between the standard deviation of linear constraint coefficients and problem difficulty. The box plot (see Fig. 2) represents the results for easy and hard instances using all objective function for (\(\varepsilon \)DEag) algorithm (solver). As it is observed, the standard deviation for coefficients in each constraint (1 to 5) for easy instances are lower than hard ones. Both these coefficient values can be a significant role to make a problem harder or easier to solve. Interestingly, all different objective functions follow the same pattern.

Figure 3 represents variation of shortest distance to optimum feature for easy and hard instances using (\(\varepsilon \)DEag) algorithm. Lower value means a higher distance from the optimum. This means, the linear hyperplanes in easy instances are further from optimum. Based on results, there is a strong relationship between problem hardness and shortest distance of constraint hyperplanes to optimum. In other word, this feature is contributing to problem difficulty. As expected, all objective functions follow the same systematic relationship between their feature and problem difficulty. This means, this feature can be used as a proper source of knowledge for predicting problem difficulty.

Table 1. The angle feature for Sphere objective function

The angle between linear constraint hyperplanes feature shows relationship between the angle and problem difficulty. The angle between constraints in easier instances are less than higher ones (see Table 1). So, this feature is contributing in problem difficulty. Table 2 explains the variation of number of constraints feature group. It is shown that the problem difficulty (required FEN for easy and hard instances) has a strong systematic relationship with number of constraints for the experimented algorithm. To calculate the optimum-local feasibility ratio, \(10^{6}\) points are generated within the vicinity of optimum (zero in our problems). Later, the ratios of feasible points to all generated points are investigated for easy and hard instances. Results point out that increasing number of linear constraints, decreases the feasibility ratio for experimented algorithms (see Table 4).

In summary the variation of feature values over the problem difficulty is more prominent in some of them than the other groups. Features such as, coefficients standard deviation, shortest distance, angle, number of constraints and feasibility ratio exhibit a relationship to problem hardness. This relationship is stronger for some features.

Table 2. The FEN for linear constraints
Table 3. The FEN for quadratic constraints

4.2 Analysis for Quadratic Constraints

In this section, we carry out our experiments on quadratic constraints. We use objective functions, dimension and coefficient range similar to linear analysis. In the following the group of features are studied for easy and hard instances using quadratic constraints.

Observing the Fig. 2, we can identify the relationship of quadratic coefficients and their ability to make problem hard or easy. Based on the experiments, quadratic coefficients has the ability to make problems harder or easier for algorithms. In other words, in each constraint, the quadratic coefficients (within the quadratic constraint) are more contributing to problem difficulty than linear coefficients (see Eq. 5). Figure 2 shows the standard deviation of quadratic coefficients for easy and hard COPs. As shown, the standard deviation of quadratic coefficient in 1 to 5 constraints in easy instances are less than harder one. In contrast to quadratic coefficients, our experiments show there is no systematic relationship between the linear coefficient in quadratic constraints and problem hardness. In other words, quadratic coefficients (\(a_{2n-1}\)) are more contributing than linear ones (\(a_{2n}\)) in the same quadratic constraint (see Eq. 5).

Fig. 2.
figure 2

Box plot for standard deviation of coefficients in linear (A,C,E,G) and quadratic (B,D,F,H) constraints for Sphere (A,B), Ackley (C,D), Rosenbrok (E,F) and Schaffer (G,H). Each sub figure includes 2 sets of hard (H) and Easy (E) instances with 1 to 5 constraints using algorithms (a/b/c denotes a: constraint number, b: easy/hard instances and c:algorithm)

Fig. 3.
figure 3

Box plot for the shortest distance to optimum of linear (A,C,E,G) and quadratic (B,D,F,H) constraints for Sphere (A,B), Ackley (C,D), Rosenbrok (E,F) and Schaffer (G,H). Each sub figure includes 2 sets of hard (H) and Easy (E) instances with 1 to 5 constraints using DE algorithm (a/b/c denotes a: constraint number, b: easy/hard instances and c:algorithm)

Box plots shown in Fig. 3 represent the shortest distance of a quadratic constraint hyperplanes to optimum. As it is observed, harder instances have constraint hyperplanes closer to optimum than easier ones. Calculating the angles between constraints do not follow any systematic pattern and there is no relationship between angle feature and problem difficulty for quadratic constraints. We also study the number of quadratic constraints feature. As it is shown in Table 3, the number of quadratic constraints is contributing to problem difficulty. It is obvious that increasing the number of quadratic constraints makes a problem harder to solve (increases FEN). As observed in Table 5, our investigations on the feasibility ratio show that increasing the number of constraints decreases the problem optimum-local feasibility ratio for easy and hard instances respectively. As it is observed, some groups of features are more contributing to problem difficulty than others. It is shown that the angle feature does not follow any systematic relationship with problem hardness for the considered algorithm in the case of quadratic constraints. On the other hand, the standard deviation, feasibility ratio and number of constraints are more influencing the performance of \(\varepsilon \)DEag.

Table 4. Optimum-local feasibility ratio of search space near the optimum for 1,2,3,4 and 5 linear constraint
Table 5. Optimum-local feasibility ratio of search space near the optimum for 1,2,3,4 and 5 quadratic constraint
Table 6. The FEN for combined constraints using Sphere objective function

4.3 Analysis for Combined Constraints

In this section, we consider the combination of linear and quadratic constraints. The generated COPs have different numbers of linear and quadratic constraints (up to 5 constraints). The obtained results show the higher effectiveness of quadratic constraints than linear constraints. In other words, these constraints are more contributing to problem difficulty than linear ones. By analysing the various number of constraints (See Table 6) we can conclude that the required FEN for sets of constraints with more quadratic ones is higher than sets with more linear constraints. This relationship holds the pattern for both easy and hard instances.

In summary, it is observed that the variation of linear and quadratic constraint coefficients over the problem difficulty is more contributing for some group of features. Considering quadratic constraints only, it is obvious that some features such as angle do not provide useful knowledge for problem difficulty. In general, this experiments point out the relationship of the various constraint features of easy and hard instances with the problem difficulty while moving from easy to hard ones. This improves the understanding of the constraint structures and their ability to make a problem hard or easy for a specific group of evolutionary algorithms.

5 Conclusions

In this paper, we performed a feature-based analysis on the impact of sets of constraints (linear, quadratic and their combination) on performance of well-known evolutionary algorithm (\(\varepsilon \)DEag). Various features of constraints for easy and hard instances have been analysed to understand which features contribute more to problem difficulty. The sets of constraints have been evolved using an evolutionary algorithm to generate hard and easy problem instances for \(\varepsilon \)DEag. Furthermore, the relationship of the features with the problem difficulty have been examined while moving from easy to hard instances. Later on, these results can be used to design an algorithm prediction model.