Keywords

1 Introduction

The convergence towards the optimum of genetic algorithms and other related search techniques is well established (Holland 1992; Pandey et al. 2014). The objective value of all search techniques shows a typical initial rapid improvement period followed by continual decrease in improvement rate, as the search goes-on (Rawlins and Sushil 2014).

There are several popular stopping criteria such as: (1) number of generations (Liu et al. 2014), (2) number of generations without minimal improvement (Ha et al. 2014), (3) maximum optimization time (Glorieux 2015, p. 9), (4) optimization threshold value (Glorieux 2015, p. 9), (5) a threshold value for the difference between the objective best value and average value, over the generations (Toledo et al. 2014). However, setting the stop criterion in each case is more an art than a science. Martí et al. (2016) tried to integrate science into this decision, but ended up with complex models based on progress indicators with a Kalman filter which is used for data gathering. This paper suggests a more structured and reason-based strategy. It first identifies an initial “warmup period” (Rossetti et al. 2005), of significant improvement, then it tests when the objective function reaches a plateau (Robinson 2007), finally, it adopts the statistical inference of a record breaking process from a stationary random process (Glick 1978; Krug and Jain 2005).

2 Objectives

The objective of the research is to propose an effective and efficient stopping criterion (Hermadi et al. 2014; Kim 2013) that could be applicable to most search techniques (for example, Genetic Algorithms (GA), Simulated Annealing (SA), Ant Colony (AC), Particle Swarm Optimization (PSO), etc.). The suggested stopping technique should also be based on current scientific tools and on reason. Additionally, the proposed technique should be easy to apply and easily understood. Most importantly, the technique should offer ways to control its reliability.

To fulfil the objectives we set three different goals to be achieved:

  1. 1.

    The identification of the initial period of rapid improvement in the objective function.

  2. 2.

    The identification of the second period of the objective function convergence to a plateau.

  3. 3.

    The identification of a sufficiently good solution by assuming a stationary homoscedastic process.

3 Identifying the End of the Intensive Warm-up Period

The first step is to identify the end of the initial steep descent or “warmup period” (Rossetti et al. 2005). For that purpose, we adopt the following Definitions:

F max :

Largest value of the fitness function

F min :

Smallest value of the fitness function

Total improvement  =  Fmax-Fmin

n :

The number of last observations of the fitness function for consideration.

δ :

The range of fitness function value of the last n observations.

e :

Threshold value for the ratio between δ and the total improvement

First we need to decide on the value of n (the number of last observations we consider). Once n is know (say n  = 6), δ could be computed continuously as follows:

$$\delta = {\text{Max(}}X_{t} ,X_{t - 1} ,\ldots X_{t - n} )- {\text{Min(}}X_{t} ,X_{t - 1} ,\ldots X_{t - n} )$$
(1)

Note that δ is analogous to the total improvement, but is only related to the last n observations. Thus, the ratio between δ and the total-improvement is the ratio between the local improvement (of last n points) and the global improvement. This ratio is expected to decline through time and using an epsilon as a threshold is utilized as a stopping rule as described in Fig. 1.

Fig. 1
figure 1

Example of identifying the warm-up period in evolutionary optimization

Reaching the threshold described in Fig. 1, is telling us that the process finished its steep progress towards the optimum. However, it does not mean that the process reached a stationary condition or hit a plateau. It still remains up to a second procedure to ascertain the steady state of the process. This step is described in Sect. 4.

4 Identifying the Steady State

Once the initial period of intensive improvement is curtailed, a quest for steadiness begins. A second procedure is applied to make sure that the process is stationary with unchanging variance. A popular quality control based technique was adopted for this purpose. The technique is widely used by the discrete event simulation community. The method is named the method of “Batch Means” (Robinson 2007). This method is sometimes used outside the world of discrete event simulations: for example, Rossetti et al. (2005).

In this scheme we define:

n :

the number of observations in each batch.

m :

the number of required batches.

\(\overline{Y} {}_{j}\) :

the mean of batch j.

\(\overline{\overline{Y}}\) :

the total mean of all observations.

\(\mathop S\nolimits_{{\overline{Y} }}^{2}\) :

the variance of the batch means.

Setting values to n, and m (typically greater than 6) generates m groups of n observations each. The central limit theorem (CLT) ensures that the batch means are distributed approximately as Normal IIDs, and thus by the procedure tests for outliers from the following range:

$$\overline{\overline{Y}} \pm t_{(m - 1,1 - \alpha /2)} \frac{{S_{{\overline{Y} }} }}{\sqrt m }$$
(2)

If there are no outliers, we adopt the end-point of last batch (mth batch) to the beginning of the third step of the overall procedure. Otherwise, a repeated procedure of m-means will be applied where the first batch will start right after the last outlier (Fig. 2).

Fig. 2
figure 2

An example for the method of batch means

5 Identifying Sufficiently Satisfying Record

Once the second step of batch means is done, a steady state process could be assumed, for the observations that follow the last batch. In that region, a third step is taken using record-breaking theory for deciding when to stop the search. For presenting the record-breaking theory and our proposed stopping rule, the following definition are given.

Definitions

r :

Record number (r = 2: second record, r = 3: third record, etc.)

n :

Number of generations/observations

E(n) :

mean number of trials/observations until a certain record number.

R n :

number of records in n generations/observations.

M r :

Median number of trials/observations related to the rth record

Mr/Mr−1:

The ratio of current median to median of last record

Table 1 shows results of an extensive simulation trial for a process of drawing n iid variables, until the 8th record is achieved. Table 1, shows for each rth record: the mean average observations (E(n)). However E(n) approaches infinity as n grows, while median is stable. Therefore, we prefer dealing with medians. Median is denoted (Mr), and the ratio of current median to last one (Mr/Mr1) is based on (Glick 1978). This ratio approaches e  = 2.718 as r grows.

Table 1 Record number r and its simulated statistical parameters (Glick 1978)

Thus:

$$\frac{{Median\left\{ {n_{r + 1} } \right\}}}{{Median\left\{ {n_{r} } \right\}}} = e = 2.718$$
(3)

This means that the median number of observations between the current record to next record is 2.718 times the total number of observations so far (until current record). Setting a threshold for number of generations without improvement is easy. For example, limiting the search for achieving next record by maximum of additional million generations has a median of last record: at, or after, a total of: 1,000,000/2.718 = 367,918 generations.

Additional way to set a threshold is to use the number of records directly. In that case, the threshold could be determined using the Chebyshev’s bound for ensuring that the solution is in a certain percentile of the total population of solutions. Table 2 shows results for several thousand repetitions of generating n iid observations and keeping Rn at each repetition.

Table 2 Expectation, variance, and standard deviation for n i.i.d. observations (Glick 1978)

For each n, the mean E(Rn) and variance V(Rn) number of records was computed. This enables to find a number of records that ensures with high probability that the best of n observations was found. The rightmost column of Table 2, uses 3 standard deviations above the mean. Using Chebyshev’s one-sided bound ensures that the best solution among n is found with a probability of 94.4%:

$$\begin{aligned} & \Pr \{\text{no better records exist for}\,\, n | k = 3\} = 1 - (1/(2(\text{k}^2))) \\ & \quad =1 - ((1/2)(1/3^2)) = 17/18 =0.944 \\ \end{aligned}$$
(4)

Thus, a process of waiting for the next 15 records is equivalent to waiting at the minimum 1200 generations, and waiting for 25 records is equivalent to observing 1,000,000 generations.

6 Results

We compared the results of the proposed GA stopping technique with optimal values of the benchmark RCPSP solutions on PSPLIB (Kolisch and Sprecher 1997; Kolisch et al. 1999). We used 30 solutions for 4 different network sizes: 120, 90, 60 and 30 activities. The stopping parameters we used were: (1) For the intensive improvement stage: n = 8, ε = 0.005; (2) for the batch means n = 10, m = 5 and α = 0.05; (3) setting the start point at the end of step-2 and waiting for 15 more records. The results are summarized in Table 3.

Table 3 Results of the GA stopping rules vs. benchmark optimum (Kolisch et al. 1999)

Thus, preliminary results (summarized in Table 3) show that the suggested scheme ensures high quality solutions, albeit in some cases adding some computational time.

7 Conclusion

The paper suggests a new approach for stopping criterion for search techniques. First, the length of the warmup period with intensive improvement is identified. Then, batch means method is applied for reaching a flat part of the fitness function—to be approximated as a static distribution process. This approximation enables using record breaking statistics for stopping inference rules. Preliminary results show high quality solutions could be obtained using this scheme. It is left for future research to examine the impact of most model parameters, and the trade-off between solutions quality and its time-performance. For that purpose far more experimentation is needed, as well as sensitivity analysis for each parameter and their combinations.