Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Dynamic Optimization Problems (DOPs) and Dynamic Constrained Optimization Problems (DCOPs) have drown the attention of many scientists during the last decade since these two models, unlike their stationary counterparts, assume a more realistic point of view that either an objective function or a set of feasible solutions or both of them may change in time [14].

For the scope of this paper let us assume that \(F^{(t)} : D \longrightarrow \mathbb {R}\) is the dynamic objective function with \(D \subseteq \mathbb {R}^d\), \(d > 0\), \(t \in \mathbb {N}_+\) and \(G_i^{(t)} : D \longrightarrow \mathbb {R}\) for all \(i = 1, \ldots , m\) are the dynamic constraint functions. The aim is then as follows: For all \(t \in \{ t_1, t_2, \ldots , t_n \} \subset N_+\) find \(x^{(t)} \in D\) such that

$$\begin{aligned} x^{(t)} = \arg \min \{ F^{(t)}(x) : x \in D \; \wedge \; \forall _{i = 1, \ldots , m} \; G_i^{(t)}(x) \geqslant 0 \} . \end{aligned}$$
(1)

A commonly used approach in Evolutionary Algorithms (EAs) dedicated to the above-defined D(C)OPs implements the so called reactive behaviour which forces them to re-evaluate the population of individuals whenever a change of the landscape is detected. Although such mechanism often guarantees at least fairly good level of tracing the moving optima and localizing the newly appearing ones as it was shown in [58], it is tempting to utilize the knowledge gained during the run of an EA in order to predict the future landscape and act one step ahead of the changes. This alternative approach is usually referred to as proactive behaviour. One of the first proactive EAs was introduced by Hatzakis and Wallace who used an Auto-Regressive model for anticipating the future shape of Pareto optimal front in [9]. Bosman presented his learning and anticipation mechanism in [10]. Later on, Simões and Costa proposed an EA equipped with a Markov chain predictor to forecast the future states of the environment [11].

IDEA-ARIMA is a proactive EA that uses the Auto-Regressive Integrated Moving Average [12] model for anticipating the future evaluations of a fitness function. It was demonstrated in [13] that this algorithm can accurately anticipate some periodically changing environments and simultaneously guarantee a good constraint handling. However, the computational cost of running IDEA-ARIMA and its demand for a huge amounts of memory are barely acceptable in practical applications. A critical analysis of IDEA-ARIMA including a detailed description of an algorithm followed by an identification of its weakest parts is discussed further in Sect. 2.

A contribution of this paper includes a number of modifications aimed at making IDEA-ARIMA an efficient and competitive tool by reducing the use of memory and proposing the new anticipation mechanism which no longer requires maintaining a separate population of individuals yet directly injects the candidate solutions in the most probable future promising regions. It also addresses the problem of possibly inaccurate forecasts by introducing a small fraction of random immigrants spread evenly across the search space. All the proposed modifications of IDEA-ARIMA are elaborated in Sect. 3.

The suggested modifications were evaluated using the set of popular benchmark functions. The experimental results are summarized in Sect. 4, then some conclusions are given in Sect. 5.

2 Critical Analysis of IDEA-ARIMA

IDEA-ARIMA was first introduced in [13] as an extension of Infeasibility Driven Evolutionary Algorithm (IDEA) which is known for its robustness in solving constrained optimization problems [14]. The original IDEA deals with constraints by incorporating an additional optimization criterion called violation measure that indicates “how far” from a nearest feasible region a given individual is. By using a multi-objective optimization mechanism similar to NSGA-II [15], IDEA simultaneously maximizes the fitness function and minimizes the violation measure which allows it to find the optima located on the boundaries of feasible regions. Moreover, IDEA is able to approach these optima from both sides, i.e. from a feasible and an infeasible one, which typically speeds up the convergence [14]. Note that even though IDEA was initially dedicated to Stationary Optimization Problems (SOPs) it also has a potential of handling some DCOPs as it was indicated in [8].

Taking into account the above-mentioned pros of IDEA, the IDEA-ARIMA was meant to be a proactive EA that would hybridize the robust constraint handling mechanism guaranteed by IDEA with a commonly used linear prediction model called Auto-Regressive Integrated Moving Average (ARIMA) [12] applied for anticipating the most probable future fitness values. This conglomerate was believed to form a powerful tool able to solve DCOPs effectively. Despite the fact that some experimental results presented in [13] were very promising, they also revealed the two weakest points of IDEA-ARIMA which are the considerable computational cost and the huge memory demands.

figure a

Let us now shed some light on the anticipation strategy used in IDEA-ARIMA in order to indicate the sources of the two main drawbacks of this EA. First of all, in this approach a dynamism of the environment is perceived through the recurrent evaluations of a set of samples \(S \subset \mathbb {R}^d\) (\(d > 0\)). Every sample \(s \in S\) is associated with the time series of its past evaluations \((X^s_t)_{t \in T}\), i.e.

$$\begin{aligned} \forall _{t \le t_{now}} \quad X^s_t = F^{(t)}(s) . \end{aligned}$$
(2)

In other words, all the historical values of the objective function \(F\) for all the samples \(s \in S\) up to the present moment \(t_{now} \in T\) are collected and made available at any time. On the top of it, the ARIMA model is applied for predicting the future values of the objective function \(\widetilde{X}^s_{t_{now}+1} = \widetilde{F}^{(t_{now}+1)}(s)\) based on the past observations \((X_t^s)_{t \le t_{now}}\). As a result, the whole future landscape \(\widetilde{F}^{(t_{now}+1)}\) can be anticipated by extrapolating the set

$$\begin{aligned} \{ \widetilde{F}^{(t_{now}+1)}(s) \, ; \; s \in S \}, \end{aligned}$$
(3)

using the \(k > 0\) nearest neighbours method. The point is that the size of \(S\) tends to grow extremely fast from one iteration of IDEA-ARIMA to another (cf. Algorithm 1) thus consuming more and more memory and additionally it requires an increasing number of invocations of the evaluation function for keeping all the samples up-to-date with the environment.

Secondly, a proper use of the information concerning the anticipated future landscape gathered by IDEA-ARIMA is assured by introducing a predictive population \(\widetilde{P}_{t+1}\) which comprises of \(M > 0\) individuals being evolved separately from a regular population \(P_t\) and evaluated with the anticipated future fitness function \(\widetilde{F}^{(t+1)}\) instead of \(F^{(t)}\). Later on, when the next time interval begins (i.e. \(t \leftarrow t + 1\)), the individuals from the predictive population are immediately transferred to \(P_t\) so that the EA could begin to explore the newest promising regions straight away. Nevertheless, the anticipation mechanism firstly requires some historical data in order to provide accurate forecasts about \(\widetilde{F}^{(t+1)}\) thus for the initial \(N_{train} > 0\) generations it is only fed with data \((X_t^s)_{t \le N_{train}}\) for \(s \in S\) and produces no outputs. Although, from the efficacy perspective, after this presumably short period, an emergence of the predictive population appears to be a yet another source of increase in the computational cost since these individuals also require evaluations (although without essentially invoking the evaluation function \(F^{(t)}\)) and an application of some evolutionary operators.

The entire pseudo-code of IDEA-ARIMA is given in Algorithm 1. It begins with generating the population \(P_1\) by picking up randomly \(M > 0\) individuals and taking the empty set of samples \(S_1\). Then the main loop of the EA is run for \(N_{gen} > 0\) generations. Whenever a change of the objective function \(F^{(t)}\) is detected (i.e. the evaluation of at least one of the randomly chosen individuals has just changed), the whole population \(P_t\) is re-evaluated and then added to the set of samples \(S_t\). Providing that the training period of the anticipation mechanism \(t = 1, 2, \ldots , N_{train}\) is over, and thus the population \(\widetilde{P}_t\) is ready, the individuals from \(P_t\) and \(\widetilde{P}_t\) are grouped together and immediately reduced to the fixed population size \(M > 0\). Eventually, regardless the changes of \(F^{(t)}\), the original IDEA is run for \(0 < N_{sub} \ll N_{gen}\) iterations (those will be referred to as subiterations). Later on, the predictive population \(\widetilde{P}_{t+1}\) is initialized randomly and evolved within the same number of \(N_{sub}\) subiterations of IDEA however the anticipated objective function \(\widetilde{F}^{(t+1)}\) is used here instead of \(F^{(t)}\).

3 Proposed Modifications of IDEA-ARIMA

The two drawbacks of IDEA-ARIMA emphasised in the previous section can be overcome in a number of ways. A rather straightforward one would be to simply bound the set of samples \(S\) and suggest a strategy for keeping it up-to-date with the environment. Some ideas concerning that approach will be discussed at first. A further modification proposed in this paper is a bit more complex. The idea behind it is to modify the model of spreading the information about the anticipated future objective function. Instead of introducing a whole predictive population \(\widetilde{P}\) and evolving it separately, a single sample that currently has the highest anticipated fitness \(\widetilde{F}\) can be selected out of the finite set \(S\) in order to deliver that information into the population \(P\). However, this scenario can only succeed providing that the forecast is accurate. Otherwise it could significantly deteriorate the performance of the EA. This risk can be minimized by introducing a small fraction of individuals located near the estimated future optimum and additionally another small fraction of random immigrants spread uniformly across the search space. In this case, though, the proper sizes of small fractions would remain an open issue. That is why the mechanism for auto-adaptation of these fraction sizes will be introduced further in this section.

3.1 Bounded Set of Samples

IDEA-ARIMA assumes that the set of samples \(S\) grows from generation to generation by \(0\) up to \(M\) new elements, where \(M > 0\) is the size of a population \(P\). It is the consequence of the operation presented in the 7th line of Algorithm 1 that reads \(S_t\) = \(S_t \cup P_t\). It is tempting to get rid of this operation and instead select randomly \(M\) samples during the initialization step and stick to them throughout the whole run. Unfortunately, this leads to rather mediocre results. However, the set \(S\) can still be bounded to \(M\) elements providing that the least contributing samples are removed any time \(S\) exceeds its maximum size. In terms of time series analysis, the least contributing samples can be those with the longest history trail (i.e. the oldest ones) since they are most likely to become over-learnt. For the scope of this paper let us refer to this slightly modified IDEA-ARIMA with a set of samples \(S\) permanently bounded to \(M\) as IDEA-ARIMA \(M\).

3.2 Small Fractions Instead of Predictive Population

After the training period (i.e. for \(t > N_{train}\)) IDEA-ARIMA essentially maintains two populations, namely \(P_t\) and \(\widetilde{P}_{t+1}\), each of which needs to be evolved and evaluated separately. As a result, the computational time is doubled. Now, that we have bounded the set of samples, it can be more efficient to simply compare all of the anticipated fitness values and select a sample \(s^* \in S\) such that \(s^* = \arg \min \{ \widetilde{F}^{(t+1)}(s) \, ; \; s \in S \}\). Of course, introducing a single sample into \(P_t\) may be not enough in order to move the population towards it, especially that the foreseen fitness value \(\widetilde{F}^{t+1}(s^*)\) is likely to be slightly distorted. To alleviate that, a whole fraction of individuals concentrated around \(s^*\) can be introduced instead. Let us call it the anticipating fraction. Probably the most appropriate way of generating the anticipating fraction is by using Gaussian distribution \(\mathcal {N}(s_i^*, \varepsilon )\) where \(i = 1, \ldots , d\) and \(\varepsilon > 0\). Although, it is worth noticing that since a prediction population \(\widetilde{P}_{t+1}\) in IDEA-ARIMA is always created from scratch hence it is evenly distributed across the search space. This in turn guarantees a safety buffer in case of the erroneous anticipation because there are dozens of randomly placed candidate solutions in \(\widetilde{P}_{t+1}\) that may potentially attract the attention of individuals from \(P_t\) while the two populations would be eventually grouped together. Fortunately, the same behaviour can be assured by introducing a fraction of random immigrants (let us call them the exploring fraction) into \(P_t\) apart from the anticipating fraction described above. The point is that the proper sizes of these fractions, name them \(0 < size_{anticip} < M\) for the anticipating fraction and \(0 < size_{explore} < M\) for the exploring one, are strictly problem-dependent thus cannot be estimated once for all the possible cases. Finally, it has to be stated that the condition \(size_{anticip} + size_{explore} < M\) must be satisfied for all time.

3.3 Auto-adaptation of Fraction Sizes

After introducing the two fractions defined above a population \(P_t\) can be thought of as a mixture of the three subsets, namely \(P_t^{anticip} \subset P_t\) built up of the anticipating fraction of \(size_{anticip}\) individuals, \(P_t^{explore} \subset P_t\) built up of the exploring fraction of \(size_{explore}\) individuals, and the remaining \(P_t^{exploit} = P_t {\setminus } (P_t^{anticip} \cup P_t^{explore})\) fraction of \(size_{exploit}\) = \(M - size_{anticip} - size_{explore}\) individuals responsible for exploiting the promising regions identified so far.

figure b
figure c

At first, all the fraction sizes are assumed equal \(size_{explore}\) = \(size_{exploit}\) = \(size_{anticip}\) = \(M / 3\) yet after the training period of \(N_{train}\) generations those can be adapted automatically. The updating rule is presented in Algorithm 2. It begins with finding a single best individual per fraction as its representative. Then, all the three fractions are given labels adequate to the fitness of their respective representatives. The fraction containing the best representative is labeled best, the second best is labeled medium and the last one—worst. Next, the size of the best fraction is increased by \(0 < \delta \ll M\). The remaining \(M - size_{best}\) “vacant slots” are disposed between the medium and worst fractions proportionally to the differences in fitness of their representatives and the representative of the best fraction. Clearly, all the three sizes must sum up to \(M\). They are also restricted to the range \([size_{min}, size_{max}]\) where \(0 < size_{min} < size_{max} < M\) in order to prevent from the excessive domination of a certain fraction causing the exclusion of the others. The suggested values of parameters used in UpdateFractionSizes procedure are \(\delta = 10\,\% \times M\), \(size_{min} = \delta \) and \(size_{max} = M - \delta \).

3.4 mIDEA-ARIMA

A pseudo-code of the modified IDEA-ARIMA algorithm (abbreviated to mIDEA-ARIMA) is given in Algorithm 3. It differs from the original IDEA-ARIMA in few places. First of all, it begins with a non-empty set \(S_1\) containing of \(M\) randomly selected samples. Secondly, after the training period is over, it picks up a best sample \(s_t^*\) out of \(S_t\) in each generation by taking into account the anticipated fitness values \(\widetilde{F}^{(t+1)}\). Then, it prepares the anticipating fraction \(P_{t+1}^{anticip}\) concentrated around \(s_t^*\) and the exploring fraction \(P_{t+1}^{explore}\) uniformly distributed across the search space. Finally, during the next time step (providing that the landscape has changed since the last iteration) it reduces the set of samples \(S_t \cup P_t\) to the maximum number of \(M\) elements and also reduces the population \(P_t\) into \(size_{anticip}\) individuals. Later on, it composes the new population out of the three fractions \(P_{explore}\), \(P_{exploit}\), \(P_{anticip}\) and updates their respective sizes for the next generation.

4 Experiments

The experiments were performed on the following benchmark problems.

Benchmarks G24 [2] Minimize the function

  1. (a)

    G24_1, G24_6c

    $$\begin{aligned} F^{(t)}(x) = -\left[ \sin \left( k \pi t + \frac{\pi }{2} \right) \cdot x_1 + x_2 \right] , \end{aligned}$$
  2. (b)

    G24_2

    $$\begin{aligned} F^{(t)}(x) = -\left[ p_1(t) \cdot x_1 + p_2(t) \cdot x_2 \right] , \end{aligned}$$
    $$\begin{aligned} p_1(t) = \left\{ \begin{array}{ll} \sin \left( \frac{k \pi t}{2} + \frac{\pi }{2} \right) , &{} t \mid 2 \\ p_1(t-1), &{} t \not \mid 2 \end{array}\right. , \quad p_2(t) = \left\{ \begin{array}{ll} p_2(\max \{0, t-1\}), &{} t \mid 2 \\ \sin \left( \frac{k \pi (t-1)}{2} + \frac{\pi }{2} \right) , &{} t \not \mid 2 \end{array} \right. \end{aligned}$$
  3. (c)

    G24_8b

    $$\begin{aligned} F^{(t)}(x) = -3 \exp \left\{ - \left[ \left( p_1(t) - x_1 \right) ^2 + \left( p_2(t) - x_2 \right) ^2 \right] ^ \frac{1}{4} \right\} , \end{aligned}$$
    $$\begin{aligned} p_1(t) = 1.4706 + 0.8590 \cdot \cos (k \pi t), \quad p_2(t) = 3.4420 + 0.8590 \cdot \sin (k \pi t) \end{aligned}$$

subject to

  1. (a)

    G24_1, G24_2, G24_8b

    $$\begin{aligned} G_1(x)= & {} 2 x_1^4 - 8 x_1^3 + 8 x_1 ^ 2 - x_2 + 2 \ge 0, \\ G_2(x)= & {} 4 x_1^4 - 32 x_1^3 + 88 x_1^2 - 96 x_1 - x_2 + 36 \ge 0, \end{aligned}$$
  2. (b)

    G24_6c

    $$\begin{aligned} G_1(x)= & {} 2 x_1 + 3 x_2 - 9 \\ G_2(x)= & {} \left\{ \begin{array}{rl} -1 &{} \text{ if } \; x_1 \in [0, 1] \cup [2, 3] \\ 1 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

where \(x = (x_1, x_2) \in [0, 3] \times [0, 4]\), \(t \in \mathbb {N}_+\) and \(0 \le k \le 2\).

Benchmark mFDA1 [13] Minimize the function

$$\begin{aligned} F^{(t)}(x) = 1 - \sqrt{\frac{x_1}{1 + \sum _{i = 2}^{n} \left( x_i - \sin \left( \frac{\pi t}{4} \right) \right) ^2 }} \end{aligned}$$

subject to

$$\begin{aligned} G_j(x) = \frac{3[x_2 - \frac{1}{2}(\alpha _j + \beta _j)]^2}{2 (\alpha _j - \beta _j)^2} - x_1 + \frac{1}{4} \ge 0, \end{aligned}$$
$$\begin{aligned} \alpha _j = \sin \left( \frac{\pi (j+1)}{4}\right) , \quad \beta _j = \sin \left( \frac{\pi (j+1)}{4}\right) , \quad j \in \{ 1, 2, 3, 4 \}. \end{aligned}$$

where \(x = (x_1, x_2) \in [0, 1] \times [-1, 1]\) and \(t \in \mathbb {N}_+\).

Each of the above benchmarks was run in the three severity variants \(k \in \{ 0.1\), \(0.25\), \(0.5 \}\) and four frequency variants expressed as a number of subiterations between consecutive environmental changes \(N_{sub} \in \{1, 2, 5, 10\}\).

The compared algorithms were split into the three groups.

  1. 1.

    IDEA with:

    • re-initialization of a population each time a change of the landscape is detected (further referred to as IDEA reset),

    • introduction of a fixed-sized exploring fraction (IDEA explore),

    • introduction of an exploring fraction of the size adapted online according to the UpdateFractionSizes procedure yet without the anticipating fraction (IDEA adapt).

  2. 2.

    IDEA-ARIMA with:

    • the set of samples bounded to \(M\) (IDEA-ARIMA M),

    • the set of samples bounded to \(2M\) (IDEA-ARIMA 2M),

    • the unbounded set of samples (IDEA-ARIMA \(\infty \)).

  3. 3.

    mIDEA-ARIMA with:

    • non-empty anticipation fraction and empty exploring fraction (mIDEA-ARIMA anticip),

    • non-empty anticipation fraction and non-empty exploring fraction (mIDEA-ARIMA anticip/explore),

    • non-empty anticipation fraction and non-empty exploring fraction of the sizes adapted online according to the UpdateFractionSizes procedure (mIDEA-ARIMA adapt).

Table 1. Offline performances averaged over 50 independent runs with \(k = 0.1\).

Tables 1, 2 and 3 summarize the offline performances obtained for all the analyzed benchmark functions with severity regulator \(k\) set to 0.1, 0.25 and 0.5 respectively. The results are averaged over 50 independent runs each of which lasted for \(N_{gen}\) = 100 generations. In the cases with fixed-sized fractions, the optimal sizes are given in brackets, e.g. (\(0.7\)) means \(size_{explore} = 0.7 \times M\) while (\(0.1/0.6\)) stands for \(size_{anticip} = 0.1 \times M\) and \(size_{explore} = 0.6 \times M\).

Table 2. Offline performances averaged over 50 independent runs with \(k = 0.25\).
Table 3. Offline performances averaged over 50 independent runs with \(k = 0.5\).

It is clearly seen that mIDEA-ARIMA anticip/explore outperformed the other algorithms in nearly all the cases. Particularly, it gave better results than IDEA-ARIMA \(\infty \) even though the latter required more evaluations and memory. It also turned out that even the simplest modification including only a bounding of \(S\) resulted in fairly good offline performances. A comparison with IDEA-ARIMA 2M revealed that doubling the maximum size of \(S\) gave satisfactory results only in cases with greater \(N_{sub}\) values.

Figure 1 presents the results of 50 runs of those algorithms that do not require a prior estimation of proper fraction sizes. After each run the winning algorithm scored +3 points, the second best +2 points and the third best +1 point. It can be seen that mIDEA-ARIMA adapt performed best in many cases, especially in rapidly changing environments. It also has to be mentioned that in this comparison IDEA-ARIMA M again proved its surprising effectiveness.

Fig. 1.
figure 1

Results of 50 runs of algorithms not requiring a prior estimation of proper fraction sizes. Each winning algorithm scored +3 points, the second best +2 points and the third best +1 point.

5 Conclusions

In this paper a number of modifications of IDEA-ARIMA were proposed. The introduced mIDEA-ARIMA proved its potential in solving DCOPs although it increased the space of possible input parameters of the EA. To alleviate that issue the online auto-adaptation mechanism was suggested.

The experiments performed on the popular benchmark problems revealed the superiority of mIDEA-ARIMA over the original IDEA-ARIMA in terms of the offline performance, a number of evaluations and a memory consumption.