Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction and Research Questions

1.1 Dynamic Problems and Evolutionary Dynamic Optimisation

Real-world applications are naturally dynamic. Customer demands change, internet bandwidth fluctuates, policies are being revised, and a changing climate are some examples of real-world dynamic problems. To deal with the inherent time-dependence of the real-world, finding effective ways to solve dynamic problems is very important. If a dynamic problem is solved online when time goes by, it is called dynamic optimisation problem (DOP) [1]. Among many different approaches to solving DOPs, evolutionary algorithms (EAs), is a common approach. The field of applying EAs to solving DOPs is called evolutionary dynamic optimisation (EDO).

1.2 Detecting Changes in DOPs

In addition to the need of finding the optimum as quickly as possible (as in static problems), in DOPs the solver also has to react to changes to track the changing optimum [2]. There are two approaches: the algorithm either react to changes implicitly by some form of self-adaptation, or the algorithm need to react to changes explicitly. This paper focuses on the second approach. For most EAs following this approach, reacting to changes requires the knowledge of when a change occurs [2]. How to know when a change occurs is an important factor and it needs to be taken into consideration when an algorithm is designed.

Regarding the knowledge of the moments of changes, there are two schools of thought. The first school of thought considers that algorithms are well informed of changes or changes can be detected easily by just using one/a few detectors [37]. This approach makes sense for solving the current continuous academic benchmark problems, where the whole search space changes at once.

However, in many real-world applications, especially in constrained problems, only a part of the space changes and knowledge of environmental change might not be accessible [1, 8, 9]. In such situations, using just a few detectors to detect changes might not be sufficient because the detectors might not be in the changing region in the search space [2]. The second school of thought considers change detection an important part of the optimisation process rather than just a few detectors. To incorporate change detection in algorithms, some research tried to maintain enough diversity to cover the whole search space [10] or to distribute specific detectors in different search regions [11]. Some studies tried to detect changes by finding the statistical difference between the populations from two consecutive generations [9]. Some detected changes by monitoring the previous best found solutions [12]. The main disadvantages of methods following this school of thought is the additional computational cost spent on detecting/adapting changes in the whole search space. This cause methods following this approach perform generally worse than methods following the first school of thought in solving current benchmark problems.

This difference in performance between the two schools of thoughts raise an important research question of how to improve the efficiency of change detection.

1.3 Tracking Multiple Peaks in DOPs

One of the most commonly used approaches for EDO is to cover multiple regions of the search space, and separately monitor the movement of optima at each region. This way, multiple optima can be tracked at the same time, and if any of those optima become the global optimum after a change, they would likely be found more quickly. A natural way to track multiple regions is to use multiple populations, one for each region. Multi-population is the most used approach to solve some standard benchmark problems in the field of EDO.

In multi-population/multi-region approach, it is essential that the sub populations/regions are not overlapped so that one area is not searched by two or more sub-populations and an area is not being re-searched multiple times if there is no change. To avoid overlapped sub-populations/regions, existing methods either define each sub-population/region as a hypercube or sphere, then prevent individuals from other sub-populations to enter the cube/sphere [13, 14], or use distance calculations to estimate the basins of attractions of peaks and use these basins as the separate regions for each sub-population [15].

The above techniques, however, are computationally expensive due to the distance calculations (analysed in Sect. 3). Finding a more efficient method to separate tracking regions, hence, is an important research question.

This paper describes an attempt to answer the two questions above.

2 Avoiding Revisiting Explored Areas and Improving Change Detection

2.1 Distributing Detectors Effectively

After having explored a certain part of the search space, if an algorithm remembers the structure of the explored search space, it might be able to use that knowledge to better distribute detectors, e.g. sending more detectors to rugged areas (having more optima) and fewer detectors to smooth areas (having fewer optima). In addition, if it can be assumed that changes in the basin of an optimum might likely change the value and position of the optimum itself, each basin may just need a detector right at the previously found optimum.

Placing detectors at the optima, however, can only detect changes that alter basin’s height/position. For other basin changes, it might be necessary to frequently send detectors to the explored basin to check for any newly appearing solution. Such new solutions should only be accepted if they are shown to be promising. Otherwise, they should be discarded and the detectors should be sent to other areas. To implement this idea, it is essential to estimate the basin sizes. Estimating basin sizes also helps maintain just one sub-population per one peak/basin. Although estimating the basin size is a common goal of multi-population approaches, existing methods may not be able to achieve it. Their pre-determined fixed-size search area may not correctly cover the exact basin.

The next subsection proposes a method to estimate basins of attraction.

2.2 Estimating Optima’s Basins of Attraction

As mentioned earlier, the problem with many existing methods to estimate optima’s basins of attraction is that these methods are both computationally expensive and inaccurate. The procedure below (Algorithm 1) proposes a simple and computationally cheap estimation by taking a number of consecutive samples along each dimensional axis until a slump in fitness is found. This procedure can be applied to all dimensional axes to create a hyper-rectangle, which approximately covers the basin of attraction of a found optimum.

figure a

2.3 Separating Explored Areas from Unexplored Areas

To separate sub-populations/regions, for every individual many existing algorithms has to calculate individual distances to all sub-populations, then assign each individual to its closest sub-population. This is a computationally expensive task, as mentioned in Sect. 1.3. Another downside is that each sub-population/region has to maintain its own regional information and this information needs to be re-calculated at every generation.

In the previous subsection, an idea has been proposed to estimate the basins of attraction for found optima. This way of estimating basin can be used as a basis for a new idea to separate the sub-regions/populations with low computational cost. The idea is to make use of a special data structure named the K-dimensional tree (KD-tree) [16]. KD-Tree is a special kind of binary tree specialised for representing multi-dimensional spaces into hyper-rectangles. Each non-leaf node of the tree represents a cutting hyperplane perpendicular to one of the k dimensions. This cutting hyperplane will divide the space into two parts, represented by the two subtrees of the node. Figure 1 shows how a KD-Tree can be used to divide a two-dimensional space.

Fig. 1.
figure 1

These figures, reproduced from [17], show how a two-dimensional space is decomposed using a KD-tree. (a): the tree, and (b): the decomposed space.

This special property inspires the authors to develop a modified version of the KD-Tree to represent the areas covered by sub-regions/populations and to distinguish explored and unexplored areas (Fig. 1). The modified tree still split the space in the same way as that of the original version: at each step the space will be splitted at a chosen plane. However, the newly modified KD-tree has a major structural difference. In the original KD-tree, each node represents (i) a chosen dimension axis that is perpendicular to the splitting hyperplane, and (ii) one point in the space that the splitting hyperplane must go through. On the contrary, in the modified version there is no point in each node although the nodes still represent the chosen dimensions and cutting splits to divide the space. In addition, each leaf of the modified tree represents a hyper-rectangle bounded by the cutting hyperplanes rather than the point the cutting hyperplane goes through.

In this modified KD-Tree, each estimated basin of the found optima is represented as a hyper-rectangle in the tree. This hyper-rectangle also indicates the cover area of the corresponding sub-population. Algorithm 2 shows the process of using a modified KD-Tree for separating regions in EDO:

figure b

This tree construction procedure help separating the regions covering different peaks automatically. In addition, it takes only \(O\left( \log M\right) \) (where \(M\) is the number of sub-regions/populations) for each individual to identify which sub-region/population the individual belongs to. The procedure also allows the tree to adaptively adjust its structure in response to changes. For example, if a new optimum appears or an existing optimum has moved and the current hyper-rectangle is no longer able to cover the optimum’s basin, the size of the hyper-rectangle will be adjusted accordingly. Another benefit is that, since we need only one KD-tree to memorise all regions/populations in the space, sub-regions/populations no longer have to manage their own regional information.

2.4 Local Search

EAs are considered relatively slow to converge. To speed up convergence speed, once a population starts to converge, a local search is applied to the best found solution to find the optimum more quickly and accurately. A population is considered starting to converge when the standard deviation of fitness values in the population becomes smaller than a threshold \(\beta \). We choose the Brent local search, first used for EA research in [18, 19]. This local search does not require any derivative information, hence can function as a black-box local search. The disadvantage is that it is generally much slower than local searches requiring derivative information such as conjugate gradient or quasi-Newton.

2.5 Tracking the Optima Movements

Although some existing methods maintain a full sub-population around an optimum to track its potential movement, it might not be necessary. Within the basin of a found optimum, tracking should only be triggered if there is a change that alters the basin. Following this idea, we propose the followings:

  1. 1.

    For changes that alter the existing optimum: simply re-evaluate the value of the optimum at every generation. If the values in two generations are different, a change has occurred and we track the moving optimum by applying the Brent local search to identify its new location.

  2. 2.

    For changes that lead to a new optimum without changing the existing ones, re-evaluating existing optima does not work. To deal with this, we allow individuals to venture into any explored basin, but prevent them from converging to existing optima. To do so, for each found optimum we define a hypercube, which has the optimum at its centre and has a length of \(0.8*l_{\min }\) where \(l_{\min }\) is the smallest edge of the hyper-rectangle covering the optimum’s basin. Any individual within this hypercube, but with worse value than the optimum’s value, will be randomly re-initialised to the unexplored areas.

2.6 The EA-KDTree Algorithm

We integrate all the above ideas into a simple Genetic Algorithm (GA). The new EA is called EA-KDTree. The algorithm works as follows. First, a KD-Tree is created with one root node representing the whole search space. Then, whenever a new optimum is found, the algorithm estimates the optimum’s basin using BasinEstimation() (Algorithm 1). The hyper-rectangle representing this estimated basin is added as a leaf to the KD-Tree, and the space is divided accordingly. This basin is recorded in the tree as an explored area. In addition, its optimum is monitored for changes and the algorithm will be prevented from re-converging to this optimum. A pseudo code is given in Algorithm 3.

figure c

3 Complexity Analysis

Many existing multi-population methods that track multiple peaks are computationally expensive since they have to do distance calculations. For example, for each generation, methods in [13, 14] and similar studies require distance calculations with a complexity of \(O(MNn^{2})\) where \(M\) is the number of sub-populations, \(N\) is the number of individuals and \(n\) is the number of variables. The method in [15] requires at least \(O(mN^{2})\) where \(m\) is the number of samples needed to detect the basin of attraction. In comparison, EA-DKTree complexity is significantly less: for each generation it only requires \(O\left( N\log M\right) \) to identify the correct search region for all individuals (in EA-KDTree \(M\) is the number of regions monitored by the algorithm). If we need to restructure the KD-Tree, the cost to restructure is \(O\left( M\log M\right) \), which is not computationally expensive.

4 Experimental Results

4.1 Experimental Settings

For this experiment, we choose the classic MovPeaks [20] benchmark problem. This is arguably most tested dynamic academic problem to date. The MovPeaks has multiple peaks whose locations, widths, and heights can change over time. To facilitate cross-comparison among different algorithms, three standard scenarios were proposed, of which scenario 2 was most commonly used. Due to that, in this experiment the algorithms will be tested on Scenario 2 (Table 1).

Parameter tuning was not done for EA-KDTree because the purpose is to provide a proof of principle. All parameters of the EA are the default values (Table 1) as used in recent research in the field (see justifications in [8]).

Table 1. Parameter settings for EA-KDTree and MovPeaks

The chosen performance measure is the common modified offline error [21].

4.2 Experimental Results - Comparing with Current State-of-the-arts

EA-KDTree is compared with current state-of-the-art population-based methods that follows the aforementioned school of thoughts in change detection to judge the potential of the proposed ideas. The peer algorithms were chosen from: Group 1 include algorithms with complete or semi-complete change detection methods, and Group 2 include algorithms with no change detection or with just one detector, as seen in Tables 2 and 3. EA-KDTree belongs to Group 1. Note that in Group 1, some algorithms offer a full change detection/adaptation mechanism (including EA-KDTree) while some others rely on re-evaluating the current best solution in each sub-population/region only (Cellular DE, mQSO and Sa multi-swarm). The latter are supposed to have better performance than the earlier in the MovPeaks but might not be as robust in detecting changes in some real-world problems.

As seen in Tables 2 and 3, EA-KDTree has the best performance among all Group 1 algorithms (algorithms with (semi) complete change detection). The results in the tables also indicate that due to not having to detect changes comprehensively, most algorithms in Group 1 have worse performance than most in Group 2. EA-KDTree is, however, an exception. It is still better than most algorithms in Group 2 except CDE and CPSO. Overall, EA-KDTree is the second best EA and the third best meta-heuristics of all algorithms. The few better methods are those with no complete change detection. As previously discussed, these methods might become less effective in problems where changes occur in only a part of the search space. Note that here we do not consider methods that react to changes implicitly (e.g. [22]) or methods that are not population-based.

It is worth noting that EA-KDTree performance, however, has a quite large standard deviation. This suggests that the algorithm might not always be completely reliable. We hypothesize that this might be due to the Brent local search, which is stochastic and hence may needs a large number of evaluations in certain situations. This causes a larger standard deviation. This limitation, however, can easily be alleviated by using a more powerful local search.

Table 2. Methods with (semi)-complete change-detection (Group 1).

4.3 Experimental Results - Studying Algorithmic Components

In this section we investigate why and which algorithmic component helps EA-KDTree to have a good performance. We will investigate if the proposed ideas make it possible to (i) correctly approximate the basins of attraction, (ii) divide the space using KD-Tree, (iii) track the moving basins, and (iv) prevent the population from converging to an existing optimum again unless it has changed.

Table 3. Methods with no complete change-detection (Group 2).

Approximating the basins and dividing the space using KD-Tree: We investigate the ability of the algorithm in approxmating the basin and dividing the space by comparing Simple GA + KD-Tree with Simple GA. The only difference between the two algorithms is the implementation of the KD-Tree and along with it the procedure BasinEstimation (Algorithm 1). To compare, we plot the position of individuals over different generations (for both algorithms) and also plot the division of the space by the KD-Tree (for GA+KD-Tree) to see if the proposed idea can help estimate the basins and divide the space correctly.

Figure 2 shows that after 11 generations GA+KDTree can find all optima, while the original GA is unable to do so after 50 generations. Furthermore the simple GA converges to just one optimum and hence fails to track multiple optima simultaneously. Another interesting observation is that the hyper-rectangles divided by the KD-Tree fits well with optima’s basins. This illustrates the clear advantage of estimating the basin and dividing the search space using KDTree. Figure 2c also shows that in the hyper-rectangle on the right half (the explored area), there is almost no individual because they have already been re-initialised to the unexplored area (the left half). This demonstrates that EA-KDTree is able to distinguish between explored and unexplored areas, as well as to prevent individuals from reconverging to an existing optimum.

Fig. 2.
figure 2

Top: Simple GA vs Simple GA+KDTree; Bottom: EAKD-Tree adjusts its tree to track the moving optima’s basins

Using KD-Tree to track moving optima: We investigate if TreeConstruction (Algorithm 2) can help EA-KDTree to adaptively adjust its tree structure to track the moving optima by plotting the structure of the KDTree against the search landscape at different moments when changes occur (Fig. 2).

Figure 2 shows EAKD-Tree has clearly adjusted the size of its hyper-rectangles to adapt with changes. At change 3, due to the radical level of changes, the KD-Tree even completely changes its structure to better cover the changing basins and optima. The figure confirms that the algorithm is able to resize/relocate its hyper-rectangles to better fit with the changes in both basin sizes and locations of the optima. This ensures that moving optima are tracked successfully.

5 Conclusion and Future Work

This paper presented a new method to adaptively separate the unexplored and explored areas in search spaces. This method helps improve tracking the moving optima and detecting changes. The resulting algorithm performs competitively against current state-of-the-art, while having the benefits of offering less computational complexity and better change detection, even when being applied to even a not-usually-effective simple GA.

The paper has the following contributions: (a) a novel use of KD-Tree to separate and track explored regions, with low computational cost; (b) a simple method to correctly estimate basins of attraction of optima; (c) a new competitive algorithm; and (d) detailed analyses to provide more insights of the behaviours of the new algorithm.

There are a number of areas for future research. First, we will use a more powerful EA, for example DE or PSO instead of simple GA. Second, we plan to tune the parameters to have better results. Third, we will investigate replacing the current Brent local search with a different local search that is more reliable.