Keywords

1 Introduction

The fireworks algorithm (FWA) [1], as a new member of swarm intelligence algorithms inspired by the explosion of real fireworks, has attracted much attentions in academia and industry. It simulates explosions repeatedly to implement local search points (sparks) around a specific point (firework) and evolves towards the optimal solution. Many improved versions of FWA have been proposed. The enhanced FWA (EFWA) [2] improves the corresponding operations of the original FWA and can achieve a better performance. Dynamic FWA (dynFWA) [3] uses a dynamic explosion amplitude for the current best firework to tune the search range more intelligently. An amplitude reduction strategy and local optima-based selection strategy [4] were also proposed to improve the performance of FWA obviously. Although many new ideas and mechanisms have been introduced to FWA to develop new variations, little attention was given to the generated sparks, which therefore offer a potential new direction for research.

Using gradient information has always been a very hot topic full of potential. Many practitioners have tried to build and use gradients to accelerate convergence. For example, [5] estimates the natural gradient for the exponential family based on regularized linear regression. In addition, [6] proposes an alternative way to compute search directions by exploiting neighborhood information. In this paper, we introduce a new type of firework, the synthetic firework. Using gradient information derived from the generated sparks, we can gain an understanding about the direction of local evolution on the fitness landscape. This local gradient information is then used to estimate a convergence point for the fireworks population.

The main objective of this paper is to use the estimated convergence point as an elite individual to accelerate FWA by substituting it for the worst firework individual in next generation if its fitness is better. The secondary one is to analyze the applicability of the proposed strategy, and introduce some topics which are open to discussion.

We introduce the framework of canonical FWA in Sect. 2.1 and a method for estimating the convergence point in Sect. 2.2. New types of fireworks are described in detail in Sect. 3. We evaluate them by comparing them with the original FWA using 20 benchmark functions of 3 different dimensions in Sect. 4. Finally, we discuss the experimental evaluations in Sect. 5 and conclude in Sect. 6.

Fig. 1.
figure 1

The search process of FWA. (a) fireworks are generated, (b) sparks are created around each firework, and mutation points are also generated, (c) new fireworks are created in the next generation using the sparks from (b). Steps (b) and (c) are iterated until the termination condition is satisfied.

2 Related Research

2.1 Fireworks Algorithm

Real fireworks are launched into the sky, and many sparks are generated around the fireworks. The explosion process of a firework can be viewed as a local search around a specific point. FWA simulates this explosion process iteratively to find the optimal solution. Figure 1 illustrates the process of FWA, which consists principally of three operations: explosion, mutation and selection [1].

Since there are some limitations in classic FWA and its performance is also not very prominent among all its subsequent variants, such as EFWA and dynFWA, we employ the more powerful EFWA [2] as our baseline algorithm and combine it with our proposed strategy. The EFWA introduces five major improvements into conventional FWA to improve its performance. For details on these improvements, refer to [2].

2.2 Method for Estimating the Convergence Point for a Population

The convergence point for the moving vectors between parent individuals and their offspring in the next EC search generation can be calculated mathematically [7, 8]. Let us begin by defining symbols. \(\varvec{a}_i\) and \(\varvec{c}_i\) in the Fig. 2 are the i-th parent individual and its offspring individual, respectively (\(\varvec{a}_{i}, \varvec{c}_{i} \in \mathbb {R}^{d}\)). The i-th moving vector is defined as the direction vector, \(\varvec{b}_i = \varvec{c}_i - \varvec{a}_i\). The unit direction vector of the \(\varvec{b}_i\) is given as \( \varvec{b}_{0i} = \varvec{b}_i / ||\varvec{b}_i||\), i.e. \(\varvec{b}_{0i}^T \varvec{b}_{0i}=1 \).

Fig. 2.
figure 2

The moving vector \(\varvec{b}_i\) (\(= \varvec{c}_i - \varvec{a}_i\)) is calculated from a parent individual \(\varvec{a}_i\) and its offspring \(\varvec{c}_i\) in the d-dimensional searching space. The \(\varvec{\star }\) mark is the convergence point for these moving vectors.

Let \(\varvec{x}\in \mathbb {R}^{d}\) be the point that is the nearest to the n extended directional line segments, \(\varvec{a}_i + t_{i}\varvec{b}_i\) (\(t_i \in R\)). The nearest, means that the total distance from \(\varvec{x}\) to the n extended directional line segments, \(\mathrm {J}(\varvec{x},\{t_{i}\})\) in Eq. (1), becomes the minimum. We may insert an orthogonality condition, Eq. (2), into Eq. (1) and thus remove \( t_{i}\).

$$\begin{aligned} \mathrm {J}(\varvec{x},\{t_{i}\}) = \sum _{i=1}^{n}\Vert \varvec{a}_{i}+t_{i}\varvec{b}_{i}-\varvec{x}\Vert ^2 \end{aligned}$$
(1)
$$\begin{aligned} \varvec{b}_{i}^{\mathrm {T}} (\varvec{a}_{i} + t_{i}\varvec{b}_{i} - \varvec{x}) =0 \quad \text {(orthogonal condition)} \end{aligned}$$
(2)

The \(\hat{\varvec{x}}\) that minimizes the total distance in the Eq. (1) is obtained by partially differentiating each element of \(\varvec{x}\) and setting them equal 0. Finally, the convergence point \(\hat{\varvec{x}}\) is given by Eq. (3), where \(\varvec{I}_{d}\) is the unit matrix.

$$\begin{aligned} \hat{\varvec{x}} = \left\{ \sum _{i=1}^{n}\left( \varvec{I}_{d}-\varvec{b}_{0i}\varvec{b}_{0i}^{\mathrm {T}}\right) \right\} ^{-1} \left\{ \sum _{i=1}^{n}\left( \varvec{I}_{d} -\varvec{b}_{0i}\varvec{b}_{0i}^{\mathrm {T}}\right) \varvec{a}_{i} \right\} \end{aligned}$$
(3)

3 Proposed Method

We introduce a new kind of firework, named the synthetic firework, to make full use of the many generated sparks, which are otherwise only involved in the selection operation and then destroyed. The synthetic fireworks and fireworks of the current generation form many moving vectors which can be used to estimate a convergence point that is expected to locate near the global optimum. The estimated point is regarded as an elite individual and replaces the worst individual from the next generation if its fitness is better.

The method for calculating the synthetic fireworks is as follows. Each firework and its generated sparks form a subgroup, and we can construct many vectors between the firework and its generated sparks. If the firework is worse than one of the generated sparks, this vector’s direction is considered to be promising. Otherwise, its antipode is used to calculate a synthetic firework. There are many methods to evaluate the potential of these directions. In this paper, we simply use the fitness difference between the endpoint and the start point of a vector to evaluate it. Thus, the larger the fitness difference is, the higher will be the weight of the vector. In order to not increase the number of fitness evaluations, we only calculate the antipode for a firework which is lacking a fitness evaluation if the antipodal direction is to be used. The fitness difference of the original vector is roughly used to evaluate the used antipodal direction. Finally, a synthetic firework can be roughly calculated by weighting those vectors with Eq. 4 in each firework group. Figure 3 illustrates how a synthetic firework is thus formed.

Fig. 3.
figure 3

A synthetic firework is generated from a firework and its generated sparks. The black five-pointed star and the red solid points represent the firework and its generated sparks, respectively. The presence of a red hollow circle means that the antipode has been used. The purple solid point is the synthetic firework obtained by weighting these vectors. (Color figure online)

$$\begin{aligned} v^i = \sum _{j=1}^{m} \frac{f(x^i)-f(s_{j}^i)}{\sum _{i=1}^n|| (f(x^i)-f(s_{j}^i))||}*(s_{j}^i-x^i) +x^i \end{aligned}$$
(4)

Where \(x^i\) and \(s_j^i\) represent the i-th firework and its j-th generated spark or antipodal point. \(v^i\) is the i-th synthetic firework of the i-th firework group; m is the number of generated sparks of the i-th firework; f() is a fitness function.

We can obtain new synthetic fireworks up to the population of the current firework generation. Since we do not increase the number of fitness evaluations and a new synthetic firework is expected to be better than the firework belonging to its subgroup, we will not evaluate the synthetic fireworks. A moving vector is calculated from the current firework to the newly generated synthetic firework in each subgroup, and the convergence point is estimated using these moving vectors with the estimation method described in the Sect. 2.2. Algorithm 1 outlines the flow of EFWA using our proposed strategy.

figure a

Note that our proposed strategy does not change the structure of the original FWA when it is combined with other fireworks algorithms. It simply uses the fireworks and the generated sparks to build local gradient information, then uses this to estimate a convergence point to accelerate convergence.

4 Experimental Evaluations

We use 20 benchmark functions from the CEC2013 benchmark test suite [9] in our evaluations, which is designed for real parameter single-objective optimization. Table 1 shows their types, characteristics, variable ranges, and optimum fitness values. These landscape characteristics include shifted, rotated, global on bounds, unimodal and multi-modal. We test them with 3 dimensional settings: \(D=\) 2, 10 and 30. We select EFWA [2] as our test baseline and combine it with our proposal for this experiment using parameters as described in Table 2, where the definition of the symbols can be found in the original literature [1, 2].

For fair evaluations, we evaluate convergence against the number of fitness calls rather than generations. We test each benchmark function with 30 trial runs in 3 different dimensional spaces. We apply the Wilcoxon signed-rank test on the fitness values at the stop condition, i.e. the maximum number of fitness calculations, and compare EFWA with (EFWA + our proposed method). Table 3 shows the result of these statistical tests.

Table 1. Benchmark function: Uni = unimodal, Multi = multimodal.
Table 2. Parameter setting of EFWA.
Table 3. Statistical test results of the Wilcoxon signed-rank test for average fitness values of 30 trial runs of the proposal (EFWA + our proposed method) and conventional method (EFWA) at the stop condition, \(MAX_{NFC}\). \(A \gg B\) and \(A > B\) mean that A is significant better than B with significant levels of 1% and 5%, respectively. \(A \approx B\) means that although A is better than B, there is no significant difference between them.

5 Discussions

Most fireworks algorithm variants mainly use their computational resources for generating sparks, but the information from these sparks is not fully used. In our experimental evaluations, the total number of generated sparks was 10 times of that of the fireworks. It is clearly productive to consider how these many sparks can be used efficiently. We introduced a new type of firework, called the synthetic firework, to explore local gradient information on the fitness landscape. Thanks to the use of multiple vectors in each subgroup, the synthetic firework also has a certain anti-noise property, as its calculation cancels noise from the directions of the various moving vectors. This can help to improve the precision of the estimated convergence point. In any case, the proposed method increases each generation’s fitness calculations by only one - so we can say that it is a low risk, high return strategy.

What potential still remains for our proposed firework, the synthetic firework? Although we have used only the fitness difference between the two endpoints of a moving vector to evaluate it, we think that not only these fitness differences but also their lengths should be considered to understand the local gradient information more accurately, yielding further improvements in the estimate. Additionally, there are many other ways to weight moving vectors and increase the precision of the estimated convergence point. As an example, the fitness value at the beginning point or the end point of a moving vector can be used to evaluate it, which means that the lower the distance from the optimal area, the higher the weight given. A precise way of obtaining reasonable weights for the vectors is also a potential discussion topic.

We would like to point out that the new type of firework introduced can be used to speed up convergence. In this paper, we used synthetic fireworks to estimate a convergence point without evaluating their fitness. They have the potential to act as a new guide for individuals, helping move them toward a preferable evolutionary direction rather than random exploration. The new synthetic fireworks can be introduced into a population to improve the diversity and reduce selection pressure. How to use them reasonably is also a potential discussion topic.

We also performed an extra experiment to investigate the fitness of synthetic fireworks. We compared the synthetic firework with the firework individual belonging to its same subgroup. The experimental results show that in the early stages, synthetic fireworks are better than fireworks individuals, while the probability of a better synthetic firework decreases as the convergence progresses. For optimization problems with different characteristics, it seems reasonable to use a different method for assigning weights when creating the synthetic fireworks. Perhaps different optimization stages could use different weighting methods to obtain better synthetic fireworks. Summarizing the relationship between weighting method and optimization problem is thus also a potential topic for study.

From the results of the statistical tests, we find that the proposed method is beneficial for unimodal optimization problems (\(f_1- f_5\)), while the performance on low-dimensional multimodal optimization problems is not obvious. This may be because the basic estimation method, which is clearly effective for unimodal optimization problems, is not always valid for multimodal problems where the moving vectors go toward different local optima. Further, the number of moving vectors is small (in this case, the number is 5), and even on some multimodal optimization problems, it is less than the number of peaks. Regardless, the proposed strategy does not show any deleterious effect. [10] confirmed the effectiveness of using an extra individual pool to preserve outstanding individuals from past generations and using this pool, instead of the current generation, to estimate convergence points. For the next stage, using past searching individuals to increase the number of moving vectors, and combining it with the clustering method may allow us to extend our proposal to multimodal optimization problems.

6 Conclusion

We propose a new kind of fireworks which uses the generated sparks to efficiently estimate a convergence point which can act as an elite individual to accelerate the fireworks algorithm. The controlled experiments confirm that the proposed strategy can significantly improve the performance of conventional EFWA, and the higher the dimension, the more obvious the effect.

In future work, we will further study the proposed synthetic fireworks and use them to beneficially guide the evolution of the population. Additionally, it is suggested that we can further improve the accuracy of the estimated point by using historical information to better understand the fitness landscape.