1 Introduction

Recently, swarm-based meta-heuristic algorithms have attracted a lot of attentions and become useful techniques to solve many real-world optimization problems [13]. The recently developed firefly algorithm (FA) is one of swarm-based search algorithms, which mimics the social behavior of fireflies [4], and it consists of two important issues involving variation of light intensity and formulation of attractiveness [4]. Similar to other heuristic algorithms, FA also has several control parameters to be tuned, for instance, light absorption coefficient, randomization control factor, and population size, for good performance on different problems. These parameters are either set to constants or fixed empirically in the traditional variants [47], which always make the algorithm inefficiency on the problems with rugged landscapes [8]. Researchers have made numerous contributions to the improvement of FA considering the alteration of the control parameters. For example, a variant of FA based on Lévy flights moving strategy was developed in Ref. [9] to enhance the search ability of the classical firefly algorithm. In Ref. [5], the authors analyzed the performance of FA on five standard benchmark functions, and, accordingly, they derived guidelines of parameter selection to guarantee convergence of FA. For further details on firefly algorithm, see Ref. [10].

Although numerous improved variants of FA have been developed, FA is still easily trapped in local regions when it is used to handle complex problems with numerous local optima [8]. The standard FA employs several parameters for solving the optimizations, and the parameters may result in significantly different performances of FA. Proper selections of these parameters can be a useful way to improve the search ability of FA. However, as to different problems of distinguished features, it is not an easy task to manually tune the parameters.

In FA algorithm, flash of a firefly is a signal to attract other fireflies, which is considered as attractiveness determined by its brightness. Apparently, its brightness is important to attract other fireflies with less bright. Mathematically, this biological phenomenon is defined as bio-inspired search algorithm based on three rules [4, 8, 19] as follows:

  • All fireflies are attracted by each other without respect to their sex;

  • Attractiveness is proportional to its brightness, that is, the less bright one will move toward the brighter one;

  • If there are no brighter fireflies than a particular firefly, it will move randomly in the space.

For mathematical simplicity, intensity of the brightness is proportional to mutual distance between any two fireflies, and, accordingly, variation of the intensity is defined as light absorption coefficient \(\gamma \).

It is obvious that there are two main drawbacks in using a constant light absorption coefficient \(\gamma \) [11], which can be concluded as follows:

  1. 1.

    The attractiveness of other fireflies will be a constant when \(\gamma \) approaches 0. That is, a firefly can be seen by all the other ones. In this case, FA is the same as a classical PSO [17].

  2. 2.

    If \(\gamma \rightarrow \infty \), the attractiveness will be equal to 0. All the fireflies cannot take their bearings to move but will be in random flight. In this case, FA becomes a pure random search algorithm.

As can be seen, the parameter \(\gamma \) is crucially important in characterizing the variation of the attractiveness, and the speed of convergence is also dependent on \(\gamma \) [12]. In our previous study [8], we developed an adaptive variant of FA termed AdaFa, which is featured by: (1) two adaptive coefficient strategies for altering the attractiveness and sharing information;0 (2) five different designs for the randomization parameter, to efficiently deal with selections of either \(\gamma \) or other control parameters for enhancing performances of FA-based algorithms. Although much progress has been achieved in the proposed AdaFa [8] as well as other FA-based algorithms since 2008, significant efforts are required to further improve the performance of FA:

  • Theoretical analysis for convergence trajectory;

  • Deriving the sufficient and necessary conditions for the selections of control coefficients;

  • Non-homogeneous update rules for enhancing search ability [8].

In this paper, we develop a new different strategy for the control parameter in randomization term and provide theoretical analyses, respectively, for the standard FA and the improved AdaFa (termed NAdaFa). As an important contribution, we theoretically explain why the developed NAdaFa enhances the balance between exploitation (local search) and exploration (global search), and, rooting in the theoretical analysis, we suggest several promising boundary conditions for choosing the control parameters of both FA and NAdaFa.

The remainder of this paper is organized as follows. Section 2 presents descriptions of firefly algorithm and its variant—AdaFa. Section 3 presents convergence analyses of a single firefly’s behavior in both FA and NAdaFa. In Sect. 4, numerical studies and discussion are presented. Conclusions are given in Sect. 5.

2 Firefly Algorithms

The firefly algorithm (FA) is a swarm-based search method [4], in which the fireflies are attracted by more bright ones without respect to their sex. Let \({\mathbf {x}}_i\) and \({\mathbf {x}}_j\) be the \(i\hbox {th}\) and the \(j\hbox {th}\) fireflies, respectively, and \(r_{ij}\) be Euclidean distance between them. Accordingly, an attractiveness of any two fireflies being proportional to their distance is defined as follows,

$$\begin{aligned} \beta (r)=\beta _0\mathrm{e}^{-\gamma r^2}, \end{aligned}$$
(1)

where \(\beta _0\) is a constant of attractiveness at \(r=0\). \(\gamma \) is light absorption coefficient, which is set to 1.0 in FA.

The step of the \(i\hbox {th}\) firefly attracted to move to another more attractive (brighter) firefly j is determined by

$$\begin{aligned} \Delta {\mathbf {x}}_i^{t}=\beta \cdot \left( {\mathbf {x}}_j^{t}-{\mathbf {x}}_i^{t}\right) +\alpha ({\mathbf {r}}_c-{\mathbf {c}}), \end{aligned}$$
(2)

where \({\mathbf {r}}_c\) is a vector of random numbers subjecting to standard normal distribution, which are randomly drawn from N(0, 1), and \({\mathbf {c}}\) consists of 0.5 with the same size as that of \({\mathbf {r}}_c\). \(\alpha \) is a constant in the range of (0, 1) in FA. Accordingly, the update law of the \(i\hbox {th}\) firefly is formulated as follows,

$$\begin{aligned} {\mathbf {x}}_i^{t+1}={\mathbf {x}}_i^{t}+\Delta {\mathbf {x}}_i^{t}. \end{aligned}$$
(3)

During search process, the traditional FA does not alter the control parameters or only uses constant parameters throughout the whole process. Moreover, information of the search or knowledge achieved by the fireflies is not taken into account in the selections of parameters. All these static designs may be optimal for one problem, but not efficient or even fail to guarantee convergence for another one [8]. Proper selections of these parameters highly determine quality of solution and search efficiency of FA.

As an improved variant of FA, AdaFa [8] concerned the developments of new strategies for parameter selections. In AdaFa, two mechanisms involving distance-based light absorption coefficient and gray factor are developed for adaptively altering the attractiveness and enhancing the ability of sharing difference information, respectively. The constant light absorption coefficient \(\gamma \) in FA is replaced by an adaptive coefficient \(A_c\) for efficient search. The new attractiveness \(\tau \) is defined in Eq. (4).

$$\begin{aligned} \tau = D_r\cdot \mathrm{e}^{-A_c r^2}, \end{aligned}$$
(4)

where \(D_r\) and r are the distance ratio and the distance between pair of fireflies, respectively.

Gray relational analysis [13] is used to measure the similarity of the fireflies in Ref. [8]. A gray factor \(\eta \), which depends on the information of population distribution computed by the gray relational grade, is defined in Eq. (5) to satisfy the requirement of diversity.

$$\begin{aligned} \eta =\frac{(\eta _{\min }-\eta _{\max }) {\mathcal {F}}_i +\eta _{\max }{\mathcal {F}}_{\max }-\eta _{\min }{\mathcal {F}}_{\min }}{({\mathcal {F}}_{\max }-{\mathcal {F}}_{\min })}, \end{aligned}$$
(5)

where \(\eta _{\max }\) and \(\eta _{\min }\) are upper and lower boundaries, respectively, which ensure the population can converge in finite time. \({\mathcal {F}}_i\) is the gray relational coefficient between the \(i\hbox {th}\) firefly and the best one. Additionally, \({\mathcal {F}}_{\max }=\max \{{\mathcal {F}}_{i|_{i=1,2\ldots ,N}}\}\) and \({\mathcal {F}}_{\min }=\min \{{\mathcal {F}}_{i|_{i=1,2\ldots ,N}}\}\).

In search process, the differences among the fireflies are used to enhance their cooperation and search abilities in a given search space. However, if these differences are not decreased, the fireflies will never converge in finite number of generations. To deal with this problem, we design a new strategy for controlling the randomization parameter \(\alpha \) in Eq. (2), which is better than the classical design in FA [19], and it is also simpler than the five strategies for \(\alpha \) in AdaFa [8], since there is no parameter needing to be tuned. The new strategy for \(\alpha \) is defined as follows,

$$\begin{aligned} \alpha ^{t+1} = \alpha ^{t}\cdot 0.95 ^{\frac{t}{N_{G}}}. \end{aligned}$$
(6)
Fig. 1
figure 1

The comparison of randomization parameters \(\alpha \). The variations of \(\alpha \) are drawn from the equations of \(\alpha \) in Refs. [8, 19]

In Fig. 1, comparison of the strategies for \(\alpha \) is illustrated. In FA, \(\alpha \) decreases smoothly depending on the generation number (\(\alpha \) is defined as \(\alpha (t)=\alpha (t-1)\left( \frac{10^{-4}}{0.9}\right) ^{1/N_{G}}\) in Ref. [19]), while other five strategies in AdaFa (details can be found in Ref. [8]) exhibit similar performances—slowly decreasing at initial stage, but later they are shaper than \(\alpha \) in FA. As shown in Fig. 1, the design in Eq. (6) can enlarge scope of \(\alpha \). Moreover, the developed strategy, which contributes to exploration in initial search while working for smooth exploitation in the later stage, provides NAdaFa with potential search ability in search process.

Another important issue in FA and most of its variants is that fireflies obey the same search law and share similar information throughout the search process. Due to the same search characteristics, the fireflies cannot always exhibit diverse and useful information for promising search. As a non-homogeneous algorithm, AdaFa employs two update rules for efficient search, one of which is to learn from the global best—global search given in Eq. (7a)—-while the other one is dependent on the neighbor of each firefly—local search as shown in Eq. (7b).

figure a

where \(N_{G}\) is maximum number of generations. \({\mathbf {x}}_{r}\) is randomization term defined as follows,

$$\begin{aligned} {\mathbf {x}}_{r}^{t+1}=\alpha ^{t+1}\cdot ({\mathbf {r}}_{0}-{\mathbf {c}})\cdot |{\mathbf {U}}_b-{\mathbf {L}}_b|, \end{aligned}$$
(8)

where \({\mathbf {r}}_0\) is a vector of random numbers generated from standard normal distribution N(0, 1). \(U_b\) and \(L_b\) are upper and lower boundaries, respectively.

figure b

Based on the framework of AdaFa [8] and combining with the new designed strategy for the randomization parameter \(\alpha \), we theoretically explain why the proposed NAdaFa (as described in Algorithm 1) can balance the exploration and exploitation in following sections.

3 Convergence Analyses

This section provides the theoretical analyses of a single firefly’s behavior in one-dimension of both FA and the developed NAdaFa. In the analyses, we show NAdaFa can keep balance between the local best position (averaged position of the attractive neighbors) and the global one according to the non-homogeneous update rules, while the standard FA can only converge to the local best position (the attractive neighbor).

Lemma 3.1

If \(\langle u_n \rangle \) is a sequence of independent identically distributed random variables subjecting to the uniform distribution U on [0, 1], that is \(\langle u_n \rangle \sim U(0,1)\) for all \(n>0\), then

$$\begin{aligned} \lim \limits _{n\rightarrow +\infty }\frac{1}{n}\sum \limits _{i=1}^{n}u_i=\int _0^1s \mathrm{d}s. \end{aligned}$$
(9)

Lemma 3.2

Let \(\langle \varrho _n \rangle \) be a sequence of constants or random variables on a bounded interval. If \(\langle u_n \rangle \) is a sequence of independent identically distributed random variables, that is \(\langle u_n \rangle \sim U(0,1)\) for all \(n>0\), then

$$\begin{aligned} \lim \limits _{n\rightarrow +\infty }\sum \limits _{i=1}^{n}\varrho _i(u_i-0.5)=0. \end{aligned}$$
(10)

Proof

Proof of Lemma 3.2 can be found in Supplementary Material. \(\square \)

Theorem 3.1

(Squeeze theorem [14]) Let \(\ell \) be an interval containing a point o as the point approaching to the limit O, and f(x), g(x), and h(x) be defined on \(\ell \). Suppose that for every point x in \(\ell \), we have that

$$\begin{aligned} g(x)\leqslant f(x) \leqslant h(x), \end{aligned}$$
(11)

and that

$$\begin{aligned} \lim \limits _{x\rightarrow o}g(x)= \lim \limits _{x\rightarrow o}h(x) =O, \end{aligned}$$
(12)

then

$$\begin{aligned} \lim \limits _{x\rightarrow o}f(x)=O. \end{aligned}$$
(13)

Theorem 3.2

Let \(\langle \theta _n\rangle \) be a sequence or random variables, the sufficient and necessary condition for the exist of the limit

$$\begin{aligned} \lim \limits _{n\rightarrow +\infty }\prod \limits _{j=1}^{n}\theta _j=0 \end{aligned}$$
(14)

and

$$\begin{aligned} \lim \limits _{n\rightarrow +\infty }\prod \limits _{j=1}^{n}(1-\theta _j)=0 \end{aligned}$$
(15)

is \(\theta \in (0,1)\).

Proof

Proof of Theorem 3.2 can be found in Supplementary Material. \(\square \)

Theorem 3.3

Let \(\langle \rho _n\rangle \) be a sequence of constants or random variables on a bounded interval and \(\langle \upsilon _n\rangle \) be a sequence of constants or random variables. If \(\langle u_n \rangle \) is a sequence of independent identically distributed random variables, that is \(\langle u_n\rangle \sim U(0,1)\) for all \(n>0\), the sufficient and necessary condition for

$$\begin{aligned} \lim \limits _{n\rightarrow +\infty }(1-\upsilon _n)^{-1}\sum \limits _{i=1}^{n}\rho _i(u_i-0.5) \prod \limits _{j=i}^{n}\left( 1-\upsilon _j\right) =0 \end{aligned}$$
(16)

is \(\forall \upsilon \in (0,1)\).

Proof

Proof of Theorem  3.3 can be found in Supplementary Material. \(\square \)

Theorem 3.4

The sufficient and necessary condition for the position sequence \(\langle X_n \rangle \) of a firefly in FA converging to the averaged position \(\bar{p}\) of its attractive neighbors is \(\beta _0\in (0,1)\).

Proof

Proof of Theorem 3.4 can be found in Supplementary Material. \(\square \)

Theorem 3.5

Given \(\forall \rho \in {\mathbb {IR}}^+\) and \(\forall \sigma \in {\mathbb {IR}}^+\), the sufficient and necessary condition for the position sequence \(\langle X_n \rangle \) of a firefly in NAdaFa converging to the averaged position \(\bar{p}\) of its attractive neighbors and the global position g is \(\tau \in (0,1)\) and \(\eta \in (0,1)\).

Proof

Proof of Theorem 3.5 can be found in Supplementary Material. \(\square \)

Based on the converge analyses, it can be concluded that the standard FA can only converge to the local position (the attractive neighbor) if only if \((0<\beta _0<1)\) while the developed NAdaFa can not only converge to the local regions but also achieve at the global best position if it satisfies \(\tau \in (0,1)\) and \(\eta \in (0,1)\), where \(\eta \) can be enlarged as the total number of iterations increasing. In following sections, we show the significantly improved performance of the proposed NAdaFa on benchmark problems.

4 Experimental Study

In this section, we demonstrate good performance of NAdaFa on twelve benchmark functions \(F_{1}\)\(F_{12}\) summarized in Table S1 of Supplementary Material (more details can be referred to the studies [15, 16]), and we also compare NAdaFa with other five swarm-based algorithms involving standard particle swarm optimization (SPSO) [17], adaptive particle swarm optimization (APSO) [16], gray particle swarm optimization (GPSO) [15], intelligent firefly algorithm (IFA) [18] and FA [19].

4.1 Parameters Settings

Experiments were conducted to compare different algorithms on 12 benchmark functions with 30-dimension. In these experiments, for sake of fair comparison, the population was set to 40, and the total number of fitness evolutions (FEs) was 40,000 for each compared algorithm. The parameters in NAdaFa are determined according to the results on design of experiment (DOE) on six benchmark functions. These parameters of NAdaFa and the other five algorithms are listed as follows:

  • SPSO: \(\omega = \frac{1}{2{\text {log}}(2)}\), \(c_1=c_2 = 0.5+log(2)\).

  • APSO: \(\omega _{{\text {start}}}= 0.9\), \(c_1=c_2 = 2.0\).

  • GPSO: \(\xi =1\), \(\omega _{\min }=0.4\), \(\omega _{\max }=0.9\), \(c_{\min }=1.5\), \(c_{\max }=c_{{\text {final}}}= 2.5\)

  • FA: \(\alpha _0=0.25\), \(\beta _{0}=1.0\), \(\beta _{\min }=0.2\), \(\gamma =1.0\).

  • IFA: \(\phi = 0.05\), \(\alpha _0=0.5\), \(\beta _{0}=1.0\), \(\beta _{\min }=0.2\), \(\gamma =1.0\).

  • NAdaFa: \(\alpha _0=0.9\), \(\sigma =1.5\), \(\rho =2.5\), \(\eta _{\min }=0.05\), \(\eta _{\max }=0.8\).

4.2 Numerical Results

The experimental results on the twelve benchmark functions are presented in this section. Table 1 shows the mean values and standard deviation on 30 independent trials of each algorithm, and the mean values are also illustrated in Fig. 2 to compare the convergence of each algorithm.

Table 1 The results achieved by different algorithms on the benchmark functions

As illustrated in Table 1, the proposed NAdaFa achieved better results than the other five algorithms on ten benchmark functions (\(F_1\)\(F_3\) and \(F_5\)\(F_{11}\)), which can be also observed from Fig. 2. On all problems except \(F_{12}\), NAdaFa was superior to FA and IFA. Moreover, NAdaFa performed better than three PSO-based algorithms on these problems. As the results on function \(F_3\) shown, NAdaFa seemed to outperform the other five methods at a level of statistical significance. All the algorithms were trapped into local regions on function \(F_4\), and there was no significant difference among the experimental results, as illustrated in Fig. 2d, but they were of similar convergent characteristics. Figure 2e shows that NAdaFa converged faster than FA in terms of average results, and it is the only one achieving the target solution to the problem \(F_5\). On function \(F_6\), NAdaFa got the best quality solution among all compared algorithms. It was far better than FA shown in Fig. 2f, which got lost in the search process. Since the landscape of function \(F_7\) is highly rugged, most of the compared methods were trapped in local regions. As shown, compared with other algorithms, NAdaFa was able to find better solution approaching to the target (\(-418.9829\times 30=-1.2569E+04\)). In Fig. 2h–k (\(F_8\)\(F_{11}\)), NAdaFa got the best place in these four test functions, and its yielded results were sharply better than those of SPSO, APSO, GPSO, FA, and IFA, which were all trapped on functions \(F_8\) and \(F_9\). For function \(F_{10}\), GPSO was better than SPSO, APSO, which were prematurely convergent, FA and IFA. As shown in Fig. 2j, FA and IFA were slightly better than SPSO and APSO. It can be observed from Table 1 and Fig. 2k that three PSO-based and two FA-based algorithms obtained similar mean and standard deviation with each other, but they were all inferior to those of NAdaFa. FA was good for the function \(F_{12}\), on which it was superior to others, and the rest possessed similar convergent characteristics as illustrated in Fig. 2l.

Fig. 2
figure 2

Comparison of different algorithms (mean values of best-so-far) on the benchmark functions with 30-dimensions a \(F_1\), b \(F_2\), c \(F_3\), d \(F_4\), e \(F_5\), f \(F_6\), g \(F_7\), h \(F_8\), i \(F_9\), j \(F_{10}\), k \(F_{11}\), l \(F_{12}\),

Table 2 Average rankings of all compared algorithms on all benchmark functions

Generally, it is necessary to use nonparametric tests to analyze the experimental results. In this section, Friedman, Aligned Friedman and Quade tests [20] were used to validate the performances of all the algorithms, and the average rankings calculated from these tests are listed in Table 2. Each algorithm and its rankings are listed in ascending order (the lower the better). The statistics and the corresponding p values are shown at the bottom of the table. It revels that the performance of NAdaFa is as good as or significantly better than those of the other five algorithms. According to Aligned Friedman test, the rankings show that NAdaFa is not significantly different from other algorithms with p values 0.077, but they can support the low presumption against neutral hypothesis in the comparison among the performances of all algorithms. The difference between the performance of NAdaFa and those of the other five algorithms is statistically significant at the 5 % significance level in Friedman and Quade tests.

5 Conclusions

In this study, according to our AdaFa [8], we propose more efficient strategy for an important control parameter—the randomization coefficient for smoothly balancing the search process. Moreover, we theoretically explain the superiority of the proposed NAdaFa, compared with the standard FA, in terms of convergence. Based on the investigations of the search mechanism of an individual firefly, it can be concluded that NAdaFa can keep a statistical balance between the exploration and the exploitation, since the update rules are randomly selected dependent on the Gaussian random number in (0, 1). Several promising conditions for guaranteeing the convergence of a single firefly in both FA and NAdaFa are also provided in this paper. In the numerical experiments, we compared the performance of the proposed NAdaFa variant with FA- and PSO-based algorithms, and, although Aligned Friedman test provides weak evidence (\(p\,\mathrm{value}>0.05\)) to determine the significance of NAdaFa, the statistical results demonstrate it is significantly better than the other four algorithms with the \(5\,\%\) significance level on Friedman and Quade tests. Although NAdaFa exhibits good performance on the numerical experiments, it is still a challenging problem to deal with the cooperation among the fireflies for further improving the performances of FA and NAdaFa. It is also important to design heterogeneous update rules in solving real-world problems, because these problems are not of homogeneous properties, which are our future efforts. The source code of NAdaFa is available upon request at: http://godzilla.uchicago.edu/pages/ngaam/NAdaFa/.