Keywords

1 Introduction

The typical optimization approaches such as linear and nonlinear programming [29], which are based on optimization formulation, would not be the best option for the intelligent forecasting models due to their nature of inherent limitations and complexity of obtained objective function [38]. Studies have also shown that the metaheuristic (MH) algorithms (MHAs), which primarily include evolutionary and swarm intelligence (SI) based algorithms, have been applied to search the feasible solutions for these complicated predictive models in the past two decades [38].

Further, researchers have devoted to improving the precision of models through modifying and developing them with new approaches [3]. One possible approach to improve precision is to integrate the model with robust optimization MHAs such as genetic programming (GP), genetic algorithm (GA), and ant colony optimization (ACO) to minimize deviation and increase the accuracy of the model [27, 34]. However, evolutionary algorithms (EAs) mimetic the evolutionary procedure in nature where the global area optimal are obtained through generating new offspring that inherit the properties from the parents’ population. The set of candidate solution is improved gradually until satisfying the termination criteria. As such, throughout generation, the probability of achieving advanced results near the global area optimal will raise, however obtaining approximation of the global optima with high preciseness is not guaranteed [32].

On the other hand, the common mechanism of artificial neural networks (ANNs) is built based on the behavior within the neurons of human brains. This biological behavior was first modeled in several mathematical equations by McCulloch & Pitts [26] in 1943. Their paper has become an innovative work and it opens a new era of computational intelligence [2]. Moreover, the radial basis function (RBF) neural network (RBFNN) was proposed by Duda & Hart [9] in 1973. It exists a number of dominances over other types of NNs: excellent approximation capabilities, easier network constitutions and algorithms with faster learning capability [28]. Further, Zhang & Liao [39] in 2014 examined the prediction performance of RBFNN and hybrid fuzzy clustering (HFC) algorithm. The HFC algorithm has shown better performance over the former [39]. In addition, the representation of RBFNN depends upon the relevant parameters of nonlinear kernel functions for RBFNN. At the meantime, not many efforts have been made to hybrid several MH and population-based algorithms with trained RBFNN where exists gaps to improve in term of the fitting preciseness for function approximation. Accordingly, this study intends to propose a HAG algorithm for training RBFNN and make adequate performance verification and comparison. The proposed HAG algorithm incorporates the local and global area search abilities for problem resolution. Next, the HAG algorithm utilizes two benchmark continuous test functions, which are usually adopted in the literature to be the contrast of algorithm expression in the experiment.

The remainder of this study is organized as follows. Section 2 presents the literature review related to this paper, while the proposed HAG algorithm is introduced and illustrated in Sect. 3. Sections 4 discussed the experimental and assessment results. The study ends by the concluding remarks in Sect. 5.

2 Literature Review

A number of soft computing (SC) techniques normally named as MHAs have been emerged as the outputs for this research field that considered to imitate the biological processes, the group behavior of agents and survival of the fittest and so on, for the optimization problems [14]. Further, swarm-based algorithms exist the features of information sharing among multiple individuals, capability of being collaboration, self-organized, and learning during generations to fulfill efficient seeking procedures [37]. This section presents general background associated to this study, including MH and population-based algorithms for RBFNN learning.

MHAs are inspired from nature and ACO is one of the instances, have been well applied to numerous different optimization problems [10]. In Savsani et al. [31] in 2014, ACO algorithm developed by Dorigo [8] in 1996 was utilized on the simulation of collaboration mechanism of real-world ant colonies. Ants have the inclination to seek the shortest route between their nest and sources of food. Consider an optimization problem as a multi-layer topology. In which, the count of nodes within specific layer is equivalent to the count of detachment values corresponding to the design variables and the count of layers is equivalent to the number of design variables. Next, ACO has fascinated much concentration in the domains of discrete problems due to its population-based search ability as well as robustness and simplicity. ACO uses heuristic technique to generate a well initial solution and decides an appropriate search tendency according to the experience. While it is deserving to note that this strategy often brings a well solution for ACO, it leads to ACO trapped in local optimal as trade-off [19].

EAs imitate rules in natural evolution operators. GA is inspired through Darwinian evolutionary principle, which is the most broadly used EA [11]. One of beforehand works to imitate the appearance to seek the global area extreme values of an optimization task was done in 1975 by Holland [15] in 1992 when he introduced his GA, which simulates the evolutionary theory proposed through Darwin [12]. Next, GA is a random search approach through the Darwinian evolutionary principle, adopting procedures such as natural selection, reproduction, mutation and crossover. Early, an initial stochastic population was established, after that several solutions are sorted through their objective function and then, the first probability of them are transmitted to the next generation. Later, any two solution sets are chosen adopting the Roulette wheel (RW) selection and are merged to establish new offsprings. The procedure employed in establishing the new population is the mutation step. Lastly, objective value estimation of the new population should be fulfilled [30]. Further, gene choice permits us to comprehend the situations of a cell influenced through an illness. Especially, gene choice is a principle through selecting the most dominant genes that can valid forecast the class to which a cell specimen belongs to [30].

The ACO is a probabilistic approach to solve computational tasks. It provides the optimal solution by the paths of graph although it perhaps be fall into a local area optimal solution and is different from a global area optimal one [21]. On the other hand, Holland (1975) developed the GA, which is a population-based and stochastic-based optimization approach. The model was built based on nature-inspired evolutionary procedures such as natural selection, inherit, crossover and mutation [1]. Next, a hybrid algorithm of GA and ACO (i.e., GA-ACO algorithm) was proposed by Luan et al. [25] in 2019 and was adopted to resolve the linear programming model for supplier extract task. The GA-ACO algorithm applies the superiorities of GA with high initial accelerate convergence and the advantages of ACO with valid and parallelism feedback. As for the GA-ACO algorithm, the solutions generated through GA will be utilized to determine the initial produced pheromones for ACO [25].

3 Methodology

RBFNN is normally premeditated as a three-layered construction composed by input, hidden, and output layers [36], in which the RBF interpolation is formulated as [17]:

$$ u(x) = \sum\limits_{i = 1}^{N} {w_{i} \xi_{i} (\left\| {x - x_{i} } \right\|_{2} )} , $$
(1)

where \(w_{i}\) are the weights value, \(\xi_{i}\) is the RBF, and \(\left\| {x - x_{i} } \right\|_{2}\) denotes the Euclidean distance between the new point \(x\) and a sample point \(x_{i}\). The RBFs \(\xi_{i}\), used in this work is the Gaussian basis function:

$$ \xi (r) = e^{{ - (\varepsilon r)^{2} }} , $$
(2)

where \(r\) is the Euclidean distance between a vector of RBFNN input layer and a center of RBFNN hidden layer, and \(\varepsilon\) is the width factor determining the size of the RBF [36].

Further, as one class of topical kernel function (KF), the parameter \(\varepsilon\) of RBF resolves the width of the KF. Merely the selected point in sight of the trial point may influence on the yield of the function. In other words, RBF function has partial features and capability on interpolation [35]. Thus, the nonlinear function of the RBFNN hidden layer adopted is the Gaussian basis function shown in Eq. (2). Additionally, a typical hidden node in a RBFNN is characterized through its center, which is a vector where its number of dimensions is the number of inputs to the node. Then, the framework for the proposed HAG algorithm is illustrated in Fig. 1.

Fig. 1.
figure 1

The framework for the proposed HAG algorithm

3.1 The Detailed Description of the Proposed HAG Algorithm

This paper focused on training and tuning the corresponding parameters for RBFNN. The best solution of parameter values set can be received and adopted in the proposed HAG algorithm with the RBFNN to solve the problem for function approximation. The purpose is to receive the maximum of a fitness function regarding the parameters of the RBFNN (i.e., the hidden node center, width, and weight between the hidden and output layers). The inverse of mean absolute error (MAE) (i.e., MAE−1) is adopted as fitness function. The fitness values for the HAG algorithm in the experiment are calculated by maximizing the MAE−1 defined as Eq. (3):

$$ Fitness = MAE^{ - 1} = N \cdot \left( {\sum\limits_{i = 1}^{N} {\left| {y_{i} - \hat{y}_{i} } \right|} } \right)^{ - 1} , $$
(3)

where \(y_{i}\) is the actual output; \(\hat{y}_{i}\) is the predicted output of the learned RBFNN for the \(i^{th}\) testing pattern; \(N\) is the number of the testing set. Therefore, RBFNN can be trained to approximate two benchmark functions to a higher degree of precision. Next, the progress procedures for the HAG algorithm was then executed and summarized as follows.

  1. (1)

    Initialization: The initialization corresponding to nature random selection assures the diversity among units (i.e., ants in ACO-based approach; chromosomes in GA-based approach) and benefits the evolutionary procedure hereafter. An initial population with a number of units is produced and the initializing procedures are as follows.

    1. (a)

      Each unit in the initial population is the set of positions for neuron (i.e., \(c_{i,j}^{t}\)) and width (i.e., \(d_{i}^{t}\)) on RBFNN, defined in a matrix form. The results are adopted as the number of neurons in RBFNN.

    2. (b)

      The weights \(w_{i}\) are obtained by resolving the linear relationship [17]: where \({\rm A} = A_{ij} = \xi_{i} (\left\| {x - x_{i} } \right\|_{2} )\) and \(u = u(x_{i} )\) are the investigated function values at the specimen points. The chosen RBFs will generate a positive-definite matrix \(\Re\), thus assuring a sole solution to Eq. (4) [17].

    3. (c)

      The fitness value of unit matrix in population is calculated through Eq. (3) (i.e., MAE−1).

  2. (2)

    ACO-based approach [8, 31]:

    Let the ant nest include K ants. In the origination of the optimization procedure, all paths are initialized with an equivalent quantity of pheromone. In each generation, ants begin at the nest node, traverse across the varying layers from the first to the last layer, and finish at the target node [31]. Through following Eq. (5), each ant may determine only one node in every layer [8].

In which, \(P_{ij}^{k}\) indicates the probability of selecting \(j\) as the next intent aim for ant \(k\) located at node \(i\), \(\tau_{ij}\) is the pheromone trial and \(\alpha\) is the pheromone sensitiveness.

In case the route is finish, the ant precipitates some pheromone on the route based on the locally trial updating rule given through Eq. (6):

$$ {\rm A}w = u $$
(4)
$$ P_{ij}^{k} = \left\{ \begin{gathered} \frac{{\tau_{ij}^{\alpha } }}{{\sum {\tau_{ij}^{\alpha } } }}{\text{ if }}j \in K_{i}^{k} \hfill \\ \, 0{\text{ if }}j \notin K_{i}^{k} \hfill \\ \end{gathered} \right. $$
(5)
$$ \tau_{ij} = \tau_{ij} + \Delta \tau^{k} , $$
(6)

where \(\Delta \tau^{k}\) is the pheromone accumulation via kth ant on the route it has transited.

When all the ants fulfill their routes, the pheromones on the global area best route are revised utilizing the globally trial updating rule given through Eq. (7):

$$ \tau_{ij} = (1 - \rho )\tau_{ij} + \sum\limits_{k = 1}^{K} {\Delta \tau_{ij}^{k} } , $$
(7)

where \(\rho\) is the pheromone attenuating (exhalation) rate, \(\Delta \tau_{ij}^{k}\) is the pheromone precipitated via the best ant \(k\) on the route \(ij\) estimated as \(Q \cdot (fitness^{k} )^{ - 1}\), and \(Q\) is a constant [8]. Furthermore, for the population in ACO-based approach, the ant concludes better solution by referencing itself and other ants, determines the proceeding direction and therefore is able to explore in a global search space.

  1. (3)

    Duplication: The population promoted via the learning of ACO-based approach [8, 20] is replicated and is named as ACO population.

  2. (4)

    GA-based approach: The approach of GA evolution that includes crossover and mutation operators in the population of ACO-based approach learning is called [GA+ACO] subpopulation. The operators used in GA-based approach are as follows.

    1. (a)

      Further, each row of the chosen paired \(C_{t}\) will execute crossover operator with Pc.

    2. (b)

      Through the mutation operator, the values are substituted via randomly chosen values from the range of the search domain in each dimension, which keeps the diversity and produces new solutions.

  3. (5)

    Reproduction: The [ACO+GA] and [GA+ACO] subpopulations are hybridization after the refined evolution. Units with same quantity from the initial population are randomly chosen via the proportional RW selection [13] for the evolution hereafter. Thus, by applying ACO-based and GA-based approaches to conduct exploration and exploitation in the solving space respectively, it is expected to obtain the optimal solution with their best complementary properties.

    Additionally, owing to the feature of local search with GA-based approach, whether what the fitness values of the units in population are, them entirely have the chances to make progress with some genetic operators and enter into the next iteration of population to perform. GA-based approach then is able to exploit the potential solution.

  4. (6)

    Termination: The HAG algorithm will not stop returning to step (2) unless a specific number of iterations has been reached.

In summary, executing an evolution program via the ACO-based approach would receive a promoted progress population, which is better than the initial population. Moreover, the advantage of the feature of global search in ACO-based approach allows wider exploration on dimension domain among different experiments and the solving space is able to be expanded. On the other hand, as the HAG algorithm progresses, the members of the population evolve gradually. In this way, the HAG algorithm accords the essence of GA-based approach, assures the genetic diversity in the refinement of future evolution, and makes progresses to obtain a new promoted population. Besides, through the GA-based approach within the HAG algorithm to calculate the fitness values of unit parameters solution in the population, the better solutions will be received gradually. Accordingly, the solution space in population could be improved progressively and converge regarding to the global optimal solution.

4 Experimental Results

This section focused on training and tuning the corresponding parameters in RBFNN for function approximation problem. The objective is to receive the maximum of a fitness function concerning the parameters of the RBFNN. The intention is then to solve the suitable values of the parameters from the setting domain in the experiment. The proposed HAG algorithm will gradually be able to train and thus receive a set of solutions for parameter values.

4.1 Benchmark Problems Experiment

Continuous test function induces excellent approximation to recompense RBFNN for the outcome of nonlinear mapping relation. This paper applies two continuous test functions that are usually used in the literature to be the comparative benchmark of estimated algorithms. As such, the experiment contains the following two benchmark problems, including Rosenbrock and Griewank [4] continuous test functions are listed in Table 1.

Table 1. Two benchmark continuous test functions [4] adopted in this experiment

4.2 Parameter Setup

In the proposed HAG algorithm, four parameters (i.e., pheromone trail, pheromone decay rate, crossover probability, and mutation probability), which have major impact on calculation results are analyzed. Besides, this paper also referred to the related literature for the range of the parameters’ value setting. Next, the setup of the parameters for HAG algorithm is tuned by referring to the Taguchi experimental design [33] with analysis mode substitute for using trial and error procedure [38]. After, the maximum number of iterations is set at 1,000 to set as termination condition in the experiment. Finally, the appraisal of the parameters setting for the proposed HAG algorithm was conducted with the content listed in Table 2.

Table 2. Parameter setup for the proposed HAG algorithm

4.3 Performance Assessment and Comparison

The learning of all algorithms on several solutions of parameter values set (i.e., hidden node center, width, and weight) for RBFNN that are generated by the population during the operation of the progress procedure in the experiment are discussed in this section. Consequently, 1000 randomly generated data sets are divided into three parts to train RBFNN which are training set (65%), testing set (25%), and validation set (10%) [24] respectively, and in which we can assess the learning status and tune the parameters’ setting. Next, this study uses these algorithms to resolve the best solution of parameter values set for RBFNN, and it randomly generates unrepeatable 65% training set from 1000 generated data and input the set to RBFNN for training. With the same approach, it randomly generates unrepeatable 25% testing set to inspect unit parameters’ solution in population and further calculates the fitness value. RBFNN has adopted 90% dataset in learning stage at this point. After 1000 iterations in the evolution process, the best solution of parameter values set for RBFNN are received. Finally, it randomly generates unrepeatable validation set (10% dataset) to certify how the unit parameters solution approximates the two benchmark problems and record the RMSE values to justify the learning status of RBFNN. Once the data refining steps presented above have completed, all algorithms are ready to execute. The learning and validation stages mentioned above were carried out 50 times before the average RMSE (i.e., RMSEavg) values were calculated. The values of the RMSEavg and standard deviation (SD) for all algorithms calculated from the experiment are listed in Table 3.

Table 3. Result comparison among relevant algorithms adopted in this experiment

In Table 3, the results demonstrate that HAG algorithm obtains the smallest values with stable expression during the entire training process of the experiment. Thus, RBFNN is able to receive the single parameters’ solution from the evolution learning process in population, which has realized the situation with optimal function approximation. When the training of RBFNN by the HAG algorithm is accomplished, the unit with the best solution of parameter values set in learning stage is the RBFNN setting in certain. Additionally, the HAG algorithm shows robust learning within two benchmark problems and shows remarkable approximation results.

5 Conclusions

This paper proposed the HAG algorithm through incorporating ACO-based and GA-based approaches, which offers the settings of RBFNN parameters. The experimental results shown that ACO and GA algorithms can be integrated intelligently and develop into a hybrid algorithm which is designed for receiving the best precise learning expression among all algorithms in this study. Additionally, method assessment results for two benchmark continuous test function experiments and show that the proposed HAG algorithm outperformed other algorithms in preciseness of function approximation.