Dolphin Swarm Extreme Learning Machine

Wu, Tianqi; Yao, Min; Yang, Jianhua

doi:10.1007/s12559-017-9451-y

Dolphin Swarm Extreme Learning Machine

Published: 27 February 2017

Volume 9, pages 275–284, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cognitive Computation Aims and scope Submit manuscript

Dolphin Swarm Extreme Learning Machine

Download PDF

Tianqi Wu¹,
Min Yao¹ &
Jianhua Yang¹

688 Accesses
27 Citations
1 Altmetric
Explore all metrics

Abstract

As a novel learning algorithm for a single hidden-layer feedforward neural network, the extreme learning machine has attracted much research attention for its fast training speed and good generalization performances. Instead of iteratively tuning the parameters, the extreme machine can be seen as a linear optimization problem by randomly generating the input weights and hidden biases. However, the random determination of the input weights and hidden biases may bring non-optimal parameters, which have a negative impact on the final results or need more hidden nodes for the neural network. To overcome the above drawbacks caused by the non-optimal input weights and hidden biases, we propose a new hybrid learning algorithm named dolphin swarm algorithm extreme learning machine adopting the dolphin swarm algorithm to optimize the input weights and hidden biases efficiently. Each set of input weights and hidden biases is encoded into one vector, namely the dolphin. The dolphins are evaluated by root mean squared error and updated by the four pivotal phases of the dolphin swarm algorithm. Eventually, we will obtain an optimal set of input weights and hidden biases. To evaluate the effectiveness of our method, we compare the proposed algorithm with the standard extreme learning machine and three state-of-the-art methods, which are the particle swarm optimization extreme learning machine, evolutionary extreme learning machine, and self-adaptive evolutionary extreme learning machine, under 13 benchmark datasets obtained from the University of California Irvine Machine Learning Repository. The experimental results demonstrate that the proposed method can achieve superior generalization performances than all the compared algorithms.

A competitive swarm optimizer with hybrid encoding for simultaneously optimizing the weights and structure of Extreme Learning Machines for classification problems

Article 07 February 2020

A Hybrid Optimization Algorithm for Extreme Learning Machine

An Improved Evolutionary Extreme Learning Machine Based on Multiobjective Particle Swarm Optimization

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Inspired by cognitive science, the extreme learning machine (ELM) was proposed to overcome challenging issues (such as slow convergence and local minima problem) faced by the conventional gradient-based approaches in a single hidden-layer feedforward neural network (SLFN) [1]. Since its inception, the ELM has attracted increasing attention from researchers all over the world. One of the salient features of the ELM is that the input weights and hidden biases are generated randomly instead of being tuned iteratively. The output weights are computed based on the prefixed input weights and hidden biases using the Moore-Penrose (MP) generalized inverse only once [2, 3]. Another characteristic of the ELM is that almost any nonlinear piecewise continuous random hidden nodes can be used in it [4, 5]. Compared with the conventional gradient-based approaches, the ELM has a significantly lower computational time and presents better generalization performances. Due to its efficiency and universal approximation capabilities, the ELM has been demonstrated on various problems in different fields, such as classification [6–8], recognition [9–12], imbalanced leaning [13, 14], steganalysis [15], and so on. However, there exist some sets of non-optimal input weights and hidden biases which may influence the performance in minimizing the cost function. Due to the randomness, the input weights and hidden biases generated by the ELM may be non-optimal. Since the output weights highly depend on the input weights and hidden biases, the non-optimal input weights and hidden biases may have a negative impact on the final results. Moreover, the ELM may require more hidden neurons than the conventional gradient-based algorithms in many cases [16–18].

To address the above ELM’s drawbacks and find a proper set of input weights and hidden biases, Huang et al. proposed a hybrid form of the differential evolutionary algorithm and ELM method called the evolutionary ELM (E-ELM) [19]. Besides the E-ELM, some other bio-inspired methods and global search techniques are also adopted in the ELM during the past several years. Cao et al. proposed an improved learning algorithm named the self-adaptive evolutionary ELM (SaE-ELM) and evaluated the performances on regression and classification problems [20]. Xu and Shu introduced a hybrid particle swarm optimization (PSO) and ELM algorithm for a prediction problem [21]. Saraswathi et al. presented a PSO-driven ELM, combined with the integer-coded genetic algorithm (ICGA), to solve gene selection and cancer classification [22]. Silva et al. combined the ELM with group search optimization (GSO) and valuated the influence of four different forms of handling members that fly out of the search space bounds [23]. More contemporary soft computing techniques, such as ant colony optimization [24–26], artificial bee colony algorithm [27–29], firefly algorithm [30, 31], binary-coded PSO [32], and multi-subswarm PSO [33], are also promising to be applied in the ELM.

Recently, inspired by the biological characteristics and living habits shown in the dolphins’ predatory process, Wu et al. presented a novel evolutionary algorithm named dolphin swarm algorithm (DSA) to solve optimization problems [34]. By simulating the dolphins’ predatory process, DSA can make use of the behavior rules of dolphins and the interactions between dolphins to produce changes and achieve certain goals at the group level. With the help of effective strategies, the DSA has shown many promising features, such as fast periodic convergence, local-minima-free, no specific demand on the cost function, and so on [34]. In comparison with the classical evolutionary algorithms, the DSA has better global search ability, better stability, and higher convergence speed, which have been proven through some benchmark functions with different properties [35]. Therefore, compared with the conventional evolutionary algorithms, the DSA is a preferable method to address the ELM’s drawbacks caused by the non-optimal input weights and hidden biases.

In this paper, we introduce a new hybrid method named dolphin swarm extreme learning machine (DS-ELM) for SLFN. In the proposed algorithm, several sets of input weights and hidden biases are used. Each set is encoded into one vector, namely the dolphin. These dolphins are evaluated by root mean squared error (RMSE) [19–23] and updated by the four pivotal phases of DSA. After the four pivotal phases, the set of input weights and hidden biases represented by the best dolphin is selected. Then, the output weights are derived by the chosen set. To evaluate the effectiveness of our method, we compare the proposed algorithm with the standard ELM and three state-of-the-art methods (PSO-ELM, E-ELM, and SaE-ELM) under 13 benchmark datasets (the newest or the most popular datasets) obtained from the University of California Irvine (UCI) Machine Learning Repository [36]. The experimental results show that DS-ELM can achieve superior generalization performances than all the compared algorithms.

The remaining structures of this paper are as follows. The “Preliminaries” presents the standard ELM and DSA methods. After that, the DS-ELM is proposed in “Dolphin Swarm Extreme Learning Machine (DS-ELM).” “Experiments” is made up of certain experiments. The DS-ELM will be compared with the standard ELM, PSO-ELM, E-ELM, and SaE-ELM under six regression benchmark datasets and seven classification benchmark datasets obtained from the UCI Machine Learning Repository [36]. A summary of this paper will be given in the last section.

Preliminaries

Extreme Learning Machine (ELM)

The ELM was originally proposed for SLFN by Huang et al. [1]. Then, it was extended to the generalized SLFN where the hidden layer need not be neuron like [5]. The main concept of the ELM is that the input weights and hidden biases are generated randomly instead of being tuned iteratively. Moreover, the output weights are computed by the MP generalized inverse only once [2, 3]. Therefore, the ELM has a significantly lower computational time compared with the conventional gradient-based approaches. Besides, ELM theories have shown that almost any nonlinear piecewise continuous random hidden nodes can be used in the ELM [4, 5], and the resultant networks have universal approximation capabilities [37–39].

The output function of the ELM for generalized SLFNs with L hidden nodes is

$$ {f}_L(x)=\sum_{i=1}^L{\beta}_i{g}_i(x)=\sum_{i=1}^L{\beta}_i G\left({a}_i,{b}_i, x\right), x\in {R}^d,{\beta}_i\in {R}^m $$

(1)

where a _i and b _i are the input weight and hidden bias of the ith hidden node respectively; β _i denotes the linked weight between the ith hidden node and the output neurons; and g _i(x) denotes the output function G(a _i, b _i, x) of the ith hidden node. The output functions G(a, b, x) are a nonlinear piecewise continuous function satisfying ELM universal approximation capability theorems [4, 5]. For additive nodes with activation function g(x), g _i(x) is defined as

$$ {g}_i(x)= G\left({a}_i,{b}_i, x\right)= g\left({a}_i\cdot x+{b}_i\right),{a}_i\in {R}^d,{b}_i\in R $$

(2)

For radial basis function (RBF) nodes with activation function g(x), g _i(x) is defined as

$$ {g}_i(x)= G\left({a}_i,{b}_i, x\right)= g\left({b}_i\left\Vert x-{a}_i\right\Vert \right),{a}_i\in {R}^d,{b}_i\in {R}^{+} $$

(3)

Suppose we are training the generalized SLFN with L hidden neurons to learn N arbitrary samples (x _i, t _i). Since the input weights and hidden biases are generated randomly, the nonlinear system can be converted to a linear model:

$$ H\beta = T $$

(4)

where

$$ H={\left[\begin{array}{ccc}\hfill G\left({a}_1,{b}_1,{x}_1\right)\hfill & \hfill \cdots \hfill & \hfill G\left({a}_L,{b}_L,{x}_1\right)\hfill \\ {}\hfill \vdots \hfill & \hfill \ddots \hfill & \hfill \vdots \hfill \\ {}\hfill G\left({a}_1,{b}_1,{x}_N\right)\hfill & \hfill \cdots \hfill & \hfill G\left({a}_L,{b}_L,{x}_N\right)\hfill \end{array}\right]}_{N\times L} $$

(5)

$$ \beta ={\left[\begin{array}{c}{\beta}_1^T\\ {}\vdots \\ {}{\beta}_L^T\end{array}\right]}_{L\times m}\mathrm{and}\kern0.5em T={\left[\begin{array}{c}{t}_1^T\\ {}\vdots \\ {}{t}_N^T\end{array}\right]}_{N\times m} $$

(6)

H is the hidden-layer output matrix; β = [β ₁, β ₂, …β _K]^T is the output weights; T = [t ₁, t ₂, …t _N]^T is desired outputs. The input weights and hidden biases are generated randomly. Thus, the matrix H can be computed by the input samples. Training the generalized SLFN is converted into finding the least-square (LS) solution to the given linear model. The minimum norm LS solution to the linear model Eq. (4) is

$$ \widehat{\beta}={H}^{\dagger } T $$

(7)

where H ^† is the MP generalized inverse of the matrix H. By utilizing such MP generalized inverse method, the ELM is able to obtain good generalization performances [2, 3].

In summary, the ELM algorithm is presented in algorithm 1.

Dolphin Swarm Algorithm (DSA)

The DSA was introduced by Wu et al. as an efficient global search method to solve varied optimization problems [34]. It is mainly implemented by simulating the biological characteristics and living habits shown in the dolphin’s actual predatory process, such as echolocation, information exchanges, cooperation, and division of labor. By simulating the dolphin’s actual predatory process, the DSA can make use of the behavior rules of dolphins and the interactions between dolphins to produce changes at the group level and achieve certain goals. With the help of effective strategies, the DSA has shown many promising features, such as fast periodic convergence, local minima free, no specific demand on the cost function, and so on [34]. Compared with the conventional evolutionary algorithms, the DSA has better global search ability, better stability, and higher convergence speed, which have been proven through some benchmark functions with different properties [35].

The dolphin’s actual predatory process consists of three stages. In the first stage, each dolphin independently takes advantage of sounds to search for nearby prey and to evaluate the surrounding environment by echoes. In the second stage, dolphins exchange their information. Those dolphins that find large prey call other dolphins for help. And those dolphins that have received information move toward the prey and surround it along with other dolphins. In the last stage, the prey is surrounded by the dolphins and then what the dolphins need to do is to take turns to enjoy the food, which means that the predatory process is accomplished [34].

In the DSA, a swarm of N dolphins is kept. In the D dimensional search space of optimization problems, the dolphins are defined as Dol_i = [x ₁, x ₂, …x _D]^T , (i= 1, 2, … , N), where x _j(j = 1, 2, ⋯ , D) is the component of each dimension to be optimized. For each Dol_i, there are two corresponding variables L _i(i = 1, 2, ⋯ , N) and K _i(i = 1, 2, ⋯ , N), where L _i represents the optimal solution Dol_i finds in a single time and K _i represents the optimal solution of what Dol_i finds by itself or gets from others. The fitness function represented by fitness(X) is the basis for judging whether the position is better. There are three kinds of distances used in total. One is the distance between Dol_i and Dol_j named DD _i , j, one is the distance between Dol_i and K _i, named DK _i, and the other one is the distance between L _i and K _i named DKL _i. Moreover, exchanging processes are maintained by an N × N order matrix named transmission time matrix TS, where the term TS_i , j represents the remaining time of the sound spreading from Dol_j to Dol_i.

Generally, DSA can be divided into six phases. In each phase, dolphins have distinct work to do. Besides the initialization phase and termination phase, four pivotal phases of DSA including search phase, call phase, reception phase, and predation phase need to be expounded for a better understanding.

Search Phase

In the search phase, each dolphin searches its nearby area by making sounds V _i = [v ₁, v ₂, ⋯ , v _D]^T , (i = 1, 2, ⋯ , M) toward M random directions, where v _j(j = 1, 2, ⋯ , D) is the component of each dimension. Besides, the sounds satisfy ‖V _i‖ = speed , (1 = 1, 2, ⋯ , M), where speed is a constant. Within the maximum search time T, the sound V _j that Dol_i makes at time t will search a new solution X _ijt:

$$ {X}_{i jt}={\mathrm{Dol}}_i+{V}_j\times t $$

(8)

For the new solution X _ijt that Dol_i gets, calculate its fitness E _ijt:

$$ {E}_{ijt}=\mathrm{fitness}\left({X}_{ijt}\right) $$

(9)

If

$$ {E}_{iab}=\underset{j=1,\dots, M; t=1,\dots, T}{ \min }{E}_{ijt}=\underset{j=1,\dots, M; t=1,\dots, T}{ \min}\mathrm{fitness}\left({X}_{ijt}\right) $$

(10)

then L _i is determined as

$$ {L}_i={X}_{i ab} $$

(11)

If

$$ \mathrm{fitness}\left({L}_i\right)<\mathrm{fitness}\left({K}_i\right) $$

(12)

then replace K _i by L _i. Otherwise K _i does not change.

Call Phase

In the call phase, each dolphin will make sounds to inform other dolphins of its result in the search phase. And the transmission time matrix TS needs to be updated as follows:

For K _i, K _j, and TS_i , j, if

$$ \mathrm{fitness}\left({K}_i\right)>\mathrm{fitness}\left({K}_j\right) $$

(13)

and

$$ {\mathrm{TS}}_{i, j}>\frac{DD_{i, j}}{\mathrm{speed}} $$

(14)

then update TS_i , j as

$$ {\mathrm{TS}}_{i, j}=\frac{DD_{i, j}}{\mathrm{speed}} $$

(15)

Reception Phase

When DSA enters the reception phase, all the terms TS_i , j subtract one to indicate that the sounds spread a unit of time, then check all the terms TS_i , j, if

$$ {\mathrm{TS}}_{i, j}=0 $$

(16)

then it means that the sound spreading from Dol_j to Dol_i can be received. Compare K _i with K _j. If it satisfies Eq. (13), then replace K _i by K _j. Otherwise K _i does not change. Then set TS_i , j as

$$ {\mathrm{TS}}_{i, j}=\left\lceil \frac{\mathrm{range}}{\mathrm{speed}}\right\rceil $$

(17)

where range is the length of the search space.

Predation Phase

In the predation phase, each dolphin needs to get a new position according to its known information and it can be discussed in two cases.

For DK _i, if

$$ {DK}_i\le T\times \mathrm{speed} $$

(18)

we can get Dol_i’s new position newDol_i:

$$ {\mathrm{newDol}}_i={K}_i+ w\times \left({\mathrm{Dol}}_i-{K}_i\right)\times \left(1-\frac{2}{e}\right) $$

(19)

where e is a constant which is greater than 2 and w is an arbitrary unit vector.

For DK _i, if

$$ {DK}_i> T\times \kern0.5em \mathrm{speed} $$

(20)

we can get Dol_i’s new position newDol_i:

$$ {\mathrm{newDol}}_i={K}_i+ w\times \left(1-\frac{D{ K}_i\times \frac{1}{\mathrm{fitness}\left({K}_i\right)}+\left({ D K}_i-{ D K L}_i\right)\times \frac{1}{\mathrm{fitness}\left({K}_i\right)}}{e\times { D K}_i\times \frac{1}{\mathrm{fitness}\left({K}_i\right)}}\right)\times { D K}_i $$

(21)

After all the Dol_i get their new position newDol_i, compare newDol_i with K _i in fitness. If

$$ \mathrm{fitness}\left({\mathrm{newDol}}_i\right)<\mathrm{fitness}\left({K}_i\right) $$

(22)

then replace K _i by newDol_i. Otherwise K _i does not change.

If the end condition is satisfied, DSA enters the termination phase. Otherwise, DSA enters the search phase again.

Although there are many parameters used in DSA, only a few of them are user-specified, including the number of dolphins N, the number of sounds M, the speed of sounds speed, maximum search time T, and the constant e used in the predation phase.

In summary, the DSA is presented in algorithm 2.

Dolphin Swarm Extreme Learning Machine (DS-ELM)

The standard ELM generates the input weights and hidden biases randomly. However, there exist some sets of non-optimal input weights and hidden biases which may inevitably influence the performance in minimizing the cost function and have a negative impact on the final results. There are two feasible ways to avoid these non-optimal input weights and hidden biases. One is to simply increase the number of hidden nodes which may weaken the performances of the ELM. The other one is to find a proper set of input weights and hidden biases which can achieve a favorable performance with less hidden nodes and more compact networks. In this section, we present our new hybrid approach DS-ELM to improve the performance of the ELM in the latter way by taking advantages of the DSA’s great global search ability.

In the DS-ELM, a swarm of N dolphins is kept. Each dolphin represents one set of input weights and hidden biases, namely Dol_i = [w ₁₁, w ₁₂, … , w _1K, w ₂₁, w ₂₂, … , w _M1, w _M2, … , w _MK, b ₁, b ₂, … , b _K]^T , (i = 1, 2, … , N). All the input weights and hidden biases are randomly initialized within the range of [−1, 1]. For each dolphin, the corresponding output weights β are computed by Eq. (7).

Then, we utilize the RMSE as the fitness function [19–23]:

$$ \mathrm{fitness}\left({\mathrm{Dol}}_i\right)=\sqrt{\frac{\sum_{j=1}^M{\left\Vert {\sum}_{j=1}^K\beta g\left({w}_i\cdot {x}_j+{b}_j\right)-{t}_j\right\Vert}_2^2}{M}} $$

(23)

where T = [t ₁, t ₂, … , t _M]^T is the desired output.

With N dolphins and the fitness function, we are able to adopt the DSA to our DS-ELM. After the search phase, the call phase, the reception phase, and the predation phase described in “Dolphin Swarm Algorithm (DSA),” an optimal set of input weights and hidden biases will be obtained eventually.

In summary, the DS-ELM is presented in algorithm 3.

Experiments

In this section, the DS-ELM is compared with the standard ELM and three state-of-the-art methods, including the PSO-ELM, E-ELM, and SaE-ELM. The experiments are composed of two parts. In the first part, these five algorithms are tested under six regression benchmark datasets. And seven classification benchmark datasets are employed in the second part. Specifically, these 13 benchmark datasets are obtained from the UCI Machine Learning Repository [36]. All these simulations are conducted in Matlab 9.1 environment (the latest version in 2016) with an Intel Core i7, 2 GHz CPU and 8 GB RAM.

In our experiment, each dataset is divided into two parts, namely the training set and the testing set. We adopt a tenfold cross-validation method in the training set and use the validation set to determine the suitable parameters of the five algorithms. The number of hidden nodes is gradually increased in a preset region [5, 10, 15…200], and the one with the lowest validation error is selected. Moreover, the number of individuals (dolphins in the DS-ELM, particles in the PSO-ELM, and population in the E-ELM and SaE-ELM) is 5, and the times of invoking the ELM is equally set as 1000 for the DS-ELM, PSO-ELM, E-ELM, and SaE-ELM. Other parameters, suggested by [20, 21, 34], are shown in Table 1.

Table 1 Parameters used in the DS-ELM, PSO-ELM, and E-ELM

Full size table

Regression

In this subsection, the five algorithms are compared under six regression benchmark datasets from the UCI Machine Learning Repository [36], which are Servo, Breast Cancer Wisconsin (Cancer), Housing, Airfoil Self-Noise (Airfoil), Wine Quality (Wine), and Air Quality (Air). The Air dataset is one of the newest regression datasets in the UCI Machine Learning Repository [36]. And the other five datasets are the most popular datasets. The specifications of the six datasets are shown in Table 2.

Table 2 Specifications of the six regression benchmark datasets

Full size table

All the attributes have been normalized to the range of [−1, 1], and the desire outputs have been normalized to the range of [0, 1]. To get a better comparison and reduce the influence of accidental factors, all the experimental results are obtained by taking the average of 20 independent experiments. For each experiment, the training set and the testing set are randomly selected from the whole dataset, and the two sets have no overlap with each other to avoid overfitting. Moreover, the two sets are kept the same for each trial of the five algorithms. The results are shown in Table 3, and the best results are shown in italic font.

Table 3 Results of the five algorithms under the six regression benchmark datasets

Full size table

In terms of the training time, it is quite easy to see that the ELM is the fastest one since all the other five algorithms invoke it multiple times. Besides, there is not much difference in training time among the DS-ELM, PSO-ELM, E-ELM, and SaE-ELM, because invoking the ELM is time consuming and the times of invoking the ELM of these four algorithms are the same. But it can still be seen that the DS-ELM is slightly faster than the PSO-ELM, E-ELM, and SaE-ELM.

As for the testing accuracy, the DS-ELM, PSO-ELM, E-ELM, and SaE-ELM obtain better results with less hidden nodes than the ELM which means that the DS-ELM, PSO-ELM, E-ELM, and SaE-ELM can achieve better generalization performances with more compact networks. Compared with the PSO-ELM, E-ELM, and SaE-ELM, the DS-ELM has smaller means and standard deviations. Wilcoxon’s signed-rank tests with 95% confidence interval show that the DS-ELM is significantly different from the ELM, PSO-ELM, E-ELM, and SaE-ELM which indicates that the DS-ELM is superior to the other four methods under these six regression benchmark datasets.

Classification

In this subsection, the performances of the five algorithms are evaluated under seven classification benchmark datasets from the UCI Machine Learning Repository [36]. The seven datasets are Iris, Heart, Pima Indians Diabetes (Diabetes), Image Segmentation (Image), Landsat Satellite (Satellite), Occupancy Detection (Occupancy), and Shuttle respectively. The Occupancy dataset is one of the newest classification datasets in the UCI Machine Learning Repository [36]. And the other six datasets are the most popular datasets. The specifications of the seven datasets are listed in Table 4.

Table 4 Specifications of the seven classfication benchmark datasets

Full size table

All the attributes have been normalized to the range of [−1, 1]. The same as the experiments in “Regression,” all the experimental results are obtained by taking the average of 20 independent experiments. For each experiment, the whole dataset is divided into two parts with no overlap, namely the training set and testing set. And the two sets are kept coincident for each trial of the five algorithms. The results are shown in Table 5, and the best results are emphasized with italic font.

Table 5 Results of the five algorithms under the seven classification benchmark datasets

Full size table

Speaking of the training time, the same conclusion can be made that the ELM runs fastest and the other four methods put in nearly the same amount of time to train the networks which is theoretically reasonable as analyzed in “Regression.”

When it comes to the testing accuracy, it can be easily seen that the DS-ELM achieves the highest mean testing accuracy and the lowest standard deviation in all the classification datasets. In addition, Wilcoxon’s signed-rank tests with 95% confidence interval show that the DS-ELM is significantly different from the ELM, PSO-ELM, E-ELM, and SaE-ELM, which indicates that the DS-ELM outperforms the other four approaches under these seven classification benchmark datasets.

Conclusion

In this paper, we firstly introduced the standard ELM and DSA. Then, we proposed a brand new hybrid approach named dolphin swarm extreme learning machine for the SLFN. In our new algorithm, the DSA is used to optimize the input weights and hidden biases of the ELM, and the ELM is employed to calculate the output weights. In the experiment part of this paper, our method is compared with the standard ELM and three state-of-the-art methods, which are the PSO-ELM, E-ELM, and SaE-ELM under six regression benchmark datasets and seven classification benchmark datasets obtained from the UCI Machine Learning Repository. Experimental results show that our method has better generalization performances with more compact networks than the standard ELM. Moreover, our algorithm can achieve better testing results (smaller RMSE on regression and higher accuracy on classification). According to Wilcoxon’s signed-rank tests, the dolphin swarm extreme learning machine is superior to the other four methods both on the six regression datasets and the seven classification datasets in the experiments. However, the DS-ELM is time consuming compared with the standard ELM. Besides, it does not reduce the number of hidden nodes compared with the state-of-the-art methods. Future research works will be concentrated on how to get faster as well as how to make the network more compact.

References

Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. Extreme learning machine: a new learning scheme of feedforward neural networks. Neural Networks, 2004. Proceedings. 2004 I.E. International Joint Conference on. Vol. 2. IEEE, 2004.
Huang G-B, Siew C-K. Extreme learning machine with randomly assigned RBF kernels. Int J Inf Technol. 2005;11(1):16–24.
CAS Google Scholar
Huang, Guang-Bin, and Chee-Kheong Siew. Extreme learning machine: RBF network case. Control, Automation, Robotics and Vision Conference, 2004. ICARCV 2004 8th. Vol. 2. IEEE, 2004.
Huang G-B, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Transactions on Neural Networks. 2006;17(4):879–92.
Article PubMed Google Scholar
Huang G-B, Chen L. Convex incremental extreme learning machine. Neurocomputing. 2007;70(16):3056–62.
Article Google Scholar
Duan L, et al. A voting optimized strategy based on ELM for improving classification of motor imagery BCI data. Cogn Comput. 2014;6.3:477–83.
Article Google Scholar
Akusok A, et al. A two-stage methodology using K-NN and false-positive minimizing ELM for nominal data classification. Cogn Comput. 2014;6(3):432–45.
Article Google Scholar
Cao K, et al. Classification of uncertain data streams based on extreme learning machine. Cogn Comput. 2015;7.1:150–60.
Article Google Scholar
Zhao Z, et al. A class incremental extreme learning machine for activity recognition. Cogn Comput. 2014;6(3):423–31.
Article Google Scholar
Zhang S, et al. Fast image recognition based on independent component analysis and extreme learning machine. Cogn Comput. 2014;6.3:405–22.
Article Google Scholar
He B, et al. Fast face recognition via sparse coding and extreme learning machine. Cogn Comput. 2014;6(2):264–77.
Google Scholar
Xie SJ, et al. Feature component-based extreme learning machines for finger vein recognition. Cogn Comput. 2014;6.3:446–61.
Article Google Scholar
Vong C-M, et al. Imbalanced learning for air pollution by meta-cognitive online sequential extreme learning machine. Cogn Comput. 2015;7.3:381–91.
Article Google Scholar
Xia S-X, et al. A kernel clustering-based possibilistic fuzzy extreme learning machine for class imbalance learning. Cogn Comput. 2015;7.1:74–85.
Article Google Scholar
Sachnev V, et al. A cognitive ensemble of extreme learning machines for steganalysis based on risk-sensitive hinge loss function. Cogn Comput. 2015;7.1:103–10.
Article Google Scholar
Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: theory and applications. Neurocomputing. 2006;70(1):489–501.
Article Google Scholar
Hagan MT, Menhaj MB. Training feedforward networks with the Marquardt algorithm. IEEE transactions on Neural Networks. 1994;5(6):989–93.
Article CAS PubMed Google Scholar
Levenberg K. A method for the solution of certain non-linear problems in least squares. Q Appl Math. 1944;2(2):164–8.
Article Google Scholar
Zhu Q-Y, et al. Evolutionary extreme learning machine. Pattern Recogn. 2005;38(10):1759–63.
Article Google Scholar
Cao J, Lin Z, Huang G-B. Self-adaptive evolutionary extreme learning machine. Neural Process Lett. 2012;36(3):285–305.
Article Google Scholar
Xu, You, and Yang Shu. Evolutionary extreme learning machine–based on particle swarm optimization. International Symposium on Neural Networks. Springer Berlin Heidelberg, 2006.
Saraswathi S, et al. ICGA-PSO-ELM approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2011;8.2:452–63.
Article Google Scholar
Silva, Danielle NG, Luciano DS Pacifico, and Teresa Bernarda Ludermir. An evolutionary extreme learning machine based on group search optimization. 2011 I.E. Congress of Evolutionary Computation (CEC). IEEE, 2011.
Drigo, M., V. Maniezzo, and A. Colorni. The ant system: optimization by a colony of cooperation agents. IEEE Trans Syst, Man, Cybernet Part B. 1996: 29–41.
Dorigo M, Gambardella LM. Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput. 1997;1.1:53–66.
Article Google Scholar
Dorigo M, Birattari M, Stutzle T. Ant colony optimization. IEEE Comput Intell Mag. 2006;1(4):28–39.
Article Google Scholar
Karaboga D, Basturk B. Artificial bee colony (ABC) optimization algorithm for solving constrained optimization problems. International Fuzzy Systems Association World Congress. Springer Berlin Heidelberg, 2007.
Karaboga D, Basturk B. On the performance of artificial bee colony (ABC) algorithm. Appl Soft Comput. 2008;8(1):687–97.
Article Google Scholar
Karaboga D, Akay B. A comparative study of artificial bee colony algorithm. Appl Math Comput. 2009;214(1):108–32.
Google Scholar
Yang X-S. Firefly algorithm, stochastic test functions and design optimisation. International Journal of Bio-Inspired Computation. 2010;2(2):78–84.
Article Google Scholar
Yang, Xin-She Nature-inspired metaheuristic algorithms. Luniver Press. Beckington. 2008.
Taormina R, Chau K-W. Data-driven input variable selection for rainfall–runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol. 2015;529:1617–32.
Article Google Scholar
Zhang J, Chau K-W. Multilayer ensemble pruning via novel multi-sub-swarm particle swarm optimization. Journal of Universal Computer Science. 2009;15(4):840–58.
Google Scholar
Tian-qi WU, Min YAO, Jian-hua YANG. Dolphin swarm algorithm. Frontiers of Information Technology & Electronic Engineering. 2016;707–729
Yao X, Liu Y, Lin G. Evolutionary programming made faster. IEEE Trans Evol Comput. 1999;3(2):82–102.
Article Google Scholar
A. Frank and A. Asuncion, UCI Machine Learning Repository, Univ. California, Sch. Inform. Comput. Sci., Irvine, CA, 2011 [Online]. Available: http://archive.ics.uci.edu/ml.
Huang G-B. An insight into extreme learning machines: random neurons, random features and kernels. Cogn Comput. 2014;6(3):376–90.
Article Google Scholar
Huang G-B. What are extreme learning machines? Filling the gap between Frank Rosenblatt’s dream and John von Neumann’s puzzle. Cogn Comput. 2015;7.3:263–78.
Article Google Scholar
Huang G-B, et al. Extreme learning machine for regression and multiclass classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics). 2012;42.2:513–29.
Article Google Scholar

Download references

Acknowledgements

This paper is the partial achievement of Project 61272261 supported by the National Natural Science Foundation of China.

Author information

Authors and Affiliations

School of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Tianqi Wu, Min Yao & Jianhua Yang

Authors

Tianqi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Min Yao
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Yao.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Informed Consent

Informed consent was not required as no human or animal was involved.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, T., Yao, M. & Yang, J. Dolphin Swarm Extreme Learning Machine. Cogn Comput 9, 275–284 (2017). https://doi.org/10.1007/s12559-017-9451-y

Download citation

Received: 04 October 2016
Accepted: 26 January 2017
Published: 27 February 2017
Issue Date: April 2017
DOI: https://doi.org/10.1007/s12559-017-9451-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Dolphin Swarm Extreme Learning Machine

Abstract

Similar content being viewed by others

A competitive swarm optimizer with hybrid encoding for simultaneously optimizing the weights and structure of Extreme Learning Machines for classification problems

A Hybrid Optimization Algorithm for Extreme Learning Machine

An Improved Evolutionary Extreme Learning Machine Based on Multiobjective Particle Swarm Optimization

Introduction