1 Introduction

A particle swarm optimization (PSO) algorithm is a kind of Swarm Intelligence (SI) technology, which was initially proposed (Eberhart and Kennedy 1995; Kennedy and Eberhart 1995) in 1995. Inspired by birds flocking, the core concept of PSO is to find out the optima or sub-optima of an objective function through the co-operation and information sharing among particles. Since PSO is efficient, simple and robust, it has been widely used in multi-objective optimization (Leong and Yen 2008), artificial neural network training (Grimaldi et al. 2004), image segmentation (Chander et al. 2011), vehicle routing problems (Khouadjia et al. 2012), job-shop scheduling problems (Ge et al. 2008), optimal power flow (Kang et al. 2012a, b) and other fields. However, as other evolutionary algorithms, how to balance between exploration and exploitation is still a key and unresolved problem for PSO. At the beginning of exploration, all the particles can enjoy fast moving and gather around the best positions found so far. At the late stage, the whole swarm becomes dense near several points and particles cannot escape from these local extreme areas. If the shortcoming can be overcome, PSO will be more powerful and thus applicable to more problems.

Most researchers have focused on the communication topology of a swarm or behavior of individuals separately and obtained some impressive results (Banks et al. 2007). From them the swarm benefits little from static single population structure. On one hand, a dynamic topology is a good choice. A simple and efficient dynamic neighborhood topology would be beneficial to improve the information sharing and bio-diversity in a swarm. On the other hand, the behavior of individuals determines their search ability. Some researchers have proposed effective adaptive strategies, such as life time (Lanzarini et al. 2006), multiple stage learning (Brits et al. 2002) and quantum behavior (Sun et al. 2011; Li et al. 2012), to improve PSO. However, the population topology is closely associated with individual behavior. The intelligence of individuals is from learning from the environment. Hence, the swarm topology and individual behavior should be considered together comprehensively.

This work proposes a new approach based on the topological structures and self-adaptive control of individuals, called adaptive PSO based on clustering (APSO-C). The proposed algorithm takes a clustering method to divide the swarm into several subpopulations dynamically. These subpopulations share their information from the solution space with a ring topology. Compared with the topology of gbest and lbest versions (Kennedy and Eberhart 1995; Kennedy 1997), it shortens the information transmission distance and avoids excessive concentration of particles. Then a method is used to evaluate the performance of each cluster. Based on the evaluation results, each individual adjusts its inertia weight adaptively to balance the local and global searches. These operations are carried out automatically after individuals’ renewal is completed. Thus each subpopulation can obtain different components at each generation. It yields positive effect on the diversity and activity of the swarm. Clustering, adaption and new learning mechanism together provide a quite different solution to improve PSO. Tests are performed to verify that the occurrence of prematurity is reduced and the search ability of the swarm is enhanced via APSO-C.

The rest of the paper is organized as follows. Section 2 reviews PSO and its developments. APSO-C is presented in Sect. 3 including the dynamic population decomposition and adaptation strategies. Section 4 experimentally compares APSO-C with various existing PSO algorithms taken from the literature through benchmark functions. Finally, conclusions are drawn in Sect. 5.

2 Background and related work

2.1 Particle swarm optimization

In the classic PSO, a group of particles are set in a multi-dimensional space and they can fly freely to search the best position. Each individual can be treated as a point in the search space and characterized by its position and velocity. Suppose that the search space is D-dimensional, then the position of the i-th particle can be represented by a D-dimensional vector x \({_i}=(x_{i1},x_{i2},\ldots ,x_{iD})\), and the velocity by another vector v \(_{i}=(v_{i1},v_{i2},\ldots ,v_{iD})\). The best previously visited position of the i-th particle is denoted as pbest \({_i}=(p_{i1},p_{i2},\ldots ,p_{iD})\). Define gbest \(=(g_{1},g_{2},\ldots ,g_{D})\) as the best global position that the swarm has found so far. The velocity and position equations are given as follows:

$$\begin{aligned} v_{id}^{t+1}&= v^t_{id}+c_1r_1\left( p_{id}^t-x^t_{id}\right) +c_2r_2\left( g_{d}^t-x^t_{id}\right) \end{aligned}$$
(1)
$$\begin{aligned} x^{t+1}_{id}&= x^{t}_{id}+v^{t+1}_{id} \end{aligned}$$
(2)

where \(\forall \,i\in \mathbb {N}_N, \mathbb {N}_N=\{1,2,\ldots N\}\), N is the size of the swarm; \(d\) is the index of the co-ordinate being updated; \(c_1\) and \(c_2\) are positive constants, called acceleration constants; \(r_1\) and \(r_2\) represent random numbers, following the uniform distribution over [0, 1].

2.2 Existing particle swarm optimization

Since PSO was first produced, many developed versions have been presented in literature. We can classify them from two aspects: population topology and individual behavior control as briefly reviewed next.

2.2.1 Population structure and topology

A population structure is the foundation of a swarm. Different structures may drive the swarm to behave differently. In the early research, the structure (Kennedy 1999; Eberhart and Kennedy 1995) is single in common, implying that all particles are in the same microenvironment and under the same control. In this model it is effective for particles to interact with others and the groups have a rapid speed of the convergence. Under it, Kennedy (1997), and Shi and Eberhart (1998a) analyze the effect of parameters on the PSO performance. Based on the evaluation of swarm distribution and particle fitness, an evaluation method for a swarm state is proposed in Zhan et al. (2009). It defines four different population evolutionary states: exploration, exploitation, convergence and jumping. The automatic control of parameters and execution of an elitist learning strategy are performed at runtime to improve the search efficiency and convergence speed. Through the analysis of particle behavior in a search process, dynamic population size strategies are used. In a kind of variable population size PSO (Lanzarini et al. 2006), life time and neighborhood are considered to allow the size of the population to vary. A variant (Montes de Oca et al. 2011a, b) adopts incremental social learning to realize time-varying population size. Particles are introduced to swarm gradually. Based on a kind of variable neighborhood search strategy and a path relinking strategy, Marinakis and Marinaki (2013) propose a new PSO model with expending neighborhood. It starts from a small size neighborhood and increases the size of the neighborhood in each iteration. Chen and Zhao (2009) propose an adaptive variable population size by periodic partial increasing or declining of individuals in the form of a ladder function. Like a living body, a particle has its life time.

Inspired from a biological society’s structure, researchers have proposed varying population and multiple subpopulations to improve the performance of PSO. With the conception that a group means an environment for individuals to learn in swarm intelligence, Jie et al. (2008) propose knowledge-based co-operative PSO. In it, a learning mechanism based on billboard is established and three different states are adopted to measure sub-populations that contain growing, near mature or mature statuses. It is common that communities exist in a society. Kennedy (2000), Li (2004), and Passaro and Starita (2006) take a clustering operation to divide the society into several subpopulations. Madeiro et al. (2009) apply adaptive density-based clustering algorithm to create neighbors for particles. Competition is the main evolutionary power for biology to avoid extinction. Brits et al. (2002) mix the niching technology and PSO to develop a niching particle swarm optimizer. In each niche, a particle searches its local area based on its personal experience only. Subpopulations are then merged when they intersect. This model simulates the evolutionary process of a biocommunity. Similarly, a PSO with a simple yet effective niching algorithm is developed to improve the performance (Li 2010). GAPSO (Liu et al. 2012) is based on a greedy algorithm and niche structure. A ring topology is employed in each niche instead of a fully connected topology in it. Recently, a Gaussian classifier-based evolutionary strategy (Dong and Zhou 2014) is proposed to solve multimodal optimization problems under a multi-subpopulation structure in evolutionary algorithms (EAs). It employs Gaussian models to capture the landscape shapes of objective functions and a zoom factor to accelerate the search speed. Also, it can be applied in PSOs to guide subpopulations.

A topology is the description of information interaction patterns among individuals. Particles co-operate or compete with others through information exchanges to evolve the whole swarm. Some researchers (Passaro and Starita 2006; Mendes et al. 2003; Kennedy and Eberhart 2002) have proved that a topology is seriously important for PSO performance. The gbest and lbest topology are the classic examples of a static topology. In gbest, the best neighbor in the entire population influences the target particle and is a fully connected group. While, in lbest, each particle is only connected to several other members in its neighbor. Kennedy (1999) adopts a small world technology to analyze several different topology structures including Circles, Wheels, Stars and Random. Through the combined tests of a topology and benchmark function, the conclusion is drawn that the best selection of a topology is related to the characteristics of problems. After that, Kennedy and Eberhart (2002) study a group of classic topologies: gbest, lbest, pyramid, von Neumann and four clusters. The results show that lbest can keep a high rate of success but with low convergence speed. Such researches focus on a static topology. However, the experiments show that dynamic and unstable topology favors information interaction for individuals. Thus some PSO versions with dynamic topology and adjustment are reported. Based on a dynamic neighborhood, a new learning model is proposed (Nasir et al. 2012). It uses a learning strategy in which all other particles’ historical best information is used to update a particle’s velocity. To adaptively generate the population topology, a scale-free network model is used as a self-organizing construction mechanism in Zhang and Yi (2011). Some researchers also study the effect of synchronicity in communications and neighborhood size (Rada-Vilela et al. 2013). They suggest that a random asynchronous model may be better than synchronous and asynchronous methods.

2.2.2 Individual behavior control

Similar to other real or virtual swarms, particles in PSO improve their knowledge and ability by learning from the environment. But high convergence speed makes the diversity decrease fast. This means when particles crowd in the same area, they express similar characteristics but show little difference. Thus the particles cannot absorb new knowledge from their neighborhood. Therefore, their behaviors are restricted and evolution is stagnant. The long-standing question is how to control and adjust their behavior to keep the swarm in evolution.

The PSO core is the updating formula of particles, which contain three components (Kennedy 1997). The first one is the effect of their present velocity; the second one is the cognition modal for learning from their own memory; and the last one is the social modal which indicates the influence from the swarm or neighbor. Without the first one, a swarm would contract to the global best individual within the initial area, like a local search. With the first one only, the particles would keep flying until the boundary is reached, resembling global search. Based on the original PSO, Shi and Eberhart (1998a) introduce a modified PSO with inertia weight \(\omega \) to balance between exploration and exploitation as:

$$\begin{aligned} v_{id}^{t+1}=\omega v^t_{id}+c_1r_1\left( p_{id}^t-x^t_{id}\right) +c_2r_2\left( g_{d}^t-x^t_{id}\right) \end{aligned}$$
(3)

The experiments show that a larger value of \(\omega \) would drive the swarm to exploit new areas, thereby causing the swarm dispersed, and smaller \(\omega \) would make the swarm tend to explore local areas resulting in precocity. To adjust \(\omega \) to balance the global and local search, Shi and Eberhart (1998b) provide a dynamic strategy to reduce \(\omega \) from 0.9 to 0.4 linearly according to the iterations during running. But the PSO search process is a complex nonlinear process. This linear method fails to perform well due to the nonlinearity. Shi and Eberhart (2001) proposed a new method to adjust the inertia weight. Based on the present best solution and \(\omega \), it calculates the changing rate of \(\omega \) by a defined fuzzy system to determine \(\omega \) in the next iteration.

Constriction is also a kind of method for controlling the behavior of particles. Clerc and Kennedy (2002) adopt a constriction factor \(\chi \) to modify the velocity as:

$$\begin{aligned}&v_{id}^{t+1}=\chi \left[ v^t_{id}+c_1r_1\left( p_{id}^t-x^t_{id}\right) +c_2r_2\left( g_{d}^t-x^t_{id}\right) \right] \end{aligned}$$
(4)
$$\begin{aligned}&\chi =\frac{2}{|2-\varphi -\sqrt{\varphi ^2-4\varphi }|} \end{aligned}$$
(5)

For the choice between \(\omega \) and \(\chi \), Eberhart and Shi (2000) compare the two models and conclude that they are algebraically equivalent.

In traditional PSO, each particle’s behavior is affected by its previous success, best previous success of its neighbors, current positions and previous velocity. The information from remaining neighbors is not considered. Based on (Clerc and Kennedy 2002), researchers (Mendes et al. 2004; Kennedy and Mendes 2006) propose an algorithm called fully informed particle swarm optimization. In this model, a particle is affected by all its neighbors, not like in the original model in which each is affected by its own experience and the best success in its neighbors. The equation to update velocity is as follows:

$$\begin{aligned} v_{id}^{t+1}=\chi \left[ v_{id}^t+\sum _{n=1}^{K_i}\frac{U\left( 0,\varphi \right) -\left( p_{nd}^t-x_{id}^t \right) }{K_i} \right] \end{aligned}$$
(6)

To overcome the premature convergence and stagnation problems and enrich the PSO with better search capacity in a multidimensional search space, a method combining a constriction factor and inertia weight is proposed (Mandal et al. 2011).

Presently, an animated area is to combine desirable properties from different approaches to mitigate their weaknesses in SI algorithm research. Kang et al. (2012a) incorporate PSO with group search optimizer to provide a novel improved algorithm, in which the optimal information leading mechanism and computation mode of PSO are kept and the historical individual optimal information is introduced. In a chaos-embedded PSO (Alatas et al. 2009), sequences generated from a chaotic system to make a random choice for PSO to improve the global convergence and prevent premature termination. Similarly, ant colony optimization (ACO) is also an SI algorithm that simulates co-operational behaviors of ants for searching food to solve problems. Holden and Freitas (2005) provide a hybrid version of PSO and ACO and apply it to solve hierarchical classification problems successfully. A genetic algorithm (GA) is an evolutionary algorithm that uses mutation, crossover and selection operations to solve problems. It has better performance but low convergence rate. A modified Broyden-Fletcher-Goldfarb-Shanno method is integrated into PSOs to improve a particle’s local search behavior (Li et al. 2011). Culture algorithms have been frequently used to vary the parameters of an individual solution for optimization problems. Daneshyari and Yen (2011) introduce a cultural framework to adapt the personalized flight parameters of the mutated particles. Considerable work on hybridization of PSO with other algorithms is done as summarized in Banks et al. (2007).

3 Adaptive PSO based on clustering

By considering both swarm communication topology and individual behavior control, we propose an adaptive PSO method with dynamic division of population, based on the model of PSO with inertia weight. It focuses on two major aspects:

  1. 1.

    How to make a simple dynamic subpopulation structure? A community structure has the remarkable characteristics of social biology. In a human society active individuals would join different organizations to gain different knowledge. An individual can be absorbed into organizations passively or participate in some actively. This movement not only takes the individual to new developing scope but also brings different culture or information to the organizations. In this work, we incorporate clustering into the swarm evolutionary process in which a swarm is split according to the property of individuals, to form multiple subpopulations. This dynamic multi-subpopulation structure is used to improve the grouping behavior of the swarm and reduce the number of premature incidents. Based on the dynamic structure, a new velocity updating model is provided to adjust such learning way of an individual that the effects from the best ones in dynamic subpopulations are considered.

  2. 2.

    How to bring more intelligence to individuals in dynamic subpopulations? Obviously, within a dynamic subpopulation structure, heterogeneous sub-swarms with different sizes and ability will be formed. These particles in different clusters have different needs, e.g., some need to explore new areas and some want to exploit local areas. The issue is how to improve the search ability of the swarm based on heterogeneous multi-swarms. We propose a method to adaptively adjust parameters based on cluster evaluation that assess the state of subpopulations in the swarm after clustering, to adjust the behavior of particles.

We call the proposed algorithm as APSO-C. Its population decomposition and evaluation strategies will be further described next.

3.1 Dynamic population decomposition based on a clustering strategy (DPDC)

Similar to an organism society, the action of a single particle is mainly affected by the best ones in its neighborhood. Researches (Kennedy and Eberhart 2002; Mendes et al. 2003) point out that the swarm topology affects an individual’s behavior substantially. According to “Birds of a feather flock together”, we adopt a clustering method to divide a swarm into several subpopulations according to the similar properties dynamically. Different from some methods like exchanging individuals among subpopulations at regular intervals or taking a fixed size of subpopulations, this recombination of individuals is consistently operated within the optimization process, which is executed after swarm moving. After each division, we make several heterogeneous subpopulations with different number and characteristics of individuals. In this model, with the migration of the individuals among different subpopulations, the information exchanges are also enhanced.

A K-means algorithm is an effective method for clustering, and the core is to minimize the sum of distance between each particle and the centers. Some researchers (Kennedy 2000; Passaro and Starita 2006) have applied it to analyze the swarm, but here we modify the standard clustering operation and propose a novel updating equation for particles. Similarly, we select K members as the centers (p \(_1\), p \(_2\),\(\ldots \), p \(_K\)) and define the distance between two particles as dis\((\) x \(_i\), x \(_j)=\sum _{d=1}^{D}{(x_{id}-x_{jd})^T}\) \((x_{id}-x_{jd})\). Then the objective can be expressed as Min \(F=\sum _{k=1}^{K}\sum _{{x}_{i}\in {C}_{k}}^{}\mathrm{dis}(\mathbf{x}_i\), p \(_k\)), where C \(_{k}\) is the k-th cluster. The operation can be described as follows:

Step 1: Select K particles (p \(_1\), p \(_2\),..., p \(_K\)) randomly as the centers of clusters.

Step 2: Calculate the distance between the rest of individuals and these centers. According to the minimum distance rule, assign the rest to the nearest cluster. Record the value of \(F=\sum _{k=1}^{K}\sum _{{x}_{i}\in {C}_{k}}^{}\mathrm{dis}\)(x \(_i\), p \(_k\)).

Step 3: For each cluster, choose the average value of the particle positions as its new center.

Step 4: If the termination condition is met (e.g. the maximum number of times is reached), output the result; and otherwise return to Step 2.

Actually, a swarm implicates an aggregative trend. The population will cover an area between p \(_i^t\) and g \(^t\). Considering the self-aggregative behavior, we modify the clustering objective. The operation is designed as a progressive clustering mode associated with the swarm search process, which does not need to reach a steady state of swarm clustering at each step. We call it as dynamic population decomposition based on a clustering strategy (DPDC).

In this process, the best individual is selected as its cluster-best solution after clusters are formed. Considering the relations of all clusters, a ring structure is used to share and transmit the information among them in this work. Each cluster is connected to other two clusters in its own neighborhood in the cluster array under the ring topology, and in each cluster the particles communicate with each other by a structure like the gbest topology. Denote cluster j’s best solution found so far by Cb \(_j\) and best one in particle i’s neighborhood is nbest \(_i=({nb}_{i1}, nb_{i2},\ldots , nb_{iD})\) where particle i belongs to cluster j. Then we make a comparison among Cb \(_{j-1}\), Cb \(_j\) and Cb \(_{j+1}\), and choose the better one as particle i’s nbest \(_i\). Both g and nbest \(_i\) have important effect on the process of learning from the environment of a particle. Hence we rebuild its velocity updating mode.

$$\begin{aligned} v_{id}^{t+1}&= \varvec{\omega } v^t_{id}+c_1r_1\left( p_{id}^t-x^t_{id}\right) \nonumber \\&+\dfrac{1}{2} c_2r_2\left[ \left( g_{d}^t-x^t_{id})+({nb}_{id}^t-x_{id}^t\right) \right] \end{aligned}$$
(7)

After the process, a new swarm structure is built dynamically under a ring topology among different clusters and an individual will update its position based on the information from its previous best, neighborhood best and global best found by a swarm. Thus, it can be seen that this strategy is different from other existing ones, (e.g., Kennedy 2000; Li 2010), which are mainly based on the classic topology and mode. Next, a novel adaptation strategy is presented.

3.2 Adaptation based on cluster evaluation strategy (ACE)

Each cluster formed following the above way has a different number of particles. Owing to the different ability of particles, clusters behave differently at the search level. For example, the average distance from the particles to the best particle may not be same. Thus we can classify all clusters based on the difference to find a new way to improve the search ability of particles. To describe ACE clearly, we give the following definitions first.

Definition 1

The population of cluster j is denoted by \(|C_j|\); Define \({f_{ij}}\) as particle i’s fitness in cluster j. The average fitness of cluster j is \(a_j=\sum _{i=1}^{|C_j|}{f_{ij}/|C_j|}\); Define \(f_i\) as the particle i’s fitness. The average fitness of a swarm is \(A=\sum _{i=1}^{N}{f_i/N}\) where \(N\) is population.

Definition 2

The initial maximum and minimum inertia weights of particles are expressed as \(\varvec{\omega } _\mathrm{max}\) and \(\varvec{\omega } _\mathrm{min}\). At generation t the inertia weight of particle i is \(\varvec{\omega } _i^{t+1}\), which can be adjusted based on \(\varvec{\omega } _\mathrm{max}\), \(\varvec{\omega } _\mathrm{min}\)and \(\varvec{\omega } _i^{t}\) according to its state.

Since each individual is at a different position in a search process, it shows different characteristics and ability. The fitness can be used to show the difference. The particles at a different stage need to strengthen their ability that they require when searching the space. For example, the particles in an inferior position need to escape from a poor area and speed up to explore other areas; while the particles in an advantageous position need to explore the area deeply to find a better solution. This novel algorithm improves the ability of particles based on a clustering operation. For each cluster, particles in it have different strengths and weaknesses. Thus the average fitness of the clusters is different. This difference can be used to evaluate the clusters and to adjust \(\varvec{\omega }\) for an individual.

Assume that the objective is to minimize the fitness. From the comparison, two different states are listed as follows:

Case 1: \(a_j\ge A\): This state means that the subpopulation is in an inferior position which is far from the best one. Hence the search steps should be set bigger for improving the global search ability and avoiding local search. We adjust the searching step by increasing the inertia weight \(\varvec{\omega }\); and

Case 2: \(a_j<A\): In this state, the subpopulation is located in a better area. For this subpopulation, they need to reduce the step length in search to improve local search ability and to avoid missing the optimal solutions nearby. Thus we adjust the step by reducing \(\varvec{\omega }\).

According to the above analysis, each individual can realize the self-adaption adjustment of parameter \(\varvec{\omega } _i^t\) by Eqs. 8 and 9, respectively:

$$\begin{aligned} \varvec{\omega } _i^{t+1}&= \omega _i^t+\mathrm{max}\left\{ \mathrm{abs}\left( \varvec{\omega } _\mathrm{max}-\omega _i^t\right) ,\mathrm{abs}\left( \varvec{\omega } _\mathrm{min}-\omega _i^t\right) \right\} \nonumber \\&a_j\ge A\end{aligned}$$
(8)
$$\begin{aligned} \varvec{\omega } _i^{t+1}&= \omega _i^t-\mathrm{min}\left\{ \mathrm{abs}\left( \varvec{\omega } _\mathrm{max}-\omega _i^t\right) ,\mathrm{abs}\left( \varvec{\omega } _\mathrm{min}-\omega _i^t\right) \right\} \nonumber \\&a_j< A \end{aligned}$$
(9)

where particle i belongs to cluster j.

3.3 Step of APSO-C

Based on DPDC and ACE, the of APSO-C has two main parts. The first part is the swarm clustering in which a K-means method is adopted to cluster the swarm into several subpopulations, and the information interaction among subpopulations is performed with a ring-topology structure. The second one is parameter adjustment that aims to adjust the search behaviors of particles according to the states of subpopulations.

APSO-C adjusts the structure of a swarm and the behavior of a particle in every generation, as detailed in the following steps.

Step 1: Set randomly x \(_i\) and v \(_i\), \(i \in \mathbb {N}_N\) (where N is the size of the swarm). Calculate \(f_i\) and initialize \(\varvec{\omega } _\mathrm{max}\), \(\varvec{\omega } _\mathrm{min}\) and \(\varvec{\omega }_i\) for each individual;

Step 2: Select randomly K particles as the clusters’ centers. Take a clustering method to divide the swarm and obtain subpopulations C \(_j\), \(\forall \,j \in \mathbb {N}_K \);

Step 3: Calculate the fitness of each particle. Set the best position pbest \(_i\) that is found so far by particle i where \(\forall \,i \in \mathbb {N}_{|C_j|} \) as the cluster j’s best position Cb \(_j\) at present.

Step 4: Choose the best position in the neighborhoods as nbest \(_i\) for each particle under the ring topology;

Step 5: Select the best position of the swarm found so far as gbsest;

Step 6: If the termination condition is met, output the result; and otherwise, use Eqs. 8 and 9 to adjust \(\varvec{\omega } _i\), and update v \(_i\) and x \(_i\) for each particle by Eqs. 7 and 2. Go to Step 2;

4 Experiments and discussion

4.1 Experiment and benchmark functions

To evaluate APSO-C, this work designs three experiments: Comparison under the same initial value, Comparison with the same iteration, and Sensitivity analysis of \(\varvec{\omega } _\mathrm{max}\) in APSO-C. We choose fourteen benchmark functions from (Liang et al. 2006; Hsieh et al. 2009; Suganthan et al. 2005). Note that \(f_1\) and \(f_2\) are unimodal, \(f_3\) (when the dimension is greater than three) and \(f_4\) are multimodal. They are four basic functions. Because some functions are separable and can be solved by using D 1-D searchers (Hsieh et al. 2009), where D is the dimension of the problem. Rotation, shifting and hybridization are introduced to overcome the known disadvantages of the functions. The new functions are more complex than the original ones. The properties of these functions are listed in Table 1.

  1. (1)

    Sphere function:

    $$\begin{aligned} {f}_{1}=\sum _{i=1}^{D}{x_i^2} \end{aligned}$$
  2. (2)

    Quadric function:

    $$\begin{aligned} {f}_{2}=\sum _{i=1}^{D}\left( \sum _{j=1}^{i}{x_j}\right) ^2 \end{aligned}$$
  3. (3)

    Rosenbrocks function:

    $$\begin{aligned} {f}_{3}=\sum _{i=1}^{D-1}\left( 100\left( x_{i+1}^2-x_i\right) ^2+\left( x_i-1\right) ^2 \right) \end{aligned}$$
  4. (4)

    Rastrigin function:

    $$\begin{aligned} {f}_{4}=\sum _{i=1}^D(x_i^2-10\cos (2\pi x_i)+10) \end{aligned}$$
  5. (5)

    Shifted sphere function:

    $$\begin{aligned}&{f}_{5}=\sum _{i=1}^{D}{z}_{i}^{2}+f_\mathrm{bias} \\&{\mathrm{where}~ {z}= {x}-{o}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  6. (6)

    Shifted schwefel’s problem 1.2:

    $$\begin{aligned}&{f}_{6}=\sum _{i=1}^{D}\left( \sum _{j=1}^{i}{z_{j}}\right) ^2+f_\mathrm{bias}\\&{\mathrm{where}~ {z}= {x}-{o}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  7. (7)

    Shifted rotated high conditioned elliptic function:

    $$\begin{aligned}&{f}_{7}=\sum _{i=1}^{D}(10^6)^{\frac{i-1}{D-1}}{Z}^2_i+f_\mathrm{bias}\\&{\mathrm{where}~ {z}= {x}-{o}*\mathbf M , {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  8. (8)

    Shifted schwefel’s problem 1.2 with nosie fitness:

    $$\begin{aligned}&f_9=\left( \sum _{i=1}^{D}\left( \sum _{j=1}^{i}{z_j}^2\right) \right) *(1+0.4|N(0,1)|)+f_\mathrm{bias}\\&\quad {\mathrm{where}~ {z}= {x}-{o}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  9. (9)

    Shifted Rosenbrocks function:

    $$\begin{aligned}&{f}_{8}=\sum _{i=1}^{D-1}\left( 100(z_{i}^2-z_{i+1})^2+(z_{i}-1)^2 \right) +f_\mathrm{bias} \\&{\mathrm{where}~ {z}= {x}-{o}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  10. (10)

    Shifted rotated Ackley’s with global optimum on bounds:

    $$\begin{aligned}&{f}_{10}=-20\,\mathrm{exp}\left( -0.2\sqrt{\frac{1}{D}\sum _{i=1}^{D}{x}_{i}^{2}} \right) \\&\quad -\mathrm{exp}\left( \frac{1}{D}\sum _{i=1}^{D}\mathrm{cos}(2\pi {x}_{i}) \right) +20+e+f_\mathrm{bias}\\&{\mathrm{where}~ {z}= ({x}-{o})*\mathbf {M}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
    Table 1 Global optimum, search range, initialization range of the test functions
  11. (11)

    Shifted Rastrigin’s function:

    $$\begin{aligned}&{f}_{11}=\sum _{i=1}^{D}{\left( z_i^2-10\,\mathrm{cos}(2\pi z_i)+10\right) }+f_\mathrm{bias}\\&{\mathrm{where}~ {z}=({x}-{o}), {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  12. (12)

    Shifted rotated Rastrigin’s function:

    $$\begin{aligned}&f_{12}=\sum _{j=1}^D\left( z_i^2-10\,\mathrm{cos}(2\pi z_i)+10\right) +f_\mathrm{bias}\\&{\mathrm{where}~ {z}= ({x}-{o})*\mathbf {M}, {o}=[o_1,o_2,\ldots ,o_D]} \end{aligned}$$
  13. (13)

    Expanded extended Griewank’s + Rosenbrock’s:

    $$\begin{aligned}&{f}_{13}=f_s(f_{3}(z_1,z_2))+f_s(f_{3}(z_2,z_3))+\ldots +\\&\quad \quad f_s(f_{3}(z_{D-1},z_{D}))+f_\mathrm{bias}\\&{\mathrm{where} f_s= \sum _{i=1}^D{\frac{{x_i^2}}{400}}-\prod _{i=1}^D{\mathrm{cos}(\frac{x_i}{\sqrt{i}})}+1}\\&{z}={x}-{o}+1, {o}=[o_1,o_2,\ldots ,o_D] \end{aligned}$$
  14. (14)

    Expanded rotated extended Scaffe’s F6:

    $$\begin{aligned}&{f}_{14})= F(z_1,z_2)+F(z_2,z_3)\\&\qquad \quad +\cdots + F(z_{D-1},z_D)+F(z_D,z_1)\\&{\mathrm{where} {z}={x}-{o}+1, {o}=[o_1,o_2,\ldots ,o_D]} \\&F(x,y)=0.5+\frac{\mathrm{sin}^2(\sqrt{x^2+y^2}-0.5)}{(1+0.001(x^2+y^2))^2} \end{aligned}$$

Seven peer algorithms including FIPS (Mendes et al. 2004), PSO\(\_g\) and PSO\(\_l\) (Kennedy 1999), Biogeography-based optimization (BBO, Simon 2008), Firefly Algorithm (FA, Yang 2010), Artificial Bee Colony (ABC, Karaboga and Basturk 2007) and Comprehensive Learning PSO (CLPSO, Liang et al. 2006) are compared with APSO-C. In CLPSO, the individual would update the position based on probability in each dimension. Liner inertia weight is taken. \(\varvec{\omega }_\mathrm{max}=0.95\), \(\varvec{\omega }_\mathrm{min}=0.4\), \(c=1.5\), and the refreshing gap \(m=7\). FIPS uses all the neighbors to influence the velocity of particles. The constriction coefficient \(\chi =0.729\), and acceleration coefficient \(\varphi =4.1\). PSO\(\_g\) and PSO\(\_l\) are the canonical PSO with gbest topology and lbest (ring) topology. Both algorithms set the linear control of inertia weight. \(\varvec{\omega }_\mathrm{max}=0.95\), \(\varvec{\omega }_\mathrm{min}=0.4\), and \(c_1=c_2=1.5\). In APSO-C, \(c_1=c_2=2\), \(\varvec{\omega }_\mathrm{max}=0.95\), \(\varvec{\omega }_\mathrm{min}=0.4\), initialized \(\varvec{\omega }=0.65\)(exception in experiment 1) and \(K=5\).The parameters in BBO, FA and ABC are set as suggested. All the sizes of populations are set as 30 in each algorithm. These algorithms are executed on same machine with an Intel Core Duo CPU 2.10 GHz, 2G memory and Windows 7 OS.

4.2 Experiment 1: sensitivity analysis of APSO-C

In APSO-C, \(\varvec{\omega }\) is a vital parameter that affects the search performance. We try to adjust it adaptively to reduce the dependence of an initial value. Here we define different values of \(\varvec{\omega }\) from 0.55 to 0.95 and the interval is 0.05. According to the analysis (Shi and Eberhart 1998a) of \( \varvec{\omega } \) in swarm search, \(\varvec{\omega }_\mathrm{min}\) is 0.4 and \(\varvec{\omega }_\mathrm{max}\) is 0.95. The maximum iteration counts are same as before. The results are shown in Tables 2 and 3. No matter of 10-D or 30-D tests, for the unimodal problems, APSO-C obtains good solutions with different \(\varvec{\omega }\) under the same number of iterations. From the mean and SD, the solutions are relatively excellent. For most multimodal problems, the proposed algorithm also performs well and obtains the best values. Various values influence the results but slightly.

Table 2 Comparison of APSO-C with different initialized \(\omega \) of 10-D function tests
Table 3 Comparison of APSO-C with different initialized \( \omega \) of 30-D function tests

From the analysis of results we can see that different initial values of \(\varvec{\omega }\) have little effect on the performance of APSO-C. In conclusion, the adaptive operation can reduce its dependence on initial values of \(\varvec{\omega }\). That means the initial value can be set as a random value between \(\varvec{\omega }_\mathrm{min}\) and \(\varvec{\omega }_\mathrm{max}\).

4.3 Experiment 2: comparison with same value

In this experiment, APSO-C is compared with other PSOs and swarm intelligence optimization algorithms on their search coverage. All the algorithms are executed with the same initial values for each benchmark function. The maximum iteration count is set 1,000 for 10-D problems, and 3,000 for 30-D problems. Each algorithm is executed 20 times. The curves are obtained based on the average performances.

Figures 1 and 2 reveal the results of the selected algorithms in 10-D and 30-D problems. From the comparison, it can be seen that, APSO-C offers, more or less, better performance, no mater unimodal or multimodal functions. Because unimodal benchmark functions \(f_1\) and \(f_2\) are changed mildly in the scope. Most test algorithms have slow search, but APSO-C has a fast convergence speed and obtains the highest precisions solution.

Fig. 1
figure 1

Median convergence characteristics of 10-dimension problems

Fig. 2
figure 2

Median convergence characteristics of 30-dimension problems

For multimodal problems, no algorithm has obvious advantages than others. For example, BBO is a little better than APSO-C in \(f_{10}\) and \(f_{12}\), but inferior in \(f_{5}\) and \(f_{9}\), and their performances on other problems are similar. The results suggest that the different performance of algorithms may depend on the characteristics of the problems. They also suggest that APSO-C has some minor advantage on average. No mater how multimodal functions differ, APSO-C can obtain better performance or one close to the best ones found by other algorithms. It is more stable than others. This implies that the proposed algorithm has good search ability and fast convergence. With a dynamic multi-swarm structure and adaption, APSO-C eases the conflicts between convergence speed and premature termination, and balances global search and local search. Regardless unimodal or multimodal problems, APSO-C can obtain highly satisfied solutions.

4.4 Experiment 3: comparison with the same iteration count

In the second experiment, APSO-C with other four algorithms is compared with random initial value for each function. All algorithms are also executed 20 times independently. As in the first experiment, the maximum iteration counts for 10-D and 30-D problems are, respectively, 1,000 and 3,000. The statistics includes the minimum value (min), maximum value (max), mean value (mean) and standard deviation (SD). The results can be seen in Tables 4 and 5.

Table 4 Comparison results of min, mean, max and SD in 10 dimensions
Table 5 Comparison Results of Min, Mean, Max and SD in 30 Dimensions

It is clear from the results that APSO-C obtains much better solution than other algorithms in \(f_1\) ,\(f_2\) and \(f_3\) under the same number of iterations. However, though APSO-C meets the accuracy requirements, it is inferior to others in the test of \(f_4\) with 30 dimensions. For \(f_5\)\(f_{14}\), APSO-C offers good performance and obtains the global best solutions for some of them. Through the operations of dynamical clustering and adaptive adjusting, the proposed APSO-C improves the performance of local search, global search and robustness. The ability of particles is strengthened by considering comprehensive attractors of the best solution from the swarm and subpopulations. The solution efficiency and accuracy are increased for most benchmark problems.

5 Conclusion

This paper presents an adaptive PSO method (APSO-C) based on swarm structure topology and individual behavior control. Two different strategies are proposed, i.e., dynamic population decomposition based on a clustering strategy and adaptation based on a cluster evaluation strategy. In the first strategy, the population is divided via a K-means method into several clusters that contain different numbers of particles with different ability. Then the clusters take a ring topology to exchange information. At the second strategy, through the cluster ability evaluation on searching status and parameter adaption for each individual, particles adjust their parameters to balance their local and global search.

Results from the experiments show that APSO-C obtains a better result than some other typical PSO algorithms in literature and reaches the expectations in terms of convergence speed, solution accuracy, and parameter sensitivity given the same benchmark functions.

In the future work, we will focus on the following issues:

  1. 1.

    To evaluate the search ability of an individual particle, we need to consider the comparison value with the average fitness of dynamic clusters and the swarm. One of the important questions is whether there are other effective evaluation methods;

  2. 2.

    The effects of best positions in neighborhood and in a swarm are treated the same for an individual in this work. In the future their different effect will be investigated;

  3. 3.

    The clustering process is essential to obtain subpopulations, and is also the key influencing factor of algorithm efficiency. From some tests, the swarm can obtain a good state of subpopulation division in APSO-C after executing a K-means method for two to four times. But how to choose the optimal number of execution times and number of clusters remains open;

  4. 4.

    Because the swarm has a bias towards the single objective in the proposed algorithm, multiple global optima may confuse the individuals. Thus, more study is needed on “multi-optima issue” (Dong and Zhou 2014) in our future research. Furthermore, the application of the proposed method to some industrial optimization problems (Fang et al. 2012; Kang et al. 2013; Wu and Zhou 2007; Xing et al. 2012; Xiong et al. 2009; Yu et al. 2009) should be pursued.