1 Introduction

Travelling salesman problem (TSP) is a classical combinatorial optimization problem. The problem can be mainly described as follows: a travelling merchant traverses all the cities of the country without repeating and finally comes back to the start point. The loop obtained by the salesman is required to be shortest, so-called the minimum Hamiltonian circuit. There are many methods to solve TSP, such as genetic algorithm [1], particle swarm optimization [2], grey wolf optimization algorithm [3], ant colony algorithm [4, 5]. And current research shows that ant colony algorithm can solve the TSP problem well.

Ant colony optimization (ACO) is a classical swarm intelligence algorithm proposed by Italian scholar M. Dorigo who was inspired by ants foraging in nature [4]. The main idea is that ants can use their own pheromone updating mechanism to effectively return and forth between food source and nest. After the ant system algorithm was proposed, it attracted much attention and brought largely discussion about the improvement of the algorithm. In order to improve the solution accuracy of the ant system algorithm, Dorigo [5] also put forward the ant colony system algorithm (ACS). In ACS, only the global optimal ant can be allowed to deposit pheromone in each iteration, and other ants will diminish the level of pheromone on the tour they visited in terms of local pheromone updating rule. This mechanism can strengthen the positive feedback effect of the optimal information and speed up the convergence of the algorithm. However, it also makes the algorithm easily fall into local optimum. In order to overcome this problem, Stützle et al. [6] proposed the Max–Min ant system algorithm (MMAS). MMAS restricts the accumulation and volatilization of pheromone by limiting it within a fixed interval, which can avoid algorithm stagnation to some certain extent. Thus, the population diversity can be improved. However, the algorithm will be difficult to converge when the solutions distribute dispersedly.

Although these improved references have acquired some achievements in the traditional ant colony algorithms, the main problem, how to balance the relationship between the convergence speed and diversity, has not been solved well. Therefore, more scholars try to solve it based on their research fields. Sangeetha et al. [7] used a pheromone enhancement mechanism to increase the pheromone concentration on the better path that reduced useless search and saved the time cost. Ye et al. [8] introduced the negative feedback pheromone strategy to guide ant colony search for unknown space to avoid too many ants selecting the same area, which can expand the search space and enhance the diversity of the algorithm. Ning et al. [9] put forward a pheromone update mechanism based on the current optimal path. The method increased the pheromone value of different paths between the best-so-far optimal path and the current optimal path, which can speed up the convergence of the algorithm. Tseng et al. [10] divided the ant colony into two groups and the cooperation between two kinds of ants improved the accuracy of the solution. Besides, the parameter setting is another conundrum for the ACO. To solve this problem, Mahi et al. [2] applied the particle swarm optimization algorithm to optimize the parameters of the ACO, which improved the stability of the algorithm. Olivas et al. [11] introduced the fuzzy control system to select the appropriate parameters for the ACO algorithm and enhanced the accuracy of the solution. Tuani et al. [12] proposed a novel adaptive parameter adjustment mechanism to improve the adaptability of the algorithm. In addition, other improved ACO algorithms have been widely used in various fields, such as robot path planning problem [13], network routing problem [14], image detection [15], vehicle scheduling problem [16], data mining [17].

However, due to the limitation of the single population, the improvements often weaken one characteristic of the colony to strengthen another. For example, it will increase the search time to expand more areas or will diminish the solution accuracy to accelerate the convergence. In order to balance the relationship between the convergence speed and diversity of the algorithm further, the multi-population gradually attracts many scholars’ attention. Gambardella [18] proposed the concept of the multi-ant colony algorithm for the first time. They adopted two colonies of ACS to solve vehicle scheduling problems with time window. Chu et al. [19] proposed seven interaction strategies to control the communication among the homogenous colonies. Twomey et al. [20] analyzed the homogenous multi-ant colony with different communication policies and proposed migrant integration strategy for the interaction. The cooperation on homogenous populations will only amplify the single feature in terms of their same characteristics, while the heterogeneous populations can take full advantage of each other. Dong et al. [21] combined the ant colony algorithm with genetic algorithm in a novel way to solve the TSPs successfully. Zhang et al. [22] used two heterogeneous ant colonies to diversify the solution of algorithm by exchanging the pheromone information. Wang et al. [23] applied multi-ant algorithm with local search to solve the vehicle routing problem, which enhanced the solution accuracy by comparing and exchanging the global optimal solution of each colony.

According to the above references, multi-colony algorithms can balance the convergence speed and search abilities of the ACO better than the single colony algorithms. However, the interaction mechanism among sub-colonies is relatively simple, which leads to the adaptability of multi-colony algorithm underperformance. To deal with these issues, some cross-discipline methods, such as game theory or information theory, are applied to improve the performance of the multi-colony algorithm. Yang et al. [24] introduced game theory to control the coordination among heterogenous populations and improved the stability of the algorithm. Li et al. [25] applied the information entropy to adapt the communication among populations more accurately. In this paper, we focus on balancing the relationship between the convergence speed and the diversity of the algorithm. And from the above theories, the multi-ant colony algorithm based on pheromone fusion mechanism of the cooperative game is proposed to solve large-scale TSP instances. The main contributions and innovations of this research are as follows.

Firstly, the pheromone fusion mechanism that regulates the pheromone distribution of each sub-colony is introduced to realize the information exchange among multiple populations effectively. It fuses the pheromone matrix of other subpopulations while remains the original population information, which improves the efficiency of communication. Thus, the diversity of the algorithm is enhanced.

Secondly, the cooperative game model is proposed to help the population select appropriate communication objects by finding the Pareto optimal combination. If the Pareto optimal combination belongs to the cooperative union, the profit distribution strategy will be applied, otherwise, the pheromone smoothing mechanism can be triggered. In the profit distribution strategy, the profits will be distributed into members reasonably by adding the pheromone on the public paths among populations to accelerate the convergence speed of the algorithm. In the pheromone smoothing mechanism, the pheromone matrix will be reinitialized to help the algorithm jump out of the local optimum effectively.

Finally, the information entropy is introduced to control the communication frequency, which is called adaptive communication strategy. In this strategy, the information entropy is used to evaluate the diversity of the population, and we control the communication frequency among populations by measuring their information entropy state to improve the adaptability of the algorithm.

In addition, the contents of this paper are as follows: Sect. 2 introduces ACS, MMAS algorithm and information entropy briefly. Section 3 reports he working principle of CGMACO, including adaptive communication strategy, pheromone fusion mechanism and cooperative game model. Section 4 analyses the performance of proposed algorithm with different strategies and compares CGMACO with the traditional ant colony algorithm and other intelligent algorithms. Section 5 summarizes and prospects this research.

2 Related Work

2.1 Ant Colony System

Ant colony system (ACS) was proposed by Italian scholar M. Dorigo in 1996 [5]. It introduced a special state transition rule called pseudo-random proportionality rule. This rule allows each ant to choose its path in a roulette mode with a probability of 1-q0, where q0 is a parameter between [0,1]. The state transition formula is as follows.

$${{P}^{k}}_{ij}=\left\{\begin{array}{c}\frac{{{\tau }^{\alpha }}_{\mathrm{ij}}\cdot {{\eta }_{\mathrm{ij}}}^{\beta }}{{\sum }_{l\in \mathrm{allowed}}{{\tau }^{\alpha }}_{\mathrm{il}}{\cdot {\eta }_{\mathrm{il}}}^{\beta }} \, {\text{i}}{\text{f}} \, {\text{j}}\in {\text{a}}{\text{l}}{\text{l}}{\text{o}}{\text{w}}{\text{e}}{\text{d}}^{k}\\ 0 \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, {\text{e}}{\text{l}}{\text{s}}{\text{e}}\end{array}\right.$$
(1)

where the \({\tau }_{ij}\) represents the pheromone value from city i to city j (vertex (i, j)); \({\eta }_{ij}=1/{d}_{ij}\) denotes the heuristic information on the vertex (i, j) which dij is the cost of the vertex (i, j); \(allowed\) stores a set of cities that the ant k is not visit; \(\alpha\) and \(\beta\) are two weight parameters determining the confluence of pheromone value and heuristic information. In addition, \(S=\mathit{argmax}({\tau }_{ij}\cdot {{\eta }_{ij}}^{\beta })\) is another transition formula that the ants positioned city i move to city j when the random number q is lower than q0.

After finishing a transition from city i to city j, each ant applies a local pheromone update rule to decrease the attraction of the edge (i, j). The formula is as follows:

$${\tau }_{ij}\leftarrow (1-\xi )\cdot {\tau }_{ij}+\xi \cdot {\tau }_{0}$$
(2)

where 0 \(<\xi <1\) is a pheromone evaporation rate; \({\tau }_{0}=1/(n\cdot {l}_{n})\) is the initial pheromone level, where n is the city number and \({l}_{n}\) is the tour length created by the nearest neighbour heuristic algorithm.

When all ants complete tour construction, only the globally best tour is allowed to add pheromone which called global pheromone updating rule. The formula is written as:

$${\tau }_{ij}\leftarrow \left(1-\rho \right)\cdot {\tau }_{ij}+\rho \cdot \Delta {{\tau }^{\text{bs}}}_{ij}$$
(3)

where \(0<\rho <1\) is pheromone evaporation rate; \(\Delta {{\tau }^{\text{bs}}}_{ij}=1/{L}_{gb}\) is the number of increasing pheromone, and \({L}_{gb}\) is the length of the best tour.

2.2 Max–Min Ant system

In order to solve the easy stagnation of the traditional ant colony algorithm, Stützle proposed the Max–Min ant system (MMAS) [6]. In MMAS, the pheromone is updated by alternating iteration-best tour with best-so-far tour in the early run time, and the updating rule is defined as same as formula (3). What’s more, the pheromone trail of each path is limited to the specified range [\({\tau }_{\mathrm{min}}\),\({\tau }_{\mathrm{max}}\)]. If \({\tau }_{\mathrm{ij}}<{\tau }_{\mathrm{min}}\), then \({\tau }_{\mathrm{ij}}={\tau }_{\mathrm{min}}\); if \({\tau }_{\mathrm{ij}}>{\tau }_{\mathrm{max}}\), then \({\tau }_{\mathrm{ij}}={\tau }_{\mathrm{max}}\). And it also reinitializes the pheromone matrix to avoid the stagnation. The maximum and minimum values of pheromones are set as follows:

$$\tau_{\max } = (1/\rho ) \cdot (1/L_{gb} )$$
(4)
$$\tau_{\min } = \tau_{\max } /(2n)$$
(5)

where n is the city number; \({L}_{gb}\) is the length of the iteration-best tour.

2.3 Information Entropy

Entropy was originally used to measure the disorder state of thermodynamic system in the field of physics. Later, American scholar Shannon introduced it into the field of information theory and put forward the conception of information entropy. For now, information entropy has not only improved greatly in theory [26,27,28], but also achieved good results in many practical applications [29,30,31]. And it shows the effectiveness of information entropy as a measure of discrete system. The formula is as follows:

$$E\left(P\right)=-{\sum }_{x\in X}P\left(x\right)\mathrm{log}p\left(x\right)$$
(6)

where X is the solution of the problem, and P (x) is the probability of x, and \({\sum }_{x\in X}P(x)=1\).

3 Proposed Algorithm

In this research, a heterogeneous multi-colony ant colony algorithm based on cooperative game theory is proposed to balance the convergence and the diversity of the algorithm. Based on regarding each subpopulation as an independent agent and the premise of individual rationality, cooperative evolution is realized by game decision mechanism among subpopulations and Fig. 1 shows the interactive model. And this part is organized as follows. Section 3.1 introduces the self-adaptive communication strategy based on information entropy. Section 3.2 provides the pheromone fusion strategy in detail. Section 3.3 is dedicated to applying the cooperative game theory to the multi-ACO algorithm. Section 3.4 is the algorithm description.

Fig. 1
figure 1

Dynamic interactive game model

3.1 Self-adaptive Communication Strategy

For multi-colony algorithm, it is necessary to have an appropriate method to control the interaction frequency of various subpopulations. With the running of algorithm, there will be more and more unbalanced distribution among various pheromone matrices. Thus, interaction with each other is needed to increase the diversity. According to the part 2.3, the information entropy is applied to measure the diversity of the algorithm so as to control the communication frequency more accurately, which the formula is as follows:

$${P}_{i}(t)=n/m$$
(7)
$$E\left(Pt\right)=-{\sum\nolimits }_{i=1}^{m}{P}_{i}(t)\mathit{log}{p}_{i}(t)$$
(8)

where \({{\varvec{P}}}_{{\varvec{i}}}\left({\varvec{t}}\right)\) is the proportion that the i-th tour trail selected by n ants when M ants generate m paths in this iteration, \({\varvec{E}}\left({\varvec{P}}{\varvec{t}}\right)\) is the information entropy of the population in t-th iteration which demonstrates that if the tour difference of the population is higher, the information entropy of the population will be larger, and vice versa. In another word, the higher information entropy, the better diversity of the population. Therefore, by comparing the information entropy of each subpopulation with the set threshold, the communication frequency among populations can be controlled more accurately, and thus, the adaptability of interaction is improved.

3.2 Pheromone Fusion Strategy

In this part, the pheromone fusion strategy is proposed to increase the diversity of the population when the population falls into local optimum. If \(E({p}_{i})<{E}^{*}(P)\), where the \({{\varvec{E}}}^{\boldsymbol{*}}({\varvec{P}})\) is the threshold parameter, we will consider that the diversity of this population is poor. Under this circumstance, communicating with other populations is needed to be adopted. In this paper, communication strategy is the pheromone fusion which is realized by using the weighted coefficients of information entropy. The formula is as follows:

$$P{h}_{i}\leftarrow (1-{\sum }_{j\ne i}^{n-1}{w}_{j})\cdot P{h}_{i}+{\sum }_{j\ne i}^{n-1}{w}_{j}\cdot P{h}_{j}$$
(9)

where Phi is the pheromone matrix of population i, Phj is the pheromone matrix of population j; Wj is the pheromone contribution of population j to population i, which the formula can be written as follows:

$${w}_{j}=\frac{E({p}_{j})}{{\sum }_{j= \text{1} }^{n}E({p}_{j})}$$
(10)

where E(Pj) is the information entropy of the population j.

3.3 Cooperative Game Model

3.3.1 Cooperative Game in Multi-ACO

In order to improve the performance of the pheromone fusion mechanism proposed above further and make full use of the heterogeneous populations, we build the cooperative game model based on cooperative game theory. In the model, two decisions that cooperation (C) or defection (D) allows subpopulations to select when they receive signals that need to communicate. In this paper, the cooperation and defection rules are given by (11)

$${w}_{j}=\left\{\begin{array}{c}{w}_{j}\, if\, {p}_{j}\, is\, cooperation\\ 0 \, \, \, if\, {p}_{j}\, is\, defection\end{array}\right.$$
(11)

The formula (11) illustrates that if the population j chooses to participate in cooperation (C), the weight coefficient is wj; if the population j chooses to defection (D), the weight coefficient wj is 0. And the different pheromone matrices obtained by those selections are shown in Table 1. Therefore, there are three types of alliance structure including that full union structure which all colonies cooperate such as Ph11 in Table 1, sub-union structure which not all colonies are cooperative such as Ph12 and Ph21 in Table 1 and single-player structure which all colonies are not cooperative such as Ph22 in Table 1.

Table 1 The pheromone matric under cooperative game

In addition, there are three basic parts of the game theory including players, strategy sets and the payoff of each strategy. In this paper, the players are the subpopulations, and the pheromone matrices in Table 1 denote the strategy sets. Meanwhile, we define Vi, which is given by (12), as the corresponding payoff of strategy i to select the best pheromone matrix for the lower information entropy subpopulation.

$${V}_{i}=f({p}_{i})\cdot {A}_{i}$$
(12)

and \(f({p}_{i})=\frac{E({p}_{i})}{E(p{)}_{gb}}\), \({A}_{i}=\frac{{L}_{gb}}{{L}_{i}}\)where \({\varvec{E}}({\varvec{p}}{)}_{{\varvec{g}}{\varvec{b}}}\) represents the value of global optimal information entropy in all strategies, and E(Pi) denotes the information entropy of strategy i; Lgb is the global optimal path, and Li is the current optimal solution under strategy i. From the equal (12), the larger value of Ai denotes the higher solution quality of the strategy and the higher f(Pi) reflects the better population diversity. Therefore, the higher value of the payoff Vi is, the better solution quality would be.

3.3.2 Profit Distribution Strategy

When the maximum payoff V belongs to the cooperative union, the payoff distribution strategy is proposed to distribute the payoff obtained by the union to the participants reasonably. In another hand, if the profit of cooperative union is higher than non-union, the population will choose cooperation under the premise of individual rationality. And in this case, in order to maintain the stability of the cooperative alliance, it is necessary to distribute the profits reasonably among the members of the union. As we known, a reasonable distributed mechanism can promote participants to take part in the union more actively and make the alliance structure more stable, thus, they can obtain a higher collective profit. In this paper, we define the increasing profits after cooperation game as the pheromone to reward the common paths among participants, which the formula is given by (13)-15 and the public path is shown as in Fig. 2.

$$\Delta {\tau }_{\mathrm{profit}}=\frac{{V}_{\mathrm{new}}-{V}_{\mathrm{before}}}{{L}_{\mathrm{new}}}$$
(13)
$$\Delta {\tau }_{\mathrm{profit}}^{k}=\frac{E({p}_{k})}{{\sum }_{k\in K}E({p}_{k})}\cdot \Delta {\tau }_{\mathrm{profit}}$$
(14)
$$\tau_{{{\text{public}}}}^{k} = \left( {1 - \rho } \right) \cdot \tau + \rho \cdot \Delta \tau + \Delta \tau_{{{\text{profit}}}}^{k}$$
(15)
Fig. 2
figure 2

Public path

where \(\boldsymbol{\Delta }{{\varvec{\tau}}}_{{\varvec{p}}{\varvec{r}}{\varvec{o}}{\varvec{f}}{\varvec{i}}{\varvec{t}}}\) is the increasing profits of the colony after cooperative game which denotes the payoff of the union, \({{\varvec{V}}}_{{\varvec{b}}{\varvec{e}}{\varvec{f}}{\varvec{o}}{\varvec{r}}{\varvec{e}}}\) is the profit of the population without pheromone fusion, \({{\varvec{V}}}_{{\varvec{n}}{\varvec{e}}{\varvec{w}}}\) is the maximum profit in all unions after pheromone fusion, and \({{\varvec{L}}}_{{\varvec{n}}{\varvec{e}}{\varvec{w}}}\) is the length of new tour created by cooperation game; \(\boldsymbol{\Delta }{{\varvec{\tau}}}_{{\varvec{p}}{\varvec{r}}{\varvec{o}}{\varvec{f}}{\varvec{i}}{\varvec{t}}}^{{\varvec{k}}}\) is the profit of kth population, and \({\varvec{E}}({{\varvec{p}}}_{{\varvec{k}}})\) is the information entropy of the kth colony in the union.

From the formula (13) and (14), we can see that the public path between lower entropy population and other participant populations will be rewarded added pheromone in the union, which makes the cooperative profits higher than non-cooperation, thus it can ensure the effectiveness of cooperation and enhance the collective rationality of the group. In addition, the path selected by many populations will also belong to the optimal tour largely [6]. Therefore, rewarding the pheromone to those paths can narrow the useless research of the ant colony and accelerate the convergence speed of the algorithm. And the formula (15) is the pheromone updating rule of the population k on public paths.

3.3.3 Pheromone Smoothing Mechanism

What's more, a rare situation that the payoff of non-cooperation is higher needs to be considered. This means that the profits created by the pheromone fusion mechanism are less than these under non-fusion mechanism. In this case, we first judge whether the population finds a better solution. And the pheromone matrix of the population would not be changed when a better solution is found, which denotes that the population has better potential, otherwise, it shows that the pheromone fusion mechanism is invalid and the algorithm has been stagnant. To deal with this issue, the pheromone smoothing mechanism (PSM) is introduced to help the population jump out of the local optimum, which the formula is as follows:

$$\tau_{{{\text{ij}}}}^{i} = \frac{{\tau_{\min }^{i} + \tau_{{{\text{ij}}}}^{i} }}{2}$$
(16)

where \({\tau }_{\mathrm{min}}^{i}\) and \({\tau }_{\mathrm{ij}}^{i}\) are the minimum concentration of pheromone and the value of pheromone on edge (i, j) in population i, respectively. And if the population i is the ACS, the \({\tau }_{\mathrm{min}}^{i}\) is equal to \({\tau }_{0}\). Otherwise, it is equivalent to \({\tau }_{\mathrm{min}}\) in MMAS. The formula (16) reflects a novel reinitialization pheromone method that the larger gap between the pheromone value on the path and the minimum pheromone trail is, the more pheromone evaporation will be. However, the edges with minimum pheromone would not be volatilized causing that the pheromone matric reduces to \({\tau }_{\mathrm{min}}\) with different degrees. Thus, it can reduce the maximum pheromone trail while would not lost the information of suboptimal path.

3.4 Algorithm Description

Informally, the algorithm proposed in this research can be described as follows: When one population happens that its entropy is lower than the threshold value, this population will launch the cooperative game model right now, and other populations will choose cooperation or defection due to pheromone fusion mechanism by equal (9)-equal (11) to create the strategy sets shown in Table 1. Then, this population with lower information entropy will produce the payoff matrix, which is shown in following Table 2, based on the different pheromone matrices in Table 1.

Table 2 The payoff matrices of cooperative game

where V11 is the payoff under the full union structure, V12 and V21 are the payoff under the sub-union structure, and V22 is the payoff under the single union structure. Thus, the Pareto optimality of this game is:

$${V}_{\mathrm{pareto}}=\mathrm{max}({V}_{11},{V}_{12},{V}_{21},{V}_{22})$$
(17)

Next, we start to judge which union is the Pareto optimality. If the \({V}_{\mathrm{pareto}}={V}_{22}\), which means the pheromone fusion mechanism is failure, the pheromone smoothing mechanism will be applied to help the algorithm get rid of local optimum, otherwise, the execute profit distribution strategy is introduced to speed up the convergence of the algorithm. And algorithm 1 and Fig. 3 are the framework and flowchart of the proposed algorithm respectively, which is shown as follows in detail.

figure a
Fig. 3
figure 3

The flow chart of CGMACO algorithm

4 Experiment and Simulation

In this part, Sect. 4.1 is the parameters setting in ACS, MMAS and E(P)*, which E(P)* is the threshold value of the information entropy. Section 4.2 analyses the performance of the different strategies we have proposed. Section 4.3 compares the proposed algorithm with the conventional ACO algorithms. Section 4.4 is the experimental comparison between CGMACO and the algorithms including other improved ACO algorithms and swarm intelligence algorithms. Moreover, the experimental platform is the MATLAB R2019b in Windows 10 environment, the CPU, with 16 GB RAM memory capacity, is Intel(R) Core (TM) i7-10700F, and the experiments are applied to execute based on different scale TSP instances, each instance runs for 20 times independently.

4.1 Parameters Setting

The first experiment of this part is to set the appropriate parameters of ACS and MMAS. In order to improve the performance of the proposed algorithm, we adopt the orthogonal tests that have four levels and sixteen parameter combinations to set the appropriate parameter for each colony. Besides, kroB100 instance is selected to carry out in the orthogonal experiment, and each combination of the parameters experiment is executed 20 times independently to ensure the reliability of the experiment. Tables 3, 4 and 5 are the experimental results of ACS, and Tables 6, 7 and 8 denote the results of MMAS.

Table 3 Experimental factors and levels of ACS
Table 4 Orthogonal test scheme and test results of ACS
Table 5 Analysis of test results of ACS
Table 6 Experimental factors and levels of MMAS
Table 7 Orthogonal test scheme and test results of MMAS
Table 8 Analysis of test results of MMAS

The second experiment of this part is to select the suitable entropy threshold for CGMACO. Information entropy threshold (E(P)*) is also an important parameter in this research. If E(P) * is too large, the communication frequency of subpopulations would be higher, which will make the multi-population degenerate into a single population. And if it is set too small, the insufficient interaction among sub-colonies would also decrease the diversity of the algorithm. In this research, we select the suitable value of E(P)* through the experiment based on four TSP instances such as kroA100, kroA200, lin318, att532. And Fig. 3 illustrates those specific experiments data. As it clears in it, the results have shown that the smallest fitness evaluation function value can be obtained under the parameter E(P)* = 4. Therefore, we set the parameter E(P)* = 4 in the following experiments (Fig. 4).

Fig. 4
figure 4

Adjustment of the entropy threshold

From the above experimental results, the final setting results of the algorithm parameters are shown in following Table 9, which the ρ denotes the global pheromone evaporation rate, ζ is the local pheromone evaporation rate, and M represents ant number.

Table 9 The parameter setting of the algorithm

4.2 Strategy Analysis

In this part, we analyse the effectiveness of three strategies proposed above including adaptive communication strategy based on information entropy, pheromone fusion strategy and pheromone smoothing mechanism. Strategy-1(S-1) is the algorithm that has pheromone fusion strategy and pheromone smoothing mechanism but does not use information entropy. Strategy-2 (S-2) is the algorithm that retains adaptive communication strategy and pheromone smoothing mechanism but does not use pheromone fusion mechanism. Strategy-3 (S-3) represents the algorithm that has adaptive communication strategy and pheromone fusion strategy but does not use pheromone smoothing mechanism. In addition, in order to make the experimental algorithm run normally, we use fixed algebra communication strategy in S-1, here, we select it as 200 iterations and exchange the optimal solution between colonies in S-2. In experiment, kroB100, kroA200 and fl417 TSP instances are selected and analysed with three aspects including optimal solution error rate, worst solution and average solution. And each instance runs 20 times, 2000 iterations each time. The experimental results are shown in Table 10 and Fig. 5.

Table 10 Experiment results with different strategies
Fig. 5
figure 5

Results comparison in different strategies

As we can see in the results, the performance of S-2, which the algorithm without pheromone fusion mechanism, is worst, while CGMACO, which the algorithm with all strategies, is best. And S-1 and S-3 have their own advantage on different instances. This is because the pheromone fusion mechanism can effectively take full advantage of the heterogeneous population and improve the diversity of the algorithm, due to regulating the pheromone distribution of each subpopulation. While the communication efficiency among subpopulations is greatly reduced without this mechanism, which it has been confirmed in experiments, thus the accuracy of the solutions reduces. The results of the algorithm without information entropy strategy and pheromone smoothing mechanism are better than the algorithm without pheromone fusion mechanism, but the quality of the solution is still lower than that of CGMACO.

4.3 Comparison with Traditional ACO Algorithm

In the first phase of the experiment, we compare CGMACO with traditional ACO algorithms. And Table 10 reveals the performance of proposed algorithm with ACS and MMAS based on 22 TSP instances. The evaluation criterions in experiment mainly include the best solution, the worst solution, mean solution, error rate and the standard deviation, which the error rate and standard deviation formula are as follows:

$$\mathrm{error}=\frac{{L}_{\mathrm{ACO}}-{L}_{\mathrm{opt}}}{{L}_{\mathrm{opt}}}\times 100\%$$
(18)

where LACO represents the optimal solution of each algorithm, and Lopt represents the standard optimal solution of the known test set.

$$\mathrm{dev}=\sqrt{\frac{1}{N}{\sum }_{i=1}^{N}({L}_{i}-\overline{L}{)}^{2}}$$
(19)

where dev denotes the standard deviation, N represents the number of times the algorithm runs, and Li is the solution obtained by the algorithm in the i-th experiment.

As it clears in Table 11, in small-scale instances with city’s scale from 51 to 200, both CGMACO and conventional ACOs can achieve better results in the error rate, but CGMACO has lower average solution and standard deviation than ACS and MMAS due to the cooperation among multi-populations, which proves that CGMACO has better stability than comparison algorithms. Moreover, more flexible interaction mechanism based on the self-adaptive communication strategy makes the information transmission among sub-colonies more adaptive, which can greatly diversify the solutions of algorithm. As shown in Table 11, in middle-scale instances, such as tsp225, pr264 and a280, CGMACO gains the standard optimal solution, but the comparison algorithms do not get. Although our algorithm would not obtain the optimal solution in lin318, the error rate remains within 1%, which is more superior to the 1.81% and 2.42% obtained by ACS and MMAS, respectively. With the increase in city scale from fl417 to rl1304, the error rate of traditional ACOs is exceedingly more than 1% and gradually rising. While attributed to the pheromone fusion mechanism, we proposed in this research, the pheromone distribution of each colony can be regulated appropriately and the information exchanged among subpopulations becomes more effective, which makes the improved algorithm still control the error rate within 1%. Besides, the cooperative game model can further promote the efficiency of communication among sub-colonies, helping that CGMACO possesses lower mean solution and standard deviation solution and still enables strong stability in large-scale instances.

Table 11 Performance compare CGMACO with ACS and MMAS

Figure 6 illustrates the obvious improvement of convergence speed and solution accuracy of the proposed algorithm. According to the population profits distribution strategy in CGMACO, high-quality solutions can be selected in the early algorithm stage which guides more ants to explore around the optimal solution and avoids much useless search, while ACS and MMAS have to seek the whole search space to complete the evolution causing that it is hard to convergence. As Fig. 6 shown, CGMACO has faster convergence than conventional ant colony algorithms. And in the later stage, due to the excessive accumulation of pheromones, single population ACOs easily fall into stagnation. However, with the help of pheromone smoothing mechanism, the pheromone matric is reinitialized helping CGMACO jump out of local optimum effectively. Taking rl1304 instance as an example, as shown in Fig. 6f, CGMACO can still obtain new better solution at about 1100 and 1600 iteration in terms of this mechanism. In a word, the CGMACO algorithm has faster convergence than ACS and MMAS without losing high-quality solutions.

Fig. 6
figure 6

Comparison the convergence of different algorithms

Figure 7 demonstrates the optimal tour found by CGMACO in the simulate experiment.

Fig. 7
figure 7

Optimal tour found by CGMACO

4.4 Comparison with Other Algorithms

In the second phase of the experiment, we compare CGMACO with DBAL [32] and DSMO [33]in detail under error rate column, average solution and standard deviation solution. According to Table 12 and Fig. 8, it can conclude that the improved algorithm has strong competitiveness with comparison algorithm. In the experiment, we select the instances with city’s scale from 51 to 1000. Both in small-scale TSP instances and large-scale TSP instances, CGMACO outperforms DSMO under three evaluation criterions, but CGMACO and DBAL have their own performance advantages in different TSP instances. For example, in lin318, the factors including error rate, average solutions and standard deviation in DBAL (0.1%, 42,268.1 and 128.24) are better than them in CGMACO (0.23%, 42666.1 and 181.32), respectively; however, our proposed algorithm is superior to DBAL in pr1002. In general, under the synergy of the pheromone fusion mechanism and cooperative game model, the efficiency of coordination among sub-colonies has been improved greatly. As we can see in Table 12, among 13 TSP instances experiments, CGMACO outperforms DBAL with 8 instances including eil51, st70, eil76, eil101, kroA100, pr264, pr439 and pr1002, which proves the excellent performance of CGMACO.

Table 12 Compare CGMACO with DBAL and DSMO
Fig. 8
figure 8

The average solution in each algorithm

In the third phase of the experiment, we compare the CGMACO with other optimization algorithms. The comparison optimization algorithms mainly have single ant colony algorithms that include HAACO [12], PACO-3opt [34], DEACO [35], HMMA [36], multi-ant colony algorithms such as JCACO [22], NACO [24], LDTACO [25] and other swarm intelligence algorithms that include hybrid ant colony particle swarm optimization algorithm called PSO-ACO-3opt [2], Discrete Bat Algorithm DBAL [32], Discrete Spider Monkey Optimization DSMO [33], Discrete Water Cycle Algorithm DWCA [37], Artificial Bee Colony algorithm ABC[38], Discrete Symbiotic Organisms Search algorithm DSOS [39] and Improved Discrete Bat Algorithm IBA [40]. Tables 12 and 13 show the specific experiment data, which the best is the optimal solution obtained by each algorithm and the error is the error rate column defined by equal (18). And the “-” in the table denotes that the comparison algorithm does not test the instance.

Table 13 Compare proposed algorithm with other algorithms in small-scale TSP instances

As we can see in Table 13, the results demonstrate that our proposed algorithm can find the standard optimal solutions in all small-scale TSP instances, which outperforms the comparison algorithms such as LDTACO, DSMO, DEACO and PACO-3opt. Moreover, in Table 14, CGMACO is also superior to the recent algorithms. In tsp225 and a280 instances, CGMACO can still obtain the standard optimal solution, while DSMO, NACO, JCACO would not get. And in fl417, pr439 and p654 instances, the superiority of CGMACO has been confirmed obviously, which the error rate CGMACO obtained is significantly lower than the comparison algorithms. These satisfactory experimental results are mainly ascribed to the strong search ability improved by cooperative game based on the pheromone fusion mechanism. Specifically, under the pheromone fusion mechanism, more useful information can be explored. And due to the cooperative game model, the beneficial pheromone distribution can be generated in subpopulations. These two methods can take full use of the advantage of heterogeneous populations. In short, our proposed algorithm, CGMACO, has strong competitiveness with the state-of-art algorithms and can obtain higher quality solutions, especially for large-scale TSP instances.

Table 14 Compare proposed algorithm with other algorithms in large-scale TSP instances

5 Conclusion

In this paper, we have proposed a novel ant colony algorithm, so-called multi-ACO based on pheromone fusion mechanism of cooperative game, to solve travelling salesman problems. In multiple populations, we select two ACS colonies and one MMAS colony. Two ACS subpopulations form the homogenous population, which can better amplify the convergence speed of ACS. In addition, we also add one MMAS subpopulation to form a heterogeneous population, which can effectively enhance the diversity of the ACS. The advantages of multiple populations complement to ensure the solution quality of the algorithm.

In addition, the pheromone fusion mechanism is applied to regulate the pheromone distribution of each subpopulation. It fuses the pheromone matrix of other subpopulations based on retaining the original population information, which can exchange the information among multiple populations more effectively. The experimental results show that the pheromone fusion mechanism has been proved to be effective and it can fully exploit the characteristics of each subpopulation and complement advantages among the heterogeneous sub-colonies.

The adaptive communication strategy and cooperative game model are used to further control the pheromone fusion mechanism. The former method based on information entropy can make the communication frequency among populations more adaptively, and the latter can help the population select appropriate communication objects by evaluating the payoff of each union. From the experiment in large-scale TSPs, it illustrates that the improved algorithm can improve the accuracy of solution without affecting the convergence speed of the population and balance the convergence speed and the diversity of the algorithm effectively.

In the future, more types of heterogeneous populations can be used in the solution construction and more pheromone fusion mechanisms can be designed to regulate the pheromone distribution among populations. In addition, except for the evaluation criteria under information entropy in this paper, more methods based on statistics or machine learning can be also introduced to control the interaction frequency of the population. Finally, the game mechanism we proposed in this research also has some certain practical value in the application of ant colony algorithm.