1 Introduction

Given a connected, undirected and edge-weighted graph G(VEw), where V represents a set of vertices of G; E represents a set of edges of G; and each edge \(\in E\) associates with a weight w, the bounded diameter minimum spanning tree (BD-MST) problem consists of finding a spanning tree (T) of minimum weight subject to the diameter of T (the maximum number of edges connecting any two vertices in T) does not exceed \(D \ge 2\), where D denotes a given positive integer.

The BD-MST problem is proven to be \({\mathcal {N}}{\mathcal {P}}\)-hard for \(4 \le D < |V|\)-1 (Garey and Johnson 1979) and finds many real-world applications. For example, linear light-wave networks (LLNs) involve multi-cast calls sent by each source to multiple destinations. To minimize interference in such a network, it is preferable to use a short (in terms of diameter) spanning tree for each transmission (Deo and Abdalla 2000). Bala et al. (1993) presented an algorithm to decompose a LLN into a set of edge disjoint trees having at least one spanning tree. However, this algorithm constructs trees with smaller diameter by finding trees whose maximum node-degree has lesser value in comparison to a parameter value instead of optimizing the diameter of trees directly. Lines of the network were assumed to be same. If such lines have different bandwidths, then spanning trees should have lines of higher bandwidth as such lines are frequently used and have heavy traffic. If an algorithm that can solve BD-MST problem would be more helpful in determining an appropriate tree decomposition for such network. For that, an edge-weighted tree would be appropriate to model the LLN, where the weight of an edge would be set as 1/z. z represents the line of bandwidth. Another application of the BD-MST problem finds in the distributed mutual exclusion, where some concurrent processes try to access a shared resource with the help of executing a section of code called critical section (CS). Wang and Lang (1994) proposed a token-based distributed algorithm that uses logical spanning tree built on a network of processors taken from Raymond’s tree-based method (Raymond 1989). In this algorithm, k entries are permitted in the critical section, where k is the number of processes. The algorithm needs at most 2kD messages for a vertex to enter the critical section, where D denotes the diameter of the tree. Therefore, a small diameter plays a significant role for the efficiency of the algorithm. Also, minimizing the weight of a spanning tree leads to the reduction of the cost of the network. The BD-MST problem also finds an application in the information retrieval, where large files are compressed through using large data structures called bitmaps. Compressing these files help in occupying less memory space, while allowing reasonably fast access. Bookstein and Klein (1996) proposed a method that uses a spanning tree of a given complete edge-weighted graph to cluster and compress vectors along the paths from a selected node (say root of the tree), to all the leaves of the spanning tree, where such graph is generated on the bitmaps. In such graph, vertices represent bitmaps, and the weighted edges represent Hamming distances. Therefore, a spanning tree with a small diameter can provide high-speed retrieval. In addition, Anderson et al. (2014) used the bounded diameter minimum spanning tree as a method for selecting concatenation of pairwise registrations for finding good results on the joint image alignment problem. An MRF-based optical flow refinement technique and a method for mesh identification were demonstrated to improve the results in terms of active appearance model compactness, suitable for video-realistic synthesis.

The BD-MST problem is extensively studied. The literature has witnessed a number of exact methods, problem-specific heuristic approaches and metaheuristic techniques. Exact methods (Achuthan et al. 1994; Gouveia and Magnanti 2003; Gouveia et al. 2011; Gruber and Raidl 2008) for this problem can address only relatively small size of problem instances upto \(V = 100\) due to its computational complexity. Besides these exact methods, literature has also witnessed a number of problem-specific heuristic and metaheuristic techniques for this problem. Among problem-specific heuristic approaches, Abdalla et al. (2000) proposed five heuristics (first general-iterative-refinement algorithm (IR1), second general-iterative-refinement algorithm (IR2), a composite iterative-refinement algorithm (CIR), one time tree construction (OTTC), and the special-case approach for k=4). For experimental purpose, authors used graphs of different orders (\(50 \le V \le 3000\)) and densities and found that among all four general-algorithms only OTTC was always successful in finding an approximate solution in complete graphs regardless of the value of the diameter bound k; however, it was found that the success rate of OTTC (using one source node) reduced quickly with the reduction of the the density of the graphs, particularly with small values of the diameter bound, such as k = 5. OTTC is a greedy-based heuristic and follows the idea of Prim’s algorithm (Prim 1957) and at each iteration, in order to add an unselected vertex to the tree, OTTC selects the closest (weight) vertex whose addition into the partial tree respects the diameter constraint. The remaining approaches can be read in details in (Abdalla et al. 2000). Later, randomized greedy heuristic (RGH) (Raidl and Julstrom 2003)—a modified version of OTTC—was presented. RGH selects a single vertex as a tree center if D is even, otherwise RGH selects two adjacent vertices as two tree centers (D is odd). Then, at each step, instead of selecting the closest (weight) unselected vertex whose addition into the tree respects the diameter constraint, RGH selects an unselected vertex randomly, and then, it joins this vertex to the partial tree through the minimum-weight edge that respects the diameter constraint. Later, Julstrom (2009) proposed center-based tree construction (CBTC) problem-specific heuristic—an improved version of RGH. Like RGH, it is also based on the idea of tree center. In CBTC, at each iteration, in order to add an unselected vertex to the partial tree, it selects the minimum-weight edge connecting an unselected vertex to one of vertices in the partial tree, and addition of this edge into the tree respects the diameter constraint. In terms of performance, RGH is superior to CBTC on Euclidean instances, whereas CBTC is superior to RGH on non-Euclidean instances. Singh and Gupta (2007) further extended the idea of RGH (Raidl and Julstrom 2003) and CBTC (Julstrom 2009) and presented its improved versions called RGH-I and CBTC-I, respectively. According to this improvement strategy, a search is performed for each vertex v (except center vertex/vertices and its immediate child vertices) whether the vertex v can be connected to a vertex whose depth is less than the depth of v with a smaller edge-weight. If the search is successful, the subtree rooted at vertex v is disconnected from its current parent and connected to this newly selected vertex. Later, Saxena and Singh (2009) proposed another improved versions of RGH (Raidl and Julstrom 2003) and CBTC Julstrom (2009) called RGH+HT and CBTC+HT, respectively. As per this improvement greedy heuristic, it only allows a sub-tree rooted at v to be joined to any vertex of the spanning tree regardless of its depth, iff, the cost of the resultant tree is further minimized subject to respecting the diameter constraint of the resultant tree. Authors improved the results of RGH-I and CBTC-I on Euclidean as well as non-Euclidean instances. Gruber and Raidl (2009) proposed a hierarchical clustering-based methods for the BD-MST problem. The clustering heuristic builds diameter constrained trees within three steps: finding a hierarchical clustering, minimizing the height of this clustering according to the diameter bound, and finally deriving a BD-MST from this height-restricted clustering. Various techniques are used within the individual phases, e.g. a GRASP-like strategy to refine cutting positions through the dendrogram, or dynamic programming to assign each cluster a good root node. Authors (Gruber and Raidl 2009) reported two variants of clustering heuristics, i.e. \(Cd^A\) and \(Cd^B\). Clustering heuristics have been tested on 15 Euclidean instances of V = 1000 for various diameter bounds and obtains in general high solution quality in few seconds which is better than CBTC and RGH and is even competitive with ACO approach (Gruber et al. 2006). Steitz (2015) proposed two new problem-specific heuristic approaches (called node selection tree construction (NSTC) and savings tree construction (STC)). Patvardhan et al. (2014) presented parallel versions of CBTC (Julstrom 2009), RGH (Raidl and Julstrom 2003) and Center-based Least Sum-of-costs (CBLSoC) (Patvardhan and Prakash 2009) problem-specific heuristic approaches. Patvardhan et al. (2015) presented some fast and effective problem-specific heuristic approaches for the BD-MST problem and tested their approach on Euclidean instances upto \(V = 10000\). Authors of (Steitz 2015) and (Patvardhan et al. 2015) compared their problem-specific heuristics with the existing problem-specific heuristics on the Euclidean instances only.

In addition, the literature has also witnessed a number of metaheuristic approaches for the BD-MST problem. Raidl and Julstrom (2003) proposed an evolutionary algorithm based on edge-set encoding (RJ-ESEA). RJ-ESEA uses a crossover based on RGH heuristic and four different mutations. Later, Julstrom (2003) presented an evolutionary algorithm using permutation encoding and used RGH heuristic for decoding the encoded solution into a spanning tree. Authors (Julstrom 2003) showed that their proposed evolutionary algorithm based on permutation encoding performs better than RJ-ESEA (Raidl and Julstrom 2003) in term of solution-quality, but several times slower. Julstrom (2004) also proposed two generational evolutionary algorithms based on the permutation-encoding and random-key techniques. These two proposed approaches are comparable with an evolutionary algorithm using permutation encoding (Julstrom 2003). Singh and Gupta (2007) proposed a permutation-coded evolutionary algorithm (PEA-I). In PEA-I, RGH-I heuristic is used to decode the encoded solution (the order of vertices) into a bounded-diameter spanning tree. PEA-I, in comparison to approaches proposed in (Raidl and Julstrom 2003; Julstrom 2003, 2004), takes a much shorter time to obtain a better solution quality on each considered problem instance. Binh et al. (2008) proposed a hybrid genetic algorithm (HGA) that uses the idea of multiple populations. For multiple populations generation, authors used different well-known heuristics. The individuals from each population are then migrated to a final population called GA_final for merging and competing to breed. HGA is better than RJ-ESEA (Raidl and Julstrom 2003) and PEA-I (Singh and Gupta 2007) on Euclidean instances overall, but is marginally better than non-Euclidean instances in terms of solution quality. Authors (Binh et al. 2008) used the same termination criterion as used for RJ-ESEA and PEA-I, but did not discuss or report the computational time for obtaining solution quality of each Euclidean and non-Euclidean instances. Since GA_final in HGA uses the same individual representation and operators as that of RJ-ESEA (Raidl and Julstrom 2003), therefore, one can infer that HGA obtains overall better results than that of PEA-I (Singh and Gupta 2007) at the cost of much higher computational time. Gruber and Raidl (2005) proposed a variable neighborhood search (VNS) approach with four neighborhood strategies (i.e. node-swap neighborhood, level-change neighborhood, arc-exchange neighborhood and center change-level neighborhood) for the BD-MST problem. These neighborhood strategies have been used as local search methods in VNS approach. Later, Gruber et al. (2006) proposed an evolutionary algorithm (EA) with a new solution encoding scheme called level encoding and an ant colony optimization (ACO) algorithm. They also used node-swap neighborhood and arc-exchange neighborhood as local search methods in EA. In ACO, they used node-swap neighborhood and arc-exchange neighborhood as local search methods. In comparison to VNS approach, EA and ACO find better solution quality on almost all instances. Lucena et al. (2010) presented a hybrid semi-greedy heuristic (LRS) which combines the ideas of greedy randomized adaptive search procedures (GRASP) and iterated local search (ILS). Later, Torkestani (2012) presented a learning automata-based distributed algorithm (LABD) for this problem. LABD consists of several stages. At each stage, the constructed bounded-diameter spanning tree is rewarded, if the cost of the tree is the minimum cost seen so far, otherwise, the constructed tree is penalized. Author (Torkestani 2012) demonstrates the performance of LABD by comparing the results of LABD with the results reported in (Gruber and Raidl 2005), (Lucena et al. 2010). Vuppuluri and Patvardhan (2021) presented a serial memetic algorithm (2PMA) for BD-MST problem. 2PMA is a two-phase memetic algorithm that uses two recombination operators based on order recombination operator and neighborhood-based recombination operator in order to direct the exploration of the search space into regions containing better solutions. Also, authors presented a parallel memetic algorithm for BD-MST problem. In addition, many of the heuristics for the BD-MST problem have been also recast for solving the related bi-objective BD-MST problem (Saha et al. 2010; Prakash et al. 2018; Saha and Kumar 2011; Prakash et al. 2020).

Table 1 summarizes the work done in the literature—exact methods, problem-specific heuristic approaches, metaheuristic techniques and parallel implementations—for the BD-MST problem.

Table 1 Summary of the work done in the literature for the BD-MST problem

In this paper, we present an artificial bee colony (ABC) algorithm for the BD-MST problem. Artificial bee colony (ABC) algorithm is a swarm-based metaheuristic technique which emulates the intelligent foraging behavior of honey bees. ABC algorithm was introduced by Karaboga (2005) and has been successfully applied for various optimization problems (Karaboga et al. 2014; Pan et al. 2011; Singh 2009; Sundar and Singh 2010). The proposed ABC algorithm employs permutation encoding. To exploit this encoding structure, two neighborhood strategies that help ABC algorithm in faster convergence towards finding high quality solutions are applied. On a set of Euclidean and non-Euclidean benchmark instances for various diameter bounds, the proposed approach has been compared with state-of-the-art approaches. Computational results demonstrate the effectiveness of the proposed approach to the other extant approaches in the literature in terms of solution quality and computational time.

The remaining of this paper is organized as follows: Sect. 2 discusses the proposed ABC algorithm for the BD-MST problem; Sect. 3 discusses about the computational results of ABC algorithm for the BD-MST problem; and finally Sect. 4 presents some concluding remarks.

2 ABC algorithm for the BD-MST

Inspired by the foraging behaviour of honey bees, Karaboga (Karaboga 2005; Karaboga et al. 2014) modeled ABC algorithm for the solutions of optimization problems. ABC algorithm also consists of three categories of artificial bees called employed bees, scout bees and onlooker bees which exhibit a variety of intelligent behaviours in their respective phases, i.e. employed bee phase, scout bee phase and onlooker bee phase. Each category of artificial bees performs stochastic behaviour in order to interact locally with one another, leading to the emergence of high quality solutions for an optimization problem. To readers, a general introduction on ABC algorithm and its applications can be found in (Karaboga and Basturk 2007; Karaboga et al. 2014; Singh 2009; Sundar and Singh 2010).

This subsection presents an ABC algorithm (PABC_BD) for the BD-MST. Hereafter, the proposed ABC algorithm for the BD-MST will be referred to as PABC_BD.

The description of each component of PABC_BD is as follows:

2.1 Solution representation

PABC_BD uses a permutation encoding to represent a solution (\( E _i\)), where \( E _i\) is encoded as a linear permutation of all vertices of V. Such ordering of vertices in \( E _i\) helps in decoding (see Sect. 2.3) \( E _i\) into a feasible bounded-diameter spanning tree. In this encoding, if D is even, then the first vertex in the ordering of vertices is considered as the center of \( E _i\). If D is odd, then the first two vertices in the ordering of vertices are considered as the centers of \( E _i\). Note that such permutation encoding is also used in (Julstrom 2003, Julstrom 2004, Singh and Gupta 2007).

2.2 Solution initialization

Each initial solution \( E _i\) which is encoded as a linear permutation of all vertices of V is generated randomly in the following iterative process. Initially, \( E _i\) represents an array of |V| empty positions. Create a copy (say S) of V. A vertex \(v_i\) is selected randomly from S. The selected vertex \(v_i\) is first inserted into the first vacant position in \( E _i\) and then is deleted from S. In a similar way, the next vertex \(v_j\) that is selected randomly from S is inserted into the next vacant position in the order of |V| positions in \( E _i\). After this, \(v_j\) is deleted from S. This whole procedure is repeatedly called until S becomes empty.

Once an initial solution \( E _i\) is generated, its association becomes with a unique employed bee.

2.3 Solution decoding

We follow a slight modified version of randomized greedy heuristic (RGH) (Raidl and Julstrom 2003) as a decoder to decode a solution \( E _i\) into a feasible bounded-diameter spanning tree. In this decoding, instead of picking a vertex randomly, the decoder picks a vertex in the order from the linear arrangement of vertices in \( E _i\). After decoding, a feasible bounded-diameter spanning tree for \( E _i\) is obtained. Similar decoding is also applied in (Julstrom 2003).

2.4 Fitness of a solution

Once the solution \( E _i\) is decoded as a bounded-diameter spanning tree, its fitness is computed as the sum of weights of all edges in \( E _i\).

2.5 Selection of a solution

We use binary tournament selection method—probability based selection method—for PABC_BD. As per this selection method, two solutions which are different are selected randomly from the employed bee population. Between these two selected solutions, a best one (in terms of fitness) is selected with probability \(P_{bt}\), otherwise, a worst one (in terms of fitness) is selected with probability 1-\(P_{bt}\).

Algorithm 1 presents the pseudocode of binary tournament selection method—i.e. \( B TS( E _1, E _2, \dots , E _{ N E })\) function—which returns the index of the selected solution in the current employed bee population.

figure a

2.6 Determination of a neighboring solution

Since the problem structure of a solution in PABC_BD is a linear random permutation of vertices of V, therefore, to exploit this structure effectively so that PABC_BD converges faster towards the high solution quality, PABC_BD uses two different neighborhood strategies—insertion on multiple positions (\( IMP \)) and swap on multiple positions (\( SMP \)). Such neighborhood strategy is applied on the current solution \( E _i\) in order to determine a neighboring solution (\( E ^{'}\)). \( IMP \) follows the principle of utilization of solution components from another solution of employed bee population ( Sundar and Singh 2012). The principle is that if a vertex is positioned correctly in a high quality solution, then there is a high likelihood that this vertex would hold the same position or would be in the vicinity of this same position in many other high quality solutions. The role of \( SMP \) is to avoid the current solution from the local optima trap and to help in exploration of the search space. Note that both \( IMP \) and \( SMP \) are applied in a mutually exclusive way. With probability \(P_{nbr}\), \( IMP \) is applied, otherwise \( SMP \) is applied. \(P_{nbr}\) is a parameter to be decided experimentally. The description of each neighborhood strategy is as follows:

To apply insertion on multiple positions (\( IMP \)) neighborhood strategy on the current solution \( E _i\) in order to determine a new permutation of vertices of V (a new neighboring solution \( E ^{'}\)), first a solution \( E ^{r}\), different from \( E _i\), is picked from the employed bee population with help of binary tournament selection method (see Sect. 2.5). Also, \( IMP \) initializes \( E ^{'}\) with an empty solution. A number (say (\(|V| \times P_{mpi}\)) of different positions are picked randomly from \( E ^{r}\). Insert all vertices assigned to these picked positions of \( E ^{r}\) into the exact positions of \( E ^{'}\). Now, insert those vertices of \( E _i\), which are not in the current \( E ^{'}\), one-by-one in the same order in which these remaining vertices seem to be in \( E _i\). Note that \(P_{mpi}\) is a parameter to be decided experimentally. Also note that while picking a solution \( E ^{r}\) with the help of binary tournament selection method, if it is found that \( E _i\) and \( E ^{r}\) represent the same solution, i.e a collision happens (Singh 2009), then it shows that the employed bee population is facing the diversity issue. To handle this, if this situation happens while determining a new solution in the neighborhood of current solution in the employed bee phase, then the employed bee associated with this solution (food source) \( E _i\) abandons \( E _i\) to become a scout. This scout instantly becomes employed by associating it with a newly generated random solution (see Sect. 2.2). However, if this situation happens while determining a new neighboring solution for an onlooker bee in the onlooker bee phase, then this onlooker bee is associated with a dummy solution whose fitness value is assigned a value artificially higher than the fitness value of its associated employed bee solution so that it cannot be selected in the next iteration. Note that we have not generated a random solution for such onlooker bee. The reason behind this one is that this newly generated random solution has to strive to win against the actual solution as well as with the solutions of all those onlookers which are associated with this actual solution. Hence, it is highly likely that such a newly generated random solution will be lost immediately. This is similar to the concept of “collision” coined in (Singh 2009).

Fig. 1
figure 1

An illustration for IMP

To explain IMP, we take the help of an example that is depicted by a series of Fig. 1a–c. Figure 1a depicted in white color represents a solution \( E _i\)—a linear permutation of vertices, i.e. \(\{1, 5, 7, 2, 4, 3, 6 \}\). Figure 1b depicted in light grey color represents another solution \( E ^{r}\)—a linear permutation of vertices, i.e. \(\{4, 7, 1, 2, 3, 6, 5\}\)—picked randomly from the employed bee population. One can notice that \( E ^{r}\) is different from \( E _i\). Initially, \( E ^{'}\) is an empty solution. Two different positions are picked randomly from \( E ^{r}\). Insert all vertices assigned to these picked positions of \( E ^{r}\) into the exact positions of \( E ^{'}\). These selected positions of \( E ^{r}\) are represented by light grey color in \( E ^{'}\). Now, insert those vertices of \( E _i\), which are not in the current \( E ^{'}\), one-by-one in the same order in which these remaining vertices seem to be in \( E _i\). Finally, Fig. 1c depicts a new neighboring solution \( E ^{'}\)—a linear permutation of vertices, i.e. {\(1, 7, 5, 2, 4, 6, 3\}\).

Fig. 2
figure 2

An illustration for SMP

To apply swap on multiple positions (\( SMP \)) neighborhood strategy on the current solution \( E _i\) in order to determine a new permutation of vertices of V (a new neighboring solution \( E ^{'}\)), first create a copy (say \( E ^{'}\)) of \( E _i\). Hereafter, a number (say \(|V| \times P_{msw}\)) of swap is performed by swapping two vertices associated with two different positions (x and y) which are selected uniformly at random from \( E ^{'}\). Here, \(P_{msw}\) is a parameter to be decided experimentally. It is to be noted that the distance between positions x and y must be greater than two.

To explain \( SMP \), we take the help of an example that is depicted by a series of Fig. 2a–b. Figure 2a depicts a solution \( E _i\)—a linear permutation of vertices, i.e. \(\{4, 7, 1, 2, 3, 6, 5 \}\). Figure 2b depicts a solution (\( E ^{'}\)) which is a copy of \( E _i\). Two positions in light grey color with distance greater than two are selected in \( E ^{'}\). Vertices assigned to these two positions are 7 and 6 that are swapped in \( E ^{'}\). After this swap, the resultant solution is depicted in Fig. 2b.

figure b

2.7 Other features

If a solution (say \( E _i\)) of the employed bee population does not improve over a certain number of iterations (say limit), then it is assumed that \( E _i\) is trapped into a local optima. To handle this situation, the employed bee becomes a scout bee by abandoning its associated \( E _i\). This scout bee, then, generates a new random solution (see 2.2). This newly generated solution instantly replaces this abandoned solution, and the status of this scout bee becomes an employed bee on this newly generated random solution. limit is a parameter that is to be decided experimentally.

Algorithm 2 presents the pseudocode of PABC_BD, where binary tournament selection method is called by \( B TS( E _1, E _2, \dots , E _{ N E })\) function which returns a selected solution (see Algorithm 1 in Sect. 2.5); and the method for determining a solution in the neighborhood of a solution, say \( X \), is called by \( D N bring\_ S ol( X )\) function which returns a neighboring solution, say \( X' \). Note that \( D N bring\_ S ol( X )\) contains two different neighborhood strategies, i.e., \( IMP \) and \( SMP \) that are applied in a mutually exclusive way. If a collision occurs in \( IMP \), then \( D N bring\_ S ol( X )\) will return \(\varPhi \) for \( X' \).

3 Computational results

PABC_BD is implemented in C and executed on a Linux-based operating system with the configuration of Intel Core i5 processor 3.3 GHz \(\times \) 4 with 4 GB RAM. Subsequent subsections discuss about benchmark instances (see Sect. 3.1); parameter setting (see Sect. 3.2); a comparison study of PABC_BD with state-of-the-art approaches such as PEA-I (Singh and Gupta 2007) and HGA (Binh et al. 2008) (see Sect. 3.3) and VNS Gruber et al. 2006, EA (Gruber et al. 2006), ACO (Gruber et al. 2006), LRS (Lucena et al. 2010) and LABD (Torkestani 2012) (see Sect. 3.4); and a statistical analysis is performed to find significant difference between PABC_BD and other approaches (PEA-I, VNS, EA, ACO, LRS and LABD) (see Sect. 3.5).

3.1 Benchmark instances

As in (Gruber et al. 2006, Lucena et al. 2010, Singh and Gupta 2007, Torkestani 2012), PABC_BD is also tested on the same benchmark instances. The description of the set of 45 benchmark instances that are classified into two different data-sets with varying sizes of |V| from 50 to 1000 vertices are as follows:

  • Euclidean data-set: This data-set consists of graphs whose sizes vary from 50 to 1000 vertices. There are 5 instances for each value of V = {50, 100, 250, 500, 1000}, leading to a total of 25 instances. One can use OR-library to download these instances. These instances are listed there as instances of the Euclidean Steiner tree problem. Euclidean instances consist of V points randomly chosen in the unit square. The diameter bound (D) is set to {5, 10, 15, 20, 25} for problems of size V = {50, 100, 250, 500, 1000} respectively.

  • Non-Euclidean data-set: There are five instances for each value of V = {100, 250, 500, 1000}, leading to a total of 20 non-Euclidean instances. All of these instances are complete graphs with weights randomly chosen from the range[0.01, 0.99]. The diameter bound D is set to {10, 15, 20, 25} for problems of size V = {100, 250, 500, 1000} respectively.

3.2 Parameter tuning

In all our experiments with PABC_BD, we have used \( N E \), \( N O \), limit, \(P_{bt}\), \(P_{nbr}\), \(P_{mpi}\) and \(P_{msw}\) parameters. For parameter setting, we consider three Euclidean instances V = {250, 500, 1000} with the instance number {3, 4, 2} and the diameter-bound (D) {15, 20, 25} respectively.

Table 2 Influence of parameter values on a set of graph instances

Table 2 reports the results on various values of each parameter. In Table 2, the column denotes Parameter used in PABC_BD; the column Value denotes various potential values of a particular parameter; each next two columns report the best value (Best) and the average solution quality (Avg) obtained over 50 runs. The best values are highlighted in bold fonts. Note that graph instances are represented by notation \(NV\_IN\), where NV is the total number of vertices of considered graph and IN is the instance number from the set of five instances of V = {250, 500, 1000}. Also note that for parameter setting, each time one potential value of considered parameter is tested, whereas values of other parameters are kept fixed. Experimental observations based on the performance of PABC_BD on different combinations of various values of parameters that can be seen in Table 2 led to choose the best combination of value of each parameter, i.e. \( N E \) = 50, \( N O \) = 150, limit = 50, \(P_{bt}\) = 0.85, \(P_{nbr}\) = 0.80, \(P_{mpi}\) = 0.04, \(P_{msw}\) = 0.05.

3.3 Comparison of PABC_BD, HGA and 2PMA with PEA-I

This subsection discusses about comparison of PABC_BD with PEA-I (Singh and Gupta 2007), HGA (Binh et al. 2008) and 2PMA (Vuppuluri and Patvardhan 2021) on a set of standard benchmark instances. Since authors of PEA-I, HGA and 2PMA used a termination criterion when the best-so-far solution obtained does not improve over 100,000 generations on a given instance, therefore, to allow a fair comparison of PABC_BD, HGA and 2PMA with PEA-I, we have also used the same termination criterion for PABC_BD, i.e. PABC_BD is allowed to execute till the best-so-far solution does not improve over 500 generations. As PEA-I which is a steady state genetic algorithm generates a single solution in each generation, hence, 1,00,000 generations generate 1,00,000 child solutions. In a similar way, PABC_BD generates \(\approx \) 200 neighboring solutions due to mainly employed bee phase (\( N E \) = 50) and onlooker phase (\( N O \) = 150), hence 500 generations generate \(\approx \) 1,00,000 neighboring solutions. As in PEA-I, HGA and 2PMA, PABC_BD is also given 50 runs for each problem instance.

Table 3 Results of PEA-I (Singh and Gupta 2007), HGA (Binh et al. 2008), 2PMA (Vuppuluri and Patvardhan 2021) and PABC_BD on 25 Euclidean instances having 50, 100, 250, 500 and 1000 vertices
Table 4 Results of PEA-I (Singh and Gupta 2007), HGA (Binh et al. 2008), 2PMA (Vuppuluri and Patvardhan 2021) and PABC_BD on 20 non-Euclidean instances having 100, 250, 500 and 1000 vertices

Tables 3 and 4 report the results of 25 Euclidean instances and 20 non-Euclidean instances for PEA-I, HGA, 2PMA and PABC_BD. In Tables 3 and 4, first three columns |V|, D and N under Instances respectively denote the total number of vertices of each Euclidean/non-Euclidean instance, the diameter-bound and the instance number from each set of five instances; and for PEA-I, HGA, 2PMA and PABC_BD, each next four columns Best, Avg, SD and ATET respectively denote the best-so-far solution obtained in terms of fitness, average solution quality, standard deviation and average total execution time. Tables 3 and 4 highlight the best value (Best) and the best average solution quality (Avg) among PEA-I, HGA, 2PMA and PABC_BD in bold fonts. Comparing PABC_BD with PEA-I, one can observe the results of 25 Euclidean instances from Table 3 that PABC_BD, in terms of Best, is better on 14, is same on 9 and is worse on 2, and that PABC_BD, in terms of Avg, is better on 24 and is same on 1. In a similar way, comparing PABC_BD with PEA-I, one can observe the results of 20 non-Euclidean instances from Table 4 that PABC_BD, in terms of Best, is better on 13, is same on 6 and is worse on 1, and that PABC_BD, in terms of Avg, is better on 18, is same on 1 and is worse on 1. Comparing PABC_BD with HGA, one can observe the results of 25 Euclidean instances from Table 3 that PABC_BD, in terms of Best, is better on 11, is same on 2 and is worse on 12, and that PABC_BD, in terms of Avg, is better on 14, is same on 4 and is worse on 7. In a similar way, comparing PABC_BD with HGA, one can observe the results of 20 non-Euclidean instances from Table 4 that PABC_BD, in terms of Best, is better on 6, is same on 4 and is worse on 10, and that PABC_BD, in terms of Avg, is better on 18, is same on 1 and is worse on 1. Comparing PABC_BD with 2PMA, one can observe the results of 25 Euclidean instances from Table 3 that PABC_BD, in terms of Best, is better on 13, is same on 5 and is worse on 7, and that PABC_BD, in terms of Avg, is better on 20, is same on 3 and is worse on 2. In a similar way, comparing PABC_BD with 2PMA, one can observe the results of 20 non-Euclidean instances from Table 4 that PABC_BD, in terms of Best, is better on 6, is same on 2 and is worse on 12, and that PABC_BD, in terms of Avg, is better on 8, is same on 1 and is worse on 11. One can also observe from the overall results of PABC_BD and 2PMA on 45 instances (25 Euclidean instances and 20 non-Euclidean instances) in Tables 3 and 4 that PABC_BD is better than 2PMA in terms of robustness, as comparing with 2PMA, PABC_BD, in terms of Avg, is better on 28, is same on 4 and is worse on 13.

Since configuration of computer system (Pentium 4, 2.4 GHz with 512 MB RAM) used for PEA-I (Singh and Gupta 2007) is different from that of computer system (Intel Core i5, 3.3 GHz \(\times \) 4 with 4 GB RAM) used for PABC_BD, therefore, to give a fair comparison based on computational time, a scaling factor mechanism is used according to the public CPU benchmark provided by PassMark Software ( https://www.cpubenchmark.net/singleThread.html). As per this website, we find that CPU Mark for the computer system used for PABC_BD is 2114, whereas CPU Mark for the computer system used for PEA-I is 561. Higher CPU Mark result represents higher performance. Hence, the performance of our computer system used for PABC_BD is \(\approx \) 3.8 times (scaling factor) better than the computer system used for PEA-I (see Table 5). Also, authors of PEA-I reported only average time till best (i.e. ATTB) needed by PEA-I to reach the best solution, not average total execution time (i.e. ATET) needed by PEA-I. Even considering this fact about average time till best (ATTB) versus average total execution time (ATET) (see Tables 3 and 4) of each instance obtained by PEA-I and PABC_BD respectively, we can safely say that PABC_BD converges much faster in terms of finding better solution quality in comparison to PEA-I.

Table 5 Scaling factors of various computer systems used in existing approaches with respect to the computer system (Intel Core i5 processor 3.3 GHz \(\times \) 4) used in PABC_BD
Table 6 Results of VNS (Gruber et al. 2006), EA (Gruber et al. 2006), ACO (Gruber et al. 2006), LRS (Lucena et al. 2010), LABD (Torkestani 2012) and PABC_BD on 20 Euclidean instances having 100, 250, 500 and 1000 vertices

It is to be noted and is already discussed in Sect. 1 that authors (Binh et al. 2008) did not discuss or report the computational time needed for obtaining the solution quality of each Euclidean and non-Euclidean instances (see NR that refers to Not Reported in Tables 3 and 4). Also, one can infer that HGA obtains overall better results than that of PEA-I (Singh and Gupta 2007) at the cost of much higher computational time. Since our proposed PABC_BD is much faster in terms of finding better solution quality in comparison to PEA-I, therefore, it implies that PABC_BD is even much more faster in terms of finding better solution quality in comparison to HGA. It is clear from the results reported in Tables 3 and 4 that PABC_BD is comparable with HGA in terms of Best on Euclidean instances, but is better than HGA in terms of Avg on Euclidean instances as well as in terms of Best and Avg on non-Euclidean instances. Since authors of 2PMA (Vuppuluri and Patvardhan 2021) did not report the computational time (ATTB or ATET) on each instance while comparing with PEA-I in their paper, therefore, we have not provided the details of comparison of our proposed PABC_BD with 2PMA in terms of computational time.

3.4 Comparison of PABC_BD with VNS, EA, ACO, LRS and LABD

The performance of VNS (Gruber et al. 2006), EA (Gruber et al. 2006), ACO (Gruber et al. 2006) and LRS (Lucena et al. 2010) has been tested on only a set of Euclidean benchmark instances. To give a fair comparison, we have also compared our PABC_BD with these approaches on only Euclidean instances. Also, to test VNS, EA, ACO and LRS approaches, their authors used different-2 criterion for termination. As in (Gruber et al. 2006), authors used different CPU time limits which depend on the size of instances. For example, they used CPU time limits of 2000, 3000, and 4000 seconds for instances with |V| = 100, 250, 500, respectively. In addition, in case of the VNS and ACO approaches, they terminated their approaches after 1000 iterations without further improvement of the best-so-far solution obtained. For instances with |V| = 1000, they used a CPU time limit of 1000 seconds. Similarly, authors of LRS also used a CPU time limit as the termination criterion. They used CPU time limits of 200, 1500, 3000, and 4500 seconds for instances with |V| = 100, 250, 500 and 1000, respectively. In addition to these different-2 termination criterion, configurations of the computer system used for VNS, EA, ACO (Pentium 4 with 2.8 GHz) and LRS (Pentium 4 with 2.0 GHz) approaches are also different from that of computer system used for PABC_BD.

To know the performance of computer system used for PABC_BD in comparison to that of computer system used for VNS, EA, ACO and LRS approaches, we have again used a scaling factor according to the public CPU benchmark provided by PassMark Software. Hence, the performance of our computer system used for PABC_BD is \(\approx \) 3.4 times (scaling factor) better than the computer system used for VNS, EA and ACO approaches (Gruber et al. 2006) (see Table 5). In case of LRS (Lucena et al. 2010), the performance of our computer system used for PABC_BD is \(\approx \) 4.6 times (scaling factor) better than the computer system used for LRS (see Table 5). Looking at all these aspects, i.e. different-2 criteria for termination of VNS, EA, ACO and LRS approaches and their computer system configuration, it is difficult to exactly compare PABC_BD with VNS, EA, ACO and LRS approaches in terms of solution quality and computational time. To overcome this difficulty, we have used another termination criterion for PABC_BD which is different from the termination criterion used for comparing with PEA-I (see Sect. 3.3). PABC_BD is allowed to execute till the best-so-far solution obtained does not improve over 5000 generations in order to compare VNS, EA, ACO and LRS approaches in terms of solution quality and computational time. Like VNS, EA and ACO approaches, PABC_BD is also given 50 runs for each problem instance.

For LABD (Torkestani 2012), author allowed 100 runs to execute their approach on same machine as used in (Lucena et al. 2010) (i.e. (Pentium 4 with 2.0 GHz)). However, results of LABD (Torkestani 2012) are reported in terms of best value and average total execution time only. To compare PABC_BD with LABD, we have used the same termination criterion for PABC_BD as discussed in the previous paragraph.

Table 7 Results of statistical comparison for the best value (Best)

Table 6 reports the results of 20 Euclidean instances for VNS, EA, ACO, LRS, LABD and PABC_BD approaches. In this table, first three columns |V|, D and N under Instances respectively denote the total number of vertices of each Euclidean instance, the diameter-bound and the instance number from each set of five instances; for VNS, EA and ACO approaches, each next four columns Best, Avg, SD and ATET respectively denote the best-so-far solution obtained in terms of fitness, average solution quality, standard deviation and average total execution time; for LRS, next three columns Best, Avg and ATET respectively denote the best-so-far solution obtained in terms of fitness, average solution quality and average total execution time ; for LABD, next two columns Best and ATET respectively denote the best-so-far solution obtained in terms of fitness and average total execution time; for PABC_BD, next four columns Best, Avg, SD and ATET respectively denote the best-so-far solution obtained in terms of fitness, average solution quality, standard deviation and average total execution time.

Table 6 highlights the best value (Best) and the best average solution quality (Avg) among VNS, EA, ACO, LRS, LABD and PABC_BD approaches in bold fonts. Comparing the results of PABC_BD with the results of VNS on 20 Euclidean instances, PABC_BD, in terms of Best, is better on 16 and is equal on 4. In terms of Avg, PABC_BD is better than VNS in all instances. Comparing the results of PABC_BD with the results of EA on 20 Euclidean instances, PABC_BD, in terms of Best, is better on 13, is equal on 6 and is worse on 1. In terms of Avg, PABC_BD is better on 19 and is worse on 1 in comparison to EA. Similarly, comparing the results of PABC_BD with the results of ACO approach on 20 Euclidean instances, PABC_BD, in terms of Best, is better on 12, is equal on 4 and is worse on 4. In terms of Avg, PABC_BD is better on 17 and is worse on 3 in comparison to ACO. Comparing the results of PABC_BD with the results of LRS on 20 Euclidean instances, PABC_BD, in terms of Best, is better in all instances. In terms of Avg, PABC_BD is also better than LRS in all instances. Similarly, comparing the results of PABC_BD with the results of LABD, on 20 Euclidean instances, PABC_BD, in terms of Best, is better on 15, is equal on 3 and is worse on 2.

In terms of computational time (ATET) in Table 6, even considering the points—the performance of computer system used for PABC_BD is \(\approx \) 3.4 times (scaling factor) better than the computer system used for VNS, EA and ACO approaches and \(\approx \) 4.6 times (scaling factor) better than the computer system used for LRS and LABD approaches—discussed in the second paragraph of Sect. 3.4, we can safely say that PABC_BD converges much faster towards better solution quality in comparison to VNS, EA, ACO, LRS and LABD approaches.

3.5 Statistical analysis

For statistical analysis, we have performed a non-parametric Wilcoxon’s signed rank test (García et al. 2009) on each data-set in order to compare PABC_BD with PEA-I, VNS, EA, ACO, LRS and LABD approaches in terms of the best (Best) and the average solution quality (Avg). We have used Wilcoxon Signed-Rank Test Calculator which is available at   http://www.socscistatistics.com/tests/signedranks/Default2.aspx. For this statistical test, we first calculate the difference between the results obtained by each two compared approaches on each data-set and then rank them according to their absolute value. For each data-set, \(R^+\) is the sum of ranks in which the second approach outperforms the first, while \(R^-\) denotes the sum of ranks for the opposite case. If \(min\{R^+;R^-\}\) is less than or equal to the critical value, then this test detects significant difference between the two compared approaches. The critical values are taken from the statistical table available at http://users.stat.ufl.edu/~athienit/Tables/tables. Tables 7 and 8 report the results of Wilcoxon’s signed rank test with a level of significance \(\alpha = 0.05\) for the best value (Best) and the average solution quality (Avg) over 50 runs respectively.

Table 8 Results of statistical comparison for the average solutions quality (Avg)
Fig. 3
figure 3

Convergence curve of PABC_BD for the average solution quality

In Tables 7 and 8, the column Data-set denotes the name of each data-set; the column Comparison denotes the name of two compared approaches; the next five columns Sample Size, Critical Value, \(R^+\), \(R^-\) and Significant respectively denote the sample size, critical value, \(R^+\), \(R^-\) and significant difference between the two compared approaches (“yes” if there exists a significant difference between two compared approaches, otherwise “no”) for each Best and Avg on each data-set. The results in Tables 7 and 8 confirm that PABC_BD significantly performs better with respect to the other approaches because the value of \(R^-\) (corresponding to the other approaches) is lower than the value of \(R^+\) (corresponding to the PABC_BD) as well as the critical value in all performed comparisons.

3.6 Analysis of convergence behavior of PABC_BD

This subsection analyzes the convergence behavior of PABC_ BD in order to gain a useful insight into PABC_BD. For this, we consider four instances, i.e. \(100\_1\), \(250\_1\), \(500\_2\) and \(1000\_5\) which are picked randomly from the set of Euclidean instances. Figure 3a–d exhibit the average solution quality over 50 runs versus the number of generations on considered instances. In each figure, PABC_BD is allowed to execute over 5000 generations. X-axis represents the “Number of Generations” generated, whereas Y-axis represents the “Average Solution Quality” obtained over 50 runs after carrying out the “Number of Generations” generated. One can observe from Fig. 3a–d that the PABC_BD converges rapidly towards better solution quality as the number of generations increases.

3.7 Collective picture

Table 9 gives a collective picture of an overall comparison of PABC_BD on Euclidean and non-Euclidean Instances with PEA-I, VNS, EA, ACO, LRS and LABD approaches in terms of best value and average solution quality obtained. From Table 9, one can observe the superiority of PABC_BD over PEA-I, VNS, EA, ACO, LRS and LABD approaches in terms of best value (Best) and average solution quality (Avg).

Table 9 Overall comparison of PABC_BD with PEA-I (Singh and Gupta 2007), VNS (Gruber et al. 2006), EA (Gruber et al. 2006), ACO (Gruber et al. 2006), LRS, (Lucena et al. 2010) and LABD (Torkestani 2012)
Table 10 Comparison of PABC_BD with \(Cd^A\), \(Cd^B\) (Gruber and Raidl 2009) and ACO (Gruber et al. 2006)

3.8 Performance of PABC_BD on 15 Euclidean instances of 1000 nodes with various diameter bounds

The subsection discusses the performance of PABC_BD on 15 Euclidean Steiner tree instances of Beasley’s OR-Library with 1000 nodes for various diameter bounds that range from 4 to 25 whose results obtained by two hierarchical clustering heuristics—\(Cd^A\) and \(Cd^B\)—and ACO approach are reported in (Gruber and Raidl 2009) (see discussion on literature survey in Sect. 1). It is mentioned in Gruber and Raidl (2009) that being problem-specific heuristics, \(Cd^A\) and \(Cd^B\) takes very few seconds to find high quality solutions in general, whereas ACO approach which is taken from (Gruber et al. 2006) needs computation times of one hour and more. Table 10 reports the results of \(Cd^A\), \(Cd^B\), ACO and PABC_BD on such Euclidean instances, where the results on instances obtained by \(Cd^A\), \(Cd^B\) and ACO approach are taken from (Gruber and Raidl 2009). In this table, first column denotes instances with 1000 nodes for various bounds; for \(Cd^A\), \(Cd^B\) and ACO, each next two columns Avg and SD respectively denote the average solution quality and standard deviation obtained over 30 runs; and the last three columns Avg, SD and ATET respectively denote the average solution quality, standard deviation and average total execution time. Table 10 shows that the performance of PABC_BD decreases with decreasing diameter bound against \(Cd^A\), \(Cd^B\) and ACO approach. The possible reason behind poor performance of PABC_BD particularly on smaller diameter bounds is the use of RGH decoding. It is mentioned in (Gruber and Raidl 2009) that RGH in comparison to clustering heuristics lacks in finding a good backbone. For more details, readers are suggested to read (Gruber and Raidl 2009).

4 Conclusion

In this paper, we present an artificial bee colony algorithm (PABC_BD) for the bounded diameter minimum spanning tree (BD-MST) problem. The proposed approach employs a permutation encoding of solutions. To exploit this encoding structure, two neighborhood strategies based on insertion on multiple positions and swap on multiple positions are applied. PABC_BD has been tested on a set of benchmark instances for various diameter bounds. Computational results justify the effectiveness and robustness of our proposed PABC_BD as the diameter bound increases. On these benchmark instances, PABC_BD is much faster in finding better solution quality in comparison to state-of-the-art metaheuristic approaches except smaller diameter bounds.

The performance of PABC_BD against two hierarchical clustering heuristics on 15 Euclidean instances of 1000 nodes suggested us that in the near future, we will focus our efforts in the development of a better decoder that can be applied to decode a solution encoded as a linear permutation of all vertices of V for this problem while developing a metaheuristic technique.