1 Introduction

Along with the rapid development of technology, the demand for connecting devices through the network is increasing. Therefore, optimization of the network design to save the building cost, but still to allow the devices to communicate with each other quickly and simultaneously has attracted a lot of interests from the research community. The common problem in fiber-optic network design is to ensure communication, the speed of sending, receiving information between devices and to optimize the cost of network construction.

All of the above problems can be modeled as an optimization problem on a fully connected graph. In particular, each device is considered as a vertex of the graph, each pair of devices is assigned with a weight representing the cost of the connection between them. To ensure that there is always a connection between the two devices, designing the network as a spanning tree of the graph is a suitable solution. When the number of devices is large, many research work have the ideas of aggregating similar devices into clusters, which will be used for designing a bigger network system. In reality, numerous types of clustered network systems are designed with different goals in term of optimizing communication costs. Those network systems are then modeled as clustered spanning trees to ensure inter-device connectivity and inter-cluster communication connectivity. In some recent studies on problems with clustered tree structures, D’Emidio et al. [8] defined the Clustered Shortest-Path Tree Problem (CluSPT), and Chen-Wan Lin et al. [14] introduced the Minimum Inter-cluster Routing Cost Clustered Tree Problem (InterCluMRCT). The solution to each of the 2 problems mentioned above is a spanning tree of a graph which satisfies a constrain that the sub-graph in each cluster belonging to that tree is also a spanning tree. Nevertheless, the goals of two problems are different. While the goal of the CluSPT is to find a spanning tree of a graph which minimizes the total length of all paths from a given source vertex to other vertices, the InterCluMRCT tries to minimize the total communication cost among vertices of different clusters. In this study, we aim to design an algorithm to solve simultaneously multiple network problems that can be transformed into the clustered tree problems with different objective functions. The effectiveness of the algorithm is then experimented on two examples of clustered tree problems, which are the CluSPT and the InterCluMRCT. Based on the similarities of the solution structures and constraints, we try to build a mechanism to simultaneously solve those 2 above problems specifically, with the purpose of solving various different types of clustered network design problems in general.

Recently, many studies have proposed multitasking algorithms to find solutions to multiple different combinatorial optimization problems at the same time. In particular, Multifactorial Evolutionary Algorithm (MFEA) [2] is emerging as one of the most effective variants of the Evolutionary Algorithm (EA). MFEA which has been proposed by A. Gupta et al. [11] can solve multiple independent optimization problems simultaneously using a single population of solutions in the unified search space. Since a part of the genetic material of individuals in unified search space is shared among different problems, there is always a process of exchanging information among problems inside MFEA through changing the shared genetic material. Based on these advantages, this research introduces the MFEA algorithm to simultaneously solve 2 problems which are the CluSPT problems and the InterCluMRCT problem. We propose a method of combining those two problems with different dimensions into the same search space. The main reason for doing that is due to the similar properties of the solution structures for those problems, so exchanging the components of the two solutions will assist in searching to find a better solution in the process of exploring the solution space.

There are several ways for representing solution as spanning tree such as edge-sets [20], characteristic vectors [9], etc. in which Cayley Code [19, 24] has been investigated for a long time and attracted a lot of attention from the research community because of its three main advantages. Cayley Code represents each tree with n labeled vertices as a string of \(n-2\) labels, such that each tree corresponds to a unique string. Firstly, Cayley Code can encode a solution into spanning tree easier than other methods. Instead of representing chromosome as a list of edges like edge-set method [20], chromosome can be encoded as a series of numbers. Secondly, it takes full advantage of existing evolutionary operators such as one-point crossover, two-point crossover, swap mutation, etc. Lastly, there are various effective algorithms for solving a lot of problems based on the Cayley Code [19, 24].

With its outstanding advantages, Cayley Code is chosen to be the solution representation for the solution to the 2 above-mentioned problems in our research. We introduce a method to represent a solution of clustered tree problems as a series of integers based on Cayley code. A spanning tree corresponding to each cluster is encoded into a segment of Cayley string which is arranged next to each other to form a solution for clustered tree problem. A valid solution can be constructed by decoding all Cayley strings from encoded solution and combining them altogether. As a result, some operations on the solution will be less complex since performing the operations on a string of numbers is simpler than that of a graph. This is especially useful when applying for EA due to the fact that in each generation such evolutionary operations like crossover operator, mutation operator,... constantly happened, so the simplification of those operations reduces the computational time for the entire algorithm. This paper also investigates the effectiveness of different types of encodings in the Cayley Code to find the most appropriate representations for solving clustered problems.

Consequently, we propose MFEA to solve clustered tree problems based on the Cayley Code (called CC-MFEA). The major contributions of this work are as following:

  1. 1.

    Propose an effective encoding mechanism for representing many solutions of clustered tree problems into unified representation in Unified Search Space (USS) in order to apply different types of Cayley Codes: Prüfer Code, Dandelion Code, and Blob Code for encoding trees.

  2. 2.

    Propose an effective decoding method to obtain the solution corresponding to each specific task problem.

  3. 3.

    Propose evolutionary operations for CC-MFEA.

  4. 4.

    Analyze the effectiveness of the proposed algorithm in some different clustered tree problems.

  5. 5.

    Analyze the effectiveness of the Cayley Code for solving the clustered tree problems.

The rest of this paper is organized as follows. Section 2 introduces related works and Sect. 3 provides foundation knowledge of the Cayley Code. Section 4 presents the notations and definitions used for formulating problem. The proposed CC-MFEA for the clustered tree problems is elaborated in Sect. 5. Section 6 explains the setup of our experiments and reports the computed results. The paper concludes in Sect. 7 with discussions on the future extension of this research.

2 Related works

2.1 Multitasking in evolutionary algorithms

Evolutionary Algorithm (EA) [1] is one of the most well-known algorithms for finding global optimization and is a part of evolutionary computation. The mechanisms of EA are inspired by the biological evolution. Being able to solve complex problems, EA has been successfully applied in many fields such as engineering, art, economics, marketing, genetics, operations research, robotics and social sciences. Despite its various strengths, EA has shown several weaknesses, one of them is the ability to do multiple tasks at a time. Therefore, EA is not compatible with some computational model such as cloud computing. To overcome this restriction, Abhishek Gupta et al [11] proposed a new paradigm of EA, called Multifactorial Evolutionary Algorithm. This new model can perform many tasks at a time on a sole search space. R. Chandra et al [5] applied the MFEA to train neural networks. In this model, the required tasks are neural network topologies with different number of hidden neurons. The authors use the encoding and decoding rules in [11] for solution representation whose dimensionality is that of one single task (in this case, the number of hidden neurons). For evaluation of the proposed algorithm, MFEA is compared with evolutionary single-task learning (ESTL) by using the n-bit Parity Problem. The results showed that MFEA is better than ESTL on all of the cases.

In Da et al. [6], the authors used MFEA to solve a single-objective problem by transforming it into a multi-objective one. With this method, MFEA is expected to overcome the local optima and exploit better areas of the search space. To demonstrate the competence of proposed MFEA, the authors compared it with the classic single-objective evolutionary algorithm (SOEA) and the standard multi-objective evolutionary algorithm (MOEA). The symmetric Traveling Salesman Problem is selected as test problem. Order crossover, random swap mutation and 2-opt local search are used. The results show that by overcoming the local optima, the proposed algorithm can get the optimal solutions on 4 out of the 5 runs, while SOEA gets trapped at a local optimum. The proposed algorithm performs at least as well as SOEA in 20 cases and better in 11 cases. In comparison with MOEA, the proposed algorithm performs at least as well on 18 out of the 20 instances and better in 11 cases. Concerning convergence trends, the convergent speeds of the proposed algorithm and SOEA are relatively fast, and equal to each other in the first 25 generations, but in the rest generations, the proposed algorithm continues to successfully explore latent regions of the search space, while SOEA gets stuck.

A. Gupta et al [10] used the characteristics of the Evolutionary multitasking in the bi-level optimization. The authors assume that lower level optimization tasks are corresponding to the neighboring upper level population members. Therefore, the individuals in that upper level population are clustered into groups. For comparing the proposed algorithm with the basic Nested Bi-Level Evolutionary Algorithm (N-BLEA), ten-dimension variants of SMD benchmarks are selected. The representation is random-key encoding scheme. The selected crossover and mutation operator are SBX operator and polynomial mutation respectively. The experimental results show that N-BLEA is worse than the proposed algorithm with respect to the number of function evaluations of the lower level. The authors also select the composites manufacturing problem for evaluating the proposed algorithm. The results show that the convergence trend of the proposed algorithm is considerably hastened compared with that of N-BLEA.

In Gupta et al. [12], A.Gupta et al further extended the concept of multitasking into a Multi-Objective MultiFactorial Optimization (MO-MFO) paradigm, in which each constitutive task is a Multi-Objective Optimization (MOO) problem that contributes a distinct factor influencing the evolutionary search process. One of the main characteristics that makes evolutionary multitasking appealing for multi-objective problems is the implicit parallelism provided by a unified population which may allow for simultaneous convergence toward the Pareto front. Accordingly, the authors proposed the Multi-Objective Multi-Factorial Evolutionary Algorithm (MO-MFEA), and compared the performance of the proposed algorithm with the classical NSGA-II which deals with a single task at a time. To demonstrate the improvement in the convergence trends of the proposed algorithm, experiments on several synthetic benchmark functions was conducted with identical solution encoding, genetic operators, and parameter settings of the proposed MO-MFEA and the NSGA-II counterpart. The results showed that while solving a problem instance separately by NSGA-II, the IGD progressed slowly and had the tendency of getting trapped in a local Pareto front. In contract, the implicit genetic transfer provided a strong motivation to the evolutionary search process of multiple task instances being integrated, which explains the accelerated convergence characteristics achieved by the MO-MFEA.

Y.S Ong et al [17] pointed out from a practical point of view that the potential applications of effective multitask optimization are truly remarkable since by taking advantage of the underlying commonalities between different optimization tasks, multitaking provides the scope for improved performance in real-world problem solving. To prove this claim, the authors compared the performance of the proposed multitasking engine with the traditional methodology on 3 insightful computational studies of the continuous benchmark functions, combinatorial optimization problems, as well as for a more realistic optimization domain involving the path-planning of multiple unmanned aerial vehicles (multi-UAVs). In the first study, three continuous benchmark functions, namely Sphere function, Ackley function, and Rastrigin function, were combined pairwise to create the multitaking instances. Then, in order to highlight the performance of multitasking, the proposed MFEA shall be compared to an elitist single-objective evolutionary algorithm (SOEA) with similar algorithmic specifications and parameter settings. Experimental results showed that the improvement in convergence characteristics achieved during multitasking is substantial in comparison to the SOEA. As a result of intersecting global optimums between different functions which leads to predominantly positive transfer of knowledge, the MFEA can exploit the evolving population of both tasks simultaneously, efficiently overcome obstacles and converge faster, while traditional SOEA causes the population to often stuck at a local optimum. Interestingly, prior familiarity about the relationship between tasks is not a prerequisite for successful evolutionary multitasking, as the proposed MFEA can exploit latent complementarities autonomously even between apparently disparate optimization tasks, thereby successfully accelerating convergence trends. Such claim was successfully demonstrated in the second study about combinatorial optimization problems, when the authors combined a QAP(Sko100a) and a JSP(la27) to be solved together. While the single-tasking method is found to get trapped in local optimum regions of the search space, the diversified search facilitated by multitasking substantially improved performance characteristics due to constant transfer of genetic materials. Finally, by tackling 2 different multi-UAV path-planning missions at once (a pair of UAVs flying through two narrow openings in a barrier, and four UAVs flying around a centrally located geo-fence of circular planform without colliding into one another), the authors had been able to improved the obtained results which showcase the utility of multitasking in realistic domains.

2.2 Cayley code: a representation scheme for spanning tree in genetic algorithms

There are a lot of research works about representation for Trees, one of the earliest is Cayley Code when Palmer et al. [18] introduced Prüfer code. In recent years, many research works demonstrate the effect of Cayley code on Genetic Algorithm (GA), such as Julstrom et al. (Julstrom [13]) deployed the Blob Code as an individual representation in GA. The results showed that the Blob Code has high locality, and could outperform the Prüfer Code in a simple test problem. In 2005, Julstrom el al. (Julstrom [13]) showed that the Blob Code was competitive with edge-sets representation on the minimum routing cost spanning tree problem. Thompson E. B. [24] indicated that the Dandelion Code has even higher locality than the Blob Code, and offers superior performance in a GA. Futher details of the Cayley Code will be presented in Sect. 3.

2.3 Clustered tree optimization problems and several solving approaches

The network optimization problems have many applications in real life [8, 14] and many of those problems have structure being clustered [8]. For that reason, a lot of researchers have shown their keen interests in such problems. One of the most interesting research in clustered problems is Clustered Steiner Tree Problem (CluSteiner). Bang Ye Wu et al. [26] considered a version of the Steiner Minimum Tree whose vertices are allocated into clusters. Firstly, the authors show that the Steiner ratios of problems can get maximum at 4 and minimum at 3. The authors also propose an approximation algorithm for solving CluSteiner. In the new algorithm, the Hamiltonian paths are added to all local tree of the clusters, then the edges belonging to the inter-cluster topology are removed. The authors demonstrate that new algorithm can be approximate in \((2 + \rho )\) for CluSteiner in O\((nlog(n) + f(n))\) time, when local topologies are given, the problem CluSteiner can be \((1 + \rho )\)-approximated in O\((n log(n) + f (n))\) time. Chen-Wan Lin and Bang Ye Wu [14] studied another variant of clustered tree, Minimum Routing Cost Clustered Tree Problem (CluMRCT). The new constraint in CluMRCT problem is that sub-graph in each cluster is a spanning tree. The authors proved that the CluMRCT having more than 2 clusters is NP-Hard and proposed an approximation algorithm for solving the CluMRCT problem. The new approximation algorithm finds the solutions of CluMRCT problem by creating a two-level star-like graph and based on the R-Star graph. The authors demonstrate that the routing cost of R-Star graph is less than two times the cost of the optimal solution of CluMRCT problem.

In Myung et al. [16], the authors studied a generalized version of the Minimum Spanning Tree Problem. The vertices were divided into distinct groups. The goal was to find a minimum-cost tree which contains only one vertex from each group. After proving that Generalized Minimum Spanning Tree Problem (GMSTP) was NP-hard, the authors proposed two mathematical programming formulations and compared them in terms of linear programming relaxations when applied to GMSTP. The test problems were partitioned into two sets: big and small set. The small one were selected from generalized traveling salesman instances whose number of vertices ranged from 0 to 100. The larger one were randomly created through an algorithm. The experimental results showed that the gap between the upper and the lower bounds grew when the number of groups and vertices in each group increased. The authors [8] have proven that the CluSPT is in the NP-hard class and provided an approximation algorithm for CluSPT problem (denoted as ALL). ALL performs well when the number of vertices in all clusters is small. The approximation approaches including EAs are used to solve the CluSPT when the problem has large dimensionality or at least one cluster has a great number of vertices. Recently, new variants of EA, MFEA are applied to solve the CluSPT and improve the resulting solutions.

In Binh et al. [3], the authors proposed MFEA (hereinafter E-MFEA)with new genetic operator algorithm for solving the CluSPT. The major ideas of the novel genetic operators is that first constructing spanning tree for smallest sub-graph then the spanning tree for larger sub-graph are construed based on the spanning tree for smaller sub-graph. This paper also proposed the new individual encoding in unified search space. The number of clusters of individual is equal to maximum number of clusters of all tasks and the number of vertices of cluster i is maximum number of vertex of cluster i of all tasks. The experimental results point out the effect of novel MFEA. However the novel MFEA has disadvantage, for instance, to construct spanning tree for the sub-graph the edges set of the sub-graph must contain edges set of smaller sub-graph; the individual encoding can make no good solutions for each task in the initial population.

In Thanh et al. [23], the authors take the advantage of Cayley code to encode the solution of CluSPT and proposed genetic operators based on Cayley code. The ideas of genetic operators are taken ideas of genetic operator for binary and permutation representation. In this paper, effective MFEA (hereinafter C-MFEA) for CluSPT is proposed. The decoding method processes segments of genes by removing the genes which have invalid values. To present the spanning tree for a graph by Cayley code, the vertex of the graph must has larger than 2 so novel MFEA can only apply to solve CluSPT instances in which the vertex of its clusters must has larger than 2. Another disadvantage of Cayley code is only applying to complete graph therefore novel MFEA is suitable for complete graph.

The authors [4] proposed a new algorithm based on the EA and Dijkstra Algorithm. The proposed algorithm decomposes the CluSPT problem into two sub-problems, the solution of the first sub-problem is found by an EA while that of the second sub-problem is found by Dijkstra Algorithm. The goal of the first sub-problem is to determine a spanning tree which connects among the clusters, while that of the second sub-problem is to determine the best spanning tree for each cluster. In some real-life applications, the communication cost between devices of different clusters may need to be considered more than between those of the same cluster. In Wu and Lin [25], Chen-Wan Lin et al investigated the problem of finding a clustered spanning tree with minimum inter-cluster cost (InterCluMRCT) and denoted the variant of InterCluMRCT with the number of clusters is k as k-InterCluMRCT. After proving that the InterCluMRCT problem with 2 clusters is solvable in polynomial time, the authors have shown that k-InterCluMRCT is NP-hard for any fixed \(k > 2\) and presented a 2-approximation algorithm for 3-InterCluMRCT.

Although there have been many studies on the use of Cayley encoding mechanisms, evaluating the effectiveness of the MFEA algorithm when applying the Cayley Code to solve the clustered tree problems is still a novel topic and attracts a lot of attention in the scientific community. As a result, this paper focuses on analyzing the efficiency of the MFEA algorithm when using the Cayley Code to solve the clustered tree problems.

3 Preliminaries of the Cayley code

Cayley Code is one of the most classical representations for spanning trees. It offers a fast and simple coding of spanning trees whose decoding algorithm is originated from a constructive proof of the Cayley’s formula. This formula identifies the number of possible spanning trees in a complete graph on n nodes is \(n^{n-2}\). From this formula, the Cayley Code has been proposed as a tree coding scheme that represents each tree on n labelled vertices (in which the underlying graph to form the tree is a complete graph) as a string of \((n-2)\) vertex labels. The Cayley Code indicates a bijective mapping between a tree and a corresponding string which ensures that each tree corresponds to a unique string, and each string corresponds to a unique tree.

In the following subsections, this paper introduces 3 typical coding types in the Cayley Code families and demonstrates their corresponding encoding and decoding mechanisms.

3.1 Prüfer code

Prüfer Code is one of the earliest example of the Cayley Code as it was firstly used in 1918 when Prufer described a mapping between a spanning tree with n nodes and a string of \((n-2)\) node labels. Since then, this code had become well-known and been widely considered as a convenient way for representing trees, especially in the application of the genetic algorithms.

Given a tree T on n nodes. The algorithm to convert a tree into a Prüfer string is described as follows:

  1. 1.

    Let i be the leaf with the smallest label (node of degree 1) in T and j be the neighbour node of i. Set the label of j to be the rightmost digit in the Prüfer string.

  2. 2.

    Remove node i and the edge (ij) from T.

  3. 3.

    Repeat steps (1) and (2) until there are only 2 remaining nodes in T. The final Prüfer string is read from left to right.

It is also possible to convert from a Prüfer string to a unique tree via the following algorithm. Let \(a_{1}a_{2}...a_{n-2}\) be a Prüfer string. In the corresponding spanning tree, each node’s degree is one more than the number of times the node’s label appears. The edges of that tree is identified as follows:

  1. 1.

    Scan the Prüfer number to identify each node’s degree. Initialize a variable t to 1.

  2. 2.

    Find a node i of degree 1 with the smallest label. \((i, a_{t})\) is a spanning tree edge.

  3. 3.

    Decrease the degree of node i and \(a_{t}\) by 1; increase t by 1.

  4. 4.

    Repeat steps (2) and (3) until all nodes have degree 0, except for two nodes with degree 1 which create the spanning tree’s last edge.

Several researchers have pointed out the advantages of the Prüfer Code such as: quick and simple decoding, easy to generate a random Prüfer string and supportive in the application of conventional evolutionary operators like k-point crossover and position-by-position mutation. However, Prüfer Code actually does not support effective evolutionary search since this coding mechanism lacks the essential properties of locality (small changes in genotypes should result in small changes in the phenotypes) and heritability (solutions generated by evolutionary operators should combine characteristics of their parents).

3.2 Blob code

In 2001, another type of Cayley Code called Blob Code was identified as a GA representation for spanning trees. The Blob mapping appears to be adapted well for evolutionary search when a GA using the Blob Code tends to considerably outperform one using the Prüfer code for a simple test problem. This can be explained by the higher locality and heritability under positional crossover and mutation that the Blob Code possesses.

In this research, we reuse the encoding and decoding algorithms for Blob Code mentioned in Julstrom [13]. These two algorithms which has the time complexity of O(n), can run much faster compared to the traditional ones for the Blob Code, which may require quadratic time in the worst case. For the notation convenience, we define vertex 1 as the root vertex; the vertex i is a descendant of vertex j if there is a directed path from i to j, and successor of vertex i (succ(i)) is the unique neighbour of i on the path from i to 1 on the tree. Two algorithms below help to convert a tree T into a corresponding Blob string \(a_{2}a_{3} \ldots a_{n-1}\) and vice versa.

Encoding algorithm for the Blob Code:

  1. 1.

    Define succ(i) for each vertex \(i \in [2, n]\) and add the directed edge (\(i\rightarrow succ(i)\)) into a temporary tree \(T'\).

  2. 2.

    Colour each vertex \(v \in [1, n]\) black if none of the descendants of v in the directed tree \(T'\) has a label exceeding v; and colour white for the remaining vertices.

  3. 3.

    Label the black vertices as \(x_{1}< x_{2}< \ldots < x_{t}\), where \(t \in [2, n]\) is the total number of black vertices. Noting that \(x_{1}=1\) and \(x_{t}=n\) since vertices 1 and n must be black.

  4. 4.

    To construct the Blob string, set \(a_{i} = succ(i)\) for every white vertex \(i \in [2, n-1]\), and set \(a_{x_{i}} = succ(x_{i+1})\) for each \(i \in [2, t-1]\).

Decoding algorithm for the Blob Code:

  1. 1.

    Define a digraph G with the vertex set as [1, n] and the directed edge set as \({(i \rightarrow a_{i}) : i \in [2, n - 1]}\).

  2. 2.

    Colour each vertex \(v \in [1, n]\) black if none of the descendants of v in the directed tree Gs has a label exceeding v; and colour white for the remaining vertices.

  3. 3.

    Label the black vertices as \(x_{1}< x_{2}< \ldots < x_{t}\), where \(t \in [2, n]\) is the total number of black vertices. Noting that \(x_{1}=1\) and \(x_{t}=n\) since vertices 1 and n must be black.

  4. 4.

    To construct the tree T, create the edge \((i,a_{i})\) for each white vertex \(i \in [2, n-1]\), and create the edge \((x_{i},a_{x_{i-1}})\) for each \(i \in [3, t]\) with the final edge is \((x_{2}, 1)\).

However, EAs using the representation of Blob Code still have not performed as well as EAs using other representations (like edge-sets, etc) on most of the problems which have the search spaces of spanning trees.

3.3 Dandelion code

Recently, Dandelion Code has been proposed to be a strong alternative as it offers efficient performance for representing tree in a GA. Researchers show that this coding scheme exhibits even higher locality than the Blob Code and could potentially outperform other current Cayley code representations.

Given a Dandelion string \(a_{2}a_{3}...a_{n-1}\), the algorithm to decode the Dandelion Code into a tree T works as follows:

  1. 1.

    Define a function \(f: [2, n - 1] \rightarrow [1, n]\) such that \(f(i) = a_{i}\) for each \(i \in [2, n-1]\)

  2. 2.

    Determine the cycles associated with the function f so that in each cycle, the largest element is rightmost. Arrange the cycles from left to right in the decreasing order of the largest element of each cycle.

  3. 3.

    Create a list L containing all the elements of every cycle with their corresponding orders after arranging in step 2.

  4. 4.

    To construct the tree T from a set of n isolated vertices, build a path from 1 to n such that the intermediate nodes on this path are the nodes in the list L in left-to-right order. Then, create the edge \((i, a_{i})\) for every \(i \in [2, n-1]\) not occurring in the list L.

To encode a tree T into its corresponding Dandelion string, the encoding algorithm works simply by reversing the above procedure:

  1. 1.

    Find a unique path from 1 to n in T, and let L be the ordered list of intermediate nodes

  2. 2.

    Split this list L into cycles in which each cycle ends when an element that is larger than all elements to its left still can be found.

  3. 3.

    The Dandelion Code corresponding to the tree T is the unique string \(a_{2}a_{3}...a_{n-1}\) which satisfies: a) for those nodes \(i \in [2, n-1]\) appear in the list L, insert the cycle information of each node i with respect to function \(f(i) = a_{i}\); b) for each \(i \in [2, n-1]\) that does not occur in the list L, the unique neighbour of i on the path from vertex i to vertex n in the tree T is \(a_{i}\).

Fig. 1
figure 1

An example of applying 3 types of Cayley Code for representing a spanning tree

An example of applying 3 above-mentioned types of Cayley Code for representing a spanning tree is demonstrated in Fig. 1. On the left of this figure is a spanning tree on 8 nodes with its corresponding Prüfer string, Blob string and Dandelion string shown respectively on the right.

4 Problem formulation

In this paper, a graph is a simple, connected and undirected graph. For a graph \(G = (V, E, w)\), V and E are the vertex and the edge sets, respectively, and w is the non-negative edge length function. An edge between 2 vertices u and v is denoted by (uv), and its weight is denoted by w(uv).

For a vertex subset U, the sub-graph of G induced by U is denoted by G[U]. For a vertex set V, a collection \(C~=~\{C_i | 1 \le i \le h\}\) of subsets of V is a partition of V if the subsets are mutually disjoint and their union is exactly V. A path of G is simple if no vertex appears more than once on the path. This paper only considers simple paths.

For a given spanning tree T of \(G = (V, E, w)\) and \(u, v \in V\), let \(d_T(u, v)\) denote length of the shortest path between u and v on T.

Local root of a cluster is a vertex which connects to a vertex in each other cluster and denoted as \(L_r\).

The CluSPT [8] is defined as following

Input:

- A weighted undirected graph \(G = (V, E, w)\).

- Vertex set V is partitioned into h clusters \({C_1, C_2, \ldots , C_h}\).

- A source vertex \(s \in V\).

Output:

- A spanning tree T of G.

- Sub-graph \(T[C_i] (i = 1,\ldots , h)\) is a connected graph.

Objective:

\(\displaystyle \sum _{v \in V} d_{T}(s,v) \rightarrow \) min

The InterCluMRCT [14] is defined as following

Input:

- A weighted undirected graph \(G = (V, E, w)\).

- Vertex set V is partitioned into h clusters \({C_1, C_2, \ldots , C_h}\).

Output:

- A spanning tree T of G.

- Sub-graph \(T[C_i] (i = 1,\ldots , h)\) is a connected graph.

Objective:

\(\displaystyle \sum _{i=1}^{k} \sum _{j \ne i} d_{T}\left( C_{i}, C_{j}\right) \rightarrow \) min

where \(d_{T}\left( C_{i}, C_{j}\right) =\sum _{u \in C_{i}} \sum _{v \in C_{j}} d_{T}(u, v)\).

Fig. 2
figure 2

An example of valid and invalid solutions for the CluSPT

Figure 2 illustrates the cases of valid and invalid solutions of the CluSPT and the InterCluMRCT. Figure 2(a) shows the input graph G with 6 clusters, 18 vertices and vertex 1 as source vertex. Figure 2(b) presents a valid solution of the CluSPT and InterCluMRCT. In Fig. 2(c), the vertex 6 and vertex 7 in cluster 2 are not connected, so this solution violates the second condition of the output of the CluSPT and the InterCluMRCT problems.

5 Proposed algorithm

This section presents a novel approach by using the MFEA to deal with the clustered tree problems. The algorithm includes: a new individual encoding, and new evolutionary operators, i.e. new population initialization, new crossover operator and new mutation operator.

5.1 MFEA scheme to solve clustered tree problems

Multifactorial Evolutionary Algorithm is used to find the solutions of k clustered tree problems \(G_i=(V_i, E_i, w_i, C_i)\), \(i = 1,.., k\) where \(V_i\), \(E_i\), \(w_i\) and \(C_i\) are set of vertices, set of edges, weight matrix and set of clusters, respectively.

\(V_i\) is partitioned into \(m_i\)\((i=1,\ldots , k)\) clusters \(C_i=\{C^1_i, C^2_i,\ldots , C_i^{m_i}\}\).

For each clustered tree problem \(G_i\), we decompose the problem into two subproblems: The first subproblem aims to determine the spanning tree for each cluster (also called local tree), while the second one intents to construct a spanning tree connecting all of the clusters (also called global tree). Global Graph is denoted by a graph \(G'_i = (V'_i, E'_i)\) in which a cluster \(C_i\) is represented as a vertex in \(V'_i\). An edge connects two vertices of \(G'_i\) if there exists at least one edge connecting the corresponding clusters in \(G_i\).

To use CC-MFEA for solving many clustered tree problems simultaneously, a new scheme of individuals encoding based on Cayley code and a two-level approach are proposed. Cayley codes are used for representation of local trees and the global tree in which the position of the Cayley codes for local tree is placed before that of the global tree in an individual. We provide a mechanism for decoding the individual into the solution for each clustered tree problem as well as efficient evolutionary operators.

The pseudocode of our approach is presented in Algorithm 1. It should be noted that to compute the fitness of an individual \(ind_j\) for a given task, both local trees and global tree must be constructed for the corresponding solution of the clustered tree problem \(s_j\) (lines 5 and 27 in the Algorithm 1). Then, the algorithm computes the fitness of solution of the clustered tree problem \(s_j\). Finally, the fitness of individual \(ind_j\) is computed through the cost of the clustered tree problem \(s_j\) (lines 6 and 28 in the Algorithm 1).

figure a

5.2 Individual encoding in unified search space

Individuals in the unified search space are encoded through two stages: the first stage will rearrange clusters of input graphs to reduce resource consumption, while the second stage encodes the solutions of all tasks into a unified representation.

5.2.1 Encoding solution of input graphs

In order to combine many clustered tree problems into unified search space, we define an individual IND which is a graph \(G = (V, E, C)\) with properties as follows:

  • The number of clusters of IND is equal to the maximum number of clusters in all graph \(G_i\), denoted by N.

  • The number of vertices in tth cluster of an IND is equal to the maximum number of vertices in tth cluster of all tasks, denoted by \(n_t\).

Encoding procedures transform an individual into the string of genes by using Cayley encoding algorithms for representing all of the local trees and the global tree. Moreover, local trees and global tree are combined via the set of vertices which is the root of the local tree. The components of an individual are arranged as follows:

  • Spanning tree for tth cluster is presented as tth segment, denoted by \(s_t\) (if a task constitutes of the number of clusters which is smaller than t, that task is ignored). Values of genes in each segment are in the comprehensive range of 1 to \(n_t\).

  • Spanning tree for Global Graph is represented as a segment, called G-Seg.

  • Genes in the last segment (called M-Seg) is used for representing the set of local roots. The ith gene of this segment represents the vertex in the ith cluster.

  • Spanning tree are represented by Cayley Code [19].

A chromosome is divided into \(N+2\) segments. For the convenience of notation, the \(N+2\) segments will be indexed from 1 to \(N+2\). Each of the first N segments is a Cayley code corresponding to a unique local tree. The segment \(N+1\) is the Cayley code corresponding to the global tree. The segment \(N+2\) is the M-seg contains all of the root vertices of the local trees.

An example of individual encoding in new MFEA is depicted in Fig. 3. Assume that the vertices of the tasks are partitioned into clusters as Fig. 3a. An individual in the USS is illustrated in Fig. 3b. The representation of individual via Prüfer Code is depicted in Fig. 3c. This chromosome has 6 segments in which the numbers of vertices in cluster 1, 2 3 and 4 are 4, 5, 5 and 4 respectively; the G-Seg has 2 genes; M-Seg has 4 genes which are 1, 1, 3 and 2.

Fig. 3
figure 3

The representation of individual in unified search space for MFEA with two tasks

To find labels of vertices in all clusters for a task, the proposed CC-MFEA also builds a mapping function (called M-function) which maps vertices in every cluster of an individual in unified search space to vertices in relative cluster in that task. Take Fig. 3a, b as an example, the M-function maps vertices in cluster 2 of the individual in USS to vertices in cluster 2 of task 1 is 1 matches with 5; 2 matches with 6; 3 matches 7; 4 matches 8. Vertex 5 in cluster 2 in Fig. 3b is not mapped to task 2 because the number of vertices in cluster 2 in task 2 is 4 whereas cluster 2 in Fig. 3b has 5 vertices. The mapping functions are used in proposed decoding method.

5.2.2 Rearrange clusters on input graphs

When solutions of all tasks are encoded as an individual in USS, if the order of clusters of input graphs are kept then it may become resource consuming. For example, in the case shown in Fig. 4, Fig. 4a show as vertices in clusters of two input graphs. If the order of clusters on input graphs are kept, with the solution in USS, the number of vertices in cluster 1 is 6 while the number of vertices needed to encode cluster 2 is 5 (detail in Fig. 4b). However, if clusters of input graphs are sorted in the order of increasing of the number of vertices, then the solution in USS only needs 3 vertices for encoding cluster 1 and 6 vertices for encoding cluster 2 (detail in Fig. 4c). Clearly, this procedure reduces resources for encoding solutions.

Fig. 4
figure 4

Illustration of a disadvantage of the encoding solution

Therefore, the clusters in the input graph are sorted in ascending (descending) order of the number of vertices in a cluster, then applied the encoding procedure in Sect. 5.2.1. For the sake of simplicity, this article assumes that the number of vertices in clusters of an input graph is sorted in increasing order.

5.3 Individual initialization method

To generate valid individuals for a problem, different representations require different methods correspondingly. We introduce a method to generate a valid individual, in which each segment corresponding to a Cayley string is generated randomly. In order to generate the valid solution for clustered tree problem, spanning tree for each cluster and G-Graph are constructed based on random Cayley Code. The detail of individual initializing method (IIM) is presented in Algorithm 2.

figure b

In Algorithm 2, Generate_Random_Cayley_Code(l) method generates randomly a Cayley code whose length is equal to \(l-2\). Select_Random_Vertices_In_Clusters() method selects a random vertex in each cluster for creating edges which connect among clusters.

5.4 Crossover operator

An advantage of Cayley Code is that it can use many existing genetic operators such as one-point crossover, PMX crossover [1], etc. In the proposed algorithm, the One-point crossover operator [1] is used. Observe that, one point in chromosome is selected then 2 gene segments after that point are swapped for generating two offspring.

Fig. 5
figure 5

An example for the crossover operator in CC-MFEA

Figure 5 illustrates the crossover operator in CC-MFEA in which red vertical line presents crossover point. In our encoding method, the Cayley strings of each local tree and the global one are placed adjacent to each other, and the last segment stores the information of the vertices that will be used to construct the complete solution from the local and global tree. The one-point crossover can keep characteristics of the pair of parents, and the affectivity of that crossover is different depending on the position of the selected point. If the crossover point is in the local tree segment, that segment will generate a different local tree in comparison to the pair of parents. If the point crosses in the G-seg segment then the global tree is constructed differently from the pair of parents. Finally, if that point lies in the last segment, it will produce a new set of the root of the local tree. Exchanging two good partial genes of parents through that crossover point more often could result in generating high quality solution. In the crossover operation, the probability of a crossover point is located in a cluster/G-seg/M-seg is actually proportional to the corresponding length of gene. Search space of each segment is corresponding to its length. Segment with larger length has a larger search space, thus requires more generation iterations to obtain optimal structures. In contrast, the segment with shorter length requires less generation iterations to explore good structures. By making the probability of a crossover point located in a segment proportional to its corresponding length, the larger segment can have greater opportunities to exchange good partial genes in an effort to acquire more optimal structures. As a result, crossover operation reduces the number of generation loops needed in order to obtain high quality solutions.

Because of the properties of the encoding method, the One-point crossover always produces valid offsprings, and increases the diversity of individual.

5.5 Mutation operator

The Swap-Change Mutation operator (SCM) performs two types of variation: The first type of variation generates new Cayley string by swapping two genes in the selected segment, and the effect of the change to the spanning tree depends on the level of locality of which type of Cayley code has been used. If high locality code is used for decoding Cayley codes into the tree, the swapping of two genes in that Cayley string generates the new offspring which is less different from the parent. Otherwise, the low locality decoding methods change the spanning tree significantly different from parents. The second type of variation is executed on M-Segment to change the root of the local tree with the aim of increasing the diversity of individual. The SCM is described in details in Algorithm 3.

figure c

Figure 6 provides an example of the SCM. In this example, the 3rd cluster is randomly selected. Two genes in third segment which have the values of 2 and 3 respectively (the values are in red and bold), are swapped. The value on 2nd gene in M-Seg, which is equal to 1, is replaced by the value of 3 (the values are in red and bold).

Fig. 6
figure 6

An example for the new mutation operator in MFEA for solving the CluSPT problem

5.6 Decoding method

In this subsection, we present association between an individual in USS and an individual in task \(T_j\). The main reason for selecting Cayley Code to represent a chromosome is that we can use simple methods for decoding between tasks.

5.6.1 Decoding method for spanning tree problem

Let \(D_j\) be the dimensionality of jth task.

figure d
  • Step 1: Remove genes whose values are greater than \(D_j\) and store them in the corresponding order into a queue Q.

  • Step 2: If the number of remaining genes R is greater than \((D_j - 2)\), the solution will select the first \(D_j - 2\) genes from the remaining. Otherwise, while the number of remaining genes R is less than \((D_j - 2)\), repeat following procedure: Let a be the first element in Q; b is the result of a modulo \(D_j\); Append b to R; Delete a from front of Q.

Figure 7 presents the use of S-Decoding method for decoding a representing spanning tree into Cayley code of two tasks. The dimensionality of these two tasks are 6 and 8 respectively, and an individual in the unified search space is transformed to a task whose dimensionality is 6.

Fig. 7
figure 7

The decoding method for representing spanning tree via Cayley code

Figure 7a depicts a case in which after step 1, the number of remaining genes is greater than the dimensionality of the associated task. In step 1, as the value of second gene is 7 which is greater than 6, this gene is removed then stored in Q = \(\{7\}\). In step 2, the number of the remaining genes is 5 which is greater than \(6-2=4\), so first 4 genes are chosen to create an individual for the associated task.

Figure 7b illustrates another case of a decoding method in which the number of remaining genes is less than the dimensionality of the associated task. In step 1, genes 7, 8, 7 are removed from a segment then stored to a queue \(Q=\{7, 8, 7\}\). In step 2, the number of the remaining genes is 3 which is less than \(6-2=4\) is stored in \(R = \{1, 2, 3 \}\), one missing gene are created by getting the remainder of the first element the queue Q is 7 divided by the dimensionality of particular task is 6, and the result of 7 modulo 6 is 1, then append 1 to the end of R.

5.6.2 Decoding method for Clustered Tree Problems

New decoding method for the Clustered Tree Problems (NDM) has two stages. The first stage finds spanning tree for clusters and G-Graph by using S-Decoding method. The second stage performs on last segment for finding edges which connect among clusters.

Detail of new decoding method is described in Algorithm 5.

figure e

In Algorithm 5, the S-Decoding method is a proposed method in Sect. 5.6.1; Line 8 and 13 use the mapping functions to relabel for vertices in clusters.

Fig. 8
figure 8

An illustrative example of propose decoding method for CluSPT

Figure 8 illustrates the proposed decoding method. Figure 8a presents an individual in USS. Representation of individual via Prüfer code is depicted in Fig. 8b. Figure 8c illustrates chromosome via Prüfer code after performing from line 1 to line 12 of Algorithm 5. Figure 8d represents a solution which is constructed from chromosome in Fig. 8c. Figure 8e illustrates the complete solution which receives from Fig. 8d after performing the M-functions in Algorithm 5.

6 Computational results

6.1 Problem instances

To assess the performance of the proposed algorithm, CluSPT instances are selected from Binh et al. [4] and InterCluMRCT instances are selected from Clustered Traveling Salesman Problem (CluTSP) instances [15]. The main reason for selecting those instances is that the instances had been generated with various algorithms to be suitable for clustered problems [15].

The experiments is performed on all small instances of Type 1, Type 5 and Type 6. The instances are selected randomly for implementing CC-MFEA. All tested instances are available via Thanh [22].

The Prüfer Code [18], Blob Code (Julstrom [13]) and Dandelion Code [19] are the well-known example of Cayley codes, so those Codes are chosen to implement experiments.

6.2 Experimental setup

We focus on the following criteria to assess the quality of the output of the algorithms.

Criteria

Average (Avg)

Average function value over all runs

Best-found (BF)

Best function value achieved over all runs

Each problem instance was evaluated 30 times on Intel Core i7 - 4790, 16GB RAM computer using windows 8 64-bit. The source codes were written in the Java language.

The simulation parameters include population size = 100, number of evaluations = 50000, probability of random mating = 0.5, mutation rate = 0.05 and number of tasks = 2.

6.3 Experimental results

In this section, we analyze the results obtained by the proposed algorithm using different types of Cayley codes. The performance of each Cayley Code is evaluated in both single-task and multi-task environment. In addition, the effectiveness of combining 2 types of variation in the proposed SCM is also examined. All of the obtained results are then carefully investigated under some statistical analysis methods such as Wilcoxon Signed-rank Test to show the effectiveness of the proposed algorithm.

In tables of results in this section, the italic, red cells on the Prüfer Code column mean that on those instances, CC-MFEA using Prüfer Code outperforms using Dandelion Code. Similar for other encoding techniques, cells in italic and red color on the Dandenion Code and Blob Code column mean that CC-MFEA using Dandelion Code exceeds the CC-MFEA using the Blob Code and CC-MFEA using Blob Code surpasses CC-MFEA using Prüfer Code, respectively. The values in bold on encoding techniques column means that CC-MFEA using the encoding techniques outperforms using two other encoding techniques.

6.3.1 The efficacy of the proposed variation of mutation operator

To evaluate the effectiveness of each variation to the proposed mutation, Dandelion Code is chosen to conduct the following experiments: In the first experiment, we perform only the first type of variation which is swapping two genes in the selected segments, while the second experiment only uses the second type of variation that changes an element in the M-seg segment. The obtained results from the two experiments above are then compared to those of the proposed SCM which is a combination of both variation types.

Table 1 Results obtained by variations of the mutation operator using Dandelion code on instances in Type 1

Table 1 presents the results obtained by performing the swap variation, change variation and the proposed SCM. The experimental results show that the proposed SCM is better than swap variation and change variation by 8.7% and 8.6% in average, respectively.

To demonstrate the effectiveness of the proposed SCM operator, a Wilcoxon signed-rank test is conducted to check whether there was an improvement in the performance of the CC-MFEA between applying the swap variation or change variation separately and combining them intro the proposed SCM. In particular, 2 pairs of samples SCM-Change and SCM-Swap are examined to discover if these changes of mutation operator led overall to a statistically significant difference in the obtained results. Detailed of the statistic test is shown in Table 2.

Table 2 Wilcoxon signed-ranks test statistics between SCM-Swap and SCM-change to show the efficiency of proposed mutation operator

As can be seen from Table 2, for the CluSPT problem, the SCM mutation produces better results than using either change or swap mutation separately on 10 out of 11 test instances. A Wilcoxon signed-rank test showed that the combination of 2 variations of mutation operator inside the SCM provided a statistically significant improvement in the quality of obtained solutions with the Asymp. Sig. (2-tailed) p of 0.026 and 0.021 for the SCM-Change and the SCM-Swap, respectively. Similarly for the InterCluMRCT problem, a considerable advance in performance of the proposed CC-MFEA can be witnessed when applying the SCM operator instead of either change or swap with \(p=0.028\) and \(p=0.005\) respectively. In conclusion, the proposed SCM mutation operator does statistically improve the overall performance of the algorithm.

Explanation: The first type of variation makes changes to the segments which are used for representing local spanning trees. Altering any two genes in those segments helps to modify the local spanning trees. In addition, changing local-root vertices is also an important factor since they contribute a major role in making connections between clusters, from which a global tree will be constructed. Therefore, the combination of these two types of variation in the process of creating new individuals helps to increase the diversity of the population, thereby helping the proposed algorithm to explore better solutions.

6.3.2 The effect of encoding techniques in single task

In this section, the performance of each Cayley Code is evaluated in either CluSPT or InterCluMRCT problem.

A. Overall results of singletasking

Table 3 Comparison among 3 Cayley Codes to summarize the number of instances in which one encoding type outperforms the other two when performing in single task

Table 3 summarizes the comparison results among 3 encoding types when performing in single task. The “Num.Ins” column denotes the total number of instances, while the “Dandelion”, “Blob” and “Prüfer” columns represent the number of instances where the encoding type has better results than the other two.

A notable point in the results shown on those tables is that the performance of Dandelion Code is by far the best. In particular, the Dandelion Code outperforms the remaining codes on 2 out of 3 types of instances (Type 1 and Type 6). The result obtained by using the Prüfer Code seems to be the worst when its performance is poorer than the Blob and the Dandelion Code in all 3 types.

B. Detailed results of singletasking

For more details of the CluSPT, in Type 1, the Dandelion Code outperforms both Prüfer Code and Blob Code in 15 out of 24 test cases. In Type 6, the Dandelion Code exceeds Prüfer Code and Blob Code on 21 out of 34 and 20 out of 34 test instances respectively.

Unlike Type 1 and Type 6, in Type 5, the Blob Code is slightly superior to 2 remaining encoding techniques, i.e. the Blob Code exceeds the Prüfer Code and Dandelion Code in 9 out of 16 and a half of test cases respectively.

Tables 4, 5 and 6 show more details of the the results obtained by the proposed algorithm on single-task when solving the CluSPT on instances of Type 1, Type 5 and Type 6.

Table 4 CluSPT’S results obtained by single task on instances in Type 1
Table 5 CluSPT’S RESULTS OBTAINED BY SINGLE TASK ON INSTANCES IN TYPE 5
Table 6 CluSPT’S results obtained by single task on instances in Type 6

In the InterCluMRCT, the Dandelion Code receives the best performance when that encoding mechanism exceeds two remaining encoding techniques on two-thirds of the total number of instances. In detail, Dandelion Code outperforms Blob Code on 14 out of 24 instances of Type 1 and on 22 out of 34 instances of Type 5. In comparison with Prüfer Code, Dandelion Code surpasses 15 out of 24 instances of Type 1 and on 22 out of 34 instances of Type 6.

A remarkable point occurs on the results in Type 5 where Prüfer Code exceeds both Dandelion and Blob Code on 10 out of 16 instances. However, in general, the performance of Prüfer Code is not as well as either Dandelion Code or Blob Code when its results are worse than those of Dandelion Code and Blob Code in two out of three instance types.

The details of the InterCluMRCT’s results received from single-tasking of the proposed algorithm with different Cayley codes are presented in Tables 7, 8 and 9.

Table 7 InterCluMRCT’S results obtained by single task on instances in Type 1
Table 8 InterCluMRCT’S results obtained by single task on instances in Type 5
Table 9 InterCluMRCT’S results obtained by single task on instances in Type 6

C. Statistical analysis of 3 Cayley Codes in singletasking

In order to examine the effect of different Cayley Codes on the obtained results in singletasking environment, a Friedman test is firstly carried out to see if there were differences in perceiving better performance of the algorithm based on 3 encoding types.

For the CluSPT problem, detailed of the statistic test is given in Table 10. The Friedman Test compares the Mean ranks between the related groups and provides the test statistic \(\chi ^2\) value Chi-square, degrees of freedom df and the significance level Asymp. Sig. to indicate how the groups differed. Results of the Friedman Test show that there was a statistically significant difference in the CluSPT’s results depending on which type of the encoding mechanism was used, with \(\chi ^2(2) = 10.0811,\ p = 0.0065\).

Table 10 Statistical analysis on the effect of 3 Cayley Codes for CluSPT in singletasking

However, at this stage, we only know that there are differences somewhere between the related groups, but the Friedman Test does not pinpoint which groups in particular differ from each other. As a result, another post hoc test needs to be run. In this case, separate Wilcoxon signed-rank tests on the different combinations of related groups (Dandelion–Prufer, Blob–Dandelion and Prufer–Blob) are chosen to be conducted. In addition, a Bonferroni adjustment on the Wilcoxon test’s results must be applied since we are making multiple comparisons. The initial significance level (0.05) is simply divided by the number of running tests, which results in a new significance level of 0.05/3 = 0.017 in our case.

Table 10 also shows the output of the Wilcoxon signed-rank test on each of our combinations. We can see that at the \(p<\) 0.017 significance level, a statistically significantly improvement can be obtained by applying the Dandelion Code instead of the other two for the CluSPT problem in singletasking, with Dandelion–Prufer and Blob–Dandelion of \(p=0.0003\) and \(p=0.0099\) respectively. Therefore, Dandelion Code is considered as the best among 3 encoding types for evolutionary algorithm to solve the CluSPT problem.

Table 11 Statistical analysis on the effect of 3 Cayley Codes for InterCluMRCT in singletasking

Similarly, Table 11 shows the results of statistical analysis among 3 Cayley Codes for the InterCluMRCT problem. There was a statistically significant difference in perceived effort depending on which type of encoding used for this problem with \(\chi ^2(2) = 3.4324,\ p = 0.0179\). Post hoc analysis with Wilcoxon signed-rank tests was conducted with a Bonferroni correction applied, resulting in a significance level set at p < 0.017. Median (IQR) achieved for Prufer, Dandelion and Blob Code were 1112182.75, 1111663.90 and 1130830.75, respectively. There were no significant differences between Blob to Dandelion Code (\(Z = -0.606,\ p = 0.5445\)) or between Prufer to Blob Code (\(Z = -0.773,\ p = 0.4395\)) despite an overall improvement in the Blob vs Prufer Code. However, there was a statistically significant reduction in performance in the Prufer Code compare to Dandelion Code (\(Z = -1.872,\ p = 0.0161\)). As a consequence, it can be concluded that Dandelion Code outperforms the remaining two encoding mechanisms when solving clustered tree problems.

D. Explanation of the obtained results

In type 1 and type 6, all of the instances are modified from the TSPLIB [21]. In details, type 1 uses a k-means algorithm to create clusters, then a source vertex is randomly selected to create instances for CluSPT; and type 6 uses grid method clustering: all vertices are divided into \(m*n\) rectangles with the same area. The properties of these instances in type 1 and type 6 are: clusters are separated clearly and less messed up, the vertices in a cluster are close to each other, and the distance between two vertices belonging to two different clusters is greater than the distance between two vertices belonging to the same one. Using the code with high locality (a small change in genotype causes a small change in phenotype) like Blob code and Dandelion code is effective for instances with the above properties. In our encoding method, each cluster is encoded as a segment with different types of Cayley codes, and a small change in these segments will have different affects on the solution depending on the types of encoding has been used. For instance, in CluSPT, the Dandelion code is the highest locality among Cayley codes, so a small change in a single cluster segment is similar to performing a local search to obtain optimal structure for a cluster. As a result, the Dandelion code is the most effective encoding for the CluSPT in type 1 and type 6.

6.3.3 The effect of encoding techniques when CC-MFEA solves two different problems

To evaluate the effectiveness of each Cayley codes’ encoding technique for the CC-MFEA algorithm when solving clustered problems, the CluSPT and the InterCluMRCT problems with different input graphs are implemented.

A. Overall results of multitasking

Table 12 Comparison among 3 Cayley Codes to summarize the number of instances in which one encoding type outperforms the other two when performing in multi-task

Table 12 presents a summary of comparison results among 3 encoding types when performing in multi-tasks. Similarly, the “Num.Ins” column indicates the number of instances, while the “Dandelion”, “Blob” and “Prüfer” columns show the number of instances where the encoding type outperforms the two remaining codes.

Tables 13, 14 and 15 show more detail of the results obtained by 3 different types of Cayley codes on instances of Type 1, Type 5 and Type 6. In those tables, the selection of pairs of instances to apply CC-MFEA is chosen randomly. For the convenience of tracking, pairs of instances of the CluSPT problem and the InterCluMRCT problem appear in the corresponding order of occurrence of instances of each problem. For example, Instance 10berlin51 in the first row in the result table of the CluSPT is matched with Instance 10Eil51 in the first row in the result table of the InterCluMRCT problem. Similarly, Instance 10Eil76 is paired with instance 10KroA100, etc.

Table 13 Results of CluSPT and InterCluMRCT obtained by multitasking in Type 1
Table 14 Results of CluSPT and InterCluMRCT obtained by multitasking in Type 5
Table 15 Results of CluSPT AND InterCluMRCT obtained by multitasking in Type 6

Results on those tables point out that on the CluSPT problem, the Blob Code achieves the best results, while the performance of Prüfer Code is the worst. CC-MFEA using Blob Code exceeds both the Dandelion Code and Prüfer Code on 7 out of 12 instances of Type 1, 5 out of 8 instances of Type 5 and 11 out of 18 instances of Type 6.

Unlike the CluSPT, with the InterCluMRCT problem, the Dandelion Code achieves worst performance while the results obtained by Prüfer Code is the best on instances of Type 6. However, the difference between results obtained by Prüfer Code and Blob Code are not large since Prüfer Code outperforms Blob Code on 9 out of 17 instances. On instances of Type 1 and Type 5, the Blob Code exceeds two remaining ones.

Comparisons among three different encoding techniques on instances of Type 1, Type 5 and Type 6 are shown in Fig. 9. In this figure (also similar mean for other figures in this subsection), the symbol “\(>>\)” denotes “outperform”, i.e. “Prüfer Code \(>>\) Dandelion Code” means that results obtained by CC-MFEA using Prüfer Code outperform those using Dandelion Code”.

Fig. 9
figure 9

Comparisons among three different encoding techniques on instances of each Type

B. The influence of the number of clusters

To analyze the relationship between the number of clusters and the results obtained by different types of Cayley codes on both CluSPT and InterCluMRCT problems, we created scattered plots of results obtained by those types according to the number of clusters and problems instances as shown in Figs. 10, 11 and 12.

Fig. 10
figure 10

Comparison between Prüfer and Dandelion Code on the scatter of Instances and Number of clusters

Figure 10 illustrates the comparison between the results obtained by Prüfer Code and Dandelion Code. The triangular symbols indicate those instances where Dandelion Code outperformed Prüfer Code. As can be seen from Fig. 10b, since all of the symbols on the right of the vertical red line are triangle, in the InterCluMRCT, Dandelion Code produces better results than Prüfer Code on all instances with the number of clusters above 25. In Fig. 10a, most of the symbols on the right of the vertical red line are triangular symbols. Therefore, it can be inferred that in the CluSPT, Dandelion Code also outperforms Prüfer Code on most instances with over 20 clusters.

Fig. 11
figure 11

Comparison between Dandelion and Blob Code on the scatter of Instances and Number of clusters

Figure 11 shows the comparison between Dandelion and Blob Code. In the CluSPT (Fig. 11a), Blob Code exceeds Dandelion Code on most instances with the number of clusters under 7 (the left of vertical red line) or in a range from 12 to 25 (between the vertical green and blue line). In the InterCluMRCT (Fig. 11b), Blob Code performs not as well as Dandelion Code on all instances with the number of clusters above 25.

Fig. 12
figure 12

Comparison between Blob and Prüfer Code on the scatter of Instances and Number of clusters

The comparison between Blob and Prüfer Codes is depicted in Fig. 12. In the CluSPT (Fig. 12a), Blob Code outperforms Prüfer Code on most instances with over 12 clusters (on the right of the vertical red line). In the InterCluMRCT (Fig. 12b), the results obtained by Blob Code are not as good as Prüfer Code on most instances with the number of clusters between 9 and 25 (between two vertical lines in the figure).

C. The influence of the number of vertices

Figures 13, 14 and 15 depict scattered plots of results obtained by those types according to the average number of vertices and problems instances.

A notable point in Fig. 15b is that in the InterCluMRCT, Blob Code outperforms Prüfer Code on all instances with the average number of vertices under 2.5 or over 12.

Fig. 13
figure 13

Comparison between Prüfer and Dandelion Code on the scatter of Instances and Number of vertices

In comparison to Dandelion Code, Blob Code exceeds it on all instances with the average number of vertices above 12.8 in the CluSPT (see Fig. 14a) while in the InterCluMRCT, Blob Code outperforms Dandelion Code on most instances with the average number of vertices above 4.7 (see Fig. 14b).

Fig. 14
figure 14

Comparison between Dandelion and Blob Code on the scatter of Instances and Number of vertices

In comparison between Blob Code and Prüfer Code, in the CluSPT, Blob Code exceeds Prüfer Code on all instances with the average number of vertices above 12.7 (see Fig. 15a). A remarkable point in the InterCluMRCT is that performance of Blob Code is better than that of Prüfer Code on instances which have the smallest or the largest average number of vertices, i.e. those instances with the average of under 2.5 or above 12 vertices (see Fig. 15b).

Fig. 15
figure 15

Comparison between Blob and Prüfer Code on the scatter of Instances and Number of clusters

6.3.4 Comparison of the performances of Single-tasking and Multi-tasking

A. Results obtained by Single-tasking and Multi-tasking

The summary of comparison between results obtained by single-task and multi-task is presented in Table 16 in which the number of instances that multi-task outperforms single-task is presented. The results show that multi-task is better than single-task for most of the test cases on both the CluSPT and the InterCluMRCT problems. The reason behind this is that, due to the similarity of the solution structures of the CluSPT problem and the InterCluMRCT problem, when encoded in the USS, these problems support each other in the process of finding the optimal solutions. In particular, with the solution representation mechanism proposed in the paper, larger tasks contain part of the solution of smaller tasks. Therefore, genes having satisfied values in the number of dimensions of the problems construct part of the shared gene structure. These shared genes are reused among tasks leading to better interaction among them. More specifically, if the shared gene structure contains many gene segments that have good characteristics for the problems then the process of exchanging genetic material is the effectiveness when properties of the input data of two tasks are similar.

Table 16 Summary of the number of instances that multitasking outperforms singletasking on each Type

A notable point in the results shown on the Table 16 is that in Type 5, single-task exceeds multi-task on only one instance in Blob code. For instances of Type 6, the results are slightly similar to Type 5 when multi-task outperforms single-task on most of the test cases. However, the gap between results obtained by those algorithms is not as large as in Type 5, i.e. In the InterCluMRCT, multi-task surpasses single-task on 11 out of 17 instances (for the Dandelion code), on 12 out of 17 instances (for the Blob code and the Prüfer Code). In the CluSPT, multi-task also exceeds single-task on two encoding techniques which are Dandelion code and Blob code on 9 out of 17 instances but multi-task performs not as well as single-task on the Prüfer Code when it is only better on 7 out of 17 instances. Last but not least, for instances of type 1, although multi-task is still superior to single-task, the difference is not as significant as in the cases of Type 5 and Type 6. For the InterCluMRCT problem, multi-task achieves better results than single-task in 7 out of 12 instances for all 3 encodings. For the CluSPT problem, with the total number of 12 instances, each of single-task and multi-tasks has 6 better cases with the encoding types of Blob code and Prüfer code. With the Dandelion code, single-task outperforms multi-task in 7 out of 12 instances.

In the InterCluMRCT, the results in all 3 Types are slightly similar where multi-task exceeds single-task in all test cases. More specifically, in three encoding techniques, there are 7 out of 12 instances on which multi-task outperforms single-task. However, in the CluSPT, unlike Type 5 and Type 6, in Type 1, single-task using Prüfer Code is slightly better than the multi-task on 5 out of 12 instances. From the comparison between single task and CC-MFEA, we find that CC-MFEA may not always be beneficial due to the possibility of negative knowledge transfer. Negative knowledge transfer occurs when the genotype of two optimal solutions of CluSPT and InterCluMRCT are too different from each other. The solutions of CluSPT and InterCluMRCT are encoded into an individual in the shared representation. For each generation, each task adjusts and exchanges its valuable genes in shared representation to converge toward its optimal solutions. In particular case of the optimal solutions corresponding to tasks converged, if genotypes are too different from other tasks, incompatible partial solutions transferred among tasks could lead to bad results. As a result, the combination of unrelated problems in CC-MFEA only expands search space without obtaining better solutions. Thus, single task can obtain better solutions than multitasking in those cases. In this experiment, those CluSPT and InterCluMRCT instances with close number of clusters are selected in the same type of instances to solve simultaneously. However, inferring the underlying similarity between tasks is extremely challenging in the case of combinatorial optimization like CluSPT and InterCluMRCT.

B. Convergence trends

To analyze convergence performance of single task and multi-task, a methodology based the use of Page’s trend test [7] is selected. The results obtained by proposed algorithms on Dandelion code of all instances in Type 1, Type 5 and Type 6 are selected for analyzing the convergence trend. The number of cut-points has been fixed at 10, one after each 10% of fitness function evaluations.

Table 17 presents the sum of ranks at cut-points while Table 18 shows the comparison of the convergence of the multi-task (MT) and single task (ST) on Types 1, 5 and 6.

Table 17 Sum of ranks for the experiments on Dandelion code
Table 18 Convergence results (p values) for the experiments on Dandelion code

Table 18 points out that the comparison MT–ST shows an increasing trend in the ranks which is confirmed by a very low p value. Moreover, the opposite comparison ST–MT, shows clearly that the ranks are not increasing, which is rejected by a p value near to 1.0. These results show that the algorithm MT is converging faster than algorithm ST.

We use the functions in Gupta et al. [11] for computing the normalized objectives and averaged normalized objectives and analyzing the convergence trends of the proposed algorithms.

Fig. 16
figure 16

Convergence trends of \({\tilde{f}}\) in multi-tasks and serial single task for instances 5eil5 and 5eil76 in Type 1; instances 5i45-18 and 5i60-21 in Type 5; instances 4berlin52-2x2 and 4eil51-2x2 in Type 6

Figure 16 depicts the convergence trends during the initial stages of the multi-task for instances 5eil51 (the input of the CluSPT) and 5eil76 (as the input of the InterCluMRCT) in Type 1; instances 5i45-18 (as the input of the CluSPT) and 5i60-21 (as the input of the InterCluMRCT) in Type 5; instances 4berlin52-2x2 (as the input of the CluSPT) and 4eil51-2x2 (as the input of the InterCluMRCT) in Type 6. This figure illustrates the main convergence trends when comparing single task and multi-task.

  • The first trend of convergence: multi-task converges faster than single task. This case is shown in Fig. 16 which corresponds to the convergence curves of Type 1 and Type 5 instances.

  • The second trend of convergence: single task has a faster convergence speed than multi-task. This case is illustrated by the convergence curves of instances of Type 6. Note that, not all of cases when single task converges faster than multi-task are similar to the case of convergence curves of Type 6 instances in Fig. 16. Since initially, the convergence rate of single task is slower in early generations; however, in later generations, the convergence curve of single task converges faster than multi-task.

Fig. 17
figure 17

Comparing convergence trends of \(\tilde{f_1}\) and \(\tilde{f_2}\) in multi-tasks and serial single task for instances 5eil5 and 5eil76 in Type 1; instances 5i45-18 and 5i60-21 in Type 5; instances 4berlin52-2x2 and 4eil51-2x2 in Type 6

Refer to Fig. 17 for a better understanding of the improved performance as a consequence of multi-task. The figure depicts the convergence trends corresponding to each individual task.

One remark in Fig. 17 is the normalized objective values difference between single-task and multi-task at start time. The cause of this difference is that the initial individual of a task obtained by decoding from unified search space may differ from initial solution in single-task.

According to the Fig. 17c, multi-task is compared with single-task, the convergence speed on instance 4Berlin54-2x2 is slightly slower. The explanation for this phenomenon is that the evolutionary multitasking does not necessarily guarantee the improvement of performance for every task in MFEA [2, 27]; some tasks are positively impacted, while other tasks may be negatively impacted by the implicit genetic transfer available during multitasking. In contrast, as shown in Fig. 17a, b, the convergence rate of each task in CC-MFEA is better than the corresponding task in single-task. Because the multitasking and the single-task use the same evolutionary operators, the improvement can be attributed entirely to the exploitation of multiple function landscapes via implicit genetic transfer, as it is afforded by the evolutionary multitasking paradigm.

7 Conclusion

This paper introduces an algorithm based on MFEA with the Cayley Code encoding mechanism to solve clustered tree problems. Evolutionary operators are proposed to exploit the advantages of the Cayley code. Those operators are applied to find solutions to the clustered problems in two nested levels: the first level constructs the tree which spans all clusters, while the second level builds the spanning tree in each cluster. The performance of the proposed algorithm is analyzed on variant types of problem instances.

This paper focuses on analyzing the effectiveness of different encoding types in the Cayley Codes (including Dandelion Code, Prüfer Code and Blob Code) when applying the CC-MFEA algorithm to solve 2 problems: the CluSPT and the InterCluMRCT problem. Experimental results show that the Dandelion Code has superior performance over the remaining encoding types, while the Prüfer Code has the lowest efficiency. The dependence of the encoding performance on some factors such as: the number of clusters, the average number of vertices in a cluster is analyzed. The obtained results also indicate that the selection of problems with similar properties of the data and solution structure help them to find better solutions.

In the near future, we are going to continue researching on MFEA in order to resolve the clustered tree problems with a variety of different representation methods, especially the representation of an arbitrary graph.