1 Introduction

The generalized traveling salesman problem (GTSP) is defined as follows. We are given a weighted complete directed or undirected graph G and a partition VV 1V 2 ∪ ··· ∪ V M of its vertices; the subsets V i are called clusters. The objective is to find a minimum weight cycle containing exactly one vertex from each cluster. There are many publications on GTSP (see, e.g., the surveys Fischetti et al. (2002), Gutin (2003) and the references there) and the problem has many applications, see, e.g. Ben-Arieh et al. (2003) and Laporte et al. (1996). The problem is NP-hard, since the traveling salesman problem (TSP) is a special case of GTSP when |V i | = 1 for each i. GTSP is trickier than TSP in the following sense: it is an NP-hard problem to find a minimum weight collection of vertex-disjoint cycles such that each cluster has only one vertex in the collection (and the claim holds even when each cluster has just two vertices) (Gutin and Yeo 2003). Compare it with the well-known fact that a minimum weight collection of vertex-disjoint cycles in a weighted complete digraph can be found in polynomial time (Gutin and Punnen 2002).

We call GTSP and TSP symmetric if the complete graph G is undirected and asymmetric if G is directed. Often instead of the term weight we use the term length.

Various approaches to GTSP have been studied. There are exact algorithms such as branch-and-bound and branch-and-cut algorithms in Fischetti et al. (1997). While exact algorithms are very important, they are unreliable with respect to their running time that can easily reach many hours or even days. For example, the well-known TSP solver Concorde can easily solve some TSP instances with several thousand cities, but it could not solve several asymmetric instances with 316 cities within the time limit of 104 s (in fact, it appears it would fail even if significantly much more time was allowed) (Fischetti et al. 1997).

Several researchers use transformations from GTSP to TSP (Ben-Arieh et al. 2003) as there exists a large variety of exact and heuristic algorithms for the TSP, see, e.g., Gutin and Punnen (2002) and Lawler et al. (1985). However, while the known transformations normally allow to produce GTSP optimal solutions from the obtained optimal TSP tours, all known transformations do not preserve suboptimal solutions. Moreover, conversions of near-optimal TSP tours may well result in infeasible GTSP solutions. Thus, the transformation do not allow us to obtain quickly approximate GTSP solutions and there is a necessity for specific GTSP heuristics. Not every TSP heuristic can be extended to GTSP; for example, so-called subtour patching heuristics often used for the Asymmetric TSP, see, e.g., Johnson et al. (2002), cannot be extended to GTSP due to the above mentioned NP-hardness result from Gutin and Yeo (2003).

It appears that the only metaheuristic algorithms that can compete with Lin-Kirnighan-based local search for TSP are memetic algorithms (Hart et al. 2004; Moscato 1999) that combine powers of genetic and local search algorithms (Johnson and McGeoch 2002; Tsai et al. 2004). Thus, it is no coincidence that the latest studies in the area of GTSP explore the memetic algorithm approach (Silberholz and Golden 2007; Snyder and Daskin 2006; Tasgetiren et al. 2007).

The aim of this paper is to present a new memetic algorithm for GTSP with a powerful local search part. Unlike the previous heuristics which can be used for the symmetric GTSP only, our algorithm can be used for both symmetric and asymmetric GTSPs. The computational experiments show that our algorithm clearly outperforms all published memetic heuristics (Silberholz and Golden 2007; Snyder and Daskin 2006; Tasgetiren et al. 2007) with respect to both solution quality and running time.

2 The genetic algorithm

Our heuristic is a memetic algorithm, which combines power of genetic algorithm with that of local search (Hart et al. 2004; Krasnogor and Smith 2005). We start with a general scheme of our heuristic, which is similar to the general schemes of many memetic algorithms.

  1. Step 1

    Initialize. Construct the first generation of solutions. To produce a solution we use a semirandom construction heuristic (see Sect. 2.2).

  2. Step 2

    Improve. Use a local search procedure to replace each of the first generation solutions by the local optimum. Eliminate duplicate solutions.

  3. Step 3

    Produce next generation. Use reproduction, crossover, and mutation genetic operators to produce the non-optimized next generation. Each of the genetic operators selects parent solutions from the previous generation. The length of a solution is used as the evaluation function.

  4. Step 4

    Improve next generation. Use a local search procedure to replace each of the current generation solutions except the reproduced ones by the local optimum. Eliminate duplicate solutions.

  5. Step 5

    Evolute. Repeat Steps 3 and 4 until a termination condition is reached.

2.1 Coding

The Genetic Algorithm (GA) requires each solution to be coded in a chromosome, i.e., to be represented by a sequence of genes. Unlike Snyder and Daskin (2006) and Tasgetiren et al. (2007) we use a natural coding of the solutions as in Silberholz and Golden (2007). The coded solution is a sequence of numbers (s 1 s 2 ... s M ) such that s i is the vertex at the position i of the solution. For example (2 5 9 4) represents the cycle visiting vertex 2, then vertex 5, then vertex 9, then vertex 4, and then returning to vertex 2. Note that not any sequence corresponds to a feasible solution as the feasible solution should contain exactly one vertex from each cluster, i.e., C(s i ) ≠ C(s j ) for any i ≠ j, where C(v) is the cluster containing vertex v.

Note that, using natural coding, each solution can be represented by M different chromosomes: the sequence can be ‘rotated’, i.e., the first gene can be moved to the end of the chromosome or the last gene can be inserted before the first one and these operations will preserve the cycle. For example, chromosomes (2 5 9 4) and (5 9 4 2) represent the same solution. We need to take this into account when considering several solutions together, i.e., in precisely two cases: when we compare two solutions, and when we apply crossover operator. In these cases we ‘normalise’ the chromosomes by rotating each of them such that the vertex vV 1 (the vertex that represents the cluster 1) takes the first place in the chromosome. For example, if we had a chromosome (2 5 9 4) and the vertex 5 belongs to the cluster 1, we rotate the chromosome in the following way: (5 9 4 2).

In the case of the symmetric problem the chromosome can also be ‘reflected’ while preserving the solution. But our heuristic is designed for both symmetric and asymmetric instances and, thus, the chromosomes (1 5 9 4) and (4 9 5 1) are considered as the chromosomes corresponding to distinct solutions.

The main advantage of the natural coding is its efficiency in the local search. As the local search is the most time consuming part of our heuristic, the coding should be optimized for it.

2.2 First generation

We produce 2M solutions for the first generation, where M is the number of clusters. The solutions are generated by a semirandom construction heuristic. The semirandom construction heuristic generates a random cluster permutation and then finds the best vertex in each cluster when the order of clusters is given by the permutation.

It chooses the best vertex selection within the given cluster sequence using the Cluster Optimization Heuristic (see Sect. 3).

The advantages of the semirandom construction heuristic are that it is fast and its cycles have no regularity. The latter is important as each completely deterministic heuristic can cause solutions uniformity and as a result some solution branches can be lost.

2.3 Next generations

Each generation except the first one is based on the previous generation. To produce the next generation one uses genetic operators, which are algorithms that construct a solution or two from one or two so-called parent solutions. Parent solutions are chosen from the previous generation using some selection strategy. We perform r runs of reproduction, 8r runs of crossover, and 2r runs of mutation operator. The value r is calculated as r = 0.2G + 0.05M + 10, where G is the number of generations produced before the current one. (Recall that M is the number of clusters.) As a result, we obtain at most 11r solutions in each generation but the first one (since we remove duplicated solutions from the population, the number of solutions in each generation can be smaller than 11r). From generation to generation, one can expect the number of local minima found by the algorithm to increase. Also this number can be expected to grow when the number of clusters M grows. Thus, in the formula above r depends on both G and M. All the coefficients in the formulas of this section were obtained in computational experiments, where several other values of the coefficients were also tried. Note that slight variations in selection of the coefficients do not influence significantly the results of the algorithm.

2.4 Reproduction

Reproduction is a process of simply copying solutions from the previous generation. Reproduction operator requires a selection strategy to select the solutions from the previous generation to be copied. In our algorithm we select r (see Sect. 2.3) shortest solutions from the previous generation to copy them to the current generation.

2.5 Crossover

A crossover operator is a genetic operator that combines two different solutions from the previous generation. We use a modification of the two-point crossover introduced by Silberholz and Golden (2007) as an extension of an Ordered Crossover (Davis 1985). Our crossover operator produces just one child solution (r 1 r 2 ... r M ) from the parent solutions (p 1 p 2 ... p M ) and (q 1 q 2 ... q M ). At first it selects a random position a and a random fragment length 1 ≤ l < M and copies the fragment [a, al) of the first parent to the beginning of the child solution: r i  = p i+a for each i = 0, 1, ..., l − 1. Footnote 1 To produce the rest of the child solution, we introduce a sequence q′ as follows: q i  = q i+a+l-1, where i = 1, 2, ..., M. Then for each i such that the cluster C(q i ) is already visited by the child solution r, the vertex q i is removed from the sequence: q′ = (q1 q2... qi-1 qi+1...). As a result l vertices will be removed: |q′| = Ml. Now the child solution r should be extended by the sequence q′: r = (r 1 r 2 ... r l q1 q2 ... qM-l).

A feature of this crossover is that it preserves the vertex order of both parents.

Crossover example. Let the first parent be (1 2 3 4 5 6 7) and the second parent (3 2 5 7 6 1 4) (here we assume for explanation clarity that every cluster contains exactly one vertex: V i = { i }). First of all we rotate the parent solutions such that C(p1) = C(q1) = 1: p = (1 2 3 4 5 6 7) (remains the same) and q = (1 4 3 2 5 7 6). Now we choose a random fragment in the parent solutions:

  • p = (1 2 | 3 4 | 5 6 7)

  • q = (1 4 | 3 2 | 5 7 6)

and copy this fragment from the first parent p to the child solution: r = (3 4). Next we produce the sequence q′ = (5 7 6 1 4 3 2) and remove vertices 3 and 4 from it as the corresponding clusters are already visited by r: q′ = (5 7 6 1 2). Finally, we extend the child solution r by q′:

  • r = (3 4 5 7 6 1 2).

The crossover operator requires some strategy to select two parent solutions from the previous generation. In our algorithm an elitist strategy is used; the parents are chosen randomly between the best 33 % of all the solutions in the previous generation.

2.6 Mutation

A mutation operator modifies partially some solution from the previous generation. The modification should be stochastic and usually worsens the solution. The goal of the mutation is to increase the solution diversity in the generation.

Our mutation operator removes a random fragment of the solution and inserts it in some random position. The size of the fragment is selected between 0.05M and 0.3M. An elitist strategy is used in our algorithm; the parent is selected randomly among 75 % of all the solutions in the previous generation.

Mutation example. Let the parent solution be (1 2 3 4 5 6 7). Let the random fragment start at 2 and be of the length 3. The new fragment position is 3, for example. After removing the fragment we have (1 5 6 7). Now insert the fragment (2 3 4) at the position 3: (1 5 2 3 4 6 7).

2.7 Termination condition

For the termination condition we use the concept of idle generations. We call a generation idle if the best solution in this generation has the same length as the length of the best solution in the previous generation. In other words, if the produced generation has not improved the solution, it is idle. The heuristic stops after some idle generations are produced sequentially.

In particular, we implemented the following new condition. Let I(l) be the number of sequential idle generations with the best solution of length l. Let I cur  = I(l cur ), where l cur is the current best solution length. Let I max = maxl > l_cur I(l). Then our heuristic stops if I cur ≥ max(1.5 I max , 0.05M + 5). This formula means that we are ready to wait for the next improvement 1.5 times more generations than we have ever waited previously. The constant 0.05M + 5 is the minimum boundary for the number of generations we are ready to wait for improvement. All the coefficients used in the formula were found empirically.

2.8 Asymmetric instances

Our algorithm is designed to process equally both symmetric and asymmetric instances, however some parameters should take different values for these types of instances for the purpose of high efficiency. In particular, we double the size of the first generation (4M instead of 2M, see Sect. 2.2) and increase the minimum number of idle generations by 5 (i.e., I cur ≥ max(1.5I max , 0.05M + 10). The local improvement procedure (see below) has also some differences for symmetric and asymmetric instances.

3 Local improvement part

We use a local improvement procedure for each solution added to the current generation. The local improvement procedure runs several local search heuristics sequentially. The following local search heuristics are used in our algorithm:

  • Swaps tries to swap every non-neighboring pair of vertices. The heuristic applies all the improvements found during one cycle of swaps.

  • k-Neighbor Swap tries different permutations of every solution subsequence \((s_1 s_2 \ldots s_k)\). In particular it tries all the non-trivial permutations which are not covered by any of i-Neighbor Swap, \(i = 2, 3, \ldots , k-1\) . For each permutation the best selection of the vertices within the considered cluster subsequence is calculated. The best permutation is accepted if it improves the solution. The heuristic applies all the improvements found during one cycle.

  • 2-opt tries to replace every non-adjacent pair of edges s i si+1 and s j sj+1 in the solution by the edges s i s j and si+1sj+1 if the new edges are lighter, i.e., the sum of their weights is smaller than the sum of the weights of old edges. The heuristic applies all the improvements found.

  • Direct 2-opt is a modification of 2-opt heuristic. Direct 2-opt selects a number of the longest edges contained in the solution and then tries all the non-adjacent pairs of the selected edges. It replaces edges s i si+1 and s j sj+1 with the edges s i s j and si+1sj+1 if the new edges are shorter, i.e., the sum of their weights is smaller than the sum of the weights of old edges. The heuristic applies all the improvements found.

  • Inserts tries to remove a vertex from the solution and to insert it in the different position. The best vertex in the inserted cluster is selected after the insertion. The insertion is accepted if it improves the solution. The heuristic tries every combination of the old and the new positions except the neighboring positions and applies all the improvements found.

  • Cluster Optimization (CO) uses the shortest (s, t)-path algorithm for acyclic digraphs (see, e.g., Bang-Jensen and Gutin 2000) to find the best vertex for each cluster when the order of clusters is fixed. This heuristic was introduced by Fischetti et al. (1997) (see its detailed description also in Fischetti et al. 2002).

    The CO Heuristic uses the fact that the shortest (s, t)-path in an acyclic digraph can be found in polynomial time. Let the given solution be represented by chromosome \((s_1 s_2 \ldots s_M).\) The algorithm builds an acyclic digraph G CO = (V CO, E CO), where V COVC′(s 1) is the set of the GTSP instance vertices extended by a copy of the cluster C(s 1) and E CO is a set of edges in the digraph G CO. (Recall that C(x) is the cluster containing the vertex x.) An edge xy ∈ E CO if and only if C(x) = C(s i ) and C(y) = C(s i+1) for some i < M or if C(x) = C(s M ) and C(y) = C′(s 1). For each vertex s ∈ C(s 1) and its copy s′ ∈ C′(s 1), the algorithm finds the shortest (s, s′)-path in G CO. The algorithm selects the shortest path \((s p_2 p_3 {\ldots}p_M s^{\prime})\) and returns the chromosome \((s p_2 p_3 \ldots p_M)\) which is the best vertex selection within the given cluster sequence.

    Note that the algorithm’s time complexity grows linearly with the size of the cluster C(s 1). Thus, before applying the CO algorithm we rotate the initial chromosome in such a way that |C(s 1)| = mini ≤  M |C i |.

For each local search algorithm with some cluster optimization embedded, i.e., for k-Neighbour Swap and Inserts, we use a speed-up heuristic. We calculate a lower bound l new of the new solution length and compare it with the previous length l prev before the vertices within the clusters optimization. If l new ≥ l prev, the solution modification is declined immediately. For the purpose of the new length lower bound calculation we assume that the unknown edges, i.e., the edges adjacent to the vertices that should be optimized, have the length of the shortest edges between the corresponding clusters.

Some of these heuristics form a heuristic-vector \({\mathcal{H}}\) as follows:

Symmetric instances

Asymmetric instances

Inserts

Swaps

Direct 2-opt for M/4 longest edges

Inserts

2-Opt

Direct 2-opt for M/4 longest edges

2-Neighbour Swap

2-opt

3-Neighbour Swap

2-Neighbour Swap

4-Neighbour Swap

3-Neighbour Swap

The improvement procedure applies all the local search heuristic from \({\mathcal{H}}\) cyclically. Once some heuristic fails to improve the tour, it is excluded from \({\mathcal{H}}.\) If 2-opt heuristic fails, we also exclude Direct 2-opt from \({\mathcal{H}}.\) Once \({\mathcal{H}}\) is empty, the CO heuristic is applied to the solution and the improvement procedure stops.

4 Results of computational experiments

We tested our heuristic using GTSP instances which were generated from some TSPLIB (Reinelt 1991) instances by applying the standard clustering procedure of Fischetti et al. (1997). Note that our heuristic is designed for medium and large instances and, thus, we selected all the instances with 40 to 217 clusters. Unlike Silberholz and Golden (2007), Snyder and Daskin (2006) and Tasgetiren et al. (2007), smaller instances are not considered.

All the information necessary for reproducing our experiments is available online at http://www.cs.rhul.ac.uk/Research/ToC/publications/Karapetyan:

  • All the instances considered in our experiments. For the purpose of simplicity and efficiency we use a uniform binary format for instances of all types.

  • The binary format definition.

  • Source codes of binary format reading and writing procedures.

  • Source codes of the clustering procedure (Fischetti et al. 1997) to convert TSP instances into GTSP instances.

  • Source codes of the TSPLIB files reading procedure.

  • Source codes of our memetic algorithm.

  • Source codes of our experimentation engine.

The tables below show the experiments results. We compare the following heuristics:

  • GK is the heuristic presented in this paper.

  • SG is the heuristic by Silberholz and Golden (2007).

  • SD is the heuristic by Snyder and Daskin (2006).

  • TSP is the heuristic by Tasgetiren et al. (2007).

The results for GK and SD were obtained in our own experiments. Other results are taken from the corresponding papers. Each test of GK and SD includes ten algorithm runs. The results for SG and TSP were produced after five runs.

To compare the running times of all the considered heuristics we need to convert the running times of SG and TSP obtained from the corresponding papers to the running times on our evaluation platform. Let us assume that the running time of some Java implemented algorithm on the SG evaluation platform is t SGk SG ·t GK, where k SG is some constant and t GK is the running time of the same but C++ implemented algorithm on our evaluation platform. Let us assume that the running time of some algorithm on the TSP evaluation platform is t TSPk TSP ·t GK, where k TSP is some constant and t GK is the running time of the same algorithm on our evaluation platform.

The computer used for GK and SD evaluation has the AMD Athlon 64 X2 3.0 GHz processor. The computer used for SG has Intel Pentium 4 3.0 GHz processor. The computer used for TSP has Intel Centrino Duo 1.83 GHz processor. Heuristics GK , SD , and TSP are implemented in C++ ( GK is implemented in C# but the most time critical fragments are implemented in C++ ). Heuristic SG is implemented in Java . Some rough estimation of Java performance in the combinatorial optimisation applications shows that C++ implementation could be approximately two times faster than the Java implementation. As a result the adjusting coefficient k SG ≈3 and the adjusting coefficient k TSP ≈2.

We are able to compare the results of SD heuristic tests gathered from different papers to check the k SG and k TSP values because SD has been evaluated on each of the platforms of our interest (the heuristic was implemented in Java in Silberholz and Golden (2007) for the exact comparison to SG ). The time ratio between the SD running times from Silberholz and Golden (2007) and our own results vary significantly for different problems, but for some middle size problems the ratio is about 2.5–3. These results correlate well with the previous estimation. The suggested value k TSP ≈2 is also confirmed by this method.

The headers of the tables in this section are as follows:

  • Name is the instance name. The prefix number is the number of clusters in the instance; the suffix number is the number of vertices.

  • Error (%) is the error, in per cent, of the average solution above the optimal value. The error is calculated as \(\frac{value - opt} {opt} \times 100 \% \) , where value is the obtained solution length and opt is the optimal solution length. The exact optimal solutions are known from Ben-Arieh et al. (2003) and from Fischetti et al. (1997) for 17 of the considered instances only. For the rest of the problems we use the best solutions ever obtained in our experiments instead.

  • Time (s) is the average running time for the considered heuristic in seconds. The running times for SG and for TSP are obtained from the corresponding papers thus these values should be adjusted using k SG and k TSP coefficients, respectively, before the comparison.

  • Quality impr. (%) is the improvement of the average solution quality of the GK with respect to some other heuristic. The improvement is calculated as E H E GK where E H is the average error of the considered heuristic H and E GK is the average error of our heuristic.

  • Time impr. (%) is the improvement of the GK average running time with respect to some other heuristic running time. The improvement is calculated as T H /T GK where T H is the average running time of the considered heuristic H and T GK is the average running time of our heuristic.

  • Opt. (%) is the number of tests, in per cent, in which the optimal solution was reached. The value is displayed for three heuristics only as we do not have it for SG .

  • Opt. is the best known solution length. The exact optimal solutions are known from Fischetti et al. (1997) and Ben-Arieh et al. (2003) for 17 of the considered instances only. For the rest of the problems we use the best solutions ever obtained in our experiments.

  • Value is the average solution length.

  • # gen. is the average number of generations produced by the heuristic.

The results of the experiments presented in Table 1 show that our heuristic ( GK ) has clearly outperformed all other heuristics with respect to solution quality. For each of the considered instances the average solution reached by our heuristic is always not worse than the average solution reached by any other heuristic and the percent of the runs in which the optimal solution was reached is not less than for any other considered heuristic (note that we are not able to compare our heuristic with SG with respect to this value).

Table 1 Solvers quality comparison

The average values are calculated for four instance sets (IS). The Full IS includes all the instances considered in this paper, both symmetric and asymmetric. The Sym. IS includes all the symmetric instances considered in this paper. The SG IS includes all the instances considered in both this paper and Silberholz and Golden (2007). The TSP IS includes all the instances considered in both this paper and Tasgetiren et al. (2007).

One can see that the average quality of our GK heuristic is approximately 10 times better than that of SG heuristic, approximately 30 times better than that of SD , and for TSP IS our heuristic reaches the optimal solution each run and for each instance, in contrast to TSP that has 0.44% average error. The maximum error of GK is 0.27% while the maximum error of SG is 2.25% and the maximum error of SD is 3.84%.

The running times of the considered heuristics are presented in Table 2. The running time of GK is not worse than the running time of any other heuristic for every instance: the minimum time improvement with respect to SG is 6.6 that is greater than 3 (recall that 3 is an adjusting coefficient for SG evaluation platform, see above), the time improvement with respect to SD is never less than 1.0 (recall that both heuristics were tested on the same platform), and the minimum time improvement with respect to TSP is 4.6 that is greater than 2 (recall that 2 is an adjusting coefficient for TSP evaluation platform, see above). The time improvement average is ∼12 times for SG (or ∼4 times if we take into account the platforms difference), ∼3 times for SD , and ∼11 times for TSP (or ∼5 times if we take into account the platforms difference).

Table 2 Solvers running time comparison

The stability of GK is high, e.g., for the 89pcb442 instance it produces only exact solutions and the time standard deviation is 0.27 s for 100 runs. The minimum running time is 1.29 s, the maximum is 2.45 s, and the average is 1.88 s. For 100 runs of 217vm1084 the average running time is 65.32 s, the minimum is 44.30 s, the maximum is 99.54 s, and the standard deviation is 13.57 s. The average solution is 130994 (0.22% above the best known), the minimum is 130704 (exactly the best known), the maximum is 131845 (0.87% above best known), and the standard deviation is 331.

Some details on the GK experiments are presented in Table 3. The table includes the average number of generations produced by the heuristic. One can see that the number of generations produced by our heuristic is relatively small: the SD and TSP limit the number of generation to 100 while they consider the instances with M < 90 only; SG terminates the algorithm after 150 idle generations. Our heuristic does not require a lot of generations because of the powerful local search procedure and large population sizes.

Table 3 GK experiments details

5 Conclusion

We have developed a new memetic algorithm for GTSP that dominates all known GTSP heuristics with respect to both solution quality and the running time. Unlike other memetic algorithms introduced in the literature, our heuristic is able to solve both symmetric and asymmetric instances of GTSP. The improvement is achieved due to the powerful local search, well-fitted genetic operators and new efficient termination condition.

Our local search (LS) procedure consists of several LS heuristics of different power and type. Due to their diversity, our algorithm is capable of successfully solving various instances. Our LS heuristics are either known variations of GTSP heuristics from the literature (2-opt, Inserts, Cluster Optimization) or new ones inspired by the appropriate TSP heuristics (Swaps, k-Neighbor Swap, Direct 2-opt). Note that our computational experiments demonstrated that the order in which LS heuristics are used is of importance. Further research may find some better LS algorithms including more sophisticated based on, e.g., Tabu search or Simulated Annealing.

While crossover operator used in our algorithm is the same as in Silberholz and Golden (2007), the mutation operator is new. The termination condition is also new. The choices of the operators and the termination condition influence significantly the performance of the algorithm.