1 Introduction

The Traveling Salesman Problem (TSP) is a well-known \(\texttt {NP}\)-hard problem in network optimization (Applegate et al. 2006; Dantzig et al. 1954; Flood 1956; Lawler et al. 1985). It consists of determining a minimum cost Hamiltonian path, visiting all customers only once, and returning to the starting point. The TSP arises in many applied settings, such as drilling of printed circuit boards (Grotschel et al. 1991), the order-picking problem in depots (Ratliff and Rosenthal 1983), the school bus routing problem (Angel et al. 1972), the defender–attacker–defender problem (Lozano et al. 2017), and the maritime logistics (Malaguti et al. 2018).

The TSP and most of its variations are oriented by costs and, in the literature, we hardly see studies considering different demands. One variation that does consider different demands is the Multicommodity Traveling Salesman Problem (MTSP) (Sarubbi 2003; Sarubbi and Luna 2007; Sarubbi et al. 2007). In the MTSP, product types and quantities that pass through a route connecting two customers are considered in the total cost. In practice, customers with larger demands or more precious or high-risk products must be delivered with a higher priority. For example, sensitive materials may require special transport structure, perishable products must be refrigerated, both leading to higher transportation costs. The authors (Sarubbi 2003) consider variable costs for each product type in each route between two customers and propose a network flow model that is solved by a Lagrangian relaxation and a branch-and-cut method (Sarubbi 2008; Sarubbi et al. 2008).

Another variation of the TSP considers priority prizes for customers along the route. These prizes are collected by the salesman according to the order each customer is visited. In this variation, the quality of customer service and delivery priorities are considered because customer preferences in terms of delivery sequence order must be quantified and optimized. Pureza et al. (2018) modeled this problem as a special case of a quadratic assignment problem (Koopmans and Beckmann 1957) called the Travelling Salesman Problem with Priority Prizes (TSPPP) and they solved it by a Tabu Search metaheuristic (Glover 1986). Silva (2017); Silva et al. (2018) used the TSPPP and the Profitable Tour Problem (PTP) (Archetti et al. 2014; Balas 1987) to formulate the problem of elaborating touristic itineraries considering variables such as the profile of the visitors, the planning of the trip, visitors’ preferences and touristic attractions. The authors considered a case study in the City of Belém, in the State of Pará, Brazil, solving the problem by a Tabu Search metaheuristic.

Besides these situations, logistic companies face conflict between operational costs, customers with different categories of products, and customer satisfaction, which is directly related to delivery time. It then becomes a challenge to choose between minimizing travel costs and ensuring a degree of service quality for all customers. To obtain solutions that balance operating costs and customer satisfaction, we propose a new model that combines the TSPPP with MTSP.

In this paper, we look at the TSP from both the customer’s and the salesman’s points of view. We consider the objective of minimizing total costs, while satisfying customers’ preference, by maximizing the priority prizes. Our model is based on the assignment problem and a network flow problem. Commodities flowing in the network represent the products. We denote this problem as Multicommodity Traveling Salesman Problem with Priority Prizes (MTSPPP).

In this study, we used two versions of the Biased Random-Key Genetic Algorithm (BRKGA) (Gonçalves and Resende 2011) to solve medium and large instances of the MTSPPP. A local search based on Iterated Local Search (ILS) (Lourenço et al. 2003) and Variable Neighborhood Descent (VND) (Hansen et al. 2010) were also used to improve the solutions found by BRKGA.

We generated some instances for the MTSPPP based on classical instances of TSP and we performed computational tests with the model and the metaheuristics. The model was able to solve small instances and the proposed metaheuristics efficiently solved medium and large instances. To the best of our knowledge, the MTSPPP has not been addressed before in the literature.

This article is organized as follows. A mathematical model of MTSPPP is presented in Sect. 2. In Sect. 3, the BRKGA+CS and A-BRKGA+CS algorithms are briefly introduced and in Sect. 4, a framework of BRKGA+CS and A-BRKGA+CS are presented in detail to solve the MTSPPP. The computational results for the proposed model and methods are presented and analyzed in Sect. 5. Finally, in Sect. 6, some conclusions are presented.

2 The multicommodity traveling salesman problem with priority prizes

Let N be the number of customers to be visited including the depot. Consider a graph G(VA), where V is a set composed by N nodes, numbered 1 to N, and A is a set of arcs. For convenience, node 1 is the depot and the other nodes represent the customers to be visited. The set of arcs A in this graph represent the paths between customers. For simplicity, we consider that product l, a commodity, will be delivered to customer l. We also assume that each customer can order only one product type, i.e., the same node (customer) cannot order different products. Denote \(c_{ij}\) as the fixed cost in the arc (ij), \(q_l\) as the demand required by customer l, \(d_{ijl}\) as the variable cost to pass product l through arc (i,j), and \(p_{ki}\) the prize value collected when customer i is visited in the kth order. We assume, without loss of generality, that \(q_l > 0\) and integer for all l.

Let \(x_{ki}\) be a binary decision variable equal to 1, if customer i is visited in the kth order, and 0, otherwise. Similarly, let \(y_{ij}\) be a binary decision variable equal to 1, if customer j is visited immediately after customer i, and 0, otherwise. Let \(f_{ijl}\) be a real decision variable corresponding to the flow quantity of the product l that is transported from customer i to customer j. Observe that in an optimal solution, due to constraints (7), (8), and (9) of the model, the values of the \(f_{ijl}\) variables will be integer if \(q_l\) is integer for all l. So, a mathematical formulation for the problem MTSPPP is the following:

$$\begin{aligned}&\displaystyle \text{ Max } \text{ Z } \text{= } \displaystyle \sum _{k=2}^{N}\displaystyle \sum _{i=2}^{N}p_{ki}x_{ki} - \displaystyle \sum _{i=1}^{N}\displaystyle \sum _{{\mathop {i \ne j}\limits ^{j=1}}}^{N}\left( c_{ij}y_{ij} + \displaystyle \sum _{l=2}^{N}d_{ijl}f_{ijl}\right)&\end{aligned}$$
(1)
$$\begin{aligned}&\text{ subject } \text{ to: }&\nonumber \\&\displaystyle \sum _{i=1}^{N}x_{ki}=1, \quad k=1,2,\ldots ,N \end{aligned}$$
(2)
$$\begin{aligned}&\displaystyle \sum _{k=1}^{N}x_{ki}=1, \quad i=1,2,\ldots ,N \end{aligned}$$
(3)
$$\begin{aligned}&\displaystyle \sum _{j=1}^{N} y_{ij}=1, \quad i=1,2,\ldots ,N \end{aligned}$$
(4)
$$\begin{aligned}&\displaystyle \sum _{h=1}^{N} y_{hi}=1, \quad i=1,2,\ldots ,N \end{aligned}$$
(5)
$$\begin{aligned}&x_{(k-1)h} + x_{ki} - y_{hi} \le 1, \quad h\ne i=1,2,\ldots ,N, \ k=2,3,\ldots ,N \end{aligned}$$
(6)
$$\begin{aligned}&\displaystyle \sum _{j=2}^{N} f_{1jl} = q_{l}, \quad l=2,3,\ldots ,N \end{aligned}$$
(7)
$$\begin{aligned}&\displaystyle \sum _{{\mathop {i \ne l}\limits ^{i=1}}}^{N} f_{ill} = q_{l}, \quad l=2,3,\ldots ,N \end{aligned}$$
(8)
$$\begin{aligned}&\displaystyle \sum _{{\mathop {i \ne l, h \ne i}\limits ^{h=1}}}^{N} f_{hil} - \displaystyle \sum _{{\mathop {i \ne l, i \ne j}\limits ^{j=1}}}^{N}f_{ijl} = 0, \quad l,i=2,3,\ldots ,N \end{aligned}$$
(9)
$$\begin{aligned}&f_{ijl} \le q_{l} y_{ij}, \quad i\ne j=1,2,\ldots ,N, \ l=2,3,\ldots ,N \end{aligned}$$
(10)
$$\begin{aligned}&f_{ijl} \ge 0, \quad i,j=1,2,\ldots ,N, \ l=2,3,\ldots ,N, \end{aligned}$$
(11)
$$\begin{aligned}&x_{ki}, y_{ij} \in \{0,1\}, \quad k,i,j=1,2,\ldots ,N, i \ne j. \end{aligned}$$
(12)

The objective function (1) maximizes the priority prizes while it minimizes the fixed total costs and variable costs. Constraints (2) and (3) impose the constraint that each customer must be visited only once. Constraints (4) and (5) are assignment constraints. Constraints (6) links the variables x and y. It ensures that if customer h is visited in the \((k-1)\)th position, and customer i is visited in the kth position, then the arc (h,i) will be used.

Constraints (7) ensures that all products will leave the depot with their respective demands, and constraints (8) guarantees that all products will reach their destinations, taking into account the demand required. Constraints (9) enforces the conservation of flow at any node that is not the final destination for the products. Constraints (10) links the variables f and y. It ensures that no flow is allowed on an arc (i, j) unless it is used. Finally, constraints (11) and (12) define the domain of the decision variables. We set the first position in the route at 1, i.e., \(x_{11}=1\), since, for convenience, we set node 1 as the depot.

The subtour elimination constraints are implied by constraints (4), (5), (7)–(12). Node 1 is the supply node for all products l, \(l = 2,3,\ldots ,N\). Therefore, in a feasible solution, there must be a path from node 1 to node l, \(l = 2,3,\ldots ,N\), where the quantity \(q_l\) of product l has to flow to satisfy the demand of node l, \(l = 2,3,\ldots ,N\), respectively. At each node in this path, we must have the conservation of flow, imposed by constraints (9). According to (10) and (12), the corresponding \(y_{ij}\) of the arcs in these paths must have value 1. Assume there is a subtour in the solution. There are two possibilities: (i) node 1 does not belong to this subtour; (ii) node 1 belongs to this subtour. Case (i) cannot happen for there is no path from node 1 to any of the nodes belonging to the subtour. Therefore, the demands of these nodes cannot be satisfied. In case (ii), there is at least one node, say t, that does not belong to the subtour. Since there must be a path from node 1 to node t to satisfy the demand of this node, then we have two possibilities: this path from node 1 to node t contains only nodes not belonging to the subtour or, this path contains other nodes of the subtour before reaching node t. The first case cannot happen because of constraint (4) that imposes that there is only a single arc with flow leaving node 1. Hence, we cannot have two arcs with positive flows leaving node 1, one to supply the demand of node t and the other to supply the demands of the nodes belonging to the subtour. The second case also cannot happen because of constraint (4). Let us consider the node, say s, belonging to the subtour, where the flow to node t leaves the subtour. Again, we cannot have two arcs with positive flows leaving node s, one to supply the demand of node t and the other to supply the demands of the nodes belonging to the subtour.

Computational experiments were performed with this model, using CPLEX 12.6.3 and are presented in Sect. 5. The model was able to solve instances with at most 28 customers in 10800 seconds. For medium and large instances, the solver was not able to find a feasible solution within the time limit set.

3 BRKGA

The Biased Random-Key Genetic Algorithm (BRKGA) was proposed by Gonçalves and Resende (2011). The method consists of a specialization of the RKGA (Random-Key Genetic Algorithm) (Bean 1994), an algorithm that evolves a population of random keys, where each vector of n random keys represents a solution of a combinatorial optimization problem. A random key is a real number randomly generated in the continuous interval [0, 1). A vector of random keys is decoded into a feasible solution and an objective value (or fitness) through a deterministic algorithm called a decoder, which is dependent on the problem to be solved.

A population of p individuals is made up. At each generation, the population is divided into two sets, according to the fitness of each individual. The \(p_\mathrm{e}\) best fitness vectors are identified as an elite set and the rest of the population as a non-elite set. All elite vectors are copied, without change, to the next generation and \(p_\mathrm{m}\) mutants are introduced. A mutant is a vector of random keys generated in the same way as all individuals of the initial population. The \(p-p_\mathrm{e}-p_\mathrm{m}\) remaining individuals are generated by combining pairs of randomly chosen solutions an elite set and another non-elite set (Spears and Jong 1991).

In Fig. 1, we illustrate a flow diagram of the BRKGA. In this diagram, the gray rectangle represents the decoder function. This is the single component of a BRKGA that depends on the problem being solved.

Fig. 1
figure 1

Adapted from Chaves et al. (2016, 2018)

The BRKGA framework

3.1 Adaptive BRKGA (A-BRKGA)

The Adaptive Biased Random-Key Genetic Algorithm (A-BRKGA) is a recent metaheuristic proposed by Chaves et al. (2018). This metaheuristic tunes parameters on-line so that the processes of intensification and diversification are balanced. In Fig. 2, we illustrate the evolutionary process of the A-BRKGA.

Fig. 2
figure 2

Adapted from Chaves et al. (2016, 2018)

The Adaptive BRKGA framework

In A-BRKGA, a solution of a combinatorial optimization problem is also represented by a vector of n random-keys plus two extra random-keys in positions \(n+1\) and \(n+2\). They are used to compute the self-adaptive parameters.

If we denote \(p_{max}\) and \(p_{min}\) as the maximum and the minimum sizes of the population, respectively, the maximum number of generations (\(max_{gen}\)), the stop criterion, is given by the equation:

$$\begin{aligned} max_{gen}=\gamma ^{\left( \log _{\gamma }^{p_{min}}-p_{max}\right) }. \end{aligned}$$
(13)

Equation (13) is motivated by the expression used in Simulated Annealing (Ingber 1993) to calculate the maximum number of required iterations for a given temperature. The parameters \(p_{max}\) and \(p_{min}\) are set at 1000 and 100, respectively. These values are recommended in the literature (Gonçalves and Resende 2016; Prasetyo et al. 2015). The parameter \(\gamma \) is selected from the set {0.999, 0.998, 0.997} and depends on the available running time.

At each generation, the population is divided into two groups: the first group (RCL, restricted chromosome list) consists of the best individuals evaluated by fitness and the second group composed of the individuals of the population not in RCL. The RCL is composed of individuals whose fitness is less than a bound \(f_\mathrm{e}\) (minimization problem), given by a convex combination of the fitness values of the best (\(f_{min}\)) and the worst (\(f_{max}\)) individuals in the current population:

$$\begin{aligned} f_\mathrm{e} = \alpha f_{max} + (1-\alpha ) f_{min}, \end{aligned}$$
(14)

with \(\alpha \in [0,1]\). The number of individuals in the RCL is limited by \(p_\mathrm{e}=p * \kappa _e\), where \(\kappa _e\) is given by Eq. (15). In Eq. (15), \(g_k\) is the number of the current generation. According to the literature, the range of the number of elite individuals should vary between 10 and 25% of the population (Gonçalves and Resende 2016; Prasetyo et al. 2015). The number of elite individuals is lower in the initial generations when the average quality of the individuals is low and this number increases in the subsequent generations.

$$\begin{aligned} \kappa _e = 0.10 + \displaystyle \frac{g_k}{max_{gen}}*(0.25 - 0.10). \end{aligned}$$
(15)

The parameter \(\alpha \) in (14) is modified at each iteration according to Eq. (16). It starts with the value 0.3 and ends up with the value 0.1. \(f_\mathrm{e}\) decreases during the evolution of the population because of the tendency of flatness of the individuals fitness after a certain number of generations.

$$\begin{aligned} \alpha = 0.10 + \left( 1 - \displaystyle \frac{g_k}{max_{gen}}\right) *(0.30 - 0.10). \end{aligned}$$
(16)

The individuals in the RCL with the same fitness are perturbed by an adaptive parameter \(\beta \), that is the probability of randomly swapping two random key values, and that is updated according to Eq. (17), where \(x_{n+2}\) is the random-key in position \(n+2\) of the individual x in the current generation. This perturbation maintains the diversity of the population and avoids premature convergence.

$$\begin{aligned} \beta = 0.001 + x_{n+2}*(0.10 - 0.001). \end{aligned}$$
(17)

All RCL individuals in the population of generation k are copied to the population of generation \(k+1\). At each generation \(p_\mathrm{m}=p * \kappa _m\) individuals are generated as in the initial population and inserted to the population of generation \(k+1\), where \(\kappa _m\) is given by Eq. (18). These individuals are named mutants. Based on the literature, the number of mutants should range between 5 and 20% of the population (Gonçalves and Resende 2016; Prasetyo et al. 2015). As the population evolves, the number of mutants decreases aiming at a greater intensification in the search for the solution.

$$\begin{aligned} \kappa _m = 0.05 + \left( 1 - \displaystyle \frac{g_k}{max_{gen}}\right) *(0.20 - 0.05). \end{aligned}$$
(18)

To complete the population, \(p-|\mathrm{RCL}|-p_\mathrm{m}\) additional individuals are needed. These individuals are obtained by combining pairs of individuals of the current population, using the parameterized uniform crossover (Spears and Jong 1991). A number \(r_j \in [0,1]\) is generated for each individual gene. If \(r_j \le \rho _e\), then this individual gene inherits the allele of the RCL parent. Otherwise, it inherits the allele of the other parent. The parameter \(\rho _e\) is self-adaptive and is given by equation:

$$\begin{aligned} \rho _e = 0.65 + y_{n+1}*(0.80 - 0.65), \end{aligned}$$
(19)

where \(y_{n+1}\) is the random-key in position \(n+1\) of the individual y that is not in RCL. The probability of accepting an elite allele is always between 65 and 80% as recommended in the literature (Chaves et al. 2018).

All ranges of the parameters used in the A-BRKGA were based on studies in the literature and on empirical tests. They were defined aiming the development of an algorithm that easy reuse code and is efficient to solve different combinatorial optimization problems. In Sect. 3.2, we present a local search heuristic used with BRKGA and A-BRKGA.

3.2 Local search

Chaves et al. (2018) showed that the use of local search heuristics can improve the performance of BRKGA and A-BRKGA. However, to avoid a significant increase in running time these heuristics are applied only to regions considered promising by the algorithm. To find these promising regions, the Label Propagation (LP) method (Raghavan et al. 2007) is used to identify communities with a great number of similar individuals within the RCL. As RCL contains the best individuals evaluated by fitness, these communities are considered promising regions.

The similarity level, r, between two individuals x and y is calculated by the Pearson correlation coefficient (Rodgers and Nicewander 1988) and is given by Eq. (20), where x and y are the random-keys vectors, and \({\overline{x}}\) and \({\overline{y}}\) are the arithmetic means of x and y, respectively. A adaptive parameter \(\sigma \) is defined by Eq. (21) and used to represent the similarity degree during the search process. If \(r \ge \sigma \) then the individuals x e y are similar. The parameter \(\sigma \) starts in 0.3 and increases during the population evolution, because the diversity decreases throughout the evolution process.

$$\begin{aligned} r= & {} \displaystyle \frac{\sum _{i=1}^{n}(x_i - {\overline{x}})(y_i - {\overline{y}})}{\sqrt{ \sum _{i=1}^{n}(x_i - {\overline{x}})^2 \sum _{i=1}^{n}(y_i - {\overline{y}})^2 }}. \end{aligned}$$
(20)
$$\begin{aligned} \sigma= & {} 0.30 + \displaystyle \frac{g_k}{max_{gen}} *(0.70 - 0.30). \end{aligned}$$
(21)

Initially, we build a graph G(VA), where each individual of the RCL is represented by a node \(v \in V\). There is an edge \(a_{uv} \in A\) only if the individuals u and v have a similarity level greater than \(\sigma \). Then to each node we assign a different number, named label. For each node v of the graph, LP searches the label with the highest frequency among its neighbors. If ties occur one of them is chosen at random. The label of v is then replaced by the label of this neighbor with the highest frequency. The algorithm ends when the label of each node in the graph is the more common of its neighborhood. Finally, all nodes with the same label are grouped into a community (cluster). Figure 3 illustrates an example of the LP applied to a RCL with 9 individuals. Every edge in this graph is incident to nodes having similar individuals. Starting with graph of Fig. 3a, LP ends up with the graph of Fig. 3d. The way the labels of the nodes are updated is shown one at a time. At the end, two clusters with labels 2 and 6 are obtained.

Fig. 3
figure 3

Example of an iteration of the Label Propagation algorithm

The individual in a cluster with the best fitness value and that was not yet explored by the specific local search is chosen as the representative solution of this cluster. This individual is decoded into a solution of the problem and the local search is applied to this solution. The local optimal solution found does not return to the current population since this would require a re-decoding process. If this solution is better than the best-known solution so far, then the current best solution is updated. Since this solution is not returned to the current population, it does not interfere in the evolutionary process of the BRKGA or the A-BRKGA.

The BRKGA and A-BRKGA methods with this local search heuristics are named BRKGA+CS and A-BRKGA+CS, respectively (Chaves et al. 2016, 2018).

4 BRKGA+CS and A-BRKGA+CS for the MTSPPP

In this paper, we encode the solution of the MTSPPP as a vector S with n random-keys in the BRKGA+CS, and a vector of \(n+2\) random-keys in the A-BRKGA+CS, where n is the number of customers to be visited plus the depot, and the positions \(n+1\) and \(n+2\) are used to calculate the value of the parameters \(\rho _e\) and \(\beta \), respectively. The decoding of S in a solution \(S'\) of MTSPPP is accomplished by sorting the customers in ascending order of their corresponding random-key values. The first position in S is set at 1, representing the depot, whose salesman starts and ends his trip there. In Fig. 4, we show an example of the decoding process for the MTSPPP with six customers, where the gray rectangle represents the depot with its respective random-key value.

The fitness of Solution \(S'\) is calculated by the corresponding value of the objective function (1).

Fig. 4
figure 4

Example of an encoded solution

4.1 Local search for MTSPPP

The local search heuristic used is an Iterated Local Search (ILS) algorithm (Lourenço et al. 2003). ILS combines a local search phase with a perturbation phase. Specifically, the Variable Neighborhood Descent (VND) (Hansen et al. 2010) was used in the local search phase of ILS and Swap(1,1) was used in the perturbation phase.

Our VND used three neighborhood structures:

  1. 1.

    \(N^{(1)}\)—2-Opt: reverse the order of the visitation of customers between positions i and j, inclusive, in Solution \(S'\);

  2. 2.

    \(N^{(2)}\)—Shift(1): transfer customer i from its current position to position k, in Solution \(S'\);

  3. 3.

    \(N^{(3)}\)—Swap(1,1): change the visitation position of customer i with the visitation position of customer j in Solution \(S'\).

Examples of these neighborhoods are shown in Fig. 5. In (a) we present an example of a route with 5 customers plus the depot whose order is 1–2–3–4–5–6. In (b) we present the route 1–5–4–3–2–6 obtained by applying 2-Opt to the positions 2 and 5. In (c) we present the route 1–3–4–5–2–6 obtained by applying Shift(1) to customer 2 to position 5. In (d) we present the route 1–5–3–4–2–6 obtained by applying Swap(1,1) to customers 2 and 5.

Fig. 5
figure 5

Examples of the three neighborhood structures used in the MTSPPP. In a an example of a customer route. In b route obtained by 2-Opt. In c route obtained by Shift(1). In d route obtained by Swap(1,1)

Let \(S'\) be a decoded representative solution, i.e., a solution provided by the clustering process and decoded by the decoder function. For each of these solutions, we apply Algorithms 1 and 2, which are the ILS and VND procedures, respectively.

figure a
figure b

In Algorithm 1, the initial solution \((s_0)\) is a decoded solution (\(S'\)) in a promising region determined as described in subsection 3.2. VND is applied to obtain a better solution that will be the current solution, s, (line 2). We set to 3 the number of iterations of the algorithms (line 4) because of the available running time. In lines 5–10, we apply the local search phase; in line 7 we realize a perturbation in the current solution by Swap(1,1) before applying VND. If a better solution is found then the current solution is updated (lines 9 and 10). This is recursively applied until the maximum number of iterations is reached.

In Algorithm 2, the initial solution \(s_0\) is obtained by the ILS procedure (Algorithm 1, lines 2 and 8). The number of neighborhood structures \(n_r\) was set to 3, which are 2-opt, Shift(1) and Swap(1,1). A search is performed in all neighborhoods until the best solution for all structures is obtained (lines 5–11). In each neighborhood structure k (line 5) we find the best neighbor (line 6). Whenever a better solution is found, we return to the first neighborhood structure (line 9), ensuring that the final solution obtained is the best for all neighborhood structures. When a better solution is not found using the kth neighborhood structure (lines 10 and 11) then the structure changes to \(k+1\).

5 Computational results

The BRKGA+CS and A-BRKGA+CS were coded in C++ and the computational tests were carried out on an Intel Core i7 3.6 GHz processor with 16 GB of RAM in a Windows 10 environment. The solver ILOG CPLEX 12.6.3.0 (ILOG 2006) was used to obtain a solution using the model MTSPPP. The two versions of BRKGA were run using 20 different seeds.

The parameter tuning of BRKGA+CS is realized by the Relative Deviation Index (RDI) (Kim 1993). The RDI\(_i\) of a solution i with objective function value \(S_i\) is defined as

$$\begin{aligned} \mathrm{RDI}_i = \frac{S_\mathrm{B} - S_i}{S_\mathrm{B} - S_\mathrm{W}}, \end{aligned}$$
(22)

where \(S_\mathrm{B}\) is the best value obtained by the metaheuristics and \(S_\mathrm{W}\) is the worst value obtained by the metaheuristics. Different combinations of parameters were tested on a subset of instances and the configuration with the best RDI was chosen for the computational tests of the BRKGA+CS.

The parameter’s sets used in the tuning phase are shown in Table 1. The parameters found are given in Table 2 together with the range of the parameters used in A-BRKGA+CS. The settings of BRKGA+CS were kept constant for all instances whereas the parameters of A-BRKGA+CS varied within those intervals according to the theory presented in Sect. 3.1.

Table 1 Parameter values used in tuning process of BRKGA+CS
Table 2 Parameters used in the BRKGA+CS and the A-BRKGA+CS

The instances used in the tests were generated from the TSPLib library (2019) and were based on Sarubbi (2008). Each customer had an integer demand that ranged between 1 and \(\tau \) (maximum demand), in which \(\tau \) varied between 5 and 20 units. The depot, or initial node, always had a demand of 1 unit. The demands could be any positive real number. In this case, the value of the variables \(f_{ijl}\) in an optimal solution will be a real number. The product \(\gamma _l * c_{ij}\) represents the cost of passing the product l through arc (ij). The better the conditions for a arc (ij) are, the lower the value of the corresponding parameter. This parameter \(\gamma _l\) varied from 0.5 to 2%. Each customer also had a \(\eta _l\) value correspondent. This corresponds to the relative value of the product, and the more valuable the product is, the greater the value of the parameter. Parameters \(\eta _l\) were randomly generated between 0.5 and 1.5. The priority prizes were randomly generated between 0 and 100, considering only integer values (Silva 2017). All these parameters were generated using a uniform distribution. In A-BRKGA, we set \(\gamma \) at 0.999 because the running time implied by it was enough for the metaheuristic to find satisfactory solutions.

All instances satisfied the triangular inequality in relation to fixed costs. Each instance has its number of customers specified in its name. For example, instance \(\mathtt {burma14}\) has 13 customers to visit and the depot. In Table 3, all instances used in the computational tests are presented divided by the quantity of customers.

Table 3 The symmetric instances selected from the TSPLIB (TSPLib 2019), by instance sizes

We generated three sets of instances that differ themselves by a \(\theta \) parameter multiplied to the priority prizes. This parameter serves only to generate different instances. We denoted the set with \(\theta = {\overline{c}}\) by A, the set with \(\theta = \frac{{\overline{c}}}{4(N-1){\overline{p}}} \left( 2N+ \sum _{l=2}^{N} z_l\right) \) by B, and the set with \(\theta =1\) by C, where \({\overline{c}}= \frac{1}{N(N-1)}\sum _{i=1}^{N}\sum _{{\mathop {i \ne j}\limits ^{j=1}}}^{N} c_{ij}\), \({\overline{p}}= \frac{1}{(N-1)^2}\sum _{k=2}^{N}\sum _{i=2}^{N} p_{ki}\), and \(z_l = \gamma _l \eta _l q_l\).

Set A is closest to the assignment problem, since the priority prizes are much larger than the fixed costs. Set B is a balance between the assignment problem and the traveling salesman problem, and Set C is closest to the traveling salesman problem since the priority prizes are much smaller than the fixed costs.

In Tables 45 and 6, we present the results for the CPLEX, A-BRKGA+CS and BRKGA+CS for Sets A, B, and C, respectively. The entries in the tables are the instances; the solution obtained by the CPLEX (\(\hbox {sol}_C\)) within time limit of 10,800 s, the best solution (S*), the average solution (S) over 20 runs, the time to find the best solution (T*), the average running time (T) in seconds, the deviation (Dev) defined as \(100 \times (S - S^{*}) / S^{*}\), and the GAP provided by CPLEX defined as \((| upper bound - lower bound| / | {lower bound} |)\). The entries “-” in the column \(\hbox {sol}_C\) mean that the CPLEX was not able to find a feasible solution within the time limit set.

In Table 7, we present the results for the linear relaxation compared to the solutions of the proposed metaheuristics. The gaps were calculated using Eq. (23), where \(S_\mathrm{RL}\) is the solution of the linear relaxation, \(S_\mathrm{C}\) is the solution of the integer model from CPLEX, \(S_\mathrm{A}\) is the solution from A-BRKGA+CS, and \(S_\mathrm{B}\) is the solution from BRKGA+CS. In Time(s) column, the times used by CPLEX to solve the linear relaxation in instances of Sets A, B, and C are presented.

$$\begin{aligned} \mathrm{GAP}_\mathrm{RL}^i = \displaystyle \frac{|S_\mathrm{RL} - S_i|}{|S_\mathrm{RL}|}, \ i=\mathrm{A, B, C}. \end{aligned}$$
(23)

CPLEX was not able to find a feasible solution to all instances within the time limit set. For set A, CPLEX was able to solve 8 instances to optimality within the limit time set, and 6 instances with GAPs smaller than \(4\%\). But, it was unable to find a feasible solution in 14 of the 30 instances. For Sets B and C, CPLEX solved only 2 instances to optimality and was not able to find feasible solutions in 13 instances.

When the priority prizes were much larger than the costs (Set A), the assignment problem gains dominated the overall objective function and the model was able to solve more instances to optimality. The assignment problem has the integrality property, i.e., the solution of linear relaxation is the optimal solution of the problem, which may explain the smaller execution time for these instances compared to the instances in Sets B and C. As a consequence, the linear relaxation solution MTSPPP model approached the optimal solution. In Table 7, the values of \(\mathrm{GAP}_\mathrm{RL}^\mathrm{C}\) show the relative distance between the linear relaxation and the optimal solution (or better solution provided by CPLEX). Six instances had \(\mathrm{GAP}_\mathrm{RL}^\mathrm{C}\) smaller than \(4.5\%\), not counting instances \(\mathtt {eil51}\) and \(\mathtt {berlin52}\), where the solutions found within the time limit by CPLEX were not very good, or others where the CPLEX did not find a feasible solution,

In Sets B and C, the linear relaxation solution did not present good results, producing gaps around \(100\%\), and, in some cases, even larger gaps, as in the cases of instances \(\mathtt {gr48}\), \(\mathtt {berlin52}\), and \(\mathtt {brazil58}\), with gaps of \(3742.34\%\), \(1883.20\%\), and \(1157.06\%\), respectively, in Set C.

In Tables 45, and 6, we observe that the A-BRKGA+CS performed worse than the BRKGA+CS for the instances of Set A. The first metaheuristic was better in \(30\%\) of the instances, while the second one was better in \(43\%\). Both metaheuristics found the same solutions in \(26\%\) of the instances. In Set B, the A-BRKGA+CS was better in \(46\%\) of the instances, while the BRKGA+CS was better in \(23\%\). In Set C, the A-BRKGA+CS was better in \(33\%\) of the instances, while BRKGA+CS was better in \(20\%\). Observe that, when priority prize penalties decrease, the number of instances in which the metaheuristics return the same solution increases from \(26\%\) in A to \(30\%\) and \(46\%\) in B and C.

Table 4 Results from CPLEX, BRKGA+CS, and A-BRKGA+CS for Set A
Table 5 Results from CPLEX, BRKGA+CS, and A-BRKGA+CS for Set B
Table 6 Results from CPLEX, BRKGA+CS, and A-BRKGA+CS for Set C
Table 7 Results for the linear relaxation of Sets A, B, and C

To compare the two sets of solutions given by A-BRKGA+CS and BRKGA +CS for each set and analyze if there is a significant difference between them, the Wilcoxon signed-rank (WSR) test (Wilcoxon 1945) and Friedman test (Friedman 1940) were applied. The results of the tests are shown in Table 8 where Z is the sum of the flagged posts of the WSR test, Chi squared is the test statistic and df is the degrees of freedom of the Friedman test. We concluded that the A-BRKGA+CS and BRKGA+CS ranks were statistically equivalent (WSR test) and the distributions of the scores for the methods compared are equal (p value \(> 0.001\)—Friedman test) for all sets, at a 0.05 significance level.

From Tables 45, and 6, we observe that the proposed MTSPPP model found better solutions than the proposed metaheuristics for three instances of set A: \(\mathtt {dantzig42}\), \(\mathtt {gr48}\), and \(\mathtt {hk48}\). The gaps of the solutions obtained by A-BRKGA+CS, for these instances were \(0.76\%\), \(1.51\%\), and \(1.54\%\), respectively, and, for the solutions obtained by BRKGA+CS, they were \(0.77\%\), \(1.69\%\), and \(1.54\%\), respectively. The solutions obtained by the models have gaps \(0.52\%\), \(1.22\%\), and \(1.32\%\). On the other hand, the model found worse solutions for two instances, \(\mathtt {eil51}\) and \(\mathtt {berlin52}\), showing gaps of \(44.99\%\) and \(55.86\%\), respectively, while A-BRKGA+CS obtained solutions with gaps of \(1.97\%\) and \(2.69\%\), and BRKGA+CS obtained solutions with gaps of \(1.96\%\) and \(2.44\%\). For all the other instances in Sets B and C, the proposed model performed worse than the proposed heuristic methods.

Table 8 Result of the WSR test and Friedman test for Sets A, B, and C

6 Conclusion

In this paper, we presented a new variant of the Traveling Salesman Problem that considers variable costs and priority prizes, in addition to fixed costs. A mathematical model was proposed to represent this problem, using a multicommodity flow model. Two hybrid methods were proposed to find solutions to this problem. The first consisted of a Biased Random-Key Genetic Algorithm (BRKGA) with a Label Propagation (LP) community*** detection method. For the local search, an Iterated Local Search (ILS) combined with Variable Neighborhood Descent (VND) was proposed. The second method consisted of an Adaptive BRKGA with the same communities detection method and local search process.

For the computational experiments, 90 different instances were generated based on instances of the TSPLib library, TSP Test Data, and Sarubbi (2008). Three penalties to priority prizes were generated, resulting in three sets of instances: Set A consisted of penalizing the priority prizes for average fixed costs; Set B the penalty was using a balance between fixed costs and variable costs; Set C did not use any penalty.

The proposed model was able to provide good solutions for instances of Set A, but not for instances in Sets B and C. This happened because of the linear relaxation model. While in Set A the linear relaxation model was very good, approaching the optimal solution, in Sets B and C, it performed poorly. Because of penalties, the instances in Set A approached the assignment problem that has the integrality property, where the solution of linear relaxation is the optimum solution of the integer model.

Both the proposed heuristic methods performed well in both computational time and quality of solution of instances tested. In general, the A-BRKGA+CS method found better solutions (33) than the BRKGA+CS method (26), although it was not statistically different according to the Wilcoxon and Friedman tests performed. The main difference between the two methods is that the BRKGA+CS method needs to be calibrated, while the A-BRKGA+CS method is adaptive.

Future studies could be to develop other decoders in the proposed methods. For example, partial sequences of customers could be considered to build all the sequencing. Other local search methods also could be applied and methods to improve the upper bounds of the model could be developed. A re-decoding process can be studied to return the solution obtained by local search to the current population of BRKGA.