Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

6.1 Introduction

We consider in this chapter a combinatorial optimization problem, defined by a finite ground set E = {1, , n}, a set of feasible solutions F ⊆ 2E, and an objective function \(f: 2^{E} \rightarrow \mathbb{R}\). In its minimization version, we seek an optimal solution S F such that \(f(S^{{\ast}}) \leq f(S),\;\forall S \in F\). The ground set E, the cost function f, and the set of feasible solutions F are defined for each specific problem. For instance, in the case of the traveling salesman problem, the ground set E is that of all edges connecting the cities to be visited, f(S) is the sum of the costs of all edges in S, and F is formed by all edge subsets that determine a Hamiltonian cycle.

GRASP (Greedy Randomized Adaptive Search Procedure) [90, 91] is a multi-start or iterative metaheuristic, in which each iteration consists of two phases: construction and local search. The construction phase builds a solution using a greedy randomized adaptive algorithm. If this solution is not feasible, then it is necessary to apply a repair procedure to achieve feasibility or to make a new attempt to build a feasible solution. Once a feasible solution is obtained, its neighborhood is investigated until a local minimum is found during the local search phase. The best overall solution is kept as the result.

Extensive literature surveys on greedy randomized adaptive search procedures are presented in [98100, 212, 213, 224]. A first book on GRASP was published in 2016 by Resende and Ribeiro [215].

The pseudo-code in Fig. 6.1 illustrates the main blocks of a GRASP procedure for minimization, in which Max_Iterations iterations are performed and Seed is used as the initial seed for the pseudo-random number generator.

Fig. 6.1
figure 1

Pseudo-code of the GRASP metaheuristic

Figure 6.2 illustrates the construction phase with its pseudo-code. At each iteration of this phase, let the set of candidate elements be formed by all elements of the ground set E that can be incorporated into the partial solution being built, without impeding the construction of a feasible solution with the remaining ground set elements. The selection of the next element for incorporation is determined by the evaluation of all candidate elements according to a greedy evaluation function. This greedy function usually represents the incremental increase in the cost function due to the incorporation of this element into the solution under construction. The evaluation of the elements by this function leads to the creation of a restricted candidate list (RCL) formed by the best elements, i.e. those whose incorporation to the current partial solution results in the smallest incremental costs (this is the greedy aspect of the algorithm). The element to be incorporated into the partial solution is randomly selected from those in the RCL (this is the probabilistic aspect of the heuristic). Once the selected element is incorporated into the partial solution, the candidate list is updated and the incremental costs are reevaluated (this is the adaptive aspect of the heuristic). The above steps are repeated while there exists at least one candidate element. This strategy is similar to the semi-greedy heuristic proposed by Hart and Shogan [122], which is also a multi-start approach based on greedy randomized constructions, but without local search.

Fig. 6.2
figure 2

Pseudo-code of the construction phase

A randomized greedy construction procedure is not always able to produce a feasible solution. It may be necessary to apply a repair procedure to the solution to achieve feasibility. Examples of repair procedures can be found in [80, 81, 169, 180].

The solutions generated by a greedy randomized construction are not necessarily optimal, even with respect to simple neighborhoods. The local search phase usually improves the constructed solution. A local search algorithm works in an iterative fashion by successively replacing the current solution by a better solution in its neighborhood. It terminates when no better solution is found in the neighborhood. The pseudo-code of a basic local search algorithm starting from the solution Solution constructed in the first phase (and possibly made feasible by the repair heuristic) and using a neighborhood N is given in Fig. 6.3.

Fig. 6.3
figure 3

Pseudo-code of the local search phase

The speed and the effectiveness of a local search procedure depend on several aspects, such as the neighborhood structure, the neighborhood search technique, the strategy used for the evaluation of the cost function value at the neighbors, and the starting solution itself. The construction phase plays a very important role with respect to this last aspect, building high-quality starting solutions for the local search. Simple neighborhoods are typically used. The neighborhood search can be implemented using either a best-improving or a first-improving strategy. In the case of the best-improving strategy, all neighbors are investigated and the current solution is replaced by the best neighbor. In the case of a first-improving strategy, the current solution moves to the first neighbor whose cost function value is smaller than that of the current solution. In practice, we observed on many applications that quite often both strategies lead to the same final solution, but in smaller computation times when the first-improving strategy is used. We also observed that premature convergence to a bad local minimum is more likely to occur with a best-improving strategy.

6.2 Construction of the Restricted Candidate List

An especially appealing characteristic of GRASP is the ease with which it can be implemented. Few parameters need to be set and tuned. Therefore, development can focus on implementing appropriate data structures for efficient construction and local search algorithms. GRASP has two main parameters: one related to the stopping criterion and the other to the quality of the elements in the restricted candidate list.

The stopping criterion used in the pseudo-code described in Fig. 6.1 is determined by the number Max_Iterations of iterations. Although the probability of finding a new solution improving the incumbent (current best solution) decreases with the number of iterations, the quality of the incumbent does not worsen with the number of iterations. Since the computation time does not vary much from iteration to iteration, the total computation time is predictable and increases linearly with the number of iterations. Consequently, the larger the number of iterations, the larger will be the computation time and the better will be the solution found.

For the construction of the RCL used in the first phase we consider, without loss of generality, a minimization problem as the one formulated in Sect. 6.1. We denote by c(e) the incremental cost associated with the incorporation of element eE into the solution under construction. At any GRASP iteration, let c min and c max be, respectively, the smallest and the largest incremental costs.

The restricted candidate list RCL is made up of the elements eE with the best (i.e., the smallest) incremental costs c(e). This list can be limited either by the number of elements (cardinality-based) or by their quality (value-based). In the first case, it is made up of the p elements with the best incremental costs, where p is a parameter. In this chapter, the RCL is associated with a threshold parameter α ∈ [0, 1]. The restricted candidate list is formed by all elements eE which can be feasibly inserted into the partial solution under construction and whose quality is superior to the threshold value, i.e., c(e) ∈ [c min, c min + α(c maxc min)]. The case α = 0 corresponds to a pure greedy algorithm, while α = 1 is equivalent to a random construction. The pseudo-code in Fig. 6.4 is a refinement of the greedy randomized construction pseudo-code shown in Fig. 6.2. It shows that the parameter α controls the amounts of greediness and randomness in the algorithm.

Fig. 6.4
figure 4

Refined pseudo-code of the construction phase

GRASP construction can be viewed as a repetitive sampling technique. Each iteration produces a sample solution from an unknown distribution, whose mean and variance are functions of the restrictive nature of the RCL. For example, if the RCL is restricted to a single element, then the same solution will be produced at all iterations. The variance of the distribution will be zero and the mean will be equal to the value of the greedy solution. If the RCL is allowed to have more elements, then many different solutions will be produced, implying a larger variance. Since greediness plays a smaller role in this case, the average solution value should be worse than that of the greedy solution. However, the value of the best solution found outperforms the average value and can be near-optimal or even optimal. It is unlikely that GRASP will find an optimal solution if the average solution value is high, even if there is a large variance in the overall solution values. On the other hand, if there is little variance in the overall solution values, it is also unlikely that GRASP will find an optimal solution, even if the average solution is low. What often leads to good solutions are relatively low average solution values in the presence of a relatively large variance, such as is the case for α = 0. 2.

Another interesting observation is that the distances between the solutions obtained at each iteration and the best solution found increase as the construction phase moves from more greedy to more random. This causes the average time taken by the local search to increase. Very often, many GRASP solutions may be generated in the same amount of time required by the local search procedure to converge from a single random start. In these cases, the time saved by starting the local search from good initial solutions can be used to improve solution quality by performing more GRASP iterations.

These results are illustrated in Table 6.1 and Fig. 6.5, for an instance of the MAXSAT problem [219] where 1000 iterations were run. For each value of α ranging from 0 (purely random construction for maximization problems) to 1 (purely greedy construction for maximization problems), we give in Table 6.1 the average Hamming distance between each solution built during the construction phase and the corresponding local optimum obtained after local search, the average number of moves from the former to the latter, the local search time in seconds, and the total processing time in seconds. Figure 6.5 summarizes the values observed for the total processing time and the local search time. We notice that both time measures considerably decrease as α tends to 1, approaching the purely greedy choice. In particular, we observe that the average local search time taken by α = 0 (purely random) is approximately 2.5 times longer than the time taken by α = 0. 9 (almost greedy). In this example, two to three greedily constructed solutions can be investigated in the same time needed to apply local search to one single randomly constructed solution. The appropriate choice of the value of the RCL parameter α is clearly critical and relevant to achieve a good balance between computation time and solution quality.

Table 6.1 Average number of moves and local search time as a function of the RCL parameter α for a maximization problem

Prais and Ribeiro [201] show that using a single fixed value for the RCL parameter α very often hinders finding a high-quality solution, which could be found if another value is used. They propose an extension of the basic GRASP procedure, which they call Reactive GRASP, in which the parameter α is self-tuned and its value is periodically modified depending on the quality of the solutions obtained along the search. In particular, computational experiments on the problem of traffic assignment in communication satellites [202] show that Reactive GRASP finds better solutions than the basic algorithm for many test instances. These results motivated the study of the behavior of GRASP with different strategies for the variation of the value of the RCL parameter α:

Fig. 6.5
figure 5

Total CPU time and local search CPU time as a function of the RCL parameter α for a maximization problem (1000 repetitions for each value of α)

R: :

α self tuned with a Reactive GRASP procedure;

E: :

α randomly chosen from a uniform discrete probability distribution;

H: :

α randomly chosen from a decreasing non-uniform discrete probability distribution;

F: :

α fixed, close to the purely greedy choice value.

We summarize the results obtained by the experiments reported in [200, 201]. These four strategies are incorporated into the GRASP procedures developed for four different optimization problems: (P-1) matrix decomposition for traffic assignment in communication satellites [202]; (P-2) set covering [90]; (P-3) weighted MAX-SAT [219, 221]; and (P-4) graph planarization [210, 226]. Let

$$\displaystyle{ \varPsi =\{\alpha _{1},\ldots,\alpha _{m}\} }$$

be the set of possible values for the parameter α for the first three strategies. The strategy for choosing and self-tuning the value of α in the case of the Reactive GRASP procedure (R) is described later in Sect. 6.3. In the case of the strategy (E) based on using the discrete uniform distribution, all choice probabilities are equal to 1∕m. The third case corresponds to the a hybrid strategy (H), in which the authors considered p(α = 0. 1) = 0. 5, p(α = 0. 2) = 0. 25, p(α = 0. 3) = 0. 125, p(α = 0. 4) = 0. 03, p(α = 0. 5) = 0. 03, p(α = 0. 6) = 0. 03, p(α = 0. 7) = 0. 01, p(α = 0. 8) = 0. 01, p(α = 0. 9) = 0. 01, and p(α = 1. 0) = 0. 005. Finally, in the last strategy (F), the value of α is fixed as recommended in the original references of problems P-1 to P-4 cited above, where this parameter was tuned for each problem. A subset of the literature instances was considered for each class of test problems. The results reported in [201] are summarized in Table 6.2. For each problem, we first list the number of instances considered. Next, for each strategy, we give the number of times it found the best solution (hits), as well as the average CPU time (in seconds) on an IBM 9672 model R34. The number of iterations was fixed at 10,000.

Table 6.2 Computational results for different strategies for the variation of parameter α

Strategy (F) presented the shortest average computation times for three out the four problem types. It was also the one with the least variability in the constructed solutions and, in consequence, found the best solution the fewest times. The reactive strategy (R) is the one which most often found the best solutions, however, at the cost of computation times that are longer than those of some of the other strategies. The high number of hits observed for strategy (E) also illustrates the effectiveness of strategies based on the variation of the RCL parameter.

6.3 Alternative Construction Mechanisms

A possible shortcoming of the standard GRASP framework is the independence of its iterations, i.e., the fact that it does not learn from the search history or from solutions found in previous iterations. This is so because the basic algorithm discards information about any solution previously encountered that does not improve the incumbent. Information gathered from good solutions can be used to implement memory-based procedures to influence the construction phase, by modifying the selection probabilities associated with each element of the RCL or by enforcing specific choices. Another possible shortcoming of the greedy randomized construction is its complexity. At each step of the construction, each yet unselected candidate element has to be evaluated by the greedy function. In cases where the difference between the number of elements in the ground set and the number of elements that appear in a solution is large, this may not be very efficient.

In this section, we consider enhancements and alternative techniques for the construction phase of GRASP. They include random plus greedy, sampled greedy, Reactive GRASP, cost perturbations, bias functions, memory and learning, local search on partially constructed solutions, and Lagrangean GRASP heuristics.

6.3.1 Random Plus Greedy and Sampled Greedy Construction

In Sect. 6.2, we described the semi-greedy construction scheme used to build randomized greedy solutions that serve as starting points for local search. Two other randomized greedy approaches were proposed in [216], with smaller worst-case complexities than the semi-greedy algorithm.

Instead of combining greediness and randomness at each step of the construction procedure, the random plus greedy scheme applies randomness during the first p construction steps to produce a random partial solution. Next, the algorithm completes the solution with one or more pure greedy construction steps. The resulting solution is randomized greedy. One can control the balance between greediness and randomness in the construction by changing the value of the parameter p. Larger values of p are associated with solutions that are more random, while smaller values result in greedier solutions.

Similar to the random plus greedy procedure, the sampled greedy construction also combines randomness and greediness but in a different way. This procedure is also controlled by a parameter p. At each step of the construction process, the procedure builds a restricted candidate list by randomly sampling min{p, | C | } elements of the candidate set C. Each element of the RCL is evaluated by the greedy function. The element with the smallest greedy function value is added to the partial solution. This two-step process is repeated until there are no more candidate elements. The resulting solution is also randomized greedy. The balance between greediness and randomness can be controlled by changing the value of the parameter p, i.e. the number of candidate elements that are sampled. Small sample sizes lead to more random solutions, while large sample sizes lead to greedier solutions.

6.3.2 Reactive GRASP

The first strategy to incorporate a learning mechanism in the memoryless construction phase of the basic GRASP was the Reactive GRASP procedure introduced in Sect. 6.2. In this case, the value of the RCL parameter α is not fixed, but instead is randomly selected at each iteration from a discrete set of possible values. This selection is guided by the solution values found along the previous iterations. One way to accomplish this is to use the rule proposed in [202]. Let Ψ = { α 1, , α m} be a set of possible values for α. The probabilities associated with the choice of each value are all initially made equal to p i = 1∕m, for i = 1, , m. Furthermore, let z be the incumbent solution and let A i be the average value of all solutions found using α = α i, for i = 1, , m. The selection probabilities are periodically reevaluated by taking p i = q i j = 1 mq j, with q i = z A i for i = 1, , m. The value of q i will be larger for values of α = α i leading to the best solutions on average. Larger values of q i correspond to more suitable values for the parameter α. The probabilities associated with the more appropriate values will then increase when they are reevaluated.

The reactive approach leads to improvements over the basic GRASP in terms of robustness and solution quality, due to greater diversification and less reliance on parameter tuning. In addition to the applications in [200202], this approach has been used in power system transmission network planning [46], job shop scheduling [49], channel assignment in mobile phone networks [118], rural road network development [249], capacitated location [71], strip-packing [15], and a combined production-distribution problem [50].

6.3.3 Cost Perturbations

The idea of introducing some noise into the original costs is similar to that in the so-called “noising” method of Charon and Hudry [57, 58]. It adds more flexibility into the algorithm design and may be even more effective than the greedy randomized construction of the basic GRASP procedure in circumstances where the construction algorithms are not very sensitive to randomization. This is indeed the case for the shortest-path heuristic of Takahashi and Matsuyama [259], used as one of the main building blocks of the construction phase of the hybrid GRASP procedure proposed by Ribeiro et al. [233] for the Steiner problem in graphs. Another situation where cost perturbations can be very effective appears when no greedy algorithm is available for straightforward randomization. This happens to be the case of the hybrid GRASP developed by Canuto et al. [54] for the prize-collecting Steiner tree problem, which makes use of the primal-dual algorithm of Goemans and Williamson [117] to build initial solutions using perturbed costs.

In the case of the GRASP for the prize-collecting Steiner tree problem described in [54], a new solution is built at each iteration using node prizes updated by a perturbation function, based on the structure of the current solution. Two different prize perturbation schemes were used. In perturbation by eliminations, the primal-dual algorithm used in the construction phase is driven to build a new solution without some of the nodes that appeared in the solution constructed in the previous iteration. In perturbation by prize changes, some noise is introduced into the node prizes to change the objective function, similarly to what is proposed in [57, 58].

The cost perturbation methods used in the GRASP for the minimum Steiner tree problem described in [233] incorporate learning mechanisms associated with intensification and diversification strategies. Three distinct weight randomization methods were applied. At a given GRASP iteration, the modified weight of each edge is randomly selected from a uniform distribution over an interval which depends on the selected weight randomization method applied at that iteration. The different weight randomization methods use frequency information and may be used to enforce intensification and diversification strategies. The experimental results reported in [233] show that the strategy combining these three perturbation methods is more robust than any of them used in isolation, leading to the best overall results on a quite broad mix of test instances with different characteristics. The GRASP heuristic using this cost perturbation strategy is among the most effective heuristics currently available for the Steiner problem in graphs.

6.3.4 Bias Functions

In the construction procedure of the basic GRASP, the next element to be introduced in the solution is chosen at random from the candidates in the RCL. The elements of the RCL are assigned equal probabilities of being chosen. However, any probability distribution can be used to bias the selection toward some particular candidates. Another construction mechanism was proposed by Bresina [51], where a family of such probability distributions is introduced. They are based on the rank r(e) assigned to each candidate element eC, according to its greedy function value. Several bias functions were proposed, such as:

  • random bias: bias(r) = 1;

  • linear bias: bias(r) = 1∕r;

  • log bias: bias(r) = log−1(r + 1);

  • exponential bias: bias(r) = e r; and

  • polynomial bias of order n: bias(r) = r n.

Let r(e) denote the rank of element eC and let bias(r(e)) be one of the bias functions defined above. Once these values have been evaluated for all elements in the candidate set C, the probability π(e) of selecting element eC is

$$\displaystyle{ \pi (e) = \frac{\mathtt{bias}(r(e))} {\sum _{e'\in C}\mathtt{bias}(r(e'))}. }$$
(6.1)

The evaluation of these bias functions may be restricted to the elements of the RCL. Bresina’s selection procedure restricted to elements of the RCL was used in [49]. The standard GRASP uses a random bias function.

6.3.5 Intelligent Construction: Memory and Learning

Fleurent and Glover [106] observed that the basic GRASP does not use a long-term memory (information gathered in previous iterations) and proposed a long-term memory scheme to address this issue in multi-start heuristics. Long-term memory is one of the fundamentals on which tabu search relies.

Their scheme maintains a pool of elite solutions to be used in the construction phase. To become an elite solution, a solution must be either better than the best member of the pool, or better than its worst member and sufficiently different from the other solutions in the pool. For example, one can count identical solution vector components and set a threshold for rejection.

A strongly determined variable is one that cannot be changed without eroding the objective or changing significantly other variables. A consistent variable is one that receives a particular value in a large portion of the elite solution set. Let the intensity function I(e) be a measure of the strong determination and consistency features of a solution element eE. Then, I(e) becomes larger as e appears more often in the pool of elite solutions. The intensity function is used in the construction phase as follows. Recall that c(e) is the greedy function, i.e. the incremental cost associated with the incorporation of element eE into the solution under construction. Let K(e) = F(c(e), I(e)) be a function of the greedy and intensification functions. For example, K(e) = λc(e) + I(e). The intensification scheme biases selection from the RCL to those elements eE with a high value of K(e) by setting its selection probability to be p(e) = K(e)∕ s ∈ RCLK(s).

The function K(e) can vary with time by changing the value of λ. For example, λ may be set to a large value that is decreased when diversification is called for. Procedures for changing the value of λ are given by Fleurent and Glover [106] and Binato et al. [49].

6.3.6 POP in Construction

The Proximate Optimality Principle (POP) is based on the idea that “good solutions at one level are likely to be found ‘close to’ good solutions at an adjacent level” [115]. Fleurent and Glover [106] provided a GRASP interpretation of this principle. They suggested that imperfections introduced during steps of the GRASP construction can be “ironed-out” by applying local search during (and not only at the end of) the GRASP construction phase.

Because of efficiency considerations, a practical implementation of POP to GRASP consists in applying local search a few times during the construction phase, but not at every construction iteration. Local search was applied by Binato et al. [49] after 40% and 80% of the construction moves were performed, as well as at the end of the construction phase.

6.3.7 Lagrangean GRASP Heuristics

Lagrangean relaxation [45, 105] is a mathematical programming technique that can be used to provide lower bounds for minimization problems. Held and Karp [123, 124] were among the first to explore the use of the dual multipliers produced by Lagrangean relaxation to derive lower bounds, applying this idea in the context of the traveling salesman problem. Lagrangean heuristics further explore the use of different dual multipliers to generate feasible solutions. Beasley [43, 44] described a Lagrangean heuristic for set covering.

6.3.7.1 Lagrangean Relaxation and Subgradient Optimization

Lagrangean relaxation can be used to provide lower bounds for combinatorial optimization problems. However, the primal solutions produced by the algorithms used to solve the Lagrangean dual problem are not necessarily feasible. Lagrangean heuristics exploit dual multipliers to generate primal feasible solutions.

Given a mathematical programming problem \(\mathcal{P}\) formulated as

$$\displaystyle{ f^{{\ast}} =\min f(x) }$$
(6.2)
$$\displaystyle{ g_{i}(x) \leq 0,\quad i = 1,\ldots,m, }$$
(6.3)
$$\displaystyle{ x \in X, }$$
(6.4)

its Lagrangean relaxation is obtained by associating dual multipliers \(\lambda _{i} \in \mathbb{R}_{+}\) with each inequality (6.3), for i = 1, , m. This results in the following Lagrangean relaxation problem LRP(λ)

$$\displaystyle{ \min f'(x) = f(x) +\sum _{ i=1}^{m}\lambda _{ i} \cdot g_{i}(x) }$$
(6.5)
$$\displaystyle{ x \in X, }$$
(6.4)

whose optimal solution x(λ) gives a lower bound f′(x(λ)) to the optimal value of the original problem \(\mathcal{P}\) defined by (6.2)–(6.4). The best (dual) lower bound is given by the solution of the Lagrangean dual problem \(\mathcal{D}\)

$$\displaystyle{ f_{\mathcal{D}} = f'(x(\lambda ^{{\ast}})) =\max _{\lambda \in \mathbb{R}_{+}^{m}}f'(x(\lambda )). }$$
(6.6)

Subgradient optimization is used to solve the dual problem \(\mathcal{D}\) defined by (6.6). Subgradient algorithms start from any feasible set of dual multipliers, such as λ i = 0, for i = 1, , m, and iteratively generate updated multipliers.

At any iteration q, let λ q be the current vector of multipliers and let x(λ q) be an optimal solution to problem LRP(λ q), whose optimal value is f′(x(λ q)). Furthermore, let \(\bar{f}\) be a known upper bound to the optimal value of problem \(\mathcal{P}\). Additionally, let \(g^{q} \in \mathbb{R}^{m}\) be a subgradient of f′(x) at x = x(λ q), with g i q = g i(x(λ q)) for i = 1, , m. To update the Lagrangean multipliers, the algorithm makes use of a step size

$$\displaystyle{ d^{q} = \frac{\eta \cdot (\bar{f} - f'(x(\lambda ^{q})))} {\sum _{i=1}^{m}(g_{i}^{q})^{2}}, }$$
(6.7)

where η ∈ (0, 2]. Multipliers are then updated as

$$\displaystyle{ \lambda _{i}^{q+1} =\max \{ 0;\lambda _{ i}^{q} - d^{q} \cdot g_{ i}^{q}\},\quad i = 1,\ldots,m, }$$
(6.8)

and the subgradient algorithm proceeds to iteration q + 1.

6.3.7.2 A Template for Lagrangean Heuristics

We describe next a template for Lagrangean heuristics that make use of the dual multipliers λ q and of the optimal solution x(λ q) to each problem LRP(λ q) to build feasible solutions to the original problem \(\mathcal{P}\) defined by (6.2)–(6.4). In the following, we assume that the objective function and all constraints are linear functions, i.e. f(x) = i = 1 nc jx j and g i(x) = j = 1 nd ijx je i, for i = 1, , m.

Let \(\mathcal{H}\) be a primal heuristic that builds a feasible solution x to \(\mathcal{P}\), starting from the initial solution x 0 = x(λ q) at every iteration q of the subgradient algorithm. Heuristic \(\mathcal{H}\) is first applied using the original costs c j, i.e. using the cost function f(x). In any subsequent iteration q of the subgradient algorithm, \(\mathcal{H}\) uses either the Lagrangean reduced costs cj = c j i = 1 mλ i qd ij or the complementary costs \(\bar{c}_{j} = (1 - x_{j}(\lambda ^{q})) \cdot c_{j}\).

Let \(x^{\mathcal{H},\gamma }\) be the solution obtained by heuristic \(\mathcal{H}\), using a generic cost vector γ corresponding to either one of the above modified cost schemes or to the original cost vector. Its cost can be used to update the upper bound \(\bar{f}\) to the optimal value of the original problem. This upper bound can be further improved by local search and is used to adjust the step size defined by Eq. (6.7).

Figure 6.6 shows the pseudo-code of a Lagrangean heuristic. Lines 1–4 initialize the upper and lower bounds, the iteration counter, and the dual multipliers. The iterations of the subgradient algorithm are performed along the loop defined in lines 5–24. The reduced costs are computed in line 6 and the Lagrangean relaxation problem is solved in line 7. In the first iteration of the Lagrangean heuristic, the original cost vector is assigned to γ in line 9, while in subsequent iterations a modified cost vector is assigned to γ in line 11. Heuristic \(\mathcal{H}\) is applied in line 13 at the first iteration and after every H iterations thereafter (i.e., whenever the iteration counter q is a multiple of the input parameter H) to produce a feasible solution \(x^{\mathcal{H},\gamma }\) to problem \(\mathcal{P}\). If the cost of this solution is smaller than the current upper bound, then the best solution and its cost are updated in lines 14–18. If the lower bound f′(x(λ q)) is greater than the current lower bound \(f_{\mathcal{D}}\), then \(f_{\mathcal{D}}\) is updated in line 19. Line 20 computes a subgradient at x(λ q) and line 21 computes the step size. The dual multipliers are updated in line 22 and the iteration counter is incremented in line 23. The best solution found and its cost are returned in line 24.

The strategy proposed by Held et al. [125] is commonly used in the implementation of Lagrangean heuristics to update the dual multipliers from one iteration to the next. Beasley [44] reported as computationally useful the adjustment of components of the subgradients to zero whenever they do not effectively contribute to the update of the multipliers, i.e., arbitrarily setting g i q = 0 whenever g i q > 0 and λ i q = 0, for i = 1, , m.

Fig. 6.6
figure 6

Pseudo-code of a template for a Lagrangean heuristic

Different choices for the initial solution x 0, for the modified costs γ, and for the primal heuristic \(\mathcal{H}\) itself lead to different variants of the above algorithm. The integer parameter H defines the frequency in which \(\mathcal{H}\) is applied. The smaller the value of H, the greater the number of times \(\mathcal{H}\) is applied. Therefore, the computation time increases as the value of H decreases. In particular, one should set H = 1 if the primal heuristic \(\mathcal{H}\) is to be applied at every iteration.

6.3.7.3 Lagrangean GRASP

Pessoa et al. [195, 196] proposed the hybridization of GRASP and Lagrangean relaxation leading to the Lagrangean GRASP heuristic described below. Different choices for the primal heuristic \(\mathcal{H}\) in the template of the algorithm in Fig. 6.6 lead to distinct Lagrangean heuristics. We consider two variants: the first makes use of a greedy algorithm with local search, while in the second a GRASP with path-relinking (see Sect. 6.4) is used.

Greedy heuristic: This heuristic greedily repairs the solution x(λ q) produced in line 7 of the Lagrangean heuristic described in Fig. 6.6 to make it feasible for problem \(\mathcal{P}\). It makes use of the modified costs (c′ or \(\bar{c}\)). Local search can be applied to the resulting solution, using the original cost vector c. We refer to this approach as a greedy Lagrangean heuristic (GLH).

GRASP heuristic: Instead of simply performing one construction step followed by local search, as GLH does, this variant applies a GRASP heuristic to repair the solution x(λ q) produced in line 7 of the Lagrangean heuristic to make it feasible for problem \(\mathcal{P}\).

Although the GRASP heuristic produces better solutions than the greedy heuristic, the greedy heuristic is much faster. To appropriately address this trade-off, we adapt line 10 of Fig. 6.6 to use the GRASP heuristic with probability β and the greedy heuristic with probability 1 −β, where β is a parameter of the algorithm.

We note that this strategy involves three main parameters: the number H of iterations after which the basic heuristic is always applied, the number Q of iterations performed by the GRASP heuristic when it is chosen as the primal heuristic, and the probability β of choosing the GRASP heuristic as \(\mathcal{H}\). We shall refer to the Lagrangean heuristic that uses this hybrid strategy as LAGRASP(β, H, Q).

We next summarize computational results obtained for 135 instances of the set k-covering problem. These instances have up to 400 constraints and 4000 binary variables. The set k-covering, or set multi-covering, problem is an extension of the classical set covering problem, in which each element is required to be covered at least k times. The problem finds applications in the design of communication networks and in computational biology.

The first experiment with the GRASP Lagrangean heuristic established the relationship between running times and solution quality for different parameter settings. Parameter β, the probability of GRASP being applied as the heuristic \(\mathcal{H}\), was set to 0, 0.25, 0.50, 0.75, and 1. Parameter H, the number of iterations between successive calls to the heuristic \(\mathcal{H}\), was set to 1, 5, 10, and 50. Parameter Q, the number of iterations carried out by the GRASP heuristic, was set to 1, 5, 10, and 50. By combining some of these parameter values, 68 variants of the hybrid LAGRASP(β, H, Q) heuristic were created. Each variant was applied eight times to a subset of 21 instances, with different initial seeds being given to the random number generator.

The plot in Fig. 6.7 summarizes the results for all variants evaluated, displaying points whose coordinates are the values of the average deviation from the best known solution value and the total time in seconds for processing the eight runs on all instances, for each combination of parameter values. Eight variants of special interest are identified and labeled with the corresponding parameters β, H, and Q, in this order. These variants correspond to selected Pareto points in the plot in Fig. 6.7. Setting β = 0 and H = 1 corresponds to the greedy Lagrangean heuristic (GLH) or, equivalently, to LAGRASP(0,1,−), whose average deviation (in percentage) from the best value amounts to 0.12% in 4859.16 s of total running time. Table 6.3 shows the average deviation from the best known solution value and the total time for each of the eight selected variants.

Fig. 6.7
figure 7

Average deviation from the best value and total running time for 68 different variants of LAGRASP on a reduced set of 21 instances of the set k-covering problem: each point represents a unique combination of parameters β, H, and Q

Table 6.3 Summary of the numerical results obtained with the selected variants of the GRASP Lagrangean heuristic on a reduced set of 21 instances of the set k-covering problem

In another experiment, all 135 test instances were considered for the comparison of the above selected eight variants of LAGRASP. Table 6.4 summarizes the results obtained by the eight selected variants. It shows that LAGRASP(1,1,50) found the best solutions, with an average deviation from the best values amounting to 0.079%. It also found the best known solutions in 365 runs (each variant was run eight times on each instance), again with the best performance when the eight variants are evaluated side by side, although its running times are the largest. On the other hand, the smallest running times were observed for LAGRASP(0,50,−), which was over 3000 times faster than LAGRASP(1,1,50) but found the worst-quality solutions among the eight variants considered.

Table 6.4 Summary of the numerical results obtained with the selected variants of the GRASP Lagrangean heuristic on the full set of 135 instances of the set k-covering problem

Figure 6.8 illustrates the merit of the proposed approach for one of the test instances. We first observe that all variants reach the same lower bounds, as expected, since they depend exclusively on the common subgradient algorithm. However, as the lower bound appears to stabilize, the upper bound obtained by GLH (LAGRASP(0,1,−) also seems to freeze. On the other hand, the other variants continue to make improvements by discovering better upper bounds, since the randomized GRASP construction helps them to escape from locally optimal solutions and find new, improved upper bounds.

Fig. 6.8
figure 8

Evolution of lower and upper bounds over the iterations for different variants of LAGRASP. The number of iterations taken by each LAGRASP variant depends on the step-size, which in turn depends on the upper bounds produced by each heuristic

Finally, we provide a comparison between GRASP with backward path-relinking and the LAGRASP variants on all 135 test instances when the same time limits are used to stop all heuristics. Eight runs were performed for each heuristic and each instance, using different initial seeds for the random number generator. Each heuristic was run a total of (8 × 135 = ) 1080 times. The results in Table 6.5 show that all variants of LAGRASP outperformed GRASP with backward path-relinking and were able to find solutions whose costs are very close to or as good as the best known solution values, while GRASP with backward path-relinking found solutions whose costs are on average 4.05% larger than the best known solution values.

Table 6.5 Summary of results for the best variants of LAGRASP and GRASP

Figure 6.9 displays for one test instance the typical behavior of these heuristics. As opposed to the GRASP with path-relinking, the Lagrangean heuristics are able to escape from local optima for longer and keep on improving the solutions to obtain the best results.

We note that an important feature of Lagrangean heuristics is that they provide not only a feasible solution (which gives an upper bound, in the case of a minimization problem), but also a lower bound that may be used to give an estimate of the optimality gap that may be considered as a stopping criterion.

6.4 Path-Relinking

The LAGRASP heuristics presented in Sect. 6.3.7.3 made use of path-relinking. Path-relinking is another enhancement to the basic GRASP procedure, leading to significant improvements in both solution quality and running times. This technique was originally proposed by Glover [112] as an intensification strategy to explore trajectories connecting elite solutions obtained by tabu search or scatter search [113, 115, 116].

Fig. 6.9
figure 9

Evolution of solution costs with time for the best variants of LAGRASP and GRASP with backward path-relinking (GPRb)

We consider the undirected graph associated with the solution space G = (S, M), where the nodes in S correspond to feasible solutions and the edges in M correspond to moves in the neighborhood structure, i.e. (i, j) ∈ M if and only if iS, jS, jN(i) and iN(j), where N(s) denotes the neighborhood of a node sS. Path-relinking is usually carried out between two solutions: one is called the initial solution, while the other is the guiding solution. One or more paths in the solution space graph connecting these solutions are explored in the search for better solutions. Local search is applied to the best solution in each of these paths, since there is no guarantee that the best solution is locally optimal.

Let sS be a node on the path between an initial solution and a guiding solution gS. Not all solutions in the neighborhood N(s) are candidates to follow s on the path from s to g. We restrict the choice only to those solutions that are more similar to g than s. This is accomplished by selecting moves from s that introduce attributes contained in the guiding solution g. Therefore, path-relinking may be viewed as a strategy that seeks to incorporate attributes of high quality solutions (i.e. the guiding elite solutions), by favoring these attributes in the selected moves.

The use of path-relinking within a GRASP procedure, as an intensification strategy applied to each locally optimal solution, was first proposed by Laguna and Martí [147]. It was followed by several extensions, improvements, and successful applications  [8, 9, 22, 54, 104, 183, 206, 212, 216, 217, 227, 233, 249]. A survey of GRASP with path-relinking can be found in [213].

Enhancing GRASP with path-relinking almost always improves the performance of the heuristic. As an illustration, Fig. 6.10 shows time-to-target plots [7, 10, 228, 235, 236] for GRASP and GRASP with path-relinking implementations for four different applications. These time-to-target plots show the empirical cumulative probability distributions of the time-to-target random variable when using pure GRASP and GRASP with path-relinking, i.e., the time needed to find a solution at least as good as a prespecified target value. For all problems, the plots show that GRASP with path-relinking is able to find target solutions faster than GRASP.

GRASP with path-relinking makes use of an elite set to collect a diverse pool of high-quality solutions found during the search. This pool is limited in size, i.e. it can have at most Max_Elite solutions. Several schemes have been proposed for the implementation of path-relinking, which may be applied as:

  • an intensification strategy, between each local optimum obtained after the local search phase and one or more elite solutions;

  • a post-optimization step, between every pair of elite solutions;

  • an intensification strategy, periodically (after a fixed number of GRASP iterations since the last intensification phase) submitting the pool of elite solutions to an evolutionary process (see Sect. 6.4.7);

  • a post-optimization phase, submitting the pool of elite solutions to an evolutionary process; or

  • any other combination of the above schemes.

Fig. 6.10
figure 10

Time to target plots comparing running times of pure GRASP and GRASP with path-relinking on four instances of distinct problem types: three index assignment, maximum satisfiability, bandwidth packing, and quadratic assignment

The pool of elite solutions is initially empty. Each locally optimal solution obtained by local search and each solution resulting from path-relinking is considered as a candidate to be inserted into the pool. If the pool is not yet full, the candidate is simply added to the pool. Otherwise, if the candidate is better than the incumbent (best solution found so far), it replaces an element of the pool. In case the candidate is better than the worst element of the pool but not better than the best element, then it replaces some element of the pool if it is sufficiently different from every other solution currently in the pool. To balance the impact on pool quality and diversity, the element selected to be replaced is the one that is most similar to the entering solution among those elite solutions of quality no better than the entering solution [216].

Given a local optimum s 1 produced at the end of a GRASP iteration, we need to select at random a solution s 2 from the pool to apply path-relinking between s 1 and s 2. In principle, any pool solution could be selected. However, we may want to avoid pool solutions that are too similar to s 1, because relinking two solutions that are similar limits the scope of the path-relinking search. If the solutions are represented by 0–1 indicator vectors, we should favor pairs of solutions that are far from each other, based on their Hamming distance (i.e., the number of components that take on different values in each solution). A strategy introduced in Resende and Werneck [216] is to select a pool element s 2 at random with a probability proportional to the Hamming distance between the pool element and the local optimum s 1. Since the number of paths between two solutions grows exponentially with their Hamming distance, this strategy favors pool elements with a large number of paths connecting them to and from s 1.

After determining which solution (s 1 or s 2) will be designated the initial solution i and which will be the guiding solution g, the algorithm starts by computing the set Δ(i, g) of components in which i and g differ. This set corresponds to the moves which should be applied to i to reach g. Starting from the initial solution, the best move in Δ(i, g) still not performed is applied to the current solution, until the guiding solution is reached. By best move, we mean the one that results in the highest quality solution in the restricted neighborhood. The best solution found along this trajectory is submitted to local search and returned as the solution produced by the path-relinking algorithm.

Fig. 6.11
figure 11

Pseudo-code of a template of a GRASP with path-relinking for a minimization problem

The pseudo-code shown in Fig. 6.11 summarizes the steps of a GRASP with path-relinking for a minimization problem. The pseudo-code follows the structure of the basic GRASP algorithm in Fig. 6.1. Lines 1 and 2 initialize the pool of elite solutions and the best solution value, respectively. Path-relinking is performed in line 11 between the solution Solution obtained at the end of the local search phase (line 8) and a solution Solution′ randomly selected from the pool of elite solutions \(\mathcal{E}\) (line 10). Procedure PR(Solution, Solution′) could make use, for example, of any variant of a pure or combined path-relinking strategy. The best overall solution found Best_Solution is returned in line 19 after the stopping criterion is satisfied.

Several alternatives have been considered and combined in recent implementations of path-relinking. These include forward, backward, back and forward, mixed, truncated, greedy randomized adaptive, evolutionary, and external path-relinking. All these alternatives, which are described in the following, involve trade-offs between computation time and solution quality.

6.4.1 Forward Path-Relinking

In forward path-relinking, the GRASP local optimum is designated as the initial solution and the pool solution is made the guiding solution. This is the original scheme proposed by Laguna and Martí [147].

6.4.2 Backward Path-Relinking

In backward path-relinking, the pool solution is designated as the initial solution and the GRASP local optimum is made the guiding one. This scheme was originally proposed in Aiex et al. [9] and Resende and Ribeiro [212]. The main advantage of this approach over forward path-relinking comes from the fact that, in general, there are more high-quality solutions near pool elements than near GRASP local optima. Backward path-relinking explores more thoroughly the neighborhood around the pool solution, whereas forward path-relinking explores more the neighborhood around the GRASP local optimum. Experiments in [9, 212] have shown that backward path-relinking usually outperforms forward path-relinking.

6.4.3 Back and Forward Path-Relinking

Back and forward path-relinking combines forward and backward path-relinking. As shown in [9, 212], it finds solutions at least as good as forward path-relinking or backward path-relinking, but at the expense of taking about twice as long to run. The reason that back and forward path-relinking often finds solutions of better quality than simple backward or forward path-relinking stems from the fact that it thoroughly explores the neighborhoods of both solutions s 1 and s 2.

Fig. 6.12
figure 12

Time-to-target plots for pure GRASP and four variants of GRASP with path-relinking (forward, backward, back and forward, and mixed) on an instance of the 2-path network design problem

6.4.4 Mixed Path-Relinking

Mixed path-relinking shares the benefits of back and forward path-relinking, i.e. it thoroughly explores both neighborhoods, but does so in about the same time as forward or backward path-relinking alone. This is achieved by interchanging the roles of the initial and guiding solutions at each step of the path-relinking procedure. Therefore, two paths are generated, one starting at s 1 and the other at s 2. The paths evolve and eventually meet at some solution about half way between s 1 and s 2. The joined path relinks these two solutions. Mixed path-relinking was suggested by Glover [112] and was first implemented and tested by Ribeiro and Rosseti [227], where it was shown to outperform forward, backward, and back and forward path-relinking. Figure 6.12 shows a comparison of pure GRASP and four variants of path-relinking: forward, backward, back and forward, and mixed. The time-to-target plots show that GRASP with mixed path-relinking has the best running time profile among the variants compared.

6.4.5 Truncated Path-Relinking

Since good-quality solutions tend to be near other good-quality solutions, one would expect to find the best solutions with path-relinking near the initial or guiding solution. Indeed, Resende et al. [222] showed that this is the case for instances of the max-min diversity problem, as shown in Fig. 6.13. In that experiment, a back and forward path-relinking scheme was tested. The figure shows the average number of best solutions found by path-relinking taken over several instances and several applications of path-relinking. The 0–10% range in this figure corresponds to subpaths near the initial solutions for the forward path-relinking phase as well as the backward phase, while the 90–100% range are subpaths near the guiding solutions. As the figure indicates, exploring the subpaths near the extremities may produce solutions about as good as those found by exploring the entire path. There is a higher concentration of better solutions close to the initial solutions explored by path-relinking.

Truncated path-relinking can be applied to either forward, backward, backward and forward, or mixed path-relinking. Instead of exploring the entire path, truncated path-relinking only explores a fraction of the path and, consequently, takes a fraction of the time to run. Truncated path-relinking has been applied in [22, 222].

6.4.6 Greedy Randomized Adaptive Path-Relinking

In path-relinking, the best not yet performed move in set Δ(i, g) is applied to the current solution, until the guiding solution is reached. If ties are broken deterministically, this strategy will always produce the same path between the initial and guiding solutions. Since the number of paths connecting i and g is exponential in | Δ(i, g) |, exploring a single path can be somewhat limiting.

Greedy randomized adaptive path-relinking, introduced by Binato et al. [47], is a semi-greedy version of path-relinking. Instead of taking the best move in Δ(i, g) still not performed, a restricted candidate list of good moves still not performed is set up and a randomly selected move from the latter is applied. By applying this strategy several times between the initial and guiding solutions, several paths can be explored. Greedy randomized adaptive path-relinking has been applied in [22, 86, 222].

Fig. 6.13
figure 13

Average number of best solutions found at different depths of the path from the initial solution to the guiding solution on instances of the max-min diversity problem

6.4.7 Evolutionary Path-Relinking

GRASP with path-relinking maintains a pool of elite solutions. Applying path-relinking between pairs of pool solutions may result in an even better pool of solutions. Aiex et al. [9] applied path-relinking between all pairs of elite solutions as an intensification scheme to improve the quality of the pool and as a post-optimization step. The application of path-relinking was repeated until no further improvement was possible.

Resende and Werneck [216, 217] described an evolutionary path-relinking scheme applied to pairs of elite solutions and used as a post-optimization step. The pool resulting from the GRASP with path-relinking iterations is referred to as population P 0. At step k, all pairs of elite set solutions of population P k are relinked and the resulting solutions are made candidates for inclusion in population P k+1 of the next generation. The same rules for acceptance into the pool during GRASP with path-relinking are used for acceptance into P k+1. If the best solution in P k+1 is better than the best in P k, then k is incremented by one and the process is repeated. Resende et al. [222] describe another way to implement evolutionary path-relinking, where a single population is maintained. Each pair of elite solutions is relinked and the resulting solution is a candidate to enter the elite set. If accepted, it replaces an existing elite solution. The process is continued while there are still pairs of elite solutions that have not yet been relinked.

Fig. 6.14
figure 14

An internal path (red arcs, red nodes) from solution S to solution T and two external (blue arcs, blue nodes) paths, one emanating from solution S and the other from solution T. These paths are produced by internal and external path-relinking

Andrade and Resende [21] used this evolutionary scheme as an intensification process every 100 GRASP iterations. During the intensification phase, every solution in the pool is relinked with the two best ones. Since two elite solutions may be relinked more than once in different calls to the intensification process, greedy randomized adaptive path-relinking was used.

Resende et al. [222] showed that a variant of GRASP with evolutionary path-relinking outperformed several other heuristics using GRASP with path-relinking, simulated annealing, and tabu search for the max-min diversity problem.

6.4.8 External Path-Relinking and Diversification

So far in this section, we have considered variants of path-relinking in which a path in the search space graph connects two feasible solutions by progressively introducing in one of them (the initial solution) attributes of the other (the guiding solution). Since attributes common to both solutions are not changed and all solutions visited belong to a path between the two solutions, we may also refer to this type of path-relinking as internal path-relinking.

External path-relinking extends any path connecting two feasible solutions S and T beyond its extremities. To extend such a path beyond S, attributes not present in either S or T are introduced in S. Symmetrically, to extend it beyond T, attributes not present in either S or T are introduced in T. In its greedy variant, all moves are evaluated and the solution chosen to be next in the path is one with best cost or, in case they are all infeasible, the one with least infeasibility. In either direction, the procedure stops when all attributes that do not appear in either S or T have been tested for extending the path. Once both paths are complete, local search may be applied to the best solution in each of them. The best of the two local minima is returned as the solution produced by the external path-relinking procedure.

Figure 6.14 illustrates internal and external path-relinking. The path with red nodes and edges is the one resulting from internal path-relinking applied with S as the initial solution and T as the guiding solution. We observe that the orientation introduced by the arcs in this path is due only to the choice of the initial and guiding solutions. If the roles of solutions S and T were interchanged, it could have been computed and generated in the reverse direction. The same figure also illustrates two paths obtained by external path-relinking, one emanating from S and the other from T, both represented with blue nodes and edges. The orientations of the arcs in each of these paths indicate that they necessarily emanate from either solution S or T.

To conclude, we establish a parallel between internal and external path-relinking. Since internal path-relinking works by fixing all attributes common to the initial and guiding solutions and searches for paths between them satisfying this property, it is clearly an intensification strategy. Contrarily, external path-relinking progressively removes common attributes and replaces them by others that do not appear in either one of the initial or guiding solution. Therefore, it can be seen as a diversification strategy which produces solutions increasingly farther from both the initial and the guiding solutions. External path-relinking becomes therefore a tool for search diversification.

External path-relinking was introduced by Glover [114] and first applied by Duarte et al. [84] in a heuristic for differential dispersion minimization.

6.5 Restart Strategies

Figure 6.15 shows a typical iteration count distribution for a GRASP with path-relinking. Observe in this example that for most of the independent runs whose iteration counts make up the plot, the algorithm finds a target solution in relatively few iterations: about 25% of the runs take at most 101 iterations; about 50% take at most 192 iterations; and about 75% take at most 345. However, some runs take much longer: 10% take over 1000 iterations; 5% over 2000; and 2% over 9715 iterations. The longest run took 11,607 iterations to find a solution at least as good as the target. These long tails contribute to a large average iteration count as well as to a high standard deviation. This section proposes strategies to reduce the tail of the distribution, consequently reducing the average iteration count and its standard deviation.

Fig. 6.15
figure 15

Typical iteration count distribution of GRASP with path-relinking

Consider again the distribution in Fig. 6.15. The distribution shows that each run will take over 345 iterations with a probability of about 25%. Therefore, any time the algorithm is restarted, the probability that the new run will take over 345 iterations is also about 25%. By restarting the algorithm after 345 iterations, the new run will take more than 345 iterations with probability of also about 25%. Therefore, the probability that the algorithm will be still running after 345 + 345 = 690 iterations is the probability that it takes more than 345 iterations multiplied by the probability that it takes more than 690 iterations given that it took more than 345 iterations, i.e., about (1∕4) × (1∕4) = (1∕4)2. It follows by induction that the probability that the algorithm will still be running after k periods of 345 iterations is 1∕(4k). In this example, the probability that the algorithm will be running after 1725 iterations will be about 0.1%, i.e., much less than the 5% probability that the algorithm will take over 2000 iterations without restart.

A restart strategy is defined as an infinite sequence of time intervals τ 1, τ 2, τ 3,  which define epochs τ 1, τ 1 + τ 2, τ 1 + τ 2 + τ 3,  when the algorithm is restarted from scratch. It can be shown that the optimal restart strategy uses τ 1 = τ 2 = ⋯ = τ , where τ is some (unknown) constant. Strategies for speeding up stochastic local search algorithms using restarts were first proposed by Luby et al. [156], where they proved the existence of an optimal restart strategy. Restart strategies in metaheuristics have been addressed in [67, 139, 182, 187, 250]. Further work on restart strategies can be found in [251, 252].

Fig. 6.16
figure 16

Time-to-target plot for target solution value of 554 for a GRASP with path-linking with restart on the maximum cut instance G12 using different values of τ

Implementing the optimal strategy may be difficult in practice because it requires the constant value τ . Runtimes can vary greatly for different combinations of algorithm, instance, and solution quality sought. Since usually one has no prior information about the runtime distribution of the stochastic search algorithm for the optimization problem under consideration, one runs the risk of choosing a value of τ that is either too small or too large. On the one hand, a value that is too small can cause the restart variant of the algorithm to take much longer to converge than a no-restart variant. On the other hand, a value that is too large may never lead to a restart, causing the restart-variant of the algorithm to take as long to converge as the no-restart variant. Figure 6.16 illustrates the restart strategies with time-to-target plots for the maximum cut instance G12 [126] on an 800-node graph with edge density of 0.63% with target solution value 554 for τ = 6, 9, 12, 18, 24, 30, and 42 s. For each value of τ, 100 independent runs of a GRASP with path-relinking with restarts were performed. The variant with τ = corresponds to the heuristic without restart. The figure shows that, for some values of τ, the resulting heuristic outperformed its counterpart with no restart by a large margin.

In GRASP with path-relinking, the number of iterations between improvements of the incumbent solution tends to vary less than the runtimes for different combinations of instance and solution quality sought. If one takes this into account, a simple and effective restart strategy for GRASP with path-relinking is to keep track of the last iteration when the incumbent solution was improved and restart the GRASP with path-relinking if κ iterations have gone by without improvement. We shall call such a strategy restart(κ). A restart consists in saving the incumbent and emptying out the elite set.

Fig. 6.17
figure 17

Pseudo-code of a template of a GRASP with path-relinking with restarts for a minimization problem

The pseudo-code shown in Fig. 6.17 summarizes the steps of a GRASP with path-relinking using the restart(κ) strategy for a minimization problem. The algorithm keeps track of the current iteration (CurrentIter), as well as of the last iteration when an improving solution was found (LastImprov). If an improving solution is detected in line 16, then this solution and its cost are saved in lines 17 and 18, respectively, and the iteration of the last improvement is set to the current iteration in line 19. If, in line 21, it is determined that more than κ iterations have gone by since the last improvement of the incumbent, then a restart is triggered, emptying out the elite set in line 22 and resetting the iteration of the last improvement to the current iteration in line 23. If restart is not triggered, then in line 25 the current solution is tested for inclusion in the elite set and the set is updated if it is accepted. The best overall solution found Best_Solution is returned in line 28 after the stopping criterion is satisfied.

As an illustration of the use of the restart(κ) strategy within a GRASP with path-relinking, consider the maximum cut instance G12. For the values κ = 50, 100, 200, 300, 500, 1000, 2000, and 5000, the heuristic was run independently 100 times, and was stopped when a cut of weight 554 or higher was found. A strategy without restarts was also implemented. Figures 6.18 and 6.19, as well as Table 6.6, summarize these runs, showing the average time to target solution as a function of the value of κ and the time-to-target plots for different values of κ. These figures illustrate well the effect on running time of selecting a value of κ that is either too small (κ = 50, 100) or too large (κ = 2000, 5000). They further show that there is a wide range of κ values (κ = 200, 300, 500, 1000) that result in lower runtimes when compared to the strategy without restarts.

Fig. 6.18
figure 18

Average time to target solution for maximum cut instance G12 using different values of κ. All runs of all strategies have found a solution at least as good as the target value of 554

Figure 6.20 further illustrates the behavior of the restart(100), restart(500), and restart(1000) strategies for the previous example, when compared with the strategy without restarts on the same maximum cut instance G12. However, in this figure, for each strategy, we plot the number of iterations to the target solution value. It is interesting to note that, as expected, each strategy restart(κ) behaves exactly like the strategy without restarts for the κ first iterations, for κ = 100, 500, 1000. After this point, each trajectory deviates from that of the strategy without restarts. Among these strategies, restart(500) is the one with the best performance.

Fig. 6.19
figure 19

Time-to-target plots for maximum cut instance G12 using different values of κ. The figure also shows the time-to-target plot for the strategy without restarts. All runs of all strategies found a solution at least as good as the target value of 554

We make some final observations about these experiments. The effect of the restart strategies can be mainly observed in the column corresponding to the fourth quartile of Table 6.6. Entries in this quartile correspond to those in the heavy tails of the distributions. The restart strategies in general did not affect the other quartiles of the distributions, which is a desirable characteristic. Compared to the no-restart strategy, restart strategies restart(500) and restart(1000) were able to reduce the maximum number

Table 6.6 Summary of computational results on maximum cut instance G12 with four strategies

of iterations, as well as the average and the standard deviation. Strategy restart(100) did so, too, but not as much as restart(500) and restart(1000). Restart strategies restart(500) and restart(1000) were clearly the best strategies of those tested.

Fig. 6.20
figure 20

Comparison of the iterations-to-target plots for maximum cut instance G12 using strategies restart(100), restart(500), and restart(1000). The figure also shows the iterations-to-target plot for the strategy without restarts. All runs of all strategies found a solution at least as good as the target value of 554

The restart(κ) strategy for GRASP with path-relinking discussed in this section was originally proposed by Resende and Ribeiro [214]. Besides the experiments presented in this chapter for the maximum cut instance G12, that paper also considered five other instances of maximum cut, maximum weighted satisfiability, and bandwidth packing. Interian and Ribeiro [136] implemented restart strategies for GRASP with path-relinking for the Steiner traveling salesman problem.

6.6 Extensions

In this section, we comment on some extensions, implementation strategies, and hybridizations of GRASP.

The use of hash tables to avoid cycling in conjunction with tabu search was proposed by Woodruff and Zemel [266]. A similar approach was later explored by Ribeiro et al. [232] in their tabu search algorithm for query optimization in relational databases. In the context of GRASP implementations, hash tables were first used by Martins et al. [168] in their multi-neighborhood heuristic for the Steiner problem in graphs, to avoid the application of local search to solutions already visited in previous iterations.

Filtering strategies are used to speed up the iterations of GRASP, see e.g. [93, 168, 202]. With filtering, local search is not applied to all solutions obtained at the end of the construction phase, but only to some more promising unvisited solutions, defined by a threshold with respect to the incumbent.

Almost all randomization effort in the basic GRASP algorithm involves the construction phase. Local search stops at the first local optimum. On the other hand, strategies such as VNS (Variable Neighborhood Search), proposed by Hansen and Mladenović [121, 172], rely almost entirely on the randomization of the local search to escape from local optima. With respect to randomization, GRASP and variable neighborhood strategies can be considered complementary and potentially capable of leading to effective hybrid methods. A first attempt in this direction was made by Martins et al. [168] where the construction phase of a hybrid heuristic for the Steiner problem in graphs follows the greedy randomized strategy of GRASP, while the local search phase makes use of two different neighborhood structures, like the VND (variable neighborhood descent) procedure [121, 172]. That heuristic was later improved by Ribeiro et al. [233], where one of the key components of the new algorithm was another strategy for the exploration of different neighborhoods. Ribeiro and Souza [229] also combined GRASP with VND in a hybrid heuristic for the degree-constrained minimum spanning tree problem. Festa et al. [102] studied different variants and combinations of GRASP and VNS for the maximum cut problem, finding and improving the best known solutions for some open instances from the literature.

GRASP has also been used in conjunction with genetic algorithms. The greedy randomized strategy used in the construction phase of a GRASP heuristic is applied to generate the initial population for a genetic algorithm. As an example, consider the genetic algorithm of Ahuja et al. [4] for the quadratic assignment problem. It makes use of the GRASP heuristic proposed by Li et al. [150] to create the initial population of solutions. A similar approach was used by Armony et al. [31], with the initial population made up of both randomly generated solutions and those built by a GRASP heuristic.

The hybridization of GRASP with tabu search was first studied by Laguna and González-Velarde [146]. Delmaire et al. [71] considered two approaches. In the first, GRASP is applied as a powerful diversification strategy in the context of a tabu search procedure. The second approach is an implementation of the Reactive GRASP algorithm presented in Sect. 6.3.2, in which the local search phase is strengthened by tabu search. Results reported for the capacitated location problem show that the hybrid approaches perform better than the isolated methods previously used. Two two-stage heuristics are proposed in [1] for solving the multi-floor facility layout problem. GRASP/TS applies a GRASP to find the initial layout and tabu search to refine it.

Iterated Local Search (ILS) iteratively builds a sequence of solutions generated by the repeated application of local search and perturbation of the local optima found by local search [42]. Lourenço et al. [155] point out that ILS has been rediscovered many times and is also known as iterated descent [40, 41], large step Markov chains [165], iterated Lin-Kernighan [137], and chained local optimization [164]. A GRASP/ILS hybrid can be obtained by replacing the standard local search of GRASP by ILS. The GRASP construction produces a solution which is passed to the ILS procedure. Ribeiro and Urrutia [230] presented a hybrid GRASP with ILS for the mirrored traveling tournament problem, in which perturbations are achieved by randomly generating solutions in the game rotation ejection chain [110, 111] neighborhood.

6.7 Applications

The first application of GRASP was described in the literature in 1989 [90]. In that paper, GRASP was applied to difficult set covering problems Since then, GRASP has been applied to a wide range of problems. The main applications areas are summarized below with links to specific references:

The reader is referred to Festa and Resende [100] and the book by Resende and Ribeiro [215] for extended annotated bibliographies of GRASP applications.

6.8 Concluding Remarks

The results described in this chapter reflect successful applications of GRASP to a large number of classical combinatorial optimization problems, as well as to problems that arise in real-world situations in different areas of business, science, and technology.

We underscore the simplicity of implementation of GRASP, which makes use of simple building blocks (solution construction procedures and local search methods) that are often readily available. Contrary to what occurs with most other metaheuristics, such as tabu search or genetic algorithms, that make use of a large number of parameters in their implementations, the basic variant of GRASP requires the adjustment of a single parameter, i.e. the restricted candidate list (RCL) parameter α.

Recent developments, presented in this chapter, show that different extensions of the basic procedure allow further improvements in the solutions found by GRASP. Among these, we highlight reactive GRASP, which automates the adjustment of the restricted candidate list parameter; variable neighborhoods, which permit accelerated and intensified local search; path-relinking, which beyond allowing the implementation of intensification strategies based on the memory of elite solutions, opens the way for the development of very effective cooperative parallel strategies [6, 8, 9, 227]; and restart strategies to speedup the search.

These and other extensions make up a set of tools that can be added to simpler heuristics to find better-quality solutions. To illustrate the effect of additional extensions on solution quality, Fig. 6.21 shows some results obtained for the prize-collecting Steiner tree problem (PCSTP), as discussed by Canuto et al. in [54]. The figure shows results for 11 different levels of solution accuracy (varying from optimal to 10% from optimal) on 40 PCSTP instances. For each level of solution accuracy, the figure shows the number of instances for which each component found solutions within the accuracy level. The components are the primal-dual constructive algorithm (GW) of Goemans and Williamson [117], GW followed by local search (GW+LS), corresponding to the first GRASP iteration, 500 iterations of GRASP with path-relinking (GRASP+PR), and the complete algorithm, using variable neighborhood search as a post-optimization procedure (GRASP+PR+VNS). We observe that the number of optimal solutions found goes from six, using only the constructive algorithm, to a total of 36, using the complete algorithm described in [54]. The largest relative deviation with respect to the optimal value decreases from 36.4% in the first case, to only 1.1% for the complete algorithm. It is easy to notice the contribution made by each additional extension.

Fig. 6.21
figure 21

Performance of GW approximation algorithm, a single GRASP iteration (GW followed by local search), 500 iterations of GRASP with path-relinking, and 500 iterations of GRASP with path-relinking followed by VNS for series C prize-collecting Steiner tree problems

The structure of GRASP makes it very amenable to straightforward, efficient parallel implementations that benefit from the computer architecture. Parallel implementations of GRASP [6, 8, 9, 227] are quite robust and lead to linear speedups both in independent and cooperative strategies. Cooperative strategies are based on the collaboration between processors through path-relinking and a global pool of elite solutions. This allows the use of more processors to find better solutions in less computation time. Many parallel implementations of GRASP have been reported in the literature, see e.g. [166, 168, 178, 188, 189]. In many of these papers, a common observation was made: the speedups in the measured running times were proportional to the number of processors. This observation can be explained if the random variable time-to-target-solution-value is exponentially distributed. Aiex et al. [7] developed a graphical methodology based on runtime distributions to empirically show that the running times of GRASP heuristics fit exponential distributions, as summarized below.

Runtime distributions or time-to-target plots display on the ordinate axis the probability that an algorithm will find a solution at least as good as a given target value within a given running time, shown on the abscissa axis. They provide a very useful tool to characterize the running times of stochastic algorithms for combinatorial optimization problems and to compare different algorithms or strategies for solving a given problem. Time-to-target plots were first used by Feo et al. [93] and have been widely used as a tool for algorithm design and comparison. Runtime distributions have also been advocated by Hoos and Stützle [135] as a way to characterize the running times of stochastic local search algorithms for combinatorial optimization. In particular, they have been largely applied to evaluate and compare the efficiency of different strategies of sequential and parallel implementations of GRASP with (and without) path-relinking heuristics. Aiex et al. [7] used time-to-target plots to show experimentally that the running times of GRASP heuristics fit shifted (or two-parameter) exponential distributions, reporting computational results for 2400 runs of GRASP heuristics for each of five different problems: maximum stable set, quadratic assignment, graph planarization [210, 211, 226], maximum weighted satisfiability, and maximum covering. Aiex et al. [10] developed a Perl program to create time-to-target plots for measured times that are assumed to fit a shifted exponential distribution, following closely the work in [7]. Ribeiro et al. [235] developed a closed form result to compare two exponential algorithms and an iterative procedure to compare two algorithms following generic runtime distributions. This work was extended by Ribeiro et al. [236] and was also applied in the comparison of parallel heuristics. Ribeiro and Rosseti [228] developed a code to compare runtime distributions of randomized algorithms.

To conclude, this chapter provides the reader with the tools to build a basic GRASP to find optimal or near-optimal solutions to a combinatorial optimization problem. The chapter also provides the means to add more advanced features to this basic GRASP, like path-relinking and restart strategies, that enable better performance, both with respect to solution quality and solution run time. Left out of this chapter is the use of GRASP for solving continuous optimization problems. The interested reader is pointed to [128131, 215, 254] for an introduction to C-GRASP, or Continuous GRASP, as well as to some software and applications of C-GRASP.