1 Introduction

Transportation has a significant impact on today’s societies; it has large impacts on economic growth and employment (Fink et al. 2019). Transportation employs millions of people globally and it is considered as a major component of organizations’ costs (Baradaran et al. 2019; El Khoury et al. 2014; Comtois et al. 2013). Further, transportation depends heavily on oil resources and is considered a focal source of CO2, CO, N2O, and NH3 emissions. As stated by a US EPA report, approximately 28% of the national greenhouse-gas emissions in 2017 were generated by transportation. It is therefore becoming a priority for transportation companies to optimize their transportation processes since small improvements can lead to huge impacts on the environment and on organizations’ cost reductions. Furthermore, today, companies are both concerned with the costs and highly interested in providing the best customer service to optimize fulfillment, logistics, and production, which in turn lead to tight customer loyalty, and thus better organizational performance.

In transportation, the Vehicle Routing Problem (VRP) deals with the transportation of goods between depots and customers, where a set of routes must be defined for a number of vehicles to travel from their depot(s) to customers (Côté et al. 2020). The traveling cost between the depot and each customer and between each pair of customers is given. The VRP solution must find a route for each vehicle, starting and ending at the depot, such that a set of customers is served by exactly one vehicle, the overall cost of the routes is minimized and customer satisfaction (fulfilling their demands) is maximized while taking into account a set of given constraints. Typically, the solution to a VRP has to take into consideration several other restrictions, such as the capacity of the vehicles, the working hours of the salespersons, and the priority of the desired customers. Further, there are several variants to the VRP that take into account different factors such as the nature of the transported goods, the quality of the service required, and the characteristics of the customers and the vehicles. In Fig. 1 below, we show a typical input for a VRP problem and one of its possible outputs:

Fig. 1
figure 1

An instance of a VRP (left) and its solution (right)

The literature presents different algorithms that have been used to solve the VRP such as the Tabu Search (Du and He 2012; Jin et al. 2012), the Artificial Bee Colony algorithm (Szeto et al. 2011; Gomez and Salhi 2014), the Bee Mating Optimization algorithm (Marinaki et al. 2010), Ant Colony Optimization (Akpinar 2016), the Genetic Algorithm (GA) (Nazif and Lee 2012), Particle Swarm Optimization (PSO) (Kim and Son 2012; Chen et al. 2006), the Water Flow Alike algorithm (Zainudin et al. 2015), the membrane algorithm (Niu et al. 2015), the Cooperative Parallel metaheuristic (Jin et al. 2014) and the Clarke Wright algorithm (Clarke and Wright 1964; Shour et al. 2015). The Clarke Wright (CW) algorithm was developed in 1964 to solve the VRP. The CW is classified as a constructive method used to addresses a variant number of vehicles and works evenly for both directed and undirected problems. Further, a recent metaheuristic known as the cuckoo search (CS) was introduced by Yang and Deb in 2009 and has received much attention from researchers in various optimization areas. The CS has been applied to continuous optimization problems where it has shown better performance when compared to popular meta-heuristic algorithms such as the GA, Particle Swarm Optimization (PSO) and others (Ouaarab et al. 2014; Yang and Deb 2010; Yildiz 2013).

Recently, it became very popular among researchers to use search methods for selecting heuristics to solve computational search problems (Burke et al. 2010). This new optimization paradigm is called Hyper-heuristics and is described as using “heuristics to choose heuristics”. The main difference between hyper-heuristics and meta-heuristics is that hyper-heuristics directly search a space of heuristics rather than a space of problem solutions. Thus, when applied to a specific problem, a hyper-heuristic aims to find a proper combination of easy-to-implement low-level heuristics that could produce an acceptable domain solution (Burke et al. 2013).

Motivated by the above literature, this paper proposes a modified version of the Clarke Wright algorithm and an enhanced cuckoo search-based hyper-heuristic that selects, in each step, the most suitable low-level heuristic that directly searches for a VRPC solution in the problem’s search space. In fact, the reason for using a hyper-heuristic based on the Cuckoo Search metaheuristic was motivated by the advantages of this metaheuristic. Compared to other heuristics, it has fewer adjustable parameters that need to be configured, and it also has the potential to better balance exploitation and exploration. Regarding our proposed Clarke Wright algorithm, it extends the classical CW to tackle prioritized customers. Both methods are tested with a set of eighteen randomly generated test cases that simulate actual data in the VRP with a predefined capacity of each vehicle, route time, and customer priority. The goal is to minimize the traveling distance of vehicles and reduce the time when moving freight from the depot to prioritized customers. In addition, both methods are also tested on real data from a distribution company operating in Lebanon. The results of the cuckoo search-based hyper-heuristic outperformed the modified Clarke Wright algorithm.

The rest of this paper is organized as follows. Section 2 presents the literature review. Section 3 describes the VRP problem and its formulation. Section 4 presents a description of the classical and modified Clarke Wright algorithms. Section 5 presents the classical Cuckoo Search algorithm. The cuckoo search-based hyper-heuristic is presented in Sect. 6. Section 7 presents the empirical results. Finally, the conclusion is presented in Sect. 8.

2 Literature review

The Vehicle Routing Problem (VRP) is known to be an NP-hard problem; its computational complexity increases exponentially as the number of customers grows (Lenstra and Rinnooy Kan 1981). Researchers have approached the vehicle routing problem using various methods. Exact methods and heuristic algorithms are the most popular ones. Although exact methods can obtain an optimal solution, they are not efficient enough, especially for large-size instances (Abu-Khzam et al. 2014; Captivo et al. 2003). Hence, the requirement to find good solutions quickly (not necessarily the optimal solutions) has led to the development of various heuristic algorithms (Cordeau et al. 2005) and approximate (meta-heuristic) algorithms (Haraty et al. 2018; Tarhini et al. 2016). Some well-structured heuristics can quickly attain feasible solutions for targeted problems. However, the feasible solutions found by heuristic algorithms are not always near the optimal one and thus they cannot guarantee the quality of these solutions (Tarhini et al. 2014).

In fact, previous works have shown that it is easy to apply meta-heuristic algorithms to various VRPs to obtain near to optimal solutions with an acceptable computational time (Yang and Deb 2010; Yang et al. 2012; Khoury et al. 2019); thus, several meta-heuristic algorithms, including Particle Swarm Optimization (PSO) (Nazif and Lee 2012; Kim and Son 2012), the Tabu Search (TS) (Ai and Kachitvichyanukul 2009; Chen et al. 2006), Simulated Annealing (SA) (Gounaris et al. 2014), Genetic Algorithms (GAs) (Zainudin et al. 2015; Jin et al. 2014), and Squeaky Wheel Optimization (SWO) (Zhen 2016), have been proposed to solve VRPs. Nevertheless, the literature does not contain any usage of the Cuckoo Search (CS) algorithm to solve the vehicle routing problem with prioritized customers at the heuristic or hyper-heuristic levels. In fact, an interesting work proposed by Ouaarab et al. (2014) used the CS to solve the traveling salesperson problem (TSP), and the results show that the CS algorithm outperformed some other popular meta-heuristic algorithms. In addition to solving continuous optimization problems (Yang et al. 2012; Gandomi et al. 2013), the CS achieved remarkable performance in constrained optimization problems (Yang and Deb 2013; Bulatović et al. 2013; Bhargava et al. 2013), selecting the web service composition (Chifu et al. 2012), training a neural network (Vazquez 2011), bin packing (Layeb 2011) and manufacturing scheduling systems (Burnwal and Deb 2013).

On the other hand, the literature shows that only a few works have used hyper-heuristics to solve the VRP. Asta and Ozcan (2014) used the HyFlex framework-based hyper-heuristic approach to solve the VRP while Garrido and Castro (2009) used an evolutionary hyper-heuristic approach. Further, Marshall et al. (2014) described a grammatical evolutionary-based hyper-heuristic for the capacitated VRP. To the best of the authors’ knowledge, the use of cuckoo search-based hyper-heuristics to solve the VRPC remains unexplored in the literature. Accordingly, this work is motivated to develop a Cuckoo Search-based hyper-heuristic to solve the nonclassical VRP problem with some realistic constraints such as customer priority and constrained route times and to compare its results with those of the Clarke Wright algorithm in order to get better and more satisfactory solutions. Our work considers two types of heuristics: constructive heuristics and improvement heuristics. A constructive heuristic positions customers along a route and creates new routes as needed until all the clients have been assigned a route. On the other hand, improvement heuristics require an initial solution to start with and then they modify the placement of the customers within the routes. Thus, the role of improvement heuristics is to reduce the distance required to visit all the customers.

3 Vehicle routing problem description

3.1 Preliminary

Due to the fierce competition with their rivals, transportation and logistics companies noticed decreases in their profit margins if their trucks were not loaded at the needed capacity and routes were not optimally traversed. Accordingly, efficient and effective measures had to be taken at the operational level by optimally routing vehicles to customers. To elaborate on the effect of such routing, the following example, illustrated in Fig. 2, shows two different permutations to establish a route between the central depot 0 and 4 customers (represented as nodes), where the distance between each customer is displayed on the edge between the nodes and the demand di is found in the node itself. Assume that the maximum capacity per vehicle is 40. The two permutations lead to two different costs. Assume that the first permutation travels from depot 0 to customer 1 and then returns back to depot, from which it travels to customer 2 and back again to depot. Thus, the path will be 0–1–0, 0–2, 0. The demand in this trip is fulfilled with a cost of 12 + 12 + 11 + 11 = 46. Assume that in the second permutation we traverse the path 0–1–2–0. The demand (10 + 15) of this trip is less than the vehicle capacity and thus it is fulfilled; therefore, the cost of this trip would be 12 + 8+11 = 31. This obviously shows that the selection of the appropriate route would reduce the cost of the trip between customers.

Fig. 2
figure 2

VRP route establishment, the first permutation

3.2 Mathematical formulation

Given a set of customers \( {\mathbf{\mathcal{C}}} \)= {1,…,n} with priorities γi ∈ γ = {1n} and demands di ∈ \( {\mathbf{\mathcal{D}}} \)= {1,…k} for a product that must be served using a set of vehicles. The vehicles are situated at a central depot to which they must return after serving customers. The cost of traveling between customer i and customer j is related to the distance traversed and is denoted by cij. Each vehicle has a given maximum capacity Q. In a VRP, we need to determine a routing schedule that minimizes the total cost of deliveries such that each route starts and ends at the depot; further, every customer belongs exactly to one route, and the vehicle capacity is not exceeded in any route. The route represents a sequence of customers for each vehicle.

The VRP is known to be an NP-hard problem (Lenstra and Rinnooy Kan 1981), and we represent it using graphs. Let G (VG, EG) be a graph in which the following exists: vertex vi ∈ VG represents a customer to be visited, where |VG| = n; the customer’s demand di represents the number of products requested by customer vi; the edge e∈ EG joins the two vertices vi and vj and represents the existence of a flow between customer vi and customer vj; and the cost of traversing this edge e, cij, represents the distance between the two customers vi and vj. There exists m vehicles, where mi ∈ \( {\mathcal{M}} \)= {1,…y}, with capacities Qj, where j = 1q, that start and end at the central depot, which is at vertex 0. The problem to be handled is to determine the cycles R1,…Rn for the vehicles that start from vertex 0 and service all vertices such that the load of vehicle j does not exceed its capacity Qj, and the total cost of the cycles is minimized.

To sum up, in a VRP, we need to determine a routing schedule that minimizes the total cost of deliveries such that the following constraints are met:

  1. 1.

    each route starts and ends at the depot,

  2. 2.

    every customer belongs exactly to one route,

  3. 3.

    the total demand on each route does not exceed the vehicle capacity Q,

  4. 4.

    the total duration of each route does not exceed a predefined limit T,

  5. 5.

    a vehicle can do more than one route, and

  6. 6.

    a preference priority is assigned to every customer such that γ preferred customers could not be visited in the same route. The route represents a sequence of customers for each vehicle.

Given a set of customers \( {\mathbf{\mathcal{C}}} \), a set of vehicles \( {\mathcal{M}} \), and the operating time at each customer \( t_{i}^{o} \), the following represents our proposed mathematical formulation:

$$ \hbox{min} \varSigma_{i} \varSigma_{j} \left( {t_{ij} + t_{j}^{o} } \right)x_{ijm} $$

which is subject to

$$ \mathop \sum \limits_{i = 0}^{n} x_{ijm} = 1 \quad \forall j \in C,m \in M $$
(1)
$$ \mathop \sum \limits_{j = 0}^{n} x_{ijm} = 1\quad \forall i \in C, m \in M $$
(2)
$$ \mathop \sum \limits_{i}^{n} \mathop \sum \limits_{j, i \ne j}^{n} x_{ijm} q_{i} \le Q\quad \forall m \in M $$
(3)
$$ \mathop \sum \limits_{i}^{n} \mathop \sum \limits_{j, i \ne j}^{n} x_{ijm} P_{i} \le \gamma \quad \forall m \in M $$
(4)
$$ \mathop \sum \limits_{k} x_{ijk} \left( {t_{ij} + t_{i}^{o} } \right) \le T; \quad x \in \left\{ {0,1} \right\} $$
(5)

In the first two constraints, \( x_{ijm} \) represents the degree of a vertex that ensures that exactly one edge enters and exactly one leaves each vertex associated with a customer, respectively. Constraint 3 ensures that the vehicle capacity for each vehicle m does not exceed the defined maximum. Constraint 4 ensures that each tour does not include more than \( \gamma \) prioritized customers. In constraint 5, we ensure that the travel time for each vehicle (maybe for more than one tour) does not exceed a specified time limit.

4 Clarke–Wright algorithm

4.1 The classical Clarke–Wright algorithm

The Clarke–Wright savings algorithm is one of the known heuristics that can be used to solve the VRP. It was developed in 1964 and is classified as a constructive method in which tours are built up by adding nodes to partial tours or combining subtours to meet the capacities and costs (Clarke and Wright 1964). It applies to problems for which the number of vehicles is not fixed (it is a decision variable), and it works equally well for both directed and undirected problems. When two routes (0,…,i,0) and (0,j,…,0) can feasibly be merged into a single route (0,…,i,j,…,0), a distance saving Sij = ci0 + c0j − cij is generated. A description of the classical Clarke Wright (CW) algorithm is given as follows.

figure a

4.2 The modified Clarke–Wright algorithm

A modified version of the Clarke–Wright algorithm (CW) can be implemented by applying two new constraints to the algorithm. The first is the driver time parameter, which should not be less than the upper bound, Ut; and the second parameter is the customer preference priority, Pp, where the route should not contain more than n preferred customers. In the modified CW algorithm, customers are clustered by vehicles. First, compute the Euclidean distance matrix (di,j) according to the following equation:

$$ d_{i,j} = \sqrt {\left( {X_{i} - X_{j} } \right)^{2} + \left( {Y_{i} - {\text{Yj}}} \right)^{2} } $$

where Xi, Yi and Xj, Yj are the geographical locations of customers i and j, respectively. Second, the savings value between customers i and j is calculated as follows:

$$ S_{i,j} = d_{0,j} - d_{i,j} $$

where \( d_{0,j} \) is the traveling distance between the depot and customer j, and di,j is the traveling distance between customers i and j. Bodin et al. (1983) updated the Clarke–Wright formulation. After the calculation, all savings values are collected in the savings list and calculated as follows:

$$ S_{i,j} = d_{0,j} + d_{j,0} - d_{i,j} $$

A description of the modified Clarke Wright (CW) algorithm is given as follows.

figure b

5 A classical cuckoo search algorithm

Nature-inspired meta-heuristics have proven their adeptness and efficacy on a wide range of problems, which has contributed to the introduction of new nature-inspired meta-heuristic solutions over the years. All meta-heuristic algorithms share two important characteristics, which are intensification and diversification. The dominance of these algorithms comes from the fact that they imitate the best features of nature, especially those of biological systems that evolved from natural selection over millions of years.

The Cuckoo Search (CS) is one of the latest nature-inspired metaheuristic algorithms that belong to the swarm intelligence category. The results of several studies show that the CS has better performance than other natural algorithms such as PSO and the GA (Yang and Deb 2009, 2010) when solving continuous optimization problems. Yang and Deb (2009) first introduced the CS in 2009 based on the brood parasitism behavior of cuckoos. It is inspired by the aggressive reproduction behavior of cuckoo birds that lay their eggs in communal nests through which they might remove others’ eggs to increase the hatching probability of their own eggs. If a host bird discovers the eggs are not its own, it will either throw away these strange eggs or simply desert its nest and build a new nest elsewhere. The cuckoo’s egg might be found by the host bird with a certain probability Ped ∈ [0,1].

The behavior of cuckoos is modeled by the CS algorithm. After each step, the worst solutions are discarded and new solutions are generated. This models that the worst nests are being identified by host birds, which means that they have to be discarded and new nests are created by host birds. Then, in each iteration, a cuckoo solution tries to replace a nest among the solution nests to get the best solution after each repetition. Thus, solution \( X_{i}^{t + 1} \) is generated from solution \( X_{i}^{t} \) of cuckoo i by performing a Lévy flight (a method for generating eggs) as per the equation below:

$$ X_{i}^{t + 1} = X_{i}^{t} + \alpha \oplus {\text{L\'e vy}}\,\left( {s, \, \lambda } \right) $$

where α < 0 is the step size, which is related to the scale of the problem of interest. In the majority of cases, the most commonly used value of α is 1. The most important characteristic of Lévy flights is their intensive search around a solution and the occasional big steps of Lévy flights can minimize the probability of falling into the local optima. A Lévy flight is modeled as a probability density function:

$$ {\text{L\'e vy}}\left( {s,\lambda } \right) \sim s^{{{-}\lambda }} ,\left( {1 \, < \, \lambda \, \le \, 3} \right) $$

This has an infinite variance with an infinite mean. Here, s is the step size drawn from a Lévy distribution. A detailed description of the CS can be found in the work done by Yang and Deb (2010).

6 Proposed swarm intelligence cuckoo search based hyper-heuristic for the VRPC

The hyper-heuristic algorithm (Algorithm 3) that we propose in this paper uses the Cuckoo Search algorithm (Yang and Deb 2009) to combine low-level heuristics such that a domain specific solution, soldomain, (i.e., the VRPC) would be guided towards the optimal or a near-optimal solution. In our approach, we have used two sets of low-level heuristics. The first set is applied sequentially to improve the solution; this set is named llh-imp and shown in Table 1. The second set is applied according to a selection method using the levy-flight concept; this set is named llh_levy and shown in Table 2.

Table 1 The set of considered low-level heuristics used for the improvement step
Table 2 The set of Lévy-flight low-level heuristics used for the perturbation step

6.1 Egg representation

In this work, we assume that a cuckoo lays a single egg in one nest; thus, an egg in a nest is a solution represented by one individual in the population, while the nest is the container of that new cuckoo egg and its abandonment involves its egg being replaced in the population by a new one. The cuckoo egg is defined as follows:

$$ {\text{cuckoo}}_{\text{egg}} \, = \,\left( {llh\_levy,sol_{domain} } \right) $$

where llh_levy represents a sequence of n low-level heuristics that will be applied on an improved domain solution (VRPC), soldomain, in the order that they appear in the sequence, and soldomain represents the domain solution (i.e., the VRPC).

6.2 Host nest initialization

In the initialization stage, an initial solution, soldomain, is developed using a constructive heuristic. The constructive heuristic constructs the first current solution from scratch given a set of predefined rules. In this work, the Constructive Algorithm is executed as follows.

  1. a.

    Step 1 Choose the customer who has the highest priority.

  2. b.

    Step 2 Construct the tour by trying to add the nearest customer to the tour.

  3. c.

    Step 3 When the tour is unable to add any more customers from the rest, a new tour will be created and the same scenario will be repeated in the second step.

6.3 Local search (intensification and diversification) and levy flight

To reduce the probability of their eggs being discovered, some cuckoo species have evolved in such a way that they can engage in a kind of surveillance on nests likely to be a host (Payne and Sorensen 2005). This work uses improvement and perturbation heuristics to help the cuckoo bird’s eggs imitate the pattern and shape of the host nest’s eggs, and therefore they have a good chance to survive. These heuristics are iteratively performed in two separate steps until a stopping condition is satisfied. The new solution is accepted based on the simulated annealing acceptance criterion (Metropolis criterion). In this way, the optimization process might be prevented from getting stuck in a local optimum.

In the improvement step, a permutation of all low-level heuristics in the set llh-imp, shown in Table 1, is applied sequentially to the current solution. Each of the low-level heuristics llh-impi is applied repeatedly before applying llh-impi+1 as long as llh-impi is able to improve the solution.

In the perturbation step, we simulate the moves conducted by cuckoos in the search space via Lévy flights. The cuckoo chooses a direction and step size from its current nest to search for the best nest in a restrictive range of its current nest according to the value of the Lévy flights, which depend on two factors: the remaining time for a cuckoo to lay its egg (i.e., the algorithm execution time) and the performance enhancement of the solution. This step guarantees a certain level of diversification by selecting one of the following three strategies to be applied according to Eq. 2.

  • Strategy 1: From Table 2, choose a random permutation heuristic llh_lévyi and apply it to the candidate solution.

  • Strategy 2: From Table 2, randomly choose a permutation of the perturbation heuristics and apply it as long as the candidate solution in hand is improved. Hence, the improving low-level-heuristic is applied repeatedly as long as it is able to improve the solution.

  • Strategy 3: From Table 2, choose the perturbation heuristic that is known to have the best reward.

The strategy selection depends on the variable Lévy described in Eq. 6:

$$ Levy\left( {k,t} \right) = \left\{ {\begin{array}{*{20}l} {1, } \hfill &\quad {if\, k = 2 \,and\, t < \frac{{T_{l} }}{3}} \hfill \\ {2,} \hfill &\quad {if\, k = 2\, and\, t > \frac{{T_{l} }}{3}} \hfill \\ {3,} \hfill &\quad { if\, k \ge 3} \hfill \\ \end{array} } \right. $$
(6)

where Tl is the algorithm’s remaining execution time. K is calculated as per the equation below:

$$ K \, = Levy\left( {k,t} \right)^{ - 1} + \, m $$
(7)

The value of m is based on whether the performance improvement level falls below 30%, from 30 to 60%, or above 60%.

$$ m = \left\{ {\begin{array}{*{20}l} {1,} \hfill & \quad{if\, \alpha *L < L + 0.3*L} \hfill \\ {2,} \hfill &\quad {if\, L + 0.3*L < \alpha *L < L + 0.6*L} \hfill \\ {3,} \hfill &\quad {if\, \alpha *L > L + 0.6*L} \hfill \\ \end{array} } \right. $$
(8)

L indicates the performance level and is calculated as follows:

$$ L = \left| {\frac{{Objective_{{Function\left( {Current \,\,Solution} \right)}} - Objective_{{Function\left( {Best \,\,Solution} \right)}}^{{}} }}{{Objective_{{Function\left( {Current \,\,Solution} \right)}} }}} \right|. $$

6.4 Termination criterion

The termination criterion or stopping condition is the condition that ends the search. In this work, the hyper-heuristic will stop searching when the number of nonimproved solutions reaches a defined threshold (related to the number of customers) or the execution time reaches a certain limit ɷ. The termination criterion is defined in the following equation:

figure c
$$ T\left( {\text{ms}} \right) = {\text{Min}}\{ \hbox{max} \left\{ {{\varpi }, nb of customers * 1000} \right\},\,{\text{nb of consecutive non-improved solutions}}\} .$$
(9)

7 Experimental results

In this section, the results of a Cuckoo Search-based hyper-heuristic are presented and are compared with those of the modified Clarke Wright algorithm. These algorithms are tested on real and synthetic data. A set of eighteen synthetic test cases were generated based on the methodology adopted by Tarhini et al. (2016); each of these test cases has different numbers of customers and vehicles. Further, real data were collected from a distribution company operating in Lebanon. The company distributes products to more than 200 customers in several industries across Lebanon. This section will present the results of the synthetic data first followed by the results of the real data.

7.1 Synthetic data

Each of the eighteen synthetic test cases is represented by four main components. They are the distance matrices, the route time matrices, the demands of customers and the customer’s priority. Table 3 summarizes the results of the modified Clarke Wright (CW) algorithm and the Cuckoo Search-based hyper-heuristic (CsHh) applied on the eighteen test cases of TC1-TC18. The second column of Table 3 (Cn) represents the number of customers to be visited and the third column (Vn) represents the number of vehicles used. The remaining columns show the Objective function, traversed distance, and execution time in milliseconds for both the CW algorithm and the CS Hyper-heuristic. The CW algorithm and CsHh were both implemented using VB.Net.Footnote 1 Furthermore, the tests were carried on a PC with an Intel core i3 CPU operating at 1.2 GHz, 4 GB of Ram, and Windows 10.

Table 3 Comparison results of the modified Clarke Wright and the cuckoo search based hyper-heuristic

In Table 3, for each instance, several parameters are shown and compared: the objective function value (best found cost), the distance of the best route, and the time taken to find this best route. Figure 3 illustrates the results shown in Table 3 in which it is found that the CSs hyper-heuristic outperformed the CW algorithm in terms of the Objective function value and the route distance for the eighteen test cases.

Fig. 3
figure 3

A comparison of the Cuckoo Search-based Hyper-heuristic results with the Clarke Wright algorithm results

Moreover, it is worth noticing from Table 3 that the magnitude of the performance improvement of the CS over the CW algorithm incrementally increases as the problem size gets larger; this is illustrated in Fig. 4. One explanation for these results is that for small problem sizes, the CS hyper-heuristic was able to cover similar diverse solutions in the narrow solution space as the CW and thus it gave similar results. However, for large problem sizes, the diversification method in the CS enabled this hyper-heuristic to find a better solution than the CW by covering all possible combinations.

Fig. 4
figure 4

The magnitude of the CS performance enhancement over the CW

Further, although the CS algorithm outperformed the CW in large problem sizes with the chance of obtaining a better diversified solution space, nevertheless, the execution time for the CS (21.5 s) was much longer than that of the CW (1.67 s), as shown in Fig. 5; however, this is still acceptable since such solutions are produced off-line. In Fig. 5, the x-axis shows the test cases, the left y-axis is the execution time, and the right y-axis shows the performance enhancement of the CS over the CW. In fact, one reason for the CS hyper-heuristic having a higher computational time than the CW goes back to the fact that the operations performed in any hyper-heuristic are more complex than those done in the CW’s iterations. In the CS hyper-heuristic, all operations, ranging from the initial constructive solutions to the local search improvement method and ending with the perturbation methods, take more time than what is done in the CW, which is based on the notion of saving operations; thus, the CW scores very high on simplicity and speed. In fact, this is clearly noticed as the problem size gets bigger where the diversified set of solutions needs more computational efforts to be created than in small-sized problems where the computational efforts are close to those in the CW.

Fig. 5
figure 5

Enhancement percentage of the CS algorithm over the CW across execution times

In addition, we tuned the cuckoo-search hyper-heuristic performance by using three different acceptance criteria. The first acceptance criterion is Naive Acceptance, which allows the nonimproving solutions to be accepted with a probability of 0.5. The second adopted acceptance criterion is All Moves, which accepts the candidate solution regardless of its objective function. As shown in Fig. 6, the CS hyper-heuristic is best tuned using the SA acceptance criterion because it enables it not to be stuck in a local optimum.

Fig. 6
figure 6

Cuckoo search-based hyper-heuristic tuned on three acceptance criteria

In addition, further tuning is applied to the termination criteria to measure the effect on the execution time and thus on the solution quality. There are two termination criteria: the execution time reaches a certain limit ɷ, as mentioned in Eq. 4, or the number of nonimproved solutions reaches a predefined threshold (related to the number of customers). The second termination criterion is when the number of nonimproved solutions reaches a predefined threshold (related to the number of customers):

$$ T\left( {\text{ms}} \right) = \hbox{max} \left\{ {{\varpi }, nb of customers * 1000} \right\}. $$
(5)

Figure 7 shows the results of the CsHh for both termination criteria. The first criterion (Eq. 4) provided more efficient solutions. This is because Eq. 5 is based only on the number of customers. When it grows, the termination criterion will grow exponentially and will never stop, even if the solution is found.

Fig. 7
figure 7

Cuckoo search-based hyper-heuristic execution time based on two different termination criteria

Finally, in order to test the solution stability of our algorithm, we conducted 5 runs of the algorithm for each test case. Figure 8a and b correspondingly show the results of executing the 5 runs on the first test case (Tc-1) with a small number of customers (8 customers) and the last test case (Tc-18) with a large number of customers (200 customers). The x-axis represents the execution time and the y-axis represents the value of the objective function. Each of the dotted colored lines represents one run of the algorithm. The solid red line represents the average of the 5 runs for the same test case. The results clearly show that the algorithm’s execution is stable, where trajectory of the 5 runs are evenly distributed around the average of these runs with minimal deviation.

Fig. 8
figure 8

a TC1: Deviation trajectory of the five runs from the average for the CsHh algorithm. b TC18: Deviation trajectory of the five runs from the average for the CsHh algorithm

7.2 Case study: applying the algorithms on real data (Fueled application)

In this section, we compare the results of the CsHh and CW on real data collected from a distribution company operating in Lebanon (IBC 2019). The company distributes products to more than 200 customers in several industries across Lebanon. We applied our solution within the same time frame to a scenario with 12 customers distributed over an area of 60 km2 shown in Fig. 9. The distance between the customers is determined via the Google Maps API. The constraints placed by our client (the Distribution Company) limit the tour to being completed within a maximum of 4 h using only three vehicles. In addition, three customers had a higher priority than others (Ghobeiry, Chiyah, and Mansourieh) and need to be served within the first hour. Further, the service times are almost the same for all customers and will not exceed 10 min; thus, it is assumed that service time is counted within the customer travel time.

Fig. 9
figure 9

Customers’ locations over an area of 60 km2

The solution generated by the Clarke Wright algorithm is detailed in Table 4 and shown in Fig. 10. The total distance needed to serve the 12 customers is 101.5 km and the maximum time needed for the three tours is within 240 min.

Table 4 Solution generated by the Clarke Wright algorithm for the distribution company’s real data
Fig. 10
figure 10

The solution generated by the Clarke Wright algorithm

The solution generated by the Cuckoo search hyper-heuristic is shown in Fig. 11 and detailed in Table 5. The total distance needed to serve the 12 customers is 96.5 km and the maximum time needed for the three tours is within 179 min.

Fig. 11
figure 11

The solution generated by the cuckoo search-based hyper-heuristic

Table 5 Solution generated by the cuckoo search-based hyper heuristic for the distribution company’s real data

It is clear from these results that the CsHh is able to get a better quality solution than the CW in terms of the route distance and route time. Nevertheless, the CW scored very high on simplicity and speed.

8 Conclusion

In this paper, we presented our vision to minimize the traveling distance of vehicles when moving cargo to prioritized customers. The problem under study (the VRPC) is a combinatorial optimization management problem that seeks the optimal set of routes traversed by a vehicle to deliver products to prioritized customers in the shortest possible time. We presented a modified version of the Clarke Wright algorithm and a Cuckoo Search-based Hyper-heuristic for the VRPC with the purpose of comparing the performances of these competing methods.

The Clarke–Wright (CW) savings algorithm is one of the popular heuristics known to efficiently solve the VRP. The Clarke–Wright proved to be very quick and simple to implement. However, in contexts where vehicle routes span long distances to cover a large number of customers, it is worthwhile to explore other methods that reduce the distance covered and the needed time. The Cuckoo search is one of the latest nature-inspired metaheuristic algorithms that belong to the swarm intelligence category. The proposed Cuckoo search-based hyper-heuristic combines low-level heuristics such that a domain specific solution would be guided towards the optimal or a near-optimal solution. In fact, the CW algorithm has been enhanced to solve the VRPC. One contribution of this work is modifying the classical VRPC mathematical model to include prioritized customers. Another contribution is enhancing the CW algorithm to solve the VRP with prioritized customers. A third contribution is proposing and testing a unique hyper-heuristic (CsHh) that has not been used before for solving the VRPC and comparing it with a modified version of the modified CW.

The results indicate that our proposed CsHh outperformed the modified CW algorithm. Mainly, the focus in the CsHh was on the intensification and diversification generation method and the acceptance criteria that guarantees escaping from a local optima. A unique constructive method has been used to boost the quality of the initial population, and, consequently, the quality of the solution space was improved in every generation using a local search stimulated by a Lévy flight. Clearly, this process affected the final results since it aided in yielding better solutions than the CW. An additional contribution is the practicality of the proposed method. We applied this solution to a distribution company (IBC 2019) operating in Lebanon. The company distributes products to more than 200 clients in several industries. The distribution company believes that our CsHh solution offers an attractive alternative to their commercial solver since it is more flexible at handling the company constraints regarding the customers’ priority, capacity, and number of trucks, and the computation time is reasonable as long as it significantly reduced the time and cost of the distribution process.