1 Introduction

Vehicle Routing Problem (VRP) is a classical combinatorial optimization problem, which occurs in many real-world problems, i.e., transportation, logistics, and supply chain management, et al. [1, 2]. With customers having more requirements for logistics, VRP with different constraints are constantly appearing, which include time window, simultaneous pickup and delivery and so on. Up to now, variants of the VRP can be basically categorized into two classes. One is single objective VRP, and the other is multi-objective VRPs.

The single objective VRP aims to determine a set of vehicle routes originating and terminating at a central depot and visiting all clients such that the total travel distance is minimized [3]. According to the different constraints, variants of the single objective VRP has been developed different categories, that one of the most typical is VRP with time windows. Recently, governments and business organization have vigorously promoted green manufacturing and logistics. Therefore, there is an urgent need solution to energy-saving VRPs. In order to develop an energy-saving VRP, many researchers have proposed reverse logistics [4], which entail collecting end-of-life products from customers for either reuse or proper disposal. Reverse logistics have two characteristics in distribution systems: one is delivery, and the other is pickup. The VRP with pickup and delivery (PDP) can be divided into six categories [5]:

  • Delivery first, pickup-second, or the vehicle routing problem with backhauls (VRPB);

  • Simultaneous pickups and deliveries, or the vehicle routing problem with simultaneous pickup and delivery (VRPSPD): clients demanding both delivery and pickup operations have to be visited once, as first proposed by Min;

  • Mixed deliveries and pickups, or the vehicle routing problem with mixed pickup and delivery (VRPMPD);

  • Inter-related pickups and deliveries, or the dial-a-ride problem;

  • The heterogeneous vehicle routing problem with simultaneous pickups and deliveries;

  • The vehicle routing problem with simultaneous pickups and deliveries and time windows (VRPSPDTW): The customers sometimes request that their goods be delivered and picked up within pre-defined time windows.

Due to the constraints and problem structure of the variant of the VRP, the optimization of one objective may lead to the deterioration of other objectives; thus, many researchers have proposed multi-objective VRPs [6]. Because the decision maker’s preference is not known a priori, multi-objective formulation is necessary for providing a set of solutions that represent the tradeoffs among the objectives with the VRPSDPTW, rather than a single solution [7]. The feature of multi-objective formulation is to consider all objectives with the same importance and obtain a set of Pareto optimal solutions. So far, algorithms to solve the multi-objective vehicle routing problem have mainly concerned the multi-objective vehicle routing problem with time windows and rarely more than three objectives. Many classical heuristic and meta-heuristic algorithms have recently been proposed to solve the multi-objective vehicle routing problem with time windows. These methods are mainly divided into two categories. One is based on the Pareto ranking technique, and the other is based on the decomposition method.

A Pareto ranking technique to solve this problem has been developed by many researchers. For example, genetic algorithm using Pareto ranking was developed that simultaneously minimizes the number of routes, the total travel distance, and the delivery time [6, 8]. Iqbal et al. [9] proposed an artificial bee colony (ABC) algorithm combined with a two-step constrained local search for neighborhood selection in a multi-objective vehicle routing problem with soft time windows.

The other method is based on decomposition [11], which decomposes a multiobjective optimization problem into a number of scalar optimization subproblems and optimizes them simultaneously. Each subproblem is optimized by only using information from its several neighboring subproblems, which makes MOEA/D have lower computational complexity at each generation. It is a popular multiobjective optimization method, which is applied to solve discrete [12] and continuous multi-objective optimization problems [11]. Qi et al. [13] proposed a decomposition based memetic algorithm for the multi-objective vehicle routing problem with time windows. They proposed a novel selection operator and three local search methods in their algorithm. However, no researcher before 2016 studied the MO-VRPSDPTW with five objectives problem. In 2016, the MO-VRPSDPTW with five objectives was proposed by Wang et al. [7]. They considered the constraints and problem structure of VRPSDPTW, which the optimization of one objective may lead to the deterioration of other objectives, thus considering the VRPSDPTW as a multi-objective optimization problem (MOP) is a necessary research direction [7]. All objectives of MO-VRPSDPTW were considered with the same importance. This could facilitate decision making without a priori knowledge.

More recently, a new metaheuristic algorithm chemical reaction optimization (CRO) was proposed by Lam and Li [14], imitating the interactions of molecules in a container by balancing local exploitation and global exploration to reach internal stability. CRO has been successfully employed to solve many mono-objective discrete optimization problems [15]. Recently, many researchers have studied continuous CRO to solve multiple objectives, such as an orthogonal multi-objective chemical reaction optimization approach for brushless DC motor design proposed by Duan and Gan [16]. Bechikh et al. [17] are the first to use the indicator-based chemical reaction method to tackle MOP. Li et al. [18] developed a decomposition-based multi-objective chemical reaction optimization to solve continuous multi-objective optimization problems. Chemical reaction optimization has not been used to solve the discrete combinational optimization problem of more than three objectives. Therefore, developing the character of chemical reaction optimization to solve the discrete multi-objective optimization problem is of great significance and motivates this paper.

In this paper, we developed a decomposition-based multi-objective chemical reaction optimization for the multi-objective vehicle routing problem with simultaneous delivery and pickup and time windows (MO-VRPSDPTW). In the process of algorithm design, we decompose the multi-objective optimization problems (MOPs) into a group of single objective optimization problems and use \(\varepsilon \)-dominance to update the archive. In this work, a chemical reaction optimization (CRO) algorithm based on decomposition is used to solve discrete MOPs (D-MOCRO). In our algorithm, the framework of decomposition is used; the approach transforms the original multi-objective problem into smaller sub-problems and solves each problem separately. Each sub-problem is optimized by combining the information of several neighboring sub-problems. Each molecule represents a sub-problem that achieves a state of energy balance by a chemical reaction (the optimal solution set). To balance diversity and convergence, we design the operators of on-wall ineffective collision and inter-molecular ineffective collision to carry on the local search, and the operators of decomposition and synthesis enhance global convergence. The proposed approach is compared with two different algorithms on hypervolume performance measures.

The rest of this paper is organized as follows. Section 2 presents the problem formulation of the MO-VRPSDPTW. Section 3 presents the relate work. Section 4 proposes the D-MOCRO for the MO-VRPSDPTW. Section 5 presents the experimental results, the parameter settings for the investigation, and the performance of the proposed algorithm. Finally, our conclusions and further work are discussed in Sect. 6.

2 Problem

2.1 MO-VRPSDPTW

The VRPSDPTW is a complex combination optimization problem that belongs to the class of NP-hard problems [2]. The VRPSDPTW plays an important role in computing theory and in many real-life applications. It entails finding a minimum-cost set of routes to deliver demand for customers with specific delivery time windows using a fleet of identical vehicles with limited capacity.

The objectives and constraints of the VRPTW can be mathematically defined using notations for the route, depot, customer, and vehicles as a theoretical graphical problem [7]. There is a non-directed complete graph \(\hbox {G} = (V, E)\), where the vertices \(V =\{0, \ldots , N\}\) correspond to the depot and the customers, and the edges \(e\in E\{(i,j):i,j\in V\}\) to the links between them. The vertex 0 is called the depot, and the others are called customers. The customers are served using a fleet of vehicles with capacity C. Every customer has a demand of goods \(g_{i }>\) 0 and a time window \([b_{i}, e_{i}]\). The service constraint of the time window indicates that a vehicle should arrive at the location of customer i before the latest service time \(e_{i}\). A vehicle is allowed to arrive before the earliest service time \(b_{i}\), in which case it must wait until \(b_{i}\) and then begins to serve the customer. Each customeri has a service time \(s_{i}\), which is the actual time that the delivery takes once a vehicle arrives at the customer’s location. In addition to the customer, the depot also has a time window [0, \(e_{0}\)]. The demand of the depot is \(g_{0}\) = 0. The travel distance and travel time between two locations i and j are denoted by \(d_{ij }\) and \(t_{ij}\). The goal of VRPSDPTW is to design a set of M routes \(R = \{r_{1}, {\ldots }, r_{M}\}\) with the lowest cost. A feasible solution to the VRPSDPTW must satisfy that each vehicle departs and returns to the same depot and each customer is served by exactly one vehicle. Among R, \(r_1 =\langle {\hbox {c}( {\hbox {1},j} ),\ldots ,\hbox {c}( {N_j ,j} )} \rangle \) represents the sequence of \(N_j \) customers served in the jth route, where c(ij) the ith customers to be visited in this route indicates. \(c(0,j)=c(N_j +1,j)=0\) represent the depot. As shown in Fig. 1, there are \(M = 3\) routes \(R=\left\{ {r_1 , \ldots ,r_M } \right\} \). \(r_1 =\left\langle {0,c(2,1),c(3,1),c(7,1),c(8,1),0} \right\rangle \) represents the sequence of four customers (customers 2, 3, 7 and 8) served in route1. This process includes three routes that have four, three and two customers served, respectively.

Fig. 1
figure 1

Illustration of the molecular representation

With customers’ demand for service quality growing higher and higher, a multi-objective vehicle routing problem must be designed to meet the expected benefits of customer demand. Before describing the multi-objective vehicle routing problem, several terms are described.

The total travel distance of the jth route is defined as

$$\begin{aligned} \hbox {Dist}_j =\sum _{i=0}^{N_j } {d_{c(i,j)c(i+1,j)} } \end{aligned}$$
(1)

The arrival time of vehicle j at ith vertex is defined as follows. Among of them, the arrival time and the leaving time are represented by symbol \(a_c (i,j)\) and \(l_c (i,j)\) respectively.

$$\begin{aligned} a_c (i,j)=l_c (i-1,j)+t_{c(i-1,j)c(i,j)} \end{aligned}$$
(2)

where \(l_c (0,j)\), which represents that vehicle j departs from the depot at time 0. If a vehicle arrived at the customer earlier than the earliest service time of customer requirements, it must wait until the earliest service time. As shown in Fig. 1 TW1, it must wait 20 min to arrive. Thus, a waiting time is caused by arriving early. The waiting time of vehicle j at ith vertex can be described as

$$\begin{aligned} w_{c(i,j)} =\left\{ {\begin{array}{ll} 0,&{}\quad \hbox { if }a_{c(i,j)} \ge b_{c(i,j)} \\ b_{c(i,j)} -a_{c(i,j)} ,&{}\quad \hbox {otherwise} \\ \end{array}} \right. \end{aligned}$$
(3)

And the leaving time of ith customer of vehiclej is

$$\begin{aligned} l_c (i,j)=a_c (i,j)+w_c (i,j)+s_c (i,j) \end{aligned}$$
(4)

Thus, the total travel time of route \(r_j \) is

$$\begin{aligned} T_j =\sum _{i=0}^{N_j } {(t_{c(i,j)c(i+1,j)} +w_{c(i+1,j)} +s_{c(i+1,j)} )} \end{aligned}$$
(5)

where \(s_{c(i+1,j)} \) denotes the service time of (\(i+1\))th customer in jth route, and the total waiting time of this route is

$$\begin{aligned} W_j =\sum _{i=1}^{N_j } {w_{c(i,j)} } \end{aligned}$$
(6)

The delay time of vehicle j at ith vertex is

$$\begin{aligned} delay_{c(i,j)} =\left\{ {\begin{array}{ll} 0,&{}\quad \hbox { if }a_{c(i,j)} \le e_{c(i,j)} \\ ac(i,j)-ec(i,j),&{}\quad \hbox {otherwise} \\ \end{array}} \right. \nonumber \\ \end{aligned}$$
(7)

and the total delay time of this route is

$$\begin{aligned} \hbox {Delay}_j =\sum _{i=1}^{N_j } {delay_{c(i,j)} } \end{aligned}$$
(8)

Based on the above description, MO-VRPTW can be defined as follows.

$$\begin{aligned} \left\{ {\begin{array}{l} \min \ F(x)=(f_1 ,f_2 ,f_3 ,f_4 ,f_5 ) \\ f_1 =\left| R \right| =M \\ f_2 =\sum \limits _{j=1}^M {\hbox {Dist}_j } \\ f_3 =\mathop {\max }\limits _j \{T_j |j=1\ldots M\} \\ f_4 =\sum \limits _{j=1}^M {W_j } \\ f_5 =\sum \limits _{j=1}^M {\hbox {Delay}_j } \\ \end{array}} \right. \end{aligned}$$
(9)

\(f_{1}\) and \(f_{2}\) are transportation costs, where \(f_{1}\) indicates the number of vehicle, the objective of \(f_{1}\) is reducing the fixed cost of buying (or hiring) and repairing vehicles. \(f_{2}\) is the total travel distance, which indicates the variable costs in routing. \(f_{3 }\) denotes the makespan, i.e., travel time of the longest route (from/to depot). The objective of \(f_{4}\) improves work efficiency and avoids wasting time. The last one is \(f_{5}\), it can be considered as a service cost related to the satisfaction of customers.

The constraints of MO-VRPTW can be defined as follows.

$$\begin{aligned}&\sum _{i=1}^{N_j } {g_{c(i,j)} \le C} \quad \forall j=1,\ldots ,M. \end{aligned}$$
(10)
$$\begin{aligned}&delay_{c(i,j)} \le md \quad \forall i=1,\ldots ,N_j ,\quad \forall j=1,\ldots ,M. \end{aligned}$$
(11)
$$\begin{aligned}&a_{c(N_j +1,j)} \le e_{c(N_j +1,j)} \quad \forall j=1,\ldots ,M. \end{aligned}$$
(12)

Vehicle capacity constraint is represented by formula 10, which represents that the total demand of each route \(r_{j}\) should not exceed the vehicle capacity. The travel time constraint is represented by formula 11, which represents that the delay time should not exceed the maximum allowed delay time. The last constraint is the return time constraint that represents vehicles should return to the depot before the closing time. MO-VRPSDPTW test benchmarks that we can visit the homepage of J. Wang.

3 Relate work

3.1 Multiobjective optimization

Multiobjective optimization is different from the single objective optimization which is to find a set of Pareto solutions rather than a single solution. In multiobjective optimization, the goal is to find the best possible trade off in different objectives, in which one objective can be improved only at the expense of worsening another. To describe the concept of optimality for problem (13) the following definitions are provided [11]. A general multiobjective optimization problem (MOP) is defined as simultaneously optimizing (minimizing or maximizing)

(13)

where \(\Omega \subset {\mathbb {R}}^{n}\) is the decision space and \(x=(x_1 ,x_2 ,\ldots ,x_n )\in \Omega \) is a decision variable which represents a solution to the target MOP. \(F(x): \Omega \in {\mathbb {R}}^{m}\) denotes the m-dimensional objective vector of the solution x. The attainable objective set is defined as the set \(\left\{ {F(x)|x\in \Omega } \right\} \).

Definition 1

(Dominance Relation) In the minimization problem, a solution vector \(u=(u_1 ,u_2 ,\ldots ,u_k )\) is said to dominate another vector \(v=(v_1 ,v_2 ,\ldots ,v_k )\) (denoted by \(u\prec v\)) if u is partially less than v, i.e., \(\forall i\in \{1,\ldots ,k, \ u_i \le v_i\} \hbox { and }\exists \ i\in \{1,\ldots ,k: \ u_i <v_i \}\).

Definition 2

(\(\varepsilon \) Dominated Solution) There are two solution vectors \(u,v\in {\mathbb {R}}^{k}\) and \(\varepsilon >0\), if \(u \ \varepsilon \ \hbox {dominates} \ v\), it is said to be \(\varepsilon \) dominating, i.e., \(\forall i\in 1,\ldots ,k, (1-\varepsilon )\cdot u_i \le v_i .\)

Definition 3

(Pareto Optimum) If there does not exist any other vector \(\mathop x\limits ^\wedge =(\mathop {x_1 }\limits ^\wedge ,\mathop {x_2 }\limits ^\wedge ,\ldots ,\mathop {x_n }\limits ^\wedge )\in \Omega \) such that \(F( {\mathop x\limits ^\wedge } )\prec F(x)\) , a decision variable vector \(x=(x_1 ,x_2 ,\ldots ,x_n )\) from some universe \(\Omega \) that is said to be Pareto Optimum.

Definition 4

(Pareto Set) The Pareto Optimal set \(\dot{P} \) is defined as \(\dot{P} =\{x\in \Omega |x \ \hbox {is Pareto Optimum}\}\).

Definition 5

(Pareto Optimal Set) The Pareto Optimal front PF is defined as \(PF=\left\{ {F(x)\in {\mathbb {R}}^{k}|x\in \dot{P} }\right\} \)

With the above definitions, an MOP requires finding or approximating the set of Pareto optimal solutions and their corresponding objective vectors. The evolutionary algorithm for solving the MOP has three issues that need to be emphasized, i.e., fitness assignment, diversity preservation and elitism strategy. A single-objective optimization problem searches only for an optimal solution; however, the MOP needs to assign the fitness of a solution to multiple criteria. Diversity preservation is an important issue for MOEAs to obtain solutions that are uniformly distributed on the Pareto front for decision makers to change. An elitism strategy is implemented in MOEAs to keep the non-dominated solutions in the population during the search.

Recently, a chemical reaction optimization was proposed to solve many mono-objective optimization problems efficiently, which inspired by chemical reactions launched during collisions, inherits several features from other metaheuristics such as simulated annealing and particle swarm optimization. It has not solved the discrete multi-objective optimization problems so far. Then, we further develop its performance to optimize the discrete MO-VRPSDPTW. We develop a D-MOCRO to optimize the MO-VRPSDPTW, which divides the complex MO-VRPSDPTW into a number of sub-problems and solves them in a chemical reaction optimization collaborative manner. The energy conservation in the process of chemical reaction balances the local (exploitation) and the global (exploration). The chemical reaction optimization algorithm is described as follows.

3.2 Chemical reaction optimization

The chemical reaction optimization algorithm is used to solve many optimization problems involving single-objectives and multiple objectives in continuous and discrete combinatorial optimization problem.

The chemical reaction optimization algorithm includes a set of molecules put in a container with certain energy. Every molecular has a unique structure that includes kinetic energy, potential energy and the number of collision. Another important feature of this algorithm is the law of conservation. In the chemical reaction process, the PE orientates towards the minimum at the balanced state. In the reaction process, the potential energy goes towards the minimal state, similar to the objective function in optimization problems. PE is usually used as the fitness of the objective function.

The chemical reaction attributes of the manipulated molecules (M) in CRO include the following:

  • w is the structure of the molecule which is used to represent a potential solution.

  • PE (potential energy) is defined as the objective function value of the corresponding solution that is represented by w.

  • KE (kinetic energy) is a non-negative number quantifying the tolerance of the system to accept a worse solution than the existing one.

  • NumHit is the number of hits which is a record of the total number of collisions that a molecule has taken.

  • MinPE is the minimum potential energy which represents the minimum value of the function of the current state.

  • MinHit is the minimum hit number which represents the minimum number of hits when the molecule attains the value of the function of the current state.

  • Buffer is the total energy of the container.

The basic chemical reaction algorithm has four types of reaction operators, including on-wall ineffective collision, decomposition, inter-molecular ineffective collision and synthesis. These four operators are the same as the paper [14].

  1. (1)

    On-wall ineffective collision operator of CRO [14]

An on-wall ineffective collision reaction occurs when a molecule hits the wall of the container and then bounces back. After the on-wall collision, the structure of the molecules will change. If the current molecular structure is w, it will turn into another state \(w^{\prime }\). The reaction process is defined by Eq. (2).

$$\begin{aligned} w(i) \quad \rightarrow \ w(i)' \end{aligned}$$
(14)

where, \(w(i) ' = Neighbor (w(i))\).

As a rule of thumb, when a molecule hits a wall, a portion of its KE will be lost, the lost energy is stored in the central energy buffer, when the reaction has completed. Its KE is updated as follows:

$$\begin{aligned} KE_{w^{{\prime }}} \ = \ \left( {PE_w \ - \ PE_{w^{{\prime }}} \ + \ KE_{w^{{\prime }}} } \right) \ \times \ \alpha \end{aligned}$$
(15)

where \(\alpha \) is a random number that lies in between [KELossRate,1], where KELossRate is a parameter of CRO.

If the KE processed by the molecule is high, then there is a possibility that the PE could increase, depicting a worse solution. This change is desirable to make the algorithm escape from its local minimum. The KE drops as an effect of collision. Its level of tolerance of obtaining a worse solution is lowered.

  1. (2)

    Decomposition operator of CRO [14]

A decomposition operator means that a molecule w hits the wall of the container and then breaks into two or more molecules. Compared with the ineffective collision, the decomposition is more vigorous and the resultant molecule structures have greater differences from that of the original one. This operator can be considered as the situation when we finish the local search from w to \(w_{1}^{\prime }\) and \(w_{2}^{\prime }\). Due to the conservation of energy, w may sometimes not have enough energy to sustain its transformation into \(w_{1}^{\prime }\) and \(w_{2}^{\prime }\). A certain portion of energy in the buffer accumulated from on-wall ineffective collisions can be utilized to support the change. The specific implementation process of the decomposition operator is described in Sect. 4.2.

  1. (3)

    Inter-molecular ineffective collision operator of CRO [14]

An intermolecular ineffective collision occurs when two or more molecules collide with each other and then separate. The number of molecules involved in this collision remains unchanged after the collision. If more molecules are involved in the reaction, more energy is needed, and the structure of the molecule is also more flexible. In the original implementation in the simulation, we assume only two molecules, e.g., with molecular structures \(w_{1}\) and \(w_{2}\), involved. Similar to the on-wall ineffective collision, this collision is also not vigorous and the new molecular structures \(w_{1}^{\prime }\) and \(w_{2}^{\prime }\) are produced from their own neighborhoods separately.

  1. (4)

    Synthesis operator of CRO [14]

Synthesis does the opposite of decomposition. A synthesis refers to when two or more molecules collide and then combine to form one new single molecule. This process implies that the search regions are expanded, i.e., the diversification of solutions. The operator of synthesis uses crossover of a genetic algorithm. The specific implementation process of the synthesis operator is described in Sect. 4.2.

3.2.1 Energy handling [14]

Energy can be transformed from one type to another but all energy manipulations must follow the first law of thermodynamics, which states the energy can neither be created nor destroyed. A generalized form of the elementary reaction can be written as follows:

$$\begin{aligned} w_1 + \cdots +w_k \rightarrow w_1^{\prime } + \cdots + w_l^{\prime } \end{aligned}$$

where k and l are the numbers of molecules involved before and after the reaction, respectively. For k =1 and l = 2, the reaction can be said to be a decomposition reaction.

The corresponding energy equation can be written as follows:

$$\begin{aligned}&(PE_{w_1 } +\cdots +PE_{w_k } )+(KE_{w_1 } +\cdots +KE_{w_k } )\\&\qquad +\,buffer\\&\quad =(PE_{w_1^{\prime } } +\cdots +PE_{w_l^{\prime } } )+(KE_{w_1^{\prime } } +\cdots +KE_{w_l^{\prime } } )\\&\qquad +\,buffer^{{\prime }} \end{aligned}$$

In the equation above, the change in the total energy of the molecules before and after the reaction is represented by the left and the right hand sides of the equality respectively.

The general acceptance rule for the new solution is given as follows:

$$\begin{aligned} \sum _{i=1}^k {PE_{w_i } } +\sum _{i=1}^k {KE_{w_i } } -\sum _{i=1}^l {PE_{w_i } } \ge 0 \end{aligned}$$

3.2.2 The sketch of CRO algorithm

As explained above, the step-wise procedure for the implementation of CRO can be summarized as follows.

figure a

CRO has good searching ability and shows excellent operation in two important features of optimization metaheuristics, i.e., controlling the convergence and diversity. It can control the convergence by benefiting from the crossover of the GA. The diversity of chemical reaction uses the operators of on-wall ineffective collision. The conservation of energy of this algorithm can balance the diversity and convergence of solutions.

4 Proposed algorithms for MO-VRPSDPTW

4.1 Solution representation for MO-VRPSDPTW

In the proposed algorithm, the solution representation for the MO-VRPSDPTW is based on variable-length solution representation [7]. The solution of the proposed algorithm is denoted by the chemical molecular, which is shown in Fig. 1 (The molecular). The solution includes several routes, and each route has a sequence of customers to be served. It is worth noting that each route was designed with a vertex 0 as the beginning and ending of the route.

4.2 D-MOCRO for MO-VRPSDPTW

The decomposition method chosen to solve the MO-VRPSDPTW is defined by (9). MOEA/D has achieved good results in solving multi-objective optimization problems. In our algorithm, we use the MOEA/D framework. In [19], Ishibuchi et al. successfully used the decomposition method to solve the combinatorial optimization problem. He analyzed the performance of various methods of decomposition, such as weighted sum, weighted Tchebycheff and penalty-based boundary intersection function. It showed that the weighed sum has high diversification ability. Therefore, the weighted sum method is used to solve the MO-VRPSDPTW, in this paper. The decomposition method is does not directly solving the multiple objectives; rather, it uses decomposition to create a series of quantitative sub-problems, with a parallel chemical reaction optimization algorithm to solve these sub-problems. In this method, each sub-problem is a neighborhood. Each sub problem is optimized by the information of its neighborhood sub problem. In each generation of reaction, the group is made up of the best solution found by each sub-problem. D-MOCRO uses the framework of MOEA/D. It decomposes the target MO-VRPSDPTW into a number of scalar sub-problems using the weighted sum approach.

D-MOCRO uses a specially designed local search methods to improve the quality of the solution to each scalar sub-problems periodically. It works as follows:

  1. (1)

    Initialization Procedure

In the initialization of the MO-VRPSDPTW, a solution is randomly generated as follows. First, a customer should be selected and a route should be created. Another customer is randomly selected to insert after the first customer in the route. Other customers are inserted until another cannot to be inserted, at which point a new route should be created. This procedure is repeated until all customers are inserted, and then one molecular with MO-VRPSDPTW characteristics is built. The five objectives of the above molecular are computed according to Formula (9). This is followed by the initialization weight vector, the calculation of the Euclidean distance and the setting of the neighborhood. In addition, the \(\varepsilon \)-dominance archive is adopted in the proposed algorithm.

D-MOCRO decomposes the MO-VRPSDPTW into a scalar number of single objective sub-problems. The procedure of D-MOCRO, whose pseudo-code is given in Algorithm 6, includes three main points: the initialization of population (molecules) P, the generation of weight vectors and the assignment of neighborhood. To be specific, the initial population Pis randomly sampled from feasible space via a uniform distribution. A set of uniformly distributed weight vectors \(W = \{w^{1}, \ldots , w^{N}\}\) are generated using Das and Dennis’ method [20] where weight vectors are sampled from a unit simplex. Given H and M, a total number of \(N=\left( {\begin{array}{l} H+M-1 \\ \ M-1 \\ \end{array}} \right) \) uniformly distributed points, with a uniform spacing \(\delta =1/H\), where \(H>0\) is the number of divisions considered along each objective coordinate, which sampled on the simplex for any number of objectives, and mapping the points on the search space. An example is shown in Fig. 2 to illustrate Das and Dennis’ weight vector generation method.

The MO-VRPSDPTW can be simulated based on a vector that is represented by the molecular of CRO. The neighborhood search operator is applied in this algorithm, which is based on the vector used in the evolutionary algorithm. An old solution can be found by neighbor search. Each sub-problem (molecular) has its own population and sees computational effort at each generation. D-MOCRO follows the framework of MOEA/D. It decomposes the MO-VRPSDPTW defined by formula (9) into Q single objective optimization sub-problems. Each sub-problem has a niche, which maintains could keep diversity management in the objective space.

Fig. 2
figure 2

Structured weight vector \(\mathbf{w }= (w1, w2, w3)^{T}\) generation process with \(\delta = 0. 25\), i.e., \(H = 4\) in 3-D space. \(\left( {\begin{array}{c} 4+3-1 \\ \ 3-1 \\ \end{array}} \right) = 15\) weight vectors are sampled from a unit simplex

In the classical multi-objective decomposition method, the most popular approached are weighted sum, weighted Tchebycheff and boundary intersection. In this paper, we use the weighted sum approach [11]. The weighted sum method is described for m-objective problems using a weight vector \(\Lambda ^{i}=(\lambda _{i1}, \lambda _{i2}, \ldots ,\lambda _{im})\)

$$\begin{aligned} \min g^{ws}(x^{i}|\Lambda ^{i})= & {} \lambda _{i1} \cdot f_1 (x^{i})+\lambda _{i2} f_2 (x^{i})+\cdots \nonumber \\&+\,\lambda _{im} f_m (x^{i}) \end{aligned}$$
(16)

where m equals five in this paper \(\Lambda ^{i}=(\lambda _{i1} ,\lambda _{i2} ,\ldots ,\lambda _{im} )\) corresponds to a sub-problem \( i(i=1,2,{\ldots }, Q)\), \(\lambda _1 +\lambda _2 +\cdots +\lambda _m =1\), and \(x^{i} \) is a solution to be optimized.

Fig. 3
figure 3

On-wall ineffective collision operator of CRO

Many practical optimization problems have a Pareto front with an irregular shape. To approximate the Pareto optimal solutions of a multi-objective optimization problem, Zhang and Li [11] recently developed a novel multi-objective evolutionary algorithm based on decomposition (MOEA/D). It can work well if the curve shape of the Pareto-optimal front is friendly. However, it shows poor performance to the irregular shape of the Pareto-optimal front. For this problem, Liu et al. [21] proposed an improved MOEA/D algorithm (denoted as TMOEA/D) that utilizes a monotonic increasing function to transform each individual objective function into one so that the curve shape of the non-dominant solutions of the transformed multi-objective problem is close to the hyper-plane whose intercept of coordinate axes is equal to one in the original objective function space. We consider the MO-VRPSDPTW as a discrete optimization problem, so T-MOEA/D [21] was applied in this work.

  1. (2)

    Reproduction Procedure

The on-wall ineffective collision operator randomly selects a customer from a selected route and then re-inserts it into a good position. This operator is shown in Fig. 3, which the pseudo code described in Algorithm 2. This process includes two points, one is selecting a route, and the other is defining the good position to insert the customer. Selecting a route needs to rely on the longest travel time for \(f_{3}\) from formula 9. A good position includes different objectives: the position that has the lowest total distance, the lowest travel time among all routes, the lowest total waiting time and the lowest total delay time from \(f_{2}\), \(f_{3}\), \(f_{4}\) and \(f_{5}\) separately. In short, the best position is the position that achieves the lowest cost.

figure b

The decomposition operator of CRO has a high KE, which is active and similar to a global search. The specific operation is that it selects two routes and then randomly selects a number of customers from the selected routes. This operator is shown in Fig. 4, with the pseudo code described in Algorithm 3.

Fig. 4
figure 4

Decomposition operator of CRO

figure c

The inter-molecular ineffective collision operator of CRO will lead to changing the structure of the molecule. The inter-molecular ineffective collision operator of D-MOCRO exchanges a sequence of customers between routes while preserving the orientation of the sequences or routes. This operator is shown in Fig. 5, with the pseudo code described in Algorithm 4.

Fig. 5
figure 5

Inter-molecular ineffective collision operator of CRO

figure d

The synthesis operator of D-MOCRO generates new solutions described as follows. First, a random number of routes are selected from the first molecule, and copied into the offspring. Then, all the routes from the second molecule, that are not in conflict with customers already copied from the first parent are copied into the offspring.

figure e

Thus, the crossover operator makes offspring inherit routes from parents. Once inherited routes are chosen, they can be regarded as seed routes. The existing routes should include all customers. If one customer cannot be inserted properly, the customer will establish a new route. This operator shown in Fig. 6, with the pseudo code described in Algorithm 5.

Fig. 6
figure 6

Synthesis operator of CRO

Fig. 7
figure 7

Flow chart showing the working of D-MOCRO algorithm

For the optimization algorithm, a good balance between exploration and exploitation is necessary. Exploration refers to ability of the algorithm to “explore” or search different regions of feasible search space, whereas “exploitation” refers to the ability or all the individuals to converge to the near optimal solutions as quickly as possible. Too much stress on exploration will result in a purely random search, whereas too much stress on exploitation will result in a purely local search. In this paper, the chemical reaction optimization operator of inter-molecular ineffective collision and synthesis serve as vehicles of exploration, where the chemical reaction optimization operators of decomposition and on-wall ineffective collision serve as vehicles of exploitation (Fig. 7).

In this section, D-MOCRO uses the framework of MOEA/D to optimize the MO-VRPSDPTW. To maintain the diversity of solutions, D-MOCRO uses an external population to store non-dominated solutions, with multiple single objective sub-problems optimized in parallel. For each sub-problem, a chemical reaction optimization search is applied to improve the current solution. To maintain diversity in searching different parts of the POF, D-MOCRO applies an adaptive weight [10]. The main features of D-MOCRO include (1) using chemical reaction optimization to optimize the MO-VRPSDPTW, (2) using the adaptive weight vectors, (3) using \(\varepsilon \)-dominance to update the feasible solution and (4) using elite population to contain nondominant Pareto feasible solution sets by using an efficient non-dominated sort on sequential search strategy [22]. The procedure of D-MOCRO is given in Algorithm 6.

5 Experimental results and discussions

In this section, 45 real-world benchmark instances are used to test the performance of the proposed D-MOCRO. The MOMA [7] and MOLS [7] algorithms are compared with our proposed algorithm.

5.1 Benchmark functions

See Table 1.

5.2 Performance metrics

The performance of a MOEA is evaluated from two aspects. First, the obtained non-dominated set should be as close to the true Pareto front as possible to achieve good convergence. Second, the solutions in the obtained non-dominated set should be distributed as diversely and uniformly as possible. In this paper, the Hypervolume metric is used.

Hypervolume (\(I_{H})\) is used to indicate the area in the objective space that is dominated by at least one solution of the nondominated set. This metric was suggested by Zitzler et al. [23] to indicate the area in the objective space that is dominated by at least one solution of the nondominated set. In practice, \(I_{H }\) is calculated as follows:

$$\begin{aligned} I_H (A)=\mathop {\int {\ldots \int {1\cdot dz} } }\limits _{z\in \cup _{x\in A} HV(f(x),f^{*})} \end{aligned}$$
(17)

where \(HV(f(x),f^{*})=[f_1 (x),f_1^*]\times \ldots \times [f_m (x),f_m^*]\) is the Cartesian product of the closed intervals \([f_i (x),f_i^*],i=1,\ldots ,m\), the \(f^*=(f_1^*,\ldots ,f_m^*)\) is reference point and the set A is nondominated set. An example with two objectives is shown in Fig. 8, where the solution a, b, c and d are objective vectors. The area of shadow represents the hypervolume of nondominated set. From Fig. 8 shows that the large \(I_{H }\)represents the nondominated close towards the Pareto front. Therefore, the \(I_{H}\) metric can also measure the convergence and diversity of an algorithm.

figure f
Table 1 Details of 45 real-world instances and stopping time (in seconds) for each instance

5.3 Parameter settings

In our experimental study, the \(MAX\_FES =50{,}000\), the number of the subproblems \(N=495\), the neighborhood list size \(T=5\). Parameters of D-MOCRO are set as follows. The initKE equals to 10,000, the initBuffer equals to 100, the decThres equals to 800, the synThres equals to 15, the lossRate equals to 0.1, the collRate equals to 0.2. The initKE controls the molecular kinetic energy. The initBuffer controls the container total energy which plays a keeping energy conservation role. The decThres is a threshold value of carrying on decomposition operator. The synThres is a threshold value of carrying on synthetic operator. The collRate is a threshold value of carrying on whether the molecules collide with the container wall or collide with each other. In this paper, CRO with collRate is also used in balancing exploration and exploitation processes of the proposed CRO algorithm. Population size is set to \(N= 500\) for all the benchmarks. All the compared algorithms stop when the number of function evaluation reaches the maximum number. The maximum function evaluation number is set to 30,000 for the benchmarks. The MOLS [7] and MOMA follow the implementation of J. Wang [7].

Fig. 8
figure 8

Hypervolume of nondominated solutions

Table 2 The HV metric of comparative algorithms
Table 3 Box plots of the three algorithms on HV Metric of D-MOCRO, MOLS and MOMA

For all experiments, 20 independent runs are carried out on the same machine with a Celoron 3.40 GHz CPU, 4 GB memory, and windows 7 operating system, and conducted with the maximum number of function evaluations (MAX_FES) as the termination criterion. The proposed D-MOCRO and comparison algorithms were implemented in Microsoft Visual Studio 2013 (C++).

5.4 Experimental results and discussions

5.4.1 Comparison of the experimental results on real world instances

Recently, a many-objective VRPSDPTW was proposed. Most previous algorithms consider only the VRPSDPTW/VRPTW with two objectives. In this paper, we develop a multi-objective chemical reaction optimization to solve this problem. To balance diversity and convergence, we design local search operators and operators of global convergence. The results of the experiment are shown below.

Fig. 9
figure 9

Plots of nondominated solutions obtained by D-MOCRO (denoted as dot), MOLS (denoted as plus) and MOMA (denoted as asterisk) on the instance 50-0-1, respectively

The average values of the metrics over 30 independent runs of both algorithms on the test sets were calculated and are shown in Tables 2 and 3. Due to space limitations, the standard deviations of the metrics are not presented in the tables.

To test the performance of the D-MOCRO algorithm with other algorithms on different MO-VRPSDPTW benchmarks, we use the Wilcoxon signed-rank test [24] at a significance level of 0.05. The results of the benchmarks are summarized as \({\dagger }/{\S }/{\approx }\), which indicate whether the performance of D-MOCRO is better than, worse than or similar to others. From Tables 2, 3 and Fig. 9, several observations can be made.

  1. (1)

    For 45 real-world instances, D-MOCRO significantly outperforms MOLS and MOMA in most instances. Specifically, D-MOCRO outperforms MOLS in 23 instances and MOLS is outperformed by MOLS in 22 instances in terms of HV. According to the attribute of HV, the results in Table 2 show that D-MOCRO can obtain better convergence and diversity than MOLS and MOMA.

  2. (2)

    To demonstrate the convergence and diversity of the comparative algorithms, we plot the non-dominated solutions obtained by D-MOCRO (denoted as. in blue), MOLS and MOMA on the instance 50-0-1, on \(f_{1}-f_{2}\), \(f_{1}-f_{3}\) and \(f_{2}-f_{3}\) planes in Fig. 9. In Fig. 9, we can see that most objective vectors generated by D-MOCRO are better than those generated by MOMA and MOLS.

  3. (3)

    To further show the convergence and diversity properties of the three algorithms, the box-plots of three algorithms are shown in Table 3. The HV metric is used in this paper. From Table 3, we found instances of 50-0-3, 50-0-4, 50-1-1, 50-1-2, 50-1-3, 50-1-4, 50-2-0, 50-2-1, 150-0-0, 150-1-1, 150-1-3, 150-2-0, 150-2-1, 150-2-3, 150-2-4, 250-0-0, 250-0-1, 250-1-0, 250-1-4, 250-2-2, 250-2-3 and 250-2-4, where D-MOCRO was better than other two algorithms. D-MOCRO can search for more solutions using an efficient local search, especially in the instance 50-2-0.

5.4.2 Discussions

In this paper, a chemical reaction optimization algorithm that solves the MO-VRPSDPTW with five objectives has been proposed. The main idea is to use a decomposition-based chemical reaction optimization to solve the MO-VRPSDPTW. When the number of objectives is five, we use an archive to save solutions. We use an efficient non-dominated sort on sequential searches [22] to update the archive of D-MOCRO.

In D-MOCRO, a chemical reaction optimization algorithm is successful developed to solve discrete many-objective combinational optimization problem, for the first time. D-MOCRO has good convergence, benefitting from the operator of synthesis. The other operators are used to search locally for neighbors of molecules and can obtain a good diversity.

Finally, it can be concluded that D-MOCRO is an effective mechanism to solve real-world MO-VRPSDPTW. Extensive experiments have shown the effectiveness of the proposed algorithm. In order to further to study the performance of D-MOCRO, the proposed algorithm can also be extended to solve other multiobjective VRPs in reverse logistics. Although D-MOCRO was shown to be very promising for real-world MO-VRPSDPTW, however, it has its limitation. The source of the major limitation is the sensitivity to the parameter. In order to enhance its performance, adaptive control of parameter will be developed to make its performance more efficient.

6 Conclusion and future work

In this paper, a discrete multiobjective CRO has been developed to solve multiobjective vehicle routing problem with five objectives, for the first time. In our work, a D-MOCRO is proposed which is based on the method of decomposition. It converts the complex MO-VRPSDPTW into a number of single-objective subproblems and optimizes them in CRO simultaneously. CRO has four operators to search, which includes one molecular operation and two molecular operations. The one molecular operation includes an on-wall ineffective collision operator and decomposition operator, and the two molecular operations include a synthesis operator and inter-molecular ineffective collision operator. The on-wall ineffective collision operator and inter-molecular ineffective collision operator execute a local search. The decomposition operator and synthesis operator executes global search. The energy conservation in the process of chemical reaction balances the local (exploitation) and the global (exploration). In addition, D-MOCRO uses an archive to save non-dominate solutions. The archive updating sues the efficient non-dominated sort on sequential searches [22].

From the analysis and experiments, we observe that the decomposition-based multi-objective chemical reaction optimization enables D-MOCRO to utilize energy conservation balancing the local (exploitation) and global (exploration) to generate better quality solutions. Compared with the other algorithms on the 45 benchmark problems, we can conclude that D-MOCRO significantly outperforms MOLS and MOMA in most instances. Specifically, D-MOCRO outperforms MOLS in 23 instances and MOLS is outperformed by MOLS in 22 instances in terms of HV. According to the attribute of HV, the results of Table 1 show that D-MOCRO can obtain better convergence and diversity than MOLS and MOMA.

Further work includes research on the adaptive control of parameter to make the algorithm more efficient. Moreover, the algorithm can be applied to constrained, dynamic and noisy many-objectives and multi-objective optimization domain. It is expected that D-MOCRO will be used for many real-world optimization problems.