1 Introduction

The covering location problem (CLP) is a well-known problem discussed in location science. There are two types of CLPs. They are the set covering location problem (SCLP) and the maximal covering location problem (MCLP) (Zarandi et al. 2013). MCLP deals with the problem of finding an optimal placement of a given fixed number of facilities on a network. The goal of MCLP is to determine an optimal placement in such a way that the total demands covered by the served population is maximized. If a customer falls within a given constant service area or coverage area, then the node is served by a facility (Atta et al. 2018). MCLP is a resource constraints problem. The objective of the problem is to serve the demands of customers as much as possible with the limited resources or budget (Davari et al. 2011). Maximal covering location problem is also a constraint satisfaction problem that is very useful in locating objective areas to be served in different contexts.

There are many applications where MCLP can be applied for better solutions such as identifying police patrol stations, rehabilitating ambulances, ensuring first aid privileges during natural disasters and providing emergency services in rural as well as over populated areas (Zarandi et al. 2013). Dynamic MCLP has recently been applied for locating fire station and allocating facilities during wars and natural disasters (Hajipour et al. 2022). In emergency and military services, the MCLP has several applications to deploy (Atta et al. 2018). The MCLP has importance not only in private sectors but also in public sectors. Plant and warehouse placements and the placement of telecommunication antennas, etc., are examples of several private sectors where MCLP can be applied to establish an optimal placement structure. Similarly, the placement of schools, libraries, parks, bus stops, hospitals, fire stations and emergency units are common examples of public services where MCLP is applicable to determine optimal solution (Lorena and Pereira 2002). For sharing economy, MCLP can be applied for better solutions. MCLP has recently been deployed to distribute bikes in Chinese cities (Li et al. 2020). It is greatly applicable in designing network with constraints.

There are several strategies to solve the maximal covering location problem. It is an NP-hard problem, so heuristics and metaheuristics approaches are suitable for this context. Many heuristics and metaheuristics algorithms were proposed to solve MCLP by the researchers such as GA (genetic algorithm)(Atta et al. 2018; Zarandi et al. 2011), SA (simulated annealing)(Zarandi et al. 2013; Davari et al. 2011), GH (greedy heuristics)(Berman and Krass 2002; Máximo et al. 2017), TS (tabu search)(Bagherinejad et al. 2018). In our study, we have chosen chemical reaction optimization (CRO) algorithm, a nature inspired metaheuristics, to deal with MCLP. There are some reasons to select CRO over other metaheuristics approaches. Firstly, CRO can serve the purposes of both GA and SA algorithms. GA is a Darwinian evolutionary algorithm, functions on the basis of biologically inspired operators selection, crossover and mutation (Atta et al. 2018) where mutation and crossover operators are similar to CRO’s synthesis and decomposition. SA is another well-known stochastic metaheuristic; the energy conservation mechanisms of CRO provide a homogeneous effect like SA (Lam and Li 2012). So, CRO can provide a dual effect of GA and SA algorithms at the same time. Secondly, the best feature of CRO is its ability to search. It can search the solution space locally as well as globally with the help of its four operators (on-wall ineffective collision, intermolecular ineffective collision, decomposition and synthesis) (Lam and Li 2012). It makes the algorithm more efficient compared to other metaheuristics approaches. Thirdly, additional operators are supported in CRO to make a possible solution on the problem-specific arisen issue. In our study to deal with MCLP, we have faced the problem of solutions are sometimes being trapped into local maxima, so an additional repair operator is designed to resolve the arisen problem. Fourthly, the dynamic behavior of CRO can increase or decrease the number of solutions according to the necessity of current iteration to reach in an equilibrium energy conservation state which is quite impressive. So, the number of solutions varies from iteration to iteration. Fifthly, in recent years, CRO has successfully solved many optimization problems such as determining the machining parameter of abrasive water jet cutting (Bhoi et al. 2022), designing reduced congestion road networks (Salman and Alaswad 2022), mobile robot path planning with obstacle avoidance (Islam et al. 2021), longest common subsequence problem for multiple string (Islam et al. 2019), DNA motif discovery (Saha et al. 2020), 0–1 knapsack problem (Truong et al. 2013), task scheduling and grid computing problem (Jin et al. 2011), quadratic assignment problem (Xu et al. 2010), the max flow problem (Barham et al. 2016), shortest common super-sequence problem (Saifullah and Islam 2016), RNA structure prediction (Kabir and Islam 2019), transportation scheduling in supply chain management (Mahmud et al. 2017), RNA secondary structure prediction (Islam et al. 2021), optimization of protein folding (Islam et al. 2020), flexible job-shop scheduling problems with maintenance activity (Li and Pan 2012), the distributed permutation flow-shop scheduling problem with makespan criterion (Bargaoui et al. 2017), the cloud job scheduling (Zain and Yousif 2020), etc., with better results than the other existing metaheuristic algorithms.

Different researchers used different optimization techniques for different problems and sub-problems. For the task of codon-mRNA prediction, the hybrid method DLSTM-DSN-WOA was introduced in Kadhuim and Janabi (2023) where whale optimization algorithm (WOA) is used to build optimal structure for deep long short-term memory (LSTM). For the task of model parameter estimation of the proton exchange membrane fuel cell (Syah et al. 2023), the improved teamwork optimizer is used for optimization. Another optimization problem, verifying chemical reaction (VCR) was solved using whale optimization algorithm in Janabi et al. (2023). Whereas in this study, we have solved the problem termed as maximal covering location problem (MCLP) using the optimization algorithm named chemical reaction optimization (CRO) algorithm.

In this article, we have proposed a CRO-based approach to solve MCLP. The molecules of CRO are encoded by locating suitable facilities, and the potential energy is computed with respect to percentage of coverage and computational time. To measure the performance of the proposed method, we have tested five real-world datasets as well as four randomly generated datasets consisting of 324 to 2500 nodes (including small, medium, and large datasets). The obtained results of the proposed method are compared with the GA-based method proposed by Atta et al. (2018) which is the state-of-the-art method. The authors of paper (Atta et al. 2018) compared their method with two other methods (Davari et al. 2011) and (Lorena and Pereira 2002). The proposed method obtained better results in almost all the test instances in terms of percentage of coverage as well as computational time compared to Atta_GA (Atta et al. 2018). During dealing with MCLP, a problem is noticed that some solutions are getting trapped into local optima, so an additional repair operator (LM-GM) is designed to resolve the trapping situation; all the results are examined and shown with and without LM-GM to observe the improvements in this context. The novelty and contributions of the proposed work are as follows:

  1. 1.

    We have redesigned four fundamental operators of CRO algorithm and solved the MCLP, and this approach is not used before to solve the MCLP.

  2. 2.

    A repair operator has been designed to overcome the trapping into local optima, and it produces the max point in MCLP that helps to obtain the best results in less computational time.

  3. 3.

    We have obtained best known results in almost all the cases of the five real-world datasets. The proposed algorithm gives a substantial improvement in computational time.

  4. 4.

    The proposed algorithm gives the best results in comparison with the state-of-the-art algorithm with less computational time in four random datasets.

  5. 5.

    A statistical test has been examined that shows the proposed algorithm is significant in both real-world and random instances.

In this article, so far we have demonstrated the basic ideas of MCLP in Sect. 1. The problem statement and objective function are briefly described in section 2. Section 3 is for related work; Sect. 4 demonstrates our proposed method for solving MCLP using chemical reaction optimization, Sect.  5 describes the experimental results and analysis of the work, and Sect. 6 concludes the work.

2 Problem statement and objective function

The MCLP deals with the problem of finding an optimal placement of a given fixed number of facilities on a network in such a way that the total demands of the attended population are maximized (Zarandi et al. 2011).

2.1 The hypothesis and limitations of the develop method

Some assumptions have been considered for this problem. MCLP works with a given number of customers (nodes) in a network. Each customer (node) has three basic features. These include the demand value (must be non-negative), x coordinate value and y coordinate value. Two coordinate values (x, y) define a customer’s location in the network. Each customer (node) is assigned a fixed number of facilities. Each facility has a service area called coverage area which is circular in shape, and it remains constant throughout the whole processing. A facility can be provided to a customer (node) if the Euclidean distance between the concerned node and the facility node is smaller than the constant service area (coverage area). Therefore, the target of the problem is to locate an optimal placement of the facilities on the network so that the total demands of the served population is maximized. The mathematical demonstration of the objective function of the maximal covering location problem is given by the equation as follows:

$$\begin{aligned} f(D)= MAX \sum _{x\in {D}} d_{x}s_{x} \end{aligned}$$
(1)

Subject to the constraints:

$$\begin{aligned}{} & {} s_{x} \le \sum _{y\in {E_{x}}} p_{y}, x\in {D} \end{aligned}$$
(2)
$$\begin{aligned}{} & {} \sum _{y\in {F}} p_{y} = f \end{aligned}$$
(3)
$$\begin{aligned}{} & {} s_{x}\in {\{0,1\}}, x\in {D} \end{aligned}$$
(4)
$$\begin{aligned}{} & {} p_{y}\in {\{0,1\}}, y\in {F} \end{aligned}$$
(5)

Here, formulation Eqs. (1), (2), (3), (4) and (5) are taken from Atta et al. (2018). The goal of the objective function Eq. (1) is to maximize the total demands covered by the selected customers (population). Constraint Eq. (2) shows that a customer is covered when there exists at least one facility within the coverage area of the node. Constraint Eq. (3) denotes that the total number of facilities should be exactly f. Constraints Eqs. (4) and (5) are the two binary decision variables for MCLP. In this article, we did not consider the variable length service distances for different nodes.

2.2 Complexity of MCLP

The maximal covering location problem is NP-hard (Megiddo et al. 1983). The solution of MCLP is non-deterministic in polynomial time as it has to maintain a huge graph or matrix for a small size of customers (nodes), and the total subsets of the customers are also huge in number and the problem is restricted with several constraints. For example, if we consider 500 customers each of which to be placed with exactly 6 facilities, then we need to maintain a distance matrix of \( 500\times 500 \). To determine the optimal solution from this large solution space, we need to explore a large number of different combinations. For our example, we have total \(\frac{{500_p}_6}{6!}\) number of solutions available. It is necessary to find the best solution (one) from them.

Table 1 Complexity of MCLP

Table 1 describes the complexity of MCLP. The feasible solutions of MCLP are increasing to a great extent with the increasing number of customers and facilities. The distance matrix is becoming huge when the number of customers is increasing. MCLP has to maintain several constraints as well. So, there is no deterministic polynomial time solution exists for this problem. That is why metaheuristics approaches are applicable to determine efficient solutions for MCLP. The objective function is basically the fitness function of the problem definition. The notations shown in Table 2 are used to formulate the objective function and constraints for the maximal covering location problem.

Table 2 Symbolic notations of the objective function

Figure 1 depicts a sample example of MCLP. This is a MCLP graph of five customers where each of them asks to be assigned with exactly two facilities. The five customers (nodes) are labeled as C1, C2, C3, C4 and C5. The corresponding demand value is shown at the top of each node. These values include 50, 4, 33, 15 and 1, respectively. The coordinate values are presented in parentheses. So for customer C1 the coordinate values are 409154 and 435528. The service distance and the number of facilities to be installed for this network are 100 and 2, respectively. This is a fully connected MCLP graph. To solve this graph, first create a distance matrix from the coordinate values using Euclidean distance formula. Then, generate all possible solutions or sequences for five customers with 2 facilities. The total number of sequences is \(\frac{{5_p}_2}{2!}\) that is ten. Each sequence needs to traverse the entire solution space and find the result. Hence, ten sequences must have ten results, among them the highest one will be the required result. In the sample example, the maximum value is 83, which is generated for sequence 10100. The final percentage of coverage is determined by the average between sum of all customer demand values and the generated result. Here, the sum of all demand values is 103. In Fig. 1, nodes C1 and C3 are selected as it maximizes the demand values of attended nodes. Hence, C1 and C3 are colored. Therefore, the estimated result for the MCLP graph in Fig. 1 is 83. So, the percentage of coverage is \(\frac{83}{103}\) \( \times 100 \) that is 80.58%.

Fig. 1
figure 1

A sample example of MCLP

Figure 2 is taken from Zarandi et al. (2011), it represents the visualization to a solution of MCLP with 20 facilities. In the figure the bold nodes are the facilities and the circle denotes the service radius of the corresponding facility. The smaller dots refer to the center of demand that should be served.

3 Related works

Various metaheuristics approaches were proposed to solve the maximal covering location problem. Each approach has some efficiencies as well as pitfalls over other approaches. Some of the existing algorithms are described here.

3.1 Genetic algorithm

In 2018, Atta S et al. proposed an approach to solve the maximal covering location problem (MCLP) using genetic algorithm with local refinement (Atta et al. 2018). Since MCLP is an NP-hard problem, the authors solved the problem using genetic algorithm (GA) that is a standard metaheuristic approach. They created the initial population by encoding chromosomes. For encoding, they used binary array representation strategy. Each chromosome is a solution for GA that can be created randomly or following some preconceived notions. A set of chromosomes makes up the initial population. The authors redesigned the mutation, crossover and selection operators. They also designed a local refinement operator with elitism to get the best results. In each iteration, all chromosomes traverse these operators and try to extract a better solution than the ones that exist. A new generation is created at the end of the current iteration until the desired result is obtained. They used two types of datasets to measure the performance and efficiency of their proposed algorithm. SJC324, SJC402, SJC500, SJC708 and SJC818 are the five real-world datasets that they used. These five datasets are networks of 324, 402, 500, 708 and 818 nodes, respectively. The datasets pmed32 and pmed39 are networks of 700 and 900 nodes, respectively, collected from the OR library. \( G \& R100 \), \( G \& R150 \), ZDS1800 and ZDS2500 are four random datasets used by them to measure both small- and large-scale variation of MCLP. These four datasets are networks of 100, 150, 1800 and 2500 nodes, respectively. They compared their proposed algorithm with other approaches in terms of percentage of coverage and computational time. In most cases, their proposed method gives quite good results in both percentage of coverage and computational time.

Fig. 2
figure 2

A sample solution to MCLP

Their proposed method is almost an optimal and efficient approach that can solve both small- and large-scale versions of MCLP. It is tested with 100 to 2500 node networks, and it can give satisfactory results in almost all test instances. The computational cost is relatively high for some instances. This algorithm is a state-of-the-art method to solve the MCLP.

3.2 Greedy randomized adaptive search procedure

Maximo VR et al. proposed an intelligent-guided adaptive search technique to solve the maximum covering location problem (MCLP) in 2016 Bagherinejad et al. (2018). The authors themselves named their proposed method as intelligent-guided adaptive search (IGAS), and it works according to process of the greedy randomized search procedure (GRASP). They created the construction phase of IGAS using artificial neural network (ANN). This method is specialized for large-sized instances (more than 3000 nodes). They used an unsupervised machine learning technique, GNG (neural gas), to monitor the best solutions developed in the current iteration. In this way, this method can learn about the improvement of solutions that helps in the rest iterations. So, their proposed algorithm tries to take decision in such a way that it can traverse through a promising branch. The proposed IGAS is tested with two different experiments. It is compared with GRASP and CPLEX methods with data instances having at least 2700 nodes.

The principal advantage of this strategy is the creation of IGAS which is a variant of GRASP. This method incorporates an adaptive memory so that it can learn from previous solutions. IGAS gives better results almost in all the instances in both experiments. It is a very complex algorithm as hybridization process is applied. The iteration independence of the GRSSP has been violated here.

3.3 Simulated annealing

Zarandi MHF et al. proposed an approach to solve the large-scale dynamic maximal covering location problem in 2013 Zarandi et al. (2013). This paper proposed a long-term version of MCLP named as dynamic MCLP (DMCLP). They used simulated annealing (SA) algorithm, a well-known metaheuristics, to solve the problem. Designing solution representation strategy is the first step of SA. For DMCLP, they used a binary vector representation where each bit denotes the status of a node. In a solution for this representation 1 represents a facility is sited to the node and 0 otherwise. The authors used roulette wheel selection (RWS) technique to accomplish the initialization process of their proposed method. The initial temperature, an important parameter of SA, was set using the method of Crama and Schyns (2003). Three types of cooling schedules (linear cooling, exponential cooling and hyperbolic cooling) were used for DMCLP. They used neighborhood search structure (NSS) to search for better solutions. Their proposed algorithm is tested with random datasets of 1800, 2000 and 2500 nodes along with 100 facility sites. This metaheuristic is compared with CPLEX to measure the accuracy and computational time.

The great achievement of this method is its runtime. It takes less computational time than the existing algorithms to solve the large scale MCLPs. It cannot produce optimal solutions in most of the instances though the gap between the produced solution and the optimal solution is very small.

Table 3 Summary of previous works
Table 4 Summary of used datasets in previous works

3.4 Fuzzy simulation

In 2011, Davari S et al. proposed a technique to solve the maximal covering location problem (MCLP) with fuzzy travel times (Davari et al. 2011). They solved the problem by using a hybrid intelligent algorithm where they combined both fuzzy simulation (FS) and simulated annealing (SA) techniques. The authors named their proposed intelligent method as fuzzy version of maximal covering location problem (FMCLP). The travel time between the nodes is referred to as fuzzy variables. SA is a local search-based metaheuristic that searches the solution space stochastically. At the beginning of FMCLP, the SA part generates the initial population with several solutions. FMCLP solution representation is done using binary array representation between the candidate nodes. The travel time matrix is essential along with other general information (the coordinate values and the demand value of each node) of MCLP for this method. The method of Crama and Schyns was used to set the initial temperature of the method (Crama and Schyns 2003). NSS (neighborhood search structure) is used to search for better solutions. The proposed algorithm is tested with random datasets of 50, 100, 200, 500 and 900 nodes, respectively. The test results were compared with those of with an exact method LINGO.

The proposed method includes the fuzzy theory to MCLP. Other metaheuristics have a chance to bring out a revolution to the fuzzy version of MCLP. The proposed method cannot produce better results in several instances compared to the exact method LINGO with respect to both optimal solution and computational time.

3.5 Genetic algorithm

Zarandi MHF et al. proposed an approach to solve the large-scale maximal covering location problem (MCLP) in 2011 Zarandi et al. (2011). They tried to solve the MCLP for larger number of nodes that’s why they named the method as large-scale MCLP. Their proposed method is applicable for up to 2500 nodes. They used the genetic algorithm (GA) to solve the problem. This method applied a binary vector representation to complete the chromosome encoding. The authors redesigned the selection, crossover and mutation operators of GA They used roulette wheel selection (RWS) technique to design the selection operator. Two random datasets were used to test the results of the proposed algorithm of 1800 and 2500 nodes, respectively. The test results were compared with CPLEX method to observe the performance of the method.

This method took less computational time compared to CPLEX method; it is the achievement of their proposed method. This algorithm could not produce optimal solutions in several instances though the gap between the produced solution and the optimal solution is relatively small.

3.6 Summary of related works

We have shown the summary of the literature being studied in Table 3 where we have summarized the previous developed techniques, type of dataset used, evaluation measures, advantages and disadvantages of that technique. The * sign is used after some datasets to represent it as a randomly created dataset. It is evident from Table 3 that most of the previous methods used some random datasets for evaluating their developed methods. The brief summary of the used datasets in previous works is shown in Table 4.

4 Chemical reaction optimization for MCLP

The collective and general term optimization was basically used in lots of scientific fields and applications to come up with best values in different contexts. The goal of optimization can be minimization, maximization, cost reduction, performance upgradation etc. It depends on the problem definition, objective function and many other catalysts as well (Al-Janabi and Alkaim 2022). The maximal covering location problem is also an optimization problem where optimization means maximization of the total demands of the served population. This section includes several subsections. They are chemical reaction optimization, basic structure of proposed method, solution and population generation, operator design and flowchart of the proposed method.

4.1 Chemical reaction optimization

Lam AYS et al. proposed an algorithm for optimization problem named chemical reaction optimization (CRO), which is a metaheuristic based on nature (Lam and Li 2012). The working principle of CRO actually follows the two principles of thermodynamics. The gist of the first principle (law of conservation) is that energy cannot be produced or spoilt, energy can only be converted from one form to another. So, the amount of total energy remains constant always. This can be shown by the equation as follows:

$$\begin{aligned} \sum _{x=1}^{Popsize(tm)} (PE_{x}(tm)+KE_{x}(tm)+buffer(tm)=S \end{aligned}$$
(6)

In equation (6), \( PE_{x}(tm) \) and \( PE_{x}(tm) \) demonstrate the potential energy (PE) and kinetic energy (KE) of molecule x, respectively, at any time tm. Popsize(tm) is the number of all molecules; buffer(tm) is the energy in the central buffer at time tm. The amount of total constant energy is denoted by S. The value of S validates that CRO follows the energy conservation rule. The second principle of thermodynamics ensures transformation of energy among the molecules (PE is converted to KE during iteration stages). CRO follows these two rules to come up with a better solution for optimization problems.

CRO is a metaheuristic based on population. Basically, CRO performs three core stages. These are initial, iteration and final stages. In the initialization stage, CRO initializes several attributes (initial parameters) and creates the initial population. Each molecule has several attributes. Some of these attributes are molecular structure \( (\alpha ) \), potential energy (PE) , kinetic energy (KE) , number of hits (NumHit) , etc. After initializing the initial parameters, CRO creates the initial population with popsize. Second stage is the iteration stage. Here, one reaction operator (out of four operators) works in each iteration by checking certain conditions. CRO has four elementary reaction operators. They are on wall ineffective collision, intermolecular ineffective collision, synthesis and decomposition. There are no hard and fast rules for designing these operators. The on-wall ineffective collision and the intermolecular ineffective collision ensure intensification (local search). On the contrary, the synthesis and decomposition operators ensure diversification (global search). These four reaction operators make CRO a better method for optimization problems compared to other metaheuristics because of its capability of searching. In addition, CRO allows additional repair operators to be included in the algorithm if needed. Another interesting thing about CRO is that the number of total molecules is not same always during iterations. CRO increases or decreases the number of total molecules according to the current need of the problem. The third one is the final stage of CRO. If any of the stopping criteria is matched during iterations or the limit of iteration exceeds, then CRO goes to the final stage and shows the necessary outputs.

figure a

4.2 Basic structure of proposed method

In this article, the MCLP has been solved using chemical reaction optimization algorithm, and hence, the proposed method is named as MCLP_CRO. The initial parameters (Table 5) of CRO are initialized with proper values and then create the initial population (Algorithm 1) using a random function. The pseudo-code of the proposed MCLP_CRO method is shown in Algorithm 2. By passing the initial parameters and initial population to the function MCLP_CRO, the iterations are performed. After the termination of algorithm MCLP_CRO, it measures the outputs. The outputs are saved into two variables named result and time. The result demonstrates the percentage of coverage value, and time describes the computational time of the proposed method.

figure b

4.3 Solution and population generation

CRO is a metaheuristic method that works on population. A unit from the entire population is called a molecule. The molecule is also known as the manipulated agent due to its actual manipulation to retrieve solution. All the molecules with popsize form the entire population. Every molecule has several parameters. Some of the initial parameters with their corresponding algorithmic definitions and initial values are shown in Table 5. How the initial parameters are tuned to their initial values; the answer of this question is briefly given in Sect.  5.1.

Table 5 Parameters of CRO for MCLP with their initial values

The initial population is created randomly. Algorithm 1 shows the generation process of initial population for the proposed method. Here, molecules_vector is the set of all molecules, counter is the variable to count the given facilities to a node, rand() is the function that generates a random integer, x is the index of initial potential facility sites, d[1,2,3,...,m] is the initial satisfied demand vector of each customer. There are m customers where f facilities to be installed. Each molecule of the initial population is created by locating f random indices from the set of customers or nodes {1, 2, 3, ....., m}. Here the value of f is less than m. Each f is generated randomly (random integer mod m). By looping through 0 to popsize, all the molecules are created accordingly. Thus, the initial population is created. The pseudo-code of our proposed MCLP_CRO approach is depicted in Algorithm 2 where several parameters are used, all the initial parameters of CRO and MCLP are described briefly in Tables 2 and 5, respectively. Here, r represents a molecule; b is a random fractional number in the range [0, 1]. In the proposed MCLP_CRO algorithm, we have used Decomposition(), On Wall Ineffective Collision(), Synthesis(), Inter-molecular Collision() functions as basic CRO operators and LM-GM() as repair operator. In every iteration, either unimolecular or bimolecular reaction will activate depending on the value of threshold parameter MoleColl and random fractional parameter b. One of the reaction function operates in each iteration and tries to come up with a good solution. After the iterations stage, LM-GM() is performed to all the molecules to remove the trapping situation of some molecules. Section 4.4 briefly describes the working principle of the reaction functions.

4.3.1 Solution representation

A solution for MCLP is a set of f potential locations those need to be selected from the set of m customers. Let the values of m and f are 10 and 3, respectively. Here, m and f both are represented by one-dimensional arrays. Initially all the values of these two arrays are zero. Let the randomly selected indices for the potential facility sites are 2, 6 and 8. The indexed facility f array for this sample example is shown in Table 6.

Table 6 Indexed facility (f) array

A binary vector representation method is used for solution representation. In the solution, the selected indices of f array (Table 6) are denoted by 1, and rest of indices are denoted by 0. Finally, the sample solution representation is shown in Table 7.

Table 7 Sample solution representation

4.4 Operator design

We redesigned the four fundamental operators and an additional repair operator for retrieving the solutions with better results. The operators used in the proposed method are described in the following subsections.

4.4.1 On-wall ineffective collision

If (\( NumHit_{r} \)-\( MinHit_{r} \))\( > \alpha \) satisfies (Algorithm 2), then this reaction operator is operated. When one molecule collides with the wall of a container, then the internal structure of the molecule changes. Here molecule r produces a new molecule \( r{'} \):

$$\begin{aligned} r\rightarrow {r{'}} \end{aligned}$$

Let the values of m and f are 10 and 3. So, each customer needs to provide 3 facilities. The mechanism for on-wall ineffective collision is very simple.

Fig. 3
figure 3

On-wall ineffective collision

Randomly select one or more indices from molecule r and change it to create a new molecule \( r{'} \) if it ensures better fitness value. In Fig. 3, we can see the 5th and 7th indices of molecule r are changed to form new molecule \( r{'} \) as the new molecule is ensuring a solution with better objective function value.

Fig. 4
figure 4

Decomposition

Fig. 5
figure 5

Intermolecular ineffective collision

Fig. 6
figure 6

Synthesis

4.4.2 Decomposition

If (\( NumHit_{r} \)-\( MinHit_{r} \))\( > \alpha \) satisfies (Algorithm 2), then this reaction operator is operated. In this elementary reaction, two new molecules are created from one molecule. Two newly generated molecules bring diversity in their structures from the old molecule. Let molecule r produces two new molecules \( r_{1} \) and \( r_{2} \):

$$\begin{aligned} r\rightarrow {r_{1}+r_{2}} \end{aligned}$$

According to our example, we have 10 customers and each customer needs to be provided with 3 facilities. The mechanism is quite simple. Firstly, divide the molecule r into two partitions using a random divider function. Then, the first and second partition of molecule r are copied to the beginning indices of molecules \( r_{1} \) and \( r_{2} \), respectively. The rest of the indices of molecules \( r_{1} \) and \( r_{2} \) are created randomly using a random function generator while maintaining the constraints of the problem definition. In Fig 4, the initial three indices of molecule r are copied to form a new molecule \( r_{1} \) and rest of the seven indices of molecule r are copied to the initial indices of molecule \( r_{2} \). The \( 4^{th} \) to \( 10^{th} \) indices of molecule \( r_{1} \) are created randomly while maintaining the constraints of the problem definition properly. Same mechanism is carried out for molecule \( r_{2} \) as well.

4.4.3 Intermolecular ineffective collision

If (\( KE_{r1} > \beta \) and \( KE_{r2} > \beta \)) satisfies (Algorithm 2), then this reaction operator is operated. Here, two molecules collide with each other to form two new molecules. Let two molecules \( r_{1} \) and \( r_{2} \) collide with each other and produce two new molecules \( {r_{1}}{'} \) and \( {r_{2}}{'} \).

$$\begin{aligned} r_{1}+r_{2}\rightarrow {{r_{1}}{'}+{r_{2}}{'}} \end{aligned}$$

This is much similar to on-wall ineffective collision except that the number of molecules is twice here. Molecule \( {r_{1}}{'} \) is produced from molecule \( r_{1} \), and molecule \( {r_{2}}{'} \) is produced from \( r_{2} \). The mechanism is same as on-wall ineffective collision. Several indices are changed to form a new molecule. In Fig. 5, the \( 6^{th} \) and \( 8^{th} \) indices of molecule \( r_{1} \) are changed to form a new molecule \( {r_{1}}{'} \) and rest of the indices of molecule \( r_{1} \) are copied to the same indices of molecule \( {r_{1}}{'} \). Same operations are carried out for creating molecule \( {r_{2}}{'} \) as well.

4.4.4 Synthesis

When (\( KE_{r1} \le \beta \) and \( KE_{r2} \le \beta \)) satisfies (Algorithm 2), then this reaction operator is activated. Synthesis consolidates two molecules to form a new molecule. It is a reverse procedure of decomposition. Let \( r_{1} \) and \( r_{2} \) be two molecules. After the collision, molecule r is created.

$$\begin{aligned} r_{1}+r_{2}\rightarrow {r} \end{aligned}$$

Synthesis performs diversification to travel the global solution space and tries to retrieve the optimal solution. One point crossover mechanism is used for synthesis. The first few indices of molecule r are copied from molecule \( r_{1} \) and rest of the indices are copied from molecule \( r_{2} \) to form the new molecule r. In Fig. 6, the first four indices of molecule \( r_{1} \) and the first six indices of molecule \( r_{2} \) are copied to form a new molecule r which carries diversity.

4.4.5 Repair operator LM-GM

In some cases, the MCLP_CRO algorithm trapped into local maximum, so the optimal results cannot be obtained. To overcome the trapping situation, we have designed a new repair operator named LM-GM, which converts local maximum into global maximum. Pseudo-code of LM-GM operator is shown in Algorithm 3. Determining deviation indices (line 10–16 in Algorithm 3) of each molecule is carried out first, and then, the operator performs the demand maximization (line 17–22 in Algorithm 3) with the help of deviation indices. The working principle of LM-GM is depicted in Fig. 7. Sometimes current solution get trapped into local maximum because the basic reaction operators were designed using several random functions. For solving this issue, an iterative max deviation index search is done in LM-GM operator. In Fig. 7, the max deviation indices of 1st and 9th elements of the original solution are mapped to 6th and 10th indices to form a new solution. The 4th index remains unchanged in the new solution because it already holds the max deviation index. New solution produced the max point of MCLP.

figure c
Fig. 7
figure 7

Repair operator LM-GM

4.5 Flowchart of the proposed method

Fig. 8
figure 8

Flowchart of the proposed MCLP_CRO method

Figure 8 depicts the flowchart of the proposed method. We have created the initial population using a random function generator, and the pseudo-code of the process is given in Algorithm 1 and the description is provided in Sect.  4.3. Before starting iterations, sort all the molecules of initial population according to their potential energies. Then in iterations stage, check whether the reaction is intermolecular or unimolecular. If intermolecular reaction occurs (\( b > MoleColl\) satisfies), then either do synthesis (see Sect. 4.4.4) or intermolecular ineffective collision (see Sect. 4.4.3) by checking which one is appropriate. If \( KE > beta \) (means current kinetic energy exceeds the threshold beta), then execute the global search strategy synthesis, otherwise go for the local search technique intermolecular ineffective collision to hold the current state of the molecules maximized and balanced. When \( b < MoleColl\), perform decomposition (see Sect. 4.4.2) or on wall ineffective collision (see Sect. 4.4.1) in the same manner. In Fig. 8, NH and MH refer to the number of total hits and the number of hits with minstruct, respectively. When \( NH-MH > alpha \), then the algorithm asks for a global search strategy decomposition to activate. Otherwise perform the local search method on wall ineffective collision to develop the molecules locally. After each iteration, check whether a max point is found or not. If any max point is found, then verify the stopping criteria, if it matched, then obtain the best max point and terminates. Otherwise, again check from the beginning, that is whether the reaction is intermolecular or unimolecular. Perform the same again until get a max point or reach the iteration limit.

5 Experimental results and comparisons

Now data are one of the world’s most valuable resources that creates several fields of computer science. Without dataset, any expert or digital system cannot be established. Data can be of various types and can be gathered by recording, search or observation. We have taken nine datasets (five real-world datasets and four randomly created datasets) to measure the performance of proposed (MCLP_CRO) algorithm. The five real-world datasets are named as SJC_324, SJC_402, SJC_500, SJC_708 and SJC_818 where 324, 402, 500, 708 and 818 are the total number of nodes in the datasets, respectively. On the other hand, four random datasets are named as B_700, B_900, ZDS_1800, and ZDS_2500 where 700, 900, 1800, and 2500 are the number of nodes in the datasets. These datasets were used for evaluating recent algorithms for solving maximal covering location problem (Atta et al. 2018; Lorena and Pereira 2002). We implemented our proposed algorithm (MCLP_CRO) by using C++ language with device specifications: Processor- Intel(R) Core(TM) i5–7200U CPU @ 2.50GHz 2.71 GHz, RAM - 8.00 GB (7.90 GB usable), System type - 64-bit operating system, x64-based processor, OS - Windows 10 Pro edition.

5.1 Parameter tuning

It is important to tune the initial attributes of CRO to obtain the better results. We tuned several necessary attributes of CRO such as \( \alpha , \beta , \) kinetic energy (KE), and the number of runs. Six instances were taken for the tuning: two large, two medium and two small instances, which are shown in Table 8.

Table 8 Selected instances for tuning parameter

For \( \alpha \) tuning, we have taken the values of \( \alpha \) from 1 to 8. In Fig. 9, the values of \( \alpha \) are denoted along X axis and the values of potential energy of each instance are shown along Y axis. Figure 9 illustrates the value of tuning for \( \alpha \). When the value of \( \alpha \) is 5, we got the highest potential energy in most cases. In addition, the highest potential energies for the selected six instances (Table 8) are 5461, 12152, 15984, 19707, 24192 and 29168 found (Fig. 9) when \( \alpha \) is 5. Therefore, we used 5 as the initial value of \( \alpha \) for the proposed MCLP_CRO method.

Fig. 9
figure 9

Alpha tuning

Fig. 10
figure 10

Beta tuning

Fig. 11
figure 11

KE tuning

Fig. 12
figure 12

Tuning of runs

The values of \( \beta \) have been taken from 12000 to 18000 for \( \beta \) tuning. In Fig. 10, the values of \( \beta \) are represented along X axis and the values of potential energy of each instance are depicted along Y axis. Figure 10 demonstrates the value of tuning for \( \beta \). It can be noticed that when the value of \( \beta \) is 15000, we got highest potential energy in most cases. Rather, the highest potential energy for the selected six instances (Table 8) are 5461, 12152, 15984, 19707, 24192 and 29168 found (Fig. 10) when \( \beta \) is 15000. So, we used 15000 as the initial value of \( \beta \) for the proposed MCLP_CRO algorithm.

For the tuning purpose of kinetic energy (KE), we have considered the values of KE from 400 to 1600. In Fig. 11, the values of KE are represented along X axis and the values of potential energy of each instance are shown along Y axis. Figure 11 indicates the value of tuning for KE. It is noticed that when the value of KE is 1000, we obtained highest potential energy in most cases. In addition, the highest potential energy for the selected six instances (Table 8) are 5461, 12152, 15984, 19707, 24192 and 29168 found (Fig. 11) when KE is 1000. So, we selected 1000 as the initial value of KE for the proposed MCLP_CRO approach.

To observe the significance of the number of runs, we have examined the instances 5, 10, 15, 20, 25, and 30 times. When the number of runs is over 15, the potential energy remains same in most cases as no new value found. In Fig. 12, the values of number of runs are represented along X axis and the values of potential energy of each instance are depicted along Y axis. Figure 12 indicates the effect of number of runs over the potential energy.

Table 9 Results of SJC_324 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 10 Results of SJC_402 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 11 Results of SJC_500 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 12 Results of SJC_708 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 13 Results of SJC_818 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 14 Results of B_700 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 15 Results of B_900 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 16 Results of ZDS_1800 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 17 Results of ZDS_2500 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 18 Comparison in average time improvement for the datasets

5.2 Results analysis and discussion

We initialized the CRO attributes as \(PopSize = 10\), \(KELossRate = 0.2\), \(MoleColl = 0.4\), \(InitialKE = 1000\), \(\alpha \) \(= 5\) and \(\beta \) \(= 15000\), \(iteration = 150\). The input parameters are recorded as n, p and S, where n denotes the number of customers or nodes in the network, p is the number of facilities to be installed and S represents the constant service distance or service radius within the facilities that can be provided to a customer. These three symbols (n, p and S) are used in Tables 8 to 17. The % of cov and Time(s) represent the percentage of coverage and the computational time in second, respectively. These two symbols (% of cov and Time(s) ) are used in Tables 9 to 13. For a fair comparison, we implemented Atta_GA (Atta et al. 2018) also using C++ in the same machine with the same device specification as for MCLP_CRO. The proposed algorithm is compared with Atta_GA only because this algorithm achieved best results in all the instances, and it is the state-of-the-art method. The bold sign used in Tables 9, 10, 11, 12, 13, 15, 16, 17 denote best achieved results of the proposed method.

The five real-world datasets were collected from “L.A.N.Lorena-instancias - INPE” whose link is available in the website (http://www.lac.inpe.br/~lorena/instancias.html). These five datasets were tested with three different values of S. These values are 800, 1200 and 1600. There are 10 instances in SJC_324 dataset. By increasing the value of facility by one unit every time, we have measured the percentage of coverage and the computational time. The good results of proposed MCLP_CRO method are highlighted by bold sign. We obtained the same results as Atta_GA (Atta et al. 2018) with respect to percentage of coverage (Table 9). However, the computational time for each of 10 instances is less than that of Atta_GA. Table 10 shows the comparison for SJC_402 dataset between MCLP_CRO and Atta_GA (Atta et al. 2018). We got the same results as Atta_GA (Atta et al. 2018) with respect to percentage of coverage (Table 10). However, the computational time of each of the 11 instances is less than that of Atta_GA.

The results for SJC_500 dataset are shown in Table 11. SJC_500 is a dataset of 500 customers containing 15 instances. For 13 instances out of 15, we have obtained the same results with respect to percentage of coverage compared to Atta_GA (Atta et al. 2018) in less computational time. For 2 instances ([\(\textit{n}=500\), \( \textit{p}=4\), \(\textit{s}=800\)] and [\(\textit{n}=500\), \( \textit{p}=5\), \(\textit{s}=800\)]) we got worse but very near results with respect to percentage of coverage. However, the computational time for all the instances is less than that of Atta_GA (Atta et al. 2018).

The comparison results for SJC_708 and SJC_818 datasets between MCLP_CRO and Atta_GA (Atta et al. 2018) are given in Tables 12 and 13, respectively. SJC_708 and SJC_818 are two datasets containing 21 and 26 instances. For 20 instances out of 21 (SJC_708 dataset), we have got same results with respect to percentage of coverage compared to Atta_GA (Atta et al. 2018) in less computational time. For one instance (\(n=708\), \( p=7\), \(s=800\)), we got worse but very near results with respect to percentage of coverage. However, the computational time of each of the 21 instances (SJC_708 dataset) is less than that of Atta_GA. Atta et al. (2018). For 22 instances out of 26 (SJC_708 dataset), the same results have obtained with respect to percentage of coverage compared to Atta_GA (Atta et al. 2018), while the computational time of each of the 26 instances is less than that of Atta_GA.

The five real-world datasets were used (SJC_324, SJC_402, SJC_500, SJC_708, SJC_818) by Atta et al. (2018) and Lorena and Pereira (2002) for solving the MCLP. Lorena and Pereira (2002) used Lagrangean/surrogate heuristic for solving MCLP, and to the best our knowledge, this is the first method that used several real-world datasets (GIS-referenced instances of São José dos Campos city of Brazil) to deal with MCLP. Their proposed method performed very well with respect to percentage of coverage for all the test instances. Atta et al. (2018) further used those five real-world datasets to solve MCLP using genetic algorithm with local refinement. They compared their obtained results with the method proposed by Lorena and Pereira (2002) and showed that their method is better than almost all the test instances with respect to both percentage of coverage and computational time. In this paper, we have also used these five real-world datasets, and we have compared our proposed method with the method proposed by Atta et al. (2018). The first five rows of Tables 18 and 19 show the comparison between our proposed method and Atta_GA (Atta et al. 2018) with respect to computational time and percentage of coverage, respectively. The average computational time for SJC_324 dataset is 7.025 s in Lorena and Pereira (2002), 4.32 s in Atta et al. (2018) and our proposed method got 1.32 s which is more efficient than the previous two methods.

We have created four random datasets to measure the performance of the proposed method (MCLP_CRO) for large scales of MCLP. These four datasets were used by Atta_GA (Atta et al. 2018), and these datasets are named as B_700, B_900, ZDS_1800 and ZDS_2500 containing 700, 900, 1800 and 2500 customers (nodes), respectively. For these four datasets, the Avg Cov % and Max Cov % denote the percentage of average coverage and the percentage of maximum coverage, respectively. The Avg Time(s) and Max Time(s) denote the average computational time and maximum computational time in seconds. The two datasets pmed32 (B_700) and pmed39 (B_900) were collected from “OR-Library” whose link is available in the website (http://people.brunel.ac.uk/~mastjjb/jeb/info.html). These two datasets are for pmedian problem, so these datasets do not contain the demand values of the customers. So the demand values of customers are created using a separate program from a normal distribution with mean 80 and standard deviation 15 as done in Atta_GA (Atta et al. 2018). Tables 14 and 15 show the results for B_700 and B_900 datasets between MCLP_CRO and Atta_GA (Atta et al. 2018). B_700 dataset is tested with 3 different values of S. These values are 13, 15 and 20. On the other hand, B_900 dataset is tested with different three different values (10, 13 and 16) of S. The number of facilities to be installed for both the datasets (B_700 and B_900) are 20, 24 and 28. There are nine instances for these two datasets. For all the instances (B_700 dataset), we have obtained better results with respect to percentage of coverage (both average and maximum) and computational time (both average and maximum) compared to Atta_GA (Atta et al. 2018). For eight instances out of nine (B_900 dataset), we have got better results with respect to average percentage of coverage compared to Atta_GA (Atta et al. 2018). On the contrary, with respect to max percentage of coverage, seven instances out of nine, we have got better results. However, the computational time (both average and maximum) of each of the nine instances is less than that of Atta_GA for B_900 dataset.

The two random datasets were examined (B_700, B_900) by Atta et al. (2018) and Lorena and Pereira (2002) for solving the MCLP. Lorena and Pereira (2002) tested their method with these two datasets having the demand values of customers those are created randomly using a separate program from a normal distribution with mean 80 and standard deviation 15. Lorena and Pereira (2002) made the first attempt to use these specific types of random datasets to deal with MCLP using Lagrangean/surrogate heuristic. Their proposed method performs well with respect to maximum percentage of coverage on these datasets. Later a metaheuristic based approach were proposed by Atta et al. (2018) which outperform the previous method in almost all the dimensions. In this paper we have also used these two random datasets, and we have compared our proposed method with the method proposed by Atta et al. (2018). The sixth and seventh rows of Tables 18 and 19 show the comparison between our proposed method and Atta_GA (Atta et al. 2018) with respect to average computational time and percentage of coverage, respectively. The average computational time for B_700 dataset is 334.69 s in Lorena and Pereira (2002), 43.84 s in Atta et al. (2018) and our proposed method obtained 18.59 s which is more efficient.

The distance graph of ZDS_1800 and ZDS_2500 data-sets are created randomly in the range [0, 30] in uniform distribution. The demand values are also created randomly from a uniform distribution in the range [0, 100]. Two separate programs were used to create these two random datasets. Tables 16 and 17 show the comparison results for ZDS_1800 and ZDS_2500 datasets between MCLP_CRO and Atta_GA (Atta et al. 2018). These two datasets are tested with 3 different values of S. These values are 3.5, 3.75 and 4. The number of facilities to be installed for these two datasets are 15, 20 and 25. There are nine instances for these two datasets. For eight instances out of nine (ZDS_1800 dataset), we have obtained better results with respect to percentage of coverage (both average and maximum) compared to Atta_GA (Atta et al. 2018). However, the computational time (both average and maximum) of each of the nine instances is less than that of Atta_GA (ZDS_1800 dataset). For five instances out of nine (ZDS_2500 dataset), we have got better results with respect to percentage of average coverage compared to Atta_GA (Atta et al. 2018). With respect to percentage of max coverage, eight instances out of nine we have obtained better results. However, the computational time (both average and maximum) of each of the nine instances is less than that of Atta_GA for ZDS_2500 dataset.

The two random datasets (ZDS_1800, ZDS_2500) were used by Atta et al. (2018) and Zarandi et al. (2011) for solving the MCLP. Zarandi et al. (2011) tested their method with these two datasets having the distance graph for the datasets those are created randomly in the range [0, 30] in uniform distribution, and the demand values are also created randomly from a uniform distribution in the range [0, 100]. Zarandi et al. (2011) made the first attempt to use these specific types of random datasets to deal with MCLP using a well-known metaheuristic named genetic algorithm. The great achievement of their proposed method is its runtime. Later, another metaheuristic-based approach was proposed by Atta et al. (2018), which outperform the previous method in almost all the dimensions incorporating a local refinement strategy. In this paper, we have also used these two random datasets, and we have compared our proposed method with the method proposed by Atta et al. (2018). The eighth and ninth rows of Tables 18 and 19 show the comparison between our proposed method and Atta_GA (Atta et al. 2018) with respect to average computational time and percentage of coverage, respectively. The average computational time for B_1800 dataset is 170.33 s in Zarandi et al. (2011), 54.45 s in Atta et al. (2018) and our proposed method obtained 31.82 s which is more efficient than the previous two methods.

5.3 Results summary with details of the datasets

The proposed MCLP_CRO method gives solution in less computational time for both small and large datasets. In Table 18, we have shown the comparison in average time improvement for the all the test datasets. Here, \( Avg\_T \) denotes average computational time of a dataset. Our method improves average computational time about 81.44% for SJC_402 dataset, which is the highest compared to all nine datasets, whereas the lowest improvement is 41.56% which is for ZDS_1800 dataset. For small scale datasets, average time improvement is higher (more than 50%). On the contrary, the average time improvement is lower (around 42%) for large-scale datasets.

The proposed MCLP_CRO method works very well with respect to computational time in all the test instances of all the datasets (see Table 18), and the improvement in time is very significant, and in fact the lowest improvement is 41.56%. Table 19 shows the significance of the average percentage of coverage of the proposed method where \( Avg\_E \) represents the average gap of percentage of coverage from the best known percentage of coverage. For the small-scale datasets (SJC_324, SJC_402 and B_700), the gap is zero which is pretty good. The proposed method deals with the large-scale datasets with average gap of percentage of coverage with values 0.07, 0.07, 0.07, 0.16 and 0.05 for SJC_708, SJC_818, B_900, ZDS_1800 and ZDS_2500 datasets, respectively, which is also very less. Apart from this, only SJC_500 dataset contains average gap of percentage of coverage value 0.49, which is the worst compared to other seven datasets while ensuring 65.79% improvement in average computational time. The proposed MCLP_CRO method gives best percentage of coverage results in 91.60% of instances, while rest of the 8.40% of instances produce results with average error value 0.10. Nevertheless, the proposed method works very well with respect to computational time in all the test instances of all the datasets (see Table 18) and the improvement in time is very significant, in fact the lowest improvement is 41.56% compared to the state-of-the-art method.

Table 19 Results of summary in average percentage of coverage for the datasets

5.4 Statistical test (Wilcoxon signed-rank test)

Here Wilcoxon signed-rank test is used for significance test; a nonparametric statistical experiment is carried out between two paired set of values to determine whether there is any statistical relationship between them or not. We have examined the test between MCLP_CRO and Atta_GA (Atta et al. 2018) to observe the statistical significance. An open-source online statistical calculator is used (https://www.socscistatistics.com/tests/signedranks/default2.aspx) to run the test. We have performed the test for both real-world and random instances between MCLP_CRO and Atta_GA, and the results are shown in Tables 20 and 21, respectively. The environment of the test: hypothesis is 2-tailed and the level of significance is 0.05.

Table 20 Statistical test for real-world instances between MCLP_CRO and Atta_GA (Atta et al. 2018)
Table 21 Statistical test for random instances between MCLP_CRO and Atta_GA (Atta et al. 2018)

The test result highlights that MCLP_CRO is significant at \( \textit{P}<0.05 \) in both real-world and random instances. The results of the real-world instances (Table 20) indicate that the value of W is 26.5. The critical value for W at N = 18 (\( P<0.05 \)) is 40. The Z-value cannot be used to evaluate the hypothesis for real-world instances as the size of N is 18 which means there are less samples to form a pure normal distribution. The test requires at least 20 samples to form a pure normal distribution. So, W-value is used for evaluating the hypothesis of real-world instances.

The result of the random instances (Table 21) shows that the value of W is 18.5. The distribution is approximately normal. Therefore, the Z-value can be used to evaluate the hypothesis for random instances.

6 Conclusion

The maximal covering location problem (MCLP) deals with the problem of finding an optimal placement of a given fixed number of facilities on a network in such a way that the total demands of the attended population are maximized. MCLP is a well-known constraint satisfaction problem, which is also discussed in location science. In this article, a chemical reaction optimization (CRO)-based method is proposed to solve the maximal covering location problem (MCLP); hence, the proposed method is named as MCLP_CRO. The goal of this paper is to solve MCLP using CRO and achieve the best percentage of coverage results in least computational time in both benchmark and random types of datasets and propose an efficient way of solving the well-known location covering problem, MCLP, and also observe the statistical significance of the proposed method with other existing methods. Designing the four fundamental operators of CRO was a challenging task to solve the problem. We have performed parameter tuning to initialize the CRO parameters. Sometimes the solutions generated by the method get trapped into local maxima, to get rid of this problem we have designed a repair operator (LM-GM) which converts the local maximum into global maximum and obtain the best results. The proposed algorithm was tested for both small and large scales of instances of datasets, which include benchmark (5 datasets) as well as random ones (4 datasets). The proposed MCLP_CRO method gives best percentage of coverage results in 91.60% of instances, and for the remaining 8.40% of instances it produces results with average error value 0.10% which is very close to the optimal value. Nevertheless, the proposed method works very well with respect to computational time in all the test instances (100%) of all the datasets and the improvement in computational time is very significant; in fact, the lowest improvement is 41.56% for ZDS_1800 dataset, while the highest improvement is 81.44% which is for SJC_402 dataset. For small-scale datasets, average time improvement is higher (more than 50%). On the contrary, the average time improvement is lower (around 42%) for large-scale datasets. In a nutshell, the proposed method gives solution in less computational time for both small and large datasets compared to the state-of-the-art algorithm (Atta_GA). An statistical test (Wilcoxon signed rank test) was performed on the results of the method to observe the statistical significance with 2-tailed hypothesis and 0.05 level of significance. The statistical test shows that the proposed method is significant in both real-world and random instances. W-value (26.5) is used for evaluating the hypothesis of real-world instances and Z-value (\(-\)4.941) can be used to evaluate the hypothesis for random instances. So, the proposed method is an efficient one from the perspective of computational time. Nevertheless, it has a drawback that it cannot produce optimal percentage of coverage results for some test instances though the difference is very less, only 0.10%. In future, the traditional MCLP can be converted to dynamic large scales of MCLP to find dynamic solutions in different resource constraint aspects. A fuzzy CRO version of MCLP can be deployed in future. Multiple metaheuristics could be used at the same time to solve MCLP for further betterment in future. Another avenue for future research could be to assess the performance of CRO for other variants of covering location problem.