1 Introduction

The maximal covering location problem (MCLP) aims to locate a fixed number of facilities to cover, within a given coverage distance, as many demands as possible. MCLP is a very important model to provide social and economic services (such as the placement of health centers, fire stations, emergency warning sirens, chain stores, etc.) in order to maximize certain benefits (see, e.g., [1] and [2]).

The classical MCLP, presented in [3], seeks to maximize the number of covered demands within some desirable maximum coverage distance \(S\). However, public services such as emergency medical services (EMS) should be available to all citizens. In fact, even if a patient is in an area where no ambulances can reach there within a reasonable time, the patient must still be responded by an existing ambulance base. Hence, the “Maximal Covering with Mandatory Closeness Constraints (MCMCC)” was proposed in [3] to provide some form of fairness for the demands not served within the desirable coverage distance \(S\). By this model, besides the objective of maximizing the amount of demands served within the distance \(S\), it is ensured that each demand point will find a facility within a less desirable coverage distance \(T\) (\(T>S\)).

Although MCLP is widely studied in the literature, there are a few works on this problem with different coverage distances while the latter is very important and more applicable in many real situations. For example, in the real world, different emergency vehicles with different coverage distances can be applied in an emergency service system. More precisely, it is obvious that in heavily congested streets of a big city such as Tehran, a motorcycle is faster (as well as cheaper) than a van, and it can traverse a longer distance than a van within a standard response time (e.g., 8 min), and hence, the type of vehicles in an emergency base specifies its coverage distance. MCMCC can be referred to as the first MCLP with different coverage distances. As an extension of the MCMCC, Gendreau et al. [4] proposed a so-called double standard model (DSM) in which, similar to (MCMCC), two coverage distances \(S\) and \(T\) (\(S<T\)) are considered for a single type of facility and every demand point must be covered by a facility within \(T\) while its objective is to maximize the demands that are covered at least twice within \(S\). A dynamic version of DSM was developed in [5]. Some other models, such as [6] and [7], incorporated the concept of backup coverage. These models try (usually as a second objective) to increment the number of (say backup) facilities that cover a demand point.

Another model which considers different coverage distances is presented in [8]. However, the mentioned model does not consider the concept of backup facilities and just presents a multi-level extension of MCLP which tries to cover each demand by an as close as possible facility. In addition to the above-mentioned articles, Jabalameli et al. [9] supposed that the coverage distance of each facility can be controlled by decision-maker, and they considered the coverage distance as a function of the amount of money invested on the facility.

In the present work, we propose a variant of MCLP entitled double-type double-standard model (DtDsM) to deal with the location-allocation problem of two types of facilities: normal and backup facilities. Similar to [9], the two types of facilities (can) have different costs and different coverage distances. However, in DtDsM, it is not necessary for a facility with longer coverage distance to be more expensive.

The concept of backup facility in DtDsM is different from before. Unlike the above-mentioned models, here, we do not have only a single type of facility, and we do not aim to cover the demand points by as much as possible facilities. In fact, by backup, we do not mean multiple coverage, but the provision of primary services to the demand points within a standard time by a backup facility (if necessary).

Similar to MCMCC and consequently DSM, the proposed model DtDsM ensures that a less quality service will be available at each demand point. However, the former models translate this as availability of a facility within a less desirable distance while the latter translates it as availability of (at least) a less-quality facility (i.e., a type II or backup facility which offers only some primary services) within a standard time. In other words, each demand point must lie within the coverage distance of a backup facility if it does not lie in the coverage distance of a normal facility. This is another difference between DtDsM and the model of [9] which does not care about the too long distance between a demand point and the facility that serves it but only cares about the relation between the cost of a facility and its coverage distance.

The main objective of DtDsM is to maximize the amount of demands which can receive a full service (from a type 1 or normal facility) within the considered standard time (similar to the classical MCLP). However, as a second objective, it tries to minimize the total distance between the demand points that do not receive a full service on time and the normal facilities closest to them.

DtDsM can be useful in determining the location of facilities in public emergency services such as emergency medical services (EMSs), rescue and firefighting services (RFFSs), etc. Consider two important types of EMS vehicles: van-based ambulances as the normal facilities and motorcycle-based ambulances (motorlances) as the backup facilities. Then, if DtDsM is applied to locate EMS vehicles, it is ensured that each demand will receive at least the primary medical services by a motorlance (backup facility) within a standard waiting time (e.g., 8 min, due to its smaller size, motion abilities, road infrastructures, etc.) before a van ambulance reaches the demand and serves it full emergency services (and takes the injured to a hospital). This fast service extremely reduces the number of casualties in severe accidents before reaching the van ambulances with more complete medical equipment. It is worth knowing, as an example, that in the last few years, 200 motorlances have been purchased and added to Tehran’s emergency fleetFootnote 1, and DtDsM could be a good suggestion to be applied for determining their locations (See [10], for a survey on optimization models in emergency services).

We now consider the importance of application of an efficient solution algorithm for dealing with the model. Due to the fact that in real-life applications the number of demand points and the facilities could be very large, the solution procedure of an MCLP may be a time-consuming task. On the other hand, the demand for facilities may fluctuate throughout a planning horizon such as a day, depending on the time of day.

A scenario-based model was presented in [11] to dynamically allocate the demands to the facilities in order to minimize the expected total costs under a finite set of scenarios. Also a multi-period model was proposed in [12] which considers some different time intervals (within a time horizon) in which significant changes in demand pattern occur. Their model aims to determine the minimum number of facilities and their locations for each time interval while meeting the minimum expected coverage requirement at each time interval. Another multi-period location and relocation model was presented in [13].

However, apart from predictable changes in the parameters (e.g., the amount of demand at the points during different time intervals), there are some irregular events (such as exhibitions, tournaments, ceremonies, etc.) that may daily or even hourly affect the amount of demands and the distribution of traffic in the network and consequently change travel times and coverage regions of the facilities (i.e., \({I}_{i}\), \({c}_{ij}^{v}\), and \({c}_{ij}^{m}\) in DtDsM are subject to temporary changes). Therefore, unpredictable need to relocation of facilities during a time horizon is inevitable.

Actually, the parameters of the facility location problems are very sensitive to the above-mentioned irregular events and can widely vary from one case to another. Hence, developing fast solution algorithms to reset and resolve large-scale facility location problems in the case of irregular changes is very important. As one can see in Sect. 4 (Numerical Results), the time required to directly solve such problems by commonly used solvers (e.g., CPLEX) seriously increases as the problem size increases. Therefore, application of more efficient algorithms such as decomposition techniques is inevitable.

Benders decomposition (BD) method is one of the efficient decomposition methods for solving some complicated mixed-integer problems with a row generation strategy and has been successfully applied for the location problem under certainty ([14,15,16]) and uncertainty conditions ([17, 18]). In this paper, we propose an accelerated BD algorithm to efficiently solve the DtDsM. For a thorough review of the method, the reader is referred to [19], while the most recent studies in the Benders algorithm are described in [20].

To summarize, the contribution of this paper is twofold: presenting an extension of the maximal covering location model which can handle some practical situations for which the previous models cannot be directly applied and an accelerated version of the well-known Benders decomposition algorithm for solving the proposed model.

The rest of the paper is organized as follows: mathematical formulation of DtDsM is presented in Sect. 2. In Sect. 3, an accelerated Benders decomposition algorithm is proposed to efficiently solve DtDsM. Numerical results are reported in Sect. 4 and finally, Sect. 5 briefly concludes the paper.

2 Mathematical Model

In this section, we propose a 0-1 integer linear programming problem to mathematically express DtDsM.

Parameters of the model:

\(I:\) is the set of demand points;

\({d}_{ij}:\) is the travel distance from potential facility point \(j\) to demand point \(i\);

\(J:\) is the set of potential facility points

\({p}_{v}:\) is the cost of locating a normal facility

\({I}_{i}\): is the importance level of demand point \(i\)

\({p}_{m}:\) is the cost of locating a backup facility

\({c}_{ij}^{v}:\) equals 1 if the demand point \(i\) is in the coverage distance of a normal facility located at point \(j\)

\({c}_{ij}^{m}:\) equals 1 if the demand point \(i\) is in the coverage distance of a backup facility located at point \(j\)

\(w\): is an appropriate weight

\(B:\) is the total available budget

Variables of the model:

$${x}_{j}^{v}:$$ x j v : a binary variable which equals 1 iffFootnote 2 a normal facility is located at potential point $$j$$ j

\({x}_{j}^{m}:\) a binary variable which equals 1 iff a backup facility is located at potential point \(j\)

\({y}_{ij}^{v}:\) a binary variable which equals 1 iff the demand point \(i\) is allocated to a normal facility located at point \(j\)

\({y}_{ij}^{m}:\) a binary variable which equals 1 iff the demand point \(i\) is allocated to a backup facility located at point \(j\)

Now, using the above notation, the proposed model (DtDsM) can be presented as follows:

$$\left({\varvec{P}}\right): \mathbf{M}\mathbf{a}\mathbf{x}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ \Sigma_{i\in I}\Sigma_{j\in J}{I}_{i}{c}_{ij}^{v}{y}_{ij}^{v}-w\Sigma_{i\in I}\Sigma_{j\in J}{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij}{y}_{ij}^{v}$$
(1)

subject to

$$\Sigma_{j\in J}{y}_{ij}^{v}=1 \qquad i\in I$$
(2)
$${y}_{ij}^{v}-{x}_{j}^{v}\le 0 \qquad i\in I, \ j\in J$$
(3)
$${y}_{ij}^{m}-{x}_{j}^{m}\le 0 \qquad i\in I, \ j\in J$$
(4)
$$\Sigma_{j\in J}{c}_{ij}^{v}{y}_{ij}^{v}+\Sigma_{j\in J}{c}_{ij}^{m}{y}_{ij}^{m}=1 \qquad i\in I$$
(5)
$$\Sigma_{j\in J}{p}_{v}{x}_{j}^{v}+\Sigma_{j\in J}{p}_{m}{x}_{j}^{m}\le B$$
(6)
$${y}_{ij}^{v},{y}_{ij}^{m}\in \left\{0, 1\right\} \qquad i\in I, \ j\in J$$
(7)
$${x}_{j}^{v},{x}_{j}^{m}\in \left\{\mathrm{0,1}\right\} \qquad j\in J$$
(8)

The above model is a multi-objective model which maximizes the amount of the responded demands, within a standard time, by normal facilities while simultaneously tries to minimize the total distance between the other demands and the normal facilities to which they are allocated. By appropriately choosing the weight \(w\), the first part can be considered as the primary objective, and DtDsM becomes a preemptive priority bi-objective programming problem (a suitable value for \(w\) is proposed in the Sect. 4.1). Furthermore, the second part of (1) can be replaced by a latency cost function [21].

Constraints (2) ensure that each demand point should (anyway) be allocated to a normal facility. Constraints (3) and (4) ensure that if some demand points are allocated to a facility located at point \(j\) then the facility should be open (i.e. \({x}_{j}^{v}=1\) or \({x}_{j}^{m}=1\)). Constraints (5) try to support, by a backup facility, the demand points which are not within the coverage distance of any of the normal facilities. Similar to [9], DtDsM considers no restriction on the total number of opened facilities. However, the available budget for opening the facilities is limited by a predefined value \(B\) and is presented by constraint (6). Finally, constraints (7) and (8) declare the type of the decision variables.

3 An Accelerated Benders Decomposition Algorithm

$$\Sigma_{j \in J}{p}_{v}{x}_{j}^{v}+\Sigma_{j\in J}{p}_{m}{x}_{j}^{m}\le B$$
(9)
$$\Sigma_{j\in J}{x}_{j}^{v}\ge 1$$
(10)

In the following, we adjust the problem (\(\mathrm{P}\)) so that we can apply the Benders decomposition algorithm to solve the problem by iteratively solving finitely many easier subproblems. More precisely, considering \({x}_{j}^{v}\) and \({x}_{j}^{m}\) (\(j\in J\)) as the complicated variables, we temporarily fix them at some binary values \({\stackrel{-}{x}}_{j}^{v}\) and \({\stackrel{-}{x}}_{j}^{m}\) so that the following conditions are satisfied: and we solve the remaining subproblem. Constraint (10) is a valid inequality due to (2) and (3). Then, we check some certain termination criteria. If no criterion is satisfied, then some Benders cuts are generated to more restrict the set of feasible selections of \({x}_{j}^{v}\) and \({x}_{j}^{m}\), and new feasible values for these complicated variables are selected and this process is repeated'

3.1 Primal Subproblem

By temporarily fixing the complicated variables \({x}_{j}^{v}\) and \({x}_{j}^{m}\) at \({\stackrel{-}{x}}_{j}^{v}\) and \({\stackrel{-}{x}}_{j}^{m}\), respectively, the relevant so-called primal subproblem is then

$$\left(\mathbf{P}\mathbf{S}\mathbf{P}\right): \mathbf{M}\mathbf{a}\mathbf{x}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ \Sigma_{i\in I}\Sigma_{j\in J}{I}_{i}{c}_{ij}^{v}{y}_{ij}^{v}-w\Sigma_{i\in I}\Sigma_{j\in J}{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij}{y}_{ij}^{v}$$
(11)

subject to

$$\Sigma_{j\in J}{y}_{ij}^{v}=1 \qquad i\in I$$
(12)
$${y}_{ij}^{v}\le {\stackrel{-}{x}}_{j}^{v} \qquad i\in I, \ j\in J$$
(13)
$${y}_{ij}^{m}\le {\stackrel{-}{x}}_{j}^{m} \qquad i\in I, j\in J$$
(14)
$$\Sigma_{j\in J}{c}_{ij}^{v}{y}_{ij}^{v}+\Sigma_{j\in J}{c}_{ij}^{m}{y}_{ij}^{m}=1 \qquad i\in I$$
(15)
$${y}_{ij}^{v},{y}_{ij}^{m}\in \left\{0, 1\right\} \qquad i\in I, j\in J$$
(16)

To apply the classical Benders decomposition method, we need the dual of the subproblems \((\mathrm{PSP})\) with no duality gap. However, \(\mathrm{PSP}\) has binary variables, i.e., the standard duality theory of linear programming could not be directly applied (for previous works on Benders decomposition with integer subproblems, refer to [22] and the related works cited therein). Furthermore, the subproblem should be solved in each iteration of Benders decomposition algorithm, and hence, its solution time is critical in the solution process of the original problem (\(\mathrm{P}\)). To tackle these difficulties, we first note that the subproblem (\(\mathrm{PSP}\)) is separable on \(i\), and we can solve it easier and faster by decomposing it into \(|I|\) independent subproblems which can be solved in parallel. Then, we consider the relevant subproblem to a demand point \(i\) (\(i\in I\)):

$$\left(\mathbf{P}\mathbf{S}{\mathbf{P}}_{\mathbf{i}}\right): \mathbf{M}\mathbf{a}\mathbf{x}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ \Sigma_{j\in J}{I}_{i}{c}_{ij}^{v}{y}_{ij}^{v}-w\Sigma_{j\in J}{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij}{y}_{ij}^{v}$$
(17)

subject to

$$\Sigma_{j\in J}{y}_{ij}^{v}=1$$
(18)
$${y}_{ij}^{v}\le {\stackrel{-}{x}}_{j}^{v} \qquad j\in J$$
(19)
$${y}_{ij}^{m}\le {\stackrel{-}{x}}_{j}^{m} \qquad j\in J$$
(20)
$$\Sigma_{j\in J}{c}_{ij}^{m}{y}_{ij}^{m}+\Sigma_{j\in J}{c}_{ij}^{v}{y}_{ij}^{v}=1$$
(21)
$${y}_{ij}^{v},{y}_{ij}^{m}\in \left\{\mathrm{0,1}\right\} \qquad j\in J$$
(22)

Regarding (19) and (20) and since \({\stackrel{-}{x}}_{j}^{v}\) and \({\stackrel{-}{x}}_{j}^{m}\) are binary for all \(j\in J\), we can replace constraints (22) by the following constraints:

$${y}_{ij}^{v},{y}_{ij}^{m}\in \left\{\mathrm{0,1}\right\} \qquad j\in J$$
(23)

Now, we note that the coefficient matrix of (\({\mathrm{PSP}}_{\mathrm{i}}\)) has the consecutive ones propertyFootnote 3 for rows, and hence, it is a totally unimodular matrix (TUMFootnote 4). Also, one can simply see that appending an additional unit vector \({e}_{k}\) to a TUM yields a TUM, and hence, the coefficient matrix of the standard form of (\({\mathrm{PSP}}_{\mathrm{i}}\)) is also a TUM. But a (standard) linear programming problem with a totally unimodular coefficient matrix yields at least one integer optimal solution for any objective vector and any integer vector on the right-hand side of the constraints (see [23] and [24] for more details).

Therefore, we can relax the integrality restrictions (23) to obtain the following equivalent relaxation of \(({\mathrm{PSP}}_{\mathrm{i}})\) which is, in fact, a (continuous) linear programming problem:

$$\left(\mathbf{R}\mathbf{P}\mathbf{S}{\mathbf{P}}_{\mathbf{i}}\right): \mathbf{M}\mathbf{a}\mathbf{x}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ (17)$$
(24)

subject to

$$(18) - (21)$$
(25)
$${y}_{ij}^{v},{y}_{ij}^{m}\ge 0 j\in J$$
(26)

According to the above discussion, we actually have shown that we can study the relaxed problem \(\left({\mathrm{RPSP}}_{\mathrm{i}}\right)\) instead of the 0-1 integer problem \(\left({\mathrm{PSP}}_{\mathrm{i}}\right)\). In other words, we have shown that instead of the integer subproblem \(\left(\mathrm{PSP}\right)\), we can work with very smaller as well as easier (continuous) linear programming problems which have nice dual properties.

3.2 Dual Subproblem

We consider dual-variable \({\lambda }_{i1}\) for the constraint (18), \({\beta }_{ij}\) (\(j\in J\)) for the constraints (19), \({\gamma }_{ij}\) (\(j\in J\)) for the constraints (20) and \({\lambda }_{i2}\) for the constraint (21), and obtain the dual of \(\left({\mathrm{RPSP}}_{\mathrm{i}}\right)\) corresponding to the demand site \(i\):

$$\left(\mathbf{D}\mathbf{S}{\mathbf{P}}_{\mathbf{i}}\right): \mathbf{M}\mathbf{i}\mathbf{n}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ {\lambda }_{i1}+{\lambda }_{i2}+\Sigma_{j\in J}{\stackrel{-}{x}}_{j}^{v}{\beta }_{ij}+\Sigma_{j\in J}{\stackrel{-}{x}}_{j}^{m}{\gamma }_{ij}$$
(27)

Subject to

$${\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge {I}_{i}{c}_{ij}^{v}-w{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij} \qquad j\in J$$
(28)
$${c}_{ij}^{m}{\lambda }_{i2}+{\gamma }_{ij}\ge 0 \qquad j\in J$$
(29)
$$\beta_{ij}, \gamma_{ij} \geq 0 \qquad j \in J$$
(30)

Then, the following problem can be considered as the dual of \(\left(\mathrm{PSP}\right)\) when \({x}_{j}^{v}\) and \({x}_{j}^{m}\) are fixed at \({\stackrel{-}{x}}_{j}^{v}\) and \({\stackrel{-}{x}}_{j}^{m}\), respectively:

$$\left(\mathbf{D}\mathbf{S}\mathbf{P}\right): \mathbf{M}\mathbf{i}\mathbf{n}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ \Sigma_{i\in I}{\lambda }_{i1}+\Sigma_{i\in \mathrm{I}}{\lambda }_{i2}+\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{x}}_{j}^{v}{\beta }_{ij}+\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{x}}_{j}^{m}{\gamma }_{ij}$$
(31)

Subject to

$${\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge {I}_{i}{c}_{ij}^{v}-w{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij} i\in I, j\in J$$
(32)
$${c}_{ij}^{m}{\lambda }_{i2}+{\gamma }_{ij}\ge 0 i\in I, j\in J$$
(33)
$${\beta }_{ij}, {\gamma }_{ij}\ge 0 i\in I, j\in J$$
(34)

3.3 Master Problem and the Benders Iterations

Let \({S}_{i}^{P}\) and \({S}_{i}^{D}\) be the set of all extreme points and the set of all extreme directions of the feasible region of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\), respectively, while \({S}^{P}\) and \({S}^{D}\) are the set of all extreme points and the set of all extreme directions of the feasible region of \(\left(\mathrm{DSP}\right)\), respectively. Then, the original problem \(\left(\mathrm{P}\right)\) introduced by (1)–(8), can be represented as the following so-called Master Problem:

$$\left(\mathbf{M}\mathbf{P}\right) \mathbf{M}\mathbf{a}\mathbf{x}\mathbf{i}\mathbf{m}\mathbf{i}\mathbf{z}\mathbf{e} \ \theta$$
(35)

Subject to

$$\Sigma_{j\in J}{p}_{v}{x}_{j}^{v}+\Sigma_{j\in J}{p}_{m}{x}_{j}^{m}\le B$$
(36)
$$\Sigma_{j\in J}{x}_{j}^{v}\ge 1$$
(37)
$$\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{\beta }}_{ij}{x}_{j}^{v}+\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{\gamma }}_{ij}{x}_{j}^{m}\ge -\Sigma_{i\in I}\left({\stackrel{-}{\lambda }}_{i1}+{\stackrel{-}{\lambda }}_{i2}\right), {\left({\stackrel{-}{{\varvec{\lambda}}}}_{1},{\stackrel{-}{{\varvec{\lambda}}}}_{2},\stackrel{-}{{\varvec{\beta}}},\stackrel{-}{{\varvec{\gamma}}}\right)}^{T}\in {S}^{D}$$
(38)
$$\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{\beta }}_{ij}{x}_{j}^{v}+\Sigma_{i\in I}\Sigma_{j\in J}{\stackrel{-}{\gamma }}_{ij}{x}_{j}^{m}-\theta \ge -\Sigma_{i\in I}\left({\stackrel{-}{\lambda }}_{i1}+{\stackrel{-}{\lambda }}_{i2}\right), {\left({\stackrel{-}{{\varvec{\lambda}}}}_{1},{\stackrel{-}{{\varvec{\lambda}}}}_{2},\stackrel{-}{{\varvec{\beta}}},\stackrel{-}{{\varvec{\gamma}}}\right)}^{T}\in {S}^{P}$$
(39)
$${x}_{j}^{v},{x}_{j}^{m}\in \left\{\mathrm{0,1}\right\} j\in J$$
(40)

where

$${{\varvec{\beta}}}_{i}=\left({\beta }_{i1},\dots ,{\beta }_{i\left|J\right|}\right), \ {\varvec{\beta}}=\left({{\varvec{\beta}}}_{1},\dots ,{{\varvec{\beta}}}_{\left|I\right|}\right)$$
$${{\varvec{\gamma}}}_{i}=\left({\gamma }_{i1},\dots ,{\gamma }_{i\left|J\right|}\right), \ {\varvec{\gamma}}=\left({{\varvec{\gamma}}}_{1},\dots ,{{\varvec{\gamma}}}_{\left|I\right|}\right)$$
$${{\varvec{\lambda}}}_{1}=\left({\lambda }_{11},\dots ,{\lambda }_{\left|I\right|1}\right), \ {{\varvec{\lambda}}}_{2}=\left({\lambda }_{12},\dots ,{\lambda }_{\left|I\right|2}\right)$$

The master problem involves only the complicated variables \({x}_{j}^{v}\) and \({x}_{j}^{m}\) and an extra auxiliary variable \(\theta\). Constraints (38) and (39) are referred to as Benders feasibility cuts and optimality cuts, respectively. Feasibility cuts ensure that the selected values for \({x}_{j}^{v}\) and \({x}_{j}^{m}\) do not result in an unbounded dual subproblem \(\left(\mathrm{DSP}\right)\) and consequently an infeasible primal subproblem \(\left(\mathrm{PSP}\right)\). On the other hand, optimality cuts ensure that \(\theta\) takes the minimum value of \(\left(\mathrm{DSP}\right)\) or equivalently the maximum value of \(\left(\mathrm{PSP}\right)\) for each feasible choice of \({x}_{j}^{v}\) and \({x}_{j}^{m}\).

However, all of the feasibility cuts and optimality cuts are not known beforehand, and they are iteratively generated and appended to the relaxed master problem \((\mathrm{RMP})\) when necessary. \((\mathrm{RMP})\) is supposed to be a relaxed version of the master problem which is defined by (35)–(37), (40) and some (but not all) of the feasibility cuts (38) and the optimality cuts (39). Hence, \((\mathrm{RMP})\) provides an upper bound for the optimal value of the original problem. On the other hand, the optimal value of the primal subproblem \(\left(\mathrm{PSP}\right)\) or equivalently the optimal value of the dual subproblem \(\left(\mathrm{DSP}\right)\) for each feasible values of \({x}_{j}^{v}\) and \({x}_{j}^{m}\) provides a lower bound for the optimal value of the original problem. The Benders iterations are continued until the difference between the best upper bound and the best lower bound is less than a predefined tolerance \(\epsilon\).

Note that a vector \({\left({{\varvec{\lambda}}}_{1},{{\varvec{\lambda}}}_{2},{\varvec{\beta}},{\varvec{\gamma}}\right)}^{T}\) is an extreme feasible point of \(\left(\mathrm{DSP}\right)\) if and only if \({\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}\) is an extreme feasible point of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) for all \(i\in I\). Furthermore, a nonzero vector \({\left({{\varvec{\lambda}}}_{1},{{\varvec{\lambda}}}_{2},{\varvec{\beta}},{\varvec{\gamma}}\right)}^{T}\) is an extreme recession direction of \(\left(\mathrm{DSP}\right)\) if and only if \({({\lambda }_{{i}^{^{\prime}}1},{\lambda }_{{i}^{^{\prime}}2},{{\varvec{\beta}}}_{{i}^{^{\prime}}},{{\varvec{\gamma}}}_{{i}^{^{\prime}}} )}^{T}\) is an extreme recession direction of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) for some \({i}^{^{\prime}}\in I\) and

$${\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}=0, \forall i\ne {i}^{^{\prime}}.$$

Therefore, if for some fixed values \({\stackrel{-}{x}}_{j}^{v}\) and \({\stackrel{-}{x}}_{j}^{m}\) there exists some \({I}^{^{\prime}}\subseteq I\) so that subproblems \(\left({\mathrm{PSP}}_{\mathrm{i}}\right)\) are infeasible (and consequently, \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) are unbounded) for each \(i\in {I}^{^{\prime}}\), then we can append to the master problem any arbitrary number of the following feasibility cuts:

$${\stackrel{-}{\lambda }}_{i1}+{\stackrel{-}{\lambda }}_{i2}+\Sigma_{j\in J}{\stackrel{-}{\beta }}_{ij}{x}_{j}^{v}+\Sigma_{j\in J}{\stackrel{-}{\gamma }}_{ij}{x}_{j}^{m}\ge 0 i\in {I}^{^{\prime}}$$
(41)

where \({\left({\stackrel{-}{\lambda }}_{i1},{\stackrel{-}{\lambda }}_{i2},{\stackrel{-}{{\varvec{\beta}}}}_{i},{\stackrel{-}{{\varvec{\gamma}}}}_{i}\right)}^{T}\) is an extreme recession direction of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) for each \(i\in {I}^{^{\prime}}\). The set of directions of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) can be precisely characterized by the set of all nonzero vectors \({\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}\) satisfying the following constraints (see Sect. 2.4 of [25]):

$${\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge 0 \qquad j\in J$$
(42)
$${c}_{ij}^{m}{\lambda }_{i2}+{\gamma }_{ij}\ge 0 \qquad j\in J$$
(43)
$${\beta }_{ij}, {\gamma }_{ij}\ge 0 \qquad j\in J .$$
(44)

3.4 Solution of \(\left({\mathrm{PSP}}_{\mathrm{i}}\right)\) and \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\)

In this subsection, we show that the primal subproblems (\({\mathrm{RPSP}}_{\mathrm{i}}\)) and dual subproblems \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) are quite easy to handle for each \(i\in I\). In fact, one of the following cases may occur for this problem:

Case 1 There exists some \(\stackrel{-}{j}\in J\) such that \({x}_{\stackrel{-}{j}}^{v}={c}_{i\stackrel{-}{j}}^{v}=1\).

In this case, the demand point \(i\) can be optimally allocated to the normal facility located at \(\stackrel{-}{j}\) since \(i\) is within its coverage distance. The optimal value is \({I}_{i}\) and

$$\begin{cases}{\lambda }_{i1}=0, \quad {\lambda }_{i2}={I}_{i}, \\ {\beta }_{ij}=0 \ \mathrm{and} \ {\gamma }_{ij}=0 \qquad \forall j\in J\end{cases}$$
(45)

presents an optimal solution of the dual subproblem \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) in this case. But, if \(\stackrel{-}{j}\) is the unique index with the property that \({x}_{ij}^{v}={c}_{ij}^{v}=1\), then another (alternative) optimal solution of the dual subproblem \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) can be obtained as follows:

$$\begin{aligned}\begin{cases}{\lambda }_{i1} =0, &\quad { \lambda }_{i2}=0, \\ {\gamma }_{ij} =0 &\quad \forall j\in J, \\ {\beta }_{ij} ={I}_{i} &\quad \forall j\in {J}_{i}^{v} \ \mathrm{and} \ {\beta }_{ij}=0 \quad \forall j\in J\setminus {J}_{i}^{v} \end{cases}\end{aligned}$$
(46)

where,\({J}_{i}^{v} :=\left\{j\in J:{c}_{ij}^{v}=1\right\}\).

Proposition 1 If \(\stackrel{-}{j}\) is the unique index with the property that \({x}_{ij}^{v}={c}_{ij}^{v}=1\), then the optimal solutions defined by (46) is an extreme optimal solution of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\).

Proof By the well-known Shapiro’s variable transformation, the unrestricted (free) variables \({\lambda }_{i1}\) and \({\lambda }_{i2}\) can be replaced by \({\lambda }_{i1}^{+}-{\lambda }_{i1}^{-}\) and \({\lambda }_{i2}^{+}-{\lambda }_{i2}^{-}\), respectively, where \({\lambda }_{i1}^{+}\), \({\lambda }_{i1}^{-}\), \({\lambda }_{i2}^{+}\), and \({\lambda }_{i2}^{-}\) are non-negative variables. Then, \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) is a linear programming (LP) problem with \(2\left|J\right|+4\) non-negative variables. In fact, \({\lambda }_{i1}=0\) is equivalent to \({\lambda }_{i1}^{+}={\lambda }_{i1}^{-}=0\), and \({\lambda }_{i2}=0\) is equivalent to \({\lambda }_{i2}^{+}={\lambda }_{i2}^{-}=0\). Now, we see that the following constraints provide \(2\left|J\right|+4\) linearly independent constraints of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) binding at (46):

$${\lambda }_{i1}^{+},{\lambda }_{i1}^{-},{\lambda }_{i2}^{+},{\lambda }_{i2}^{-}\ge 0,$$
$${\gamma }_{ij}\ge 0 j\in J,$$
$${\beta }_{ij}\ge 0 j\in J\setminus {J}_{i}^{v},$$
$$\left({\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge {I}_{i}{c}_{ij}^{v}-w{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij} j\in J,\right) for j\in {J}_{i}^{v}$$

This completes the proof according to the simple characterization theorem for extreme points of polyhedral sets.

Remark In our solution algorithm, we prefer (due to preliminary numerical results) to use (46) as an optimal solution of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) if possible. However, we use (45) in the case that (46) cannot be used (i.e. whenever \(\stackrel{-}{j}\) is not the unique index with the property that \({x}_{ij}^{v}={c}_{ij}^{v}=1\)). By the following proposition, we show that (45) is also an extreme point.

Proposition 2 The optimal solutions defined by (45) is an extreme optimal solution of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\).

Proof Considering the same notation of the proof of Proposition 1, \({\lambda }_{i1}=0\) is equivalent to \({\lambda }_{i1}^{+}={\lambda }_{i1}^{-}=0\) and \({\lambda }_{i2}={I}_{i}\) is equivalent to \({\lambda }_{i2}^{+}={I}_{i}\) and \({\lambda }_{i2}^{-}=0\). Now, we see that the following constraints provide \(2\left|J\right|+4\) linearly independent constraints of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) binding at (45):

$${\lambda }_{i1}^{+},{\lambda }_{i1}^{-},{\lambda }_{i2}^{-}\ge 0,$$
$${\gamma }_{ij}\ge 0 j\in J,$$
$${\beta }_{ij}\ge 0 j\in J,$$
$$\left({\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge {I}_{i}{c}_{ij}^{v}-w{I}_{i}\left(1-{c}_{ij}^{v}\right){d}_{ij} j\in J,\right) for j=\stackrel{-}{j}.$$

This completes the proof according to the simple characterization theorem for extreme points of polyhedral sets.

Case 2 For each \(j\in J\), \({c}_{ij}^{v}=1\) implies \({x}_{j}^{v}=0\).

In this case, the demand point \(i\) is not within the coverage distance of any of normal facilities. This case can itself be divided into the following two subcases:

Subcase 2.1 For each \(j\in J\), \({c}_{ij}^{v}=1\) implies \({x}_{j}^{v}=0\) and \({c}_{ij}^{m}=1\) implies \({x}_{j}^{m}=0\).

This case occurs when not only the demand point \(i\) is not within the coverage distance of any of normal facilities, but also, it is not within the coverage distance of any of backup facilities. Hence, in this case, \(\left({\mathrm{RPSP}}_{\mathrm{i}}\right)\) is infeasible. The infeasibility of \(\left({\mathrm{RPSP}}_{\mathrm{i}}\right)\) is obvious due to (19)–(21). On the other hand, the dual subproblem \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) is unbounded in this case and clearly

$$\begin{cases}{\lambda }_{i1}=0, \quad {\lambda }_{i2}=-1, \\ {\beta }_{ij}=1 \quad \forall j\in {J}_{i}^{v}, \quad {\beta }_{ij}=0 \quad \forall j\in J\backslash {J}_{i}^{v},\\ {\gamma }_{ij}=1 \quad \forall j\in {J}_{i}^{m}, \quad {\gamma }_{ij}=0 \quad \forall j\in J\backslash {J}_{i}^{m}\end{cases}$$
(47)

is a recession direction for \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\), where:

$${J}_{i}^{v}=\left\{j\in J:{c}_{ij}^{v}=1\right\} \ {a}{n}{d} \ {J}_{i}^{m}=\{j\in J:{c}_{ij}^{m}=1\}$$
(48)

Proposition 3 The direction defined by (47) is an extreme recession direction for \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\).

Proof  It is sufficient to show that if there exist two directions \({\left({\lambda^{\prime}}_{i1},{\lambda^{\prime}}_{i2},{{\varvec{\beta}}}_{i}^{^{\prime}},{{\varvec{\gamma}}}_{i}^{^{\prime}}\right)}^{T}\) and \({\left({{\lambda }^{^{\prime\prime} }}_{i1},{{\lambda }^{^{\prime\prime} }}_{i2},{{\varvec{\beta}}}_{i}^{^{\prime\prime} },{{\varvec{\gamma}}}_{i}^{^{\prime\prime} }\right)}^{T}\) of the feasible set of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) such that

$${\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}={\left({\lambda }_{i1}^{^{\prime}},{\lambda }_{i2}^{^{\prime}},{{\varvec{\beta}}}_{i}^{\boldsymbol{^{\prime}}},{{\varvec{\gamma}}}_{i}^{\boldsymbol{^{\prime}}}\right)}^{T}+{\left({\lambda }_{i1}^{^{\prime\prime} },{\lambda }_{i2}^{^{\prime\prime} },{{\varvec{\beta}}}_{i}^{^{\prime\prime} },{{\varvec{\gamma}}}_{i}^{^{\prime\prime} }\right)}^{T},$$

then these two directions are both positive multiples of \({\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}\). First, note that for each \(j\in J\), if \({c}_{ij}^{m}=0\), then \({\gamma }_{ij}=0\) and since \({\gamma }_{ij}^{^{\prime}},{\gamma }_{ij}^{{^{\prime}}{^{\prime}}}\ge 0\), we have \({\gamma }_{ij}^{{^{\prime}}}={\gamma }_{ij}^{{^{\prime}}{^{\prime}}}=0\). Also, if \({c}_{ij}^{m}=1\) then \({\gamma }_{ij}=1\) and according to (43), we have:

$$\left\{\begin{array}{c}{\lambda }_{i2}^{^{\prime}}+{\gamma }_{ij}^{^{\prime}}\ge 0 \Rightarrow {\lambda }_{i2}^{^{\prime}}\ge -{\gamma }_{ij}^{^{\prime}}\\ {\lambda }_{i2}^{^{\prime\prime} }+{\gamma }_{ij}^{^{\prime\prime} }\ge 0 \Rightarrow {\lambda }_{i2}^{^{\prime\prime} }\ge -{\gamma }_{ij}^{^{\prime\prime} }\end{array}\right. \mathop\Rightarrow^{{{{-1={\lambda }_{i2}=\lambda }_{i2}^{^{\prime}}+{\lambda }_{i2}^{^{\prime\prime} }\ge -\left({\gamma }_{ij}^{^{\prime}}+{\gamma }_{ij}^{^{\prime\prime} }\right)=-{\gamma }_{ij}=-1} }} \left\{\begin{array}{c} {\lambda }_{i2}^{^{\prime}}=-{\gamma }_{ij}^{^{\prime}}\\ {\lambda }_{i2}^{^{\prime\prime} }=-{\gamma }_{ij}^{^{\prime\prime} }\end{array}\right.$$
$$\mathop\Rightarrow^{{{{\lambda }_{i2}=-1 \& {\gamma }_{ij}=1} }} \frac{{\lambda }_{i2}^{^{\prime}}}{{\lambda }_{i2}}=\frac{{\gamma }_{ij}^{^{\prime}}}{{\gamma }_{ij}}, \frac{{\lambda }_{i2}^{^{\prime\prime} }}{{\lambda }_{i2}}=\frac{{\gamma }_{ij}^{^{\prime\prime} }}{{\gamma }_{ij}}.$$
(49)

On the other hand, we know that \({J}_{1}^{v}\ne J\) because otherwise, according to the assumptions of Subcase 2.1, we have \({x}_{j}^{v}=0\) for all \(j\in J\) which contradicts the valid inequality (37). Therefore, there exists some \(\stackrel{-}{j}\in J\setminus {J}_{1}^{v}\) where \({\beta }_{i\stackrel{-}{j}}=0\). Hence, we have:

$$\begin{aligned}{\beta }_{i\stackrel{-}{j}}=0\mathop\Rightarrow^{{{\beta }_{i\stackrel{-}{j}}={\beta }_{i\stackrel{-}{j}}^{^{\prime}}+{\beta }_{i\stackrel{-}{j}}^{^{\prime\prime} } \& {\beta }_{i\stackrel{-}{j}}^{^{\prime}},{\beta }_{i\stackrel{-}{j}}^{^{\prime\prime} }\ge 0}}{\beta }_{i\stackrel{-}{j}}^{^{\prime}}={\beta }_{i\stackrel{-}{j}}^{^{\prime\prime} }=0\mathop\Rightarrow^{{{{c}_{i\stackrel{-}{j}}^{v}=0 \& \left({\lambda }_{i1}+{c}_{ij}^{v}{\lambda }_{i2}+{\beta }_{ij}\ge 0 j\in J\right) }}}{\lambda }_{i1}^{^{\prime}},{\lambda }_{i1}^{^{\prime\prime} }\ge 0 \\\mathop\Rightarrow^{{0={\lambda }_{i1}={\lambda }_{i1}^{^{\prime}}+{\lambda }_{i1}^{^{\prime\prime} }}}{\lambda }_{i1}^{^{\prime}}={\lambda }_{i1}^{^{\prime\prime} }=0.\end{aligned}$$
(50)

Now, for each \(j\in J\), there exist two cases: \({c}_{ij}^{v}=0\) or \({c}_{ij}^{v}=1\). If \({c}_{ij}^{v}=0\), then \({\beta }_{ij}=0\) and since \({\beta }_{ij}={\beta }_{ij}^{{^{\prime}}}+{\beta }_{ij}^{^{\prime\prime} }\) and \({\beta }_{ij}^{{^{\prime}}},{\beta }_{ij}^{^{\prime\prime} }\ge 0\), we have \({\beta }_{ij}^{{^{\prime}}}={\beta }_{ij}^{^{\prime\prime} }=0\). Furthermore, if \({c}_{ij}^{v}=1\) then:

$$\begin{aligned}{\beta }_{ij}= & 1\mathop\Rightarrow^{{(42)\&(50)}} \begin{cases}{\lambda }_{i2}^{^{\prime}}\ge -{\beta }_{ij}^{^{\prime}}\\ {\lambda }_{i2}^{^{\prime\prime} }\ge -{\beta }_{ij}^{^{\prime\prime} }\end{cases} \mathop\Rightarrow^{{-1={\lambda }_{i2}={\lambda }_{i2}^{^{\prime}}+{\lambda }_{i2}^{^{\prime\prime} }\ge -\left({\beta }_{ij}^{^{\prime}}+{\beta }_{ij}^{^{\prime\prime} }\right)=-{\beta }_{ij}=-1}} \begin{cases}{\lambda }_{i2}^{^{\prime}}=-{\beta }_{ij}^{^{\prime}}\\ {\lambda }_{i2}^{^{\prime\prime} }=-{\beta }_{ij}^{^{\prime\prime} }\end{cases} \\ & \mathop\Rightarrow^{{{\lambda }_{i2}=-1 \ \& \ {\beta }_{ij}=1}} \frac{{\lambda }_{i2}^{^{\prime}}}{{\lambda }_{i2}}=\frac{{\beta }_{ij}^{^{\prime}}}{{\beta }_{ij}}, \frac{{\lambda }_{i2}^{^{\prime\prime} }}{{\lambda }_{i2}}=\frac{{\beta }_{ij}^{^{\prime\prime} }}{{\beta }_{ij}}.\end{aligned}$$
(51)

.

To summarize, we have shown that \({\left({\lambda {^{\prime}}}_{i1},{\lambda {^{\prime}}}_{i2},{{\varvec{\beta}}}_{i}^{{^{\prime}}},{{\varvec{\gamma}}}_{i}^{{^{\prime}}}\right)}^{T}\) and \({\left({\lambda }_{i1}^{^{\prime\prime} },{\lambda }_{i2}^{^{\prime\prime} },{{\varvec{\beta}}}_{i}^{^{\prime\prime} },{{\varvec{\gamma}}}_{i}^{^{\prime\prime} }\right)}^{T}\) are positive multiples of \({\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}\) and actually, they are not distinct from \({\left({\lambda }_{i1},{\lambda }_{i2},{{\varvec{\beta}}}_{i},{{\varvec{\gamma}}}_{i}\right)}^{T}\), and this completes the proof.

Subcase 2.2 For each \(j\in J\), \({c}_{ij}^{v}=1\) implies \({x}_{j}^{v}=0\) but there exists some \(j\in J\) such that \({x}_{j}^{m}={c}_{ij}^{m}=1\)

In this case, the demand point \(i\) is optimally allocated to a normal facility located at point \(\stackrel{-}{j}\in \underset{{\{j\in J: x}_{j}^{v}=1\}}{\mathrm{arg min}}{d}_{ij}\) and also to an arbitrary backup facility located at an arbitrary point \(j\) with \({c}_{ij}^{m}=1\). The optimal value of \(\left({\mathrm{PSP}}_{\mathrm{i}}\right)\) in this case will be \(-\mathrm{w}{I}_{i}{d}_{i\stackrel{-}{j}}\). On the other hand,

$$\begin{cases}{\lambda }_{i1}=-w{I}_{i}{d}_{i\stackrel{-}{j}}, \quad {\lambda }_{i2}=0, \quad {\gamma }_{ij}=0 \quad \forall j\in J, \\ {\beta }_{ij}={I}_{i}+w{I}_{i}{d}_{i\stackrel{-}{j}} \quad \forall j\in {J}_{i}^{v}, \quad { \beta }_{ij}= \mathrm{max} \ (0,w{I}_{i}{d}_{i\stackrel{-}{j}}-w{I}_{i}{d}_{ij}) \quad \forall j\in J\setminus {J}_{i}^{v}\end{cases}$$
(52)

is an optimal solution of the dual subproblem \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\).

Proposition 4 The optimal solution defined by (52) is an extreme optimal solution of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\).

Proof Consider the constraints of the problem \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\). By the same notation of the proof of Proposition 1, \({\lambda }_{i1}=-w{I}_{i}{d}_{i\stackrel{-}{j}}\) is equivalent to \({\lambda }_{i1}^{+}=0\) and \({\lambda }_{i1}^{-}=w{I}_{i}{d}_{i\stackrel{-}{j}}\), and \({\lambda }_{i2}=0\) is equivalent to \({\lambda }_{i2}^{+}={\lambda }_{i2}^{-}=0\). Hence, \({\lambda }_{i1}^{+},{\lambda }_{i2}^{+},{\lambda }_{i2}^{-}\ge 0\) are three constraints binding at (52). These binding constraints and the constraints \({\gamma }_{ij}\ge 0\) for all \(j\in J\) provide \(\left|J\right|+3\) constraints binding at (52). On the other hand, for each \(j\in J\), either we have a constraint \({\beta }_{ij}\ge 0\) or a constraint in the bundle (28) which together give \(|J|\) more constraints binding at (52). Furthermore, the constraint \({\beta }_{ij}\ge 0\) and the constraint (28) are both binding at (52) for \(j=\stackrel{-}{j}\) (where \(\stackrel{-}{j}\in \underset{{\{j\in J: x}_{j}^{v}=1\}}{\mathrm{arg min}}{d}_{ij}\)), giving one more binding constraint. All the above-mentioned constraints together provide \(2\left|J\right|+4\) linearly independent constraints of \(\left({\mathrm{DSP}}_{\mathrm{i}}\right)\) which are binding at (52). This completes the proof according to the simple characterization theorem for extreme points of polyhedral sets.

3.5 Valid Inequalities for the Master Problem

According to the constraints (5), the following inequalities are valid for the original problem (P), and we can add them to the master problem:

$$\Sigma_{j\in {J}_{i}^{v}}{x}_{j}^{v}+\Sigma_{j\in {J}_{i}^{m}}{x}_{j}^{m}\ge 1 i\in I$$
(53)

where \({J}_{i}^{v}\) and \({J}_{i}^{m}\) are defined by (48). These inequalities prevent infeasibility in the Benders’ primal subproblems and equivalently prevent unboundedness in the Benders’ dual subproblems. Thus, feasibility cuts are no longer generated.

3.6 Steps of the Algorithm and its Finite Convergence

After the analysis of the different cases of the subproblems \(({\mathrm{PSP}}_{\mathrm{i}})\) and \(({\mathrm{DSP}}_{\mathrm{i}})\), the authors propose the algorithm Alg.1 for solving the original problem (\(\mathrm{P}\)).

Finite convergence of the proposed algorithm (Alg. 1) can be deduced in a couple of ways. First, as one can easily see, at each iteration of Alg. 1, an optimality Benders cut of type (39) is generated and appended to the relaxed master problem (RMP). Since \({S}^{P}\) has a finite number of elements, Alg. 1 terminates in a finite number of iterations for any given \(\epsilon\). Furthermore, at each iteration of Alg. 1, at least one master choice is excluded from the set of feasible master choices which has only a finite number of elements. Hence, finite convergence of Alg. 1 can alternatively be concluded.

Alg. 1. The Proposed Benders Decomposition Algorithm (PBDA) for solving Problem (P)

figure a

4 Numerical Results

In order to test the efficiency of the proposed Benders decomposition algorithm (PBDA), the authors compared it with both the classical Benders decomposition algorithm (CBDA) and with the state-of-the-art commercial solver CPLEX on a dataset for the model (\(P\)) presented in Sect. 3.

4.1 Dataset

In order to better evaluate their whole approach, the authors generated a suitable dataset for the comparison of the three algorithms. This dataset consists of 5 major groups of nodes, each one including different number of nodes (100, 225, 400, 625, and 900 nodes) to simulate the morphology of imaginary cities. For each group, the morphology is similar to the one depicted in Fig. 1. Each square corresponds to a potential point for locating the ambulances (vans and motorlances), and the distance \({d}_{ij}\) between 2 squares is given by the following Eq. (54), also known as “Manhattan distance” [26]:

Fig. 1
figure 1

Example city of 100 nodes

Fig. 2
figure 2

Solution (225 nodes, case 1, example 1). 5 vans and 5 motorcycles

Fig. 3
figure 3

Solution (225 nodes, case 2, example 1). 3 vans and 6 motorcycles

Fig. 4
figure 4

Solution (225 nodes, case 3, example 1). 2 vans and 7 motorcycles

$${d}_{ij}=\left|{x}_{i}-{x}_{j}\right|+\left|{y}_{i}-{y}_{j}\right|$$
(54)

In every group of nodes, 10 different examples are created, each one with different values for parameters \({I}_{i}\) and \(w\). For each of the 10 examples, the parameter \({I}_{i}\) is randomly generated and takes integer values in the interval \([1,\sqrt{NoOfNodes}]\), and the weight is considered as:

$$w=\frac{1}{\sum_{i\in I}[{I}_{i}*\underset{j\in J}{\mathit{max}}\left\{\left(1-{c}_{ij}^{v}\right){d}_{ij}\right\}]}.$$
(55)

By using Eq. (55), which contains the maximum values of the distance \({d}_{ij}\) in the denominator, the weight \(w\) takes very small values so that the first term of the objective function (1) is considered of the first priority and the second term (multiplied by \(w\)) is considered of the second priority, as stated in Sect. 2

Furthermore, in all groups of examples, the available budget (\(B\)), the cost of locating a backup facility (\({p}_{m}\)), and the motorcycle coverage distance (MCD) are shown in Table 1.

The authors created the dataset in order to aim at two goals. First, it offers the opportunity to estimate the efficiency of the proposed algorithm in a wide range of the problem sizes. This is the reason why the authors have chosen to solve from a small problem of 100 nodes to a relative large problem of 900 nodes.

The second is to consider some sensitivity analysis on certain parameters to see how these would change the solution. This sensitivity analysis would validate the described model. For this reason, the authors decided to differentiate the ratio between the cost of locating a backup facility (motorlance) and the cost of locating a normal facility (van). Thus, having the backup facility cost fixed for all examples, they considered three cases where the normal facility cost takes different values. Moreover, having a fixed available budget for all examples, the different normal facility costs led to the need to consider proportionally different van coverage distance (VCD) in order to avoid infeasibility of the problem. It makes sense that a more expensive normal facility could cover a larger region. Table 2 depicts the values for these parameters in each of the three cases.

Table 1: Data in all examples and cases

Obviously, the coverage distance of a motorlance is generally larger than that of the van due to the former’s faster arrival to the site of emergency. In the notation of the model, the parameter \(VCD\) is translated to \({c}_{ij}^{v}\), by \({c}_{ij}^{v}=1\) if \({d}_{ij}\le VCD\) and \({c}_{ij}^{v}=0\) otherwise. Parameter \(MCD\) is translated to \({c}_{ij}^{m}\), by \({c}_{ij}^{m}=1\) if \({d}_{ij}\le MCD\) and \({c}_{ij}^{m}=0\) otherwise. Finally, the cost of a motorlance is lower than a van in all cases.

The above leads to a total number of 150 examples solved (5 groups of nodes \(\times\) 3 cases \(\times\) 10 examples). The algorithms were implemented using C++ and solved using CPLEX 12.10 in Visual Studio 2019 on a desktop computer with Intel (R) Core(TM) i7-4790 CPU 3.6 GHz, 16 GB RAM, 64-bit.

4.2 Results

The study of the computational results could be divided into two subsections. The first one is the comparison of the three algorithms considering the different problem sizes. The second one would be to evaluate the previously mentioned sensitivity analysis on the three cases of examples.

4.2.1 Comparison of the Algorithms

In Table 3, the comparative results for the examples in the groups of 100, 225, 400, 625, and 900 nodes are shown respectively for CPLEX, classical Benders decomposition algorithm (CBDA), and the proposed Benders decomposition algorithm (PBDA). It has to be mentioned that the maximum CPU time for all instances was set at 2 h (7200 s).

Table 2 Applied parameters for the different cases

Apart from the first two columns, the first three columns show the number of examples solved to optimality in each case by the algorithms. The next three columns show the average CPU time spent to solve these examples to optimality. Following are the average optimality gaps of all examples (either optimal or feasible or infeasible ones) for each case. Note that if an algorithm was unable to find any feasible solution within 2 h, then a gap of 100% was considered for it. Hence, the average gap of 100% for 900 nodes means that CPLEX failed in finding even one feasible solution in all of 10 examples. Finally, the last three columns show the percentage of reduction of the CPU time that an algorithm achieves over the other. For example, in the final column “PBDA vs CBDA”, the negative percentages show how much faster the proposed algorithm is than the classical algorithm. In order to compare the CPU times of two algorithms such as CBDA and CPLEX, the following formula was used:

$$\left(\frac{\left(\frac{10}{n\left(CBDA\right)}\right)\times \ average \ cpu \ time \ of \ CBDA \ for \ examples \ solved \ to \ optimality}{\left(\frac{10}{n\left(CPLEX\right)}\right)\times \ average\ cpu \ time \ of \ CPLEX \ for \ examples \ solved \ to \ optimality}-1\right)\times 100\%$$
(56)

where

$$n\left(\mathrm{CBDA}\right)=\mathrm{number} \ \mathrm{of} \ \mathrm{examples} \ \mathrm{solved} \ \mathrm{to} \ \mathrm{optimality} \ \mathrm{by} \ \mathrm{CBDA}$$
$$n\left(\mathrm{CPLEX}\right)=\mathrm{number} \ \mathrm{of} \ \mathrm{examples} \ \mathrm{solved} \ \mathrm{to} \ \mathrm{optimality} \ \mathrm{by} \ \mathrm{CPLEX}$$

Similar formulas were used for comparing PBDA vs CPLEX and PBDA vs CBDA.

When comparing both CBDA and PBDA with CPLEX, one can notice that the Benders-based algorithms are much more efficient in all examples. Especially, for small to medium instances (100 to 625 nodes), they can solve them all to optimality, while being on average 94% faster than CPLEX. Note that CPLEX results in suboptimal solutions for 225 to 625 nodes in the maximum time of 2 h. As can be seen, in the cases with 900 nodes, CPLEX could not find even one feasible solution in all of 10 examples. This is the reason that its average gap could not be compared with the Benders-based ones.

Moreover, the comparison between the proposed algorithm and the classical one is quite interesting, as it is obvious that PBDA outperforms CBDA. Concerning the average optimality gap, the two algorithms manage to find optimal solutions for almost all instances, while the number of their iterations is exactly the same. The latter fact confirms that the proposed solution method of Sect. 3.4 leads to exactly the same optimality cuts as the classical Benders decomposition algorithm. Focusing on the CPU time, the proposed algorithm is on average 30% faster than the classical one. The good performance of PBDA lies on the fact that there is no need to model and solve the dual subproblem, which is quite computationally expensive. Instead, its solution can be easily computed as presented in Sect. 3.4, which is computationally very fast, as it can be easily coded using “for” and “if” loops. The computational benefit of PBDA seems to be reduced for larger instances such as 900 nodes. However, this happens due to the 2-h maximum limit, to which both algorithms reach for some examples, while PBDA still finds better solutions with lower optimality gaps than CBDA.

4.2.2 Sensitivity Analysis

As previously stated, the authors examined three cases where the cost of locating a normal facility and its coverage distance take different values. For every case, 10 examples were solved, and in Table 4, the average computational times for these examples in all node groups are depicted. As it is shown, case 1 needs more time to be solved, which can be explained by the fact that the coverage distance of the normal facilities is very small, and thus, it is harder to find a solution that satisfies the master’s constraints. To the same direction, case 3 is the easiest one due to the large coverage distance of the normal facility.

Table 3 Comparative results for the dataset (average for 10 examples in each case)

Table 5 indicates the average number of vans and motorcycles deployed. As one can see, within the same group of nodes (e.g., 225 nodes), the average number of vans decreases as its cost increases. The coverage gap created due to the use of a small number of expensive vans is covered by increasing the number of motorlances. This outcome was expected, and its confirmation proves the validity of the model and the proposed solution approach.

Table 4 Average comparative results (for each case in all node groups)

However, in the larger examples (e.g., 625 and 900 nodes), one can notice a slight difference in the above conclusion. While the average number of vans decreases as its cost is getting higher, the average number of motorlances remains the same (large enough) for all three cases. This happens because we have considered a conservative policy for the available budget (Table 1). The relatively small budget for large instances leads to shortage of vans even for case 1 (where they are cheaper), thus making it necessary to deploy a large number of motorlances even in this case.

Table 5 Average number of vans and motorcycles used in model’s solution (average for 10 examples in each case)

Moreover, it would be interesting to better illustrate the different solutions gained in every case and the coverage of the normal facilities. In the following figures Figs. 2, 3 and 4, the solutions of example 1 of 225 nodes are depicted for all three cases. Each square corresponds to a node of the network. Table 6 offers an explanation of the different shades used in the figures. It should be noted that the number inside every square corresponds to the facility (either normal or back-up) that is the first service provider to that square.

Table 6 Memo for following figures

As the cost and the coverage distance of the normal facilities (vans) increases (from case 1 to case 3), their number decreases, while their distribution tends to quite resemble with each other, since the most important nodes (which the normal facilities are motivated to cover) are the same in all cases.

5 Conclusion

A double-type double-standard model (DtDsM) was proposed for maximal covering location problem (MCLP) which is very useful in a wide range of applications like public emergency services in which it is important for each demand point to receive at least a primary service (from a backup facility) within a predefined time before receiving a full service (from a normal facility which cannot serve the demand within the predefined time).

Since the parameters of the problem can widely vary from one case to another, relocation of facilities during a time horizon is inevitable, and developing fast solution algorithms to reset and resolve real facility location problems is very important. An efficient Benders-based decomposition algorithm was proposed to solve the model which is much faster than the commercial solver CPLEX for solving DtDsM.

It is worth noting that if we assume a standard response time of 5 min and an average motorlance speed of 60 km/h, then the coverage distance of a motorlance is approximately 5 km, and the assumption MCD = 5 can be interpreted so that the distance between two adjacent points is 1 km. Therefore, a \(10\times 10\) network (with 100 nodes) can be considered for the location problem in a region with an area of \(9\times 9=81\) square kilometers. Similarly, one can see that a \(30\times 30\) network (with 900 nodes) can be considered for the location problem in a region with an area of \(29\times 29=841\) square kilometers. This means that our largest test problem can be considered as a simulation of the problem of locating the emergency vehicles in a large city like Tehran (with an area of approximately 730 \({\mathrm{km}}^{2}\)).

Finally, it is worth noting that although the proposed DtDsM is in fact an incapacitated maximal covering location problem (i.e., no limit is considered on the amount of demands allocated to a single facility), it is still useful and applicable in real emergency service planning if the probability of two demand points (allocated to an identical facility) to be the scene of accidents at the same time is assumed to be very low.

Developing an efficient solution algorithm for a capacitated version of DtDsM can be considered in future studies.