Keywords

1 Introduction

Many facility location problems involve strategic decisions that must hold for a considerable amount of time, during which uncontrolled changes may occur in the conditions underlying the problem. For example, we may observe an unexpected disruption in the network due to some failure, or we may realize that the values of some parameters (e.g., demand levels) vary in an unpredictable manner. In such cases it may be desirable to account for uncertainty in advance and thus make decisions that can somehow anticipate it. This can be accomplished by embedding uncertainty in mathematical models developed for supporting decision making processes.

The review papers by Louveaux (1993) and Snyder (2006) show that much work has been done within the context of facility location under uncertainty. The different sources of uncertainty we may face in these problems have led to the development of different research branches. One of them consists of so-called problems with congestion. In this case, the customers’ requests for service have a probabilistic behavior. If a facility is busy when a new request arrives then we say that “congestion” occurs. This is the topic covered by Chap. 17. Another important research direction regards unexpected disruptions in the network structures, e.g., in the facilities or in the transportation channels. This is a topic addressed in detail in Chap. 22. In the current chapter, we focus on a third perspective: we consider the aspects emerging from uncertainty associated with the parameters of a facility location problem such as the demand levels or transportation costs. We show how uncertainty can be embedded in optimization models aiming at supporting a decision making process. For illustrative purposes, we work with several well-known problems. We focus on a discrete setting, i.e., we assume that there is a finite set of candidate locations for the facilities. This is motivated by the practical relevance that this setting has gained overt time, which stems from many successful applications of facility location theory to areas such as logistics, transportation and routing (see Chap. 1).

In the following sections we assume that the reader is familiar with the basic concepts of robust and stochastic optimization. Important references in these fields include Birge and Louveaux (2011) and Shapiro et al. (2009) for stochastic programming; Kouvelis and Yu (1997) and Ben-Tal et al. (2009) for robust optimization.

The remainder of this chapter is organized as follows. In the next section, we discuss general aspects related with uncertainty. In Sect. 8.3, we address robust facility location problems. In Sect. 8.4, we focus on stochastic programming models. Section 8.5 is devoted to chance-constrained problems. In Sect. 8.6 we discuss some challenges and give suggestions for further reading. The chapter ends with an overview of the contents presented.

2 Uncertainty Issues

Basic information underlying a facility location problem often includes demand levels, travel time, cost for supplying the customers, location of the customers, presence or absence of the customers, and price for the commodities. Uncertainty may occur in one or several of these parameters.

One crucial aspect when dealing with uncertainty regards its representation. First, uncertain parameters may be discrete or continuous. Second, if probabilistic information is available, the uncertain parameters can be represented through random variables and thus they are jointly represented by a random vector. In this case, using the well-known characterization proposed by Rosenhead et al. (1972), we say that we are making a decision under risk and we can resort to stochastic programming models and methods for dealing with the problem. If this is not the case, we are making a decision under uncertainty and a robustness measure is usually considered for evaluating the performance of the system. It is important to note that the existence of a probabilistic description for the uncertainty does not prevent the use of robustness measures, as will be detailed in the next section.

We call “scenario” a complete realization of all the uncertain parameters. This notion is independent of whether or not probabilistic information is available. Nevertheless, if uncertain parameters can be represented by random variables, a probability can often be associated with each scenario. Depending on the problem, we may have a finite or an infinite number of scenarios. As will be discussed later, this impacts the models and techniques that can be used.

One important feature that influences the optimization model to be considered for a specific problem regards the attitude of the decision maker towards risk. Two attitudes are usually considered: risk neutral and risk averse. In the first case, the decision maker does not take risk into account when making a decision and a linear function is a correct representation of the utility associated with the decision maker. When a probability can be associated with each scenario, a risk neutral decision maker looks for a decision that minimizes the expected cost (or maximizes the expected return or utility). A risk averse decision maker can be associated with a concave utility function (when utility is measured on the vertical axis and monetary value is measured on the horizontal axis). In this case, the decision maker wants to avoid unnecessary risk and the expected value of the future assets is no longer an appropriate objective. Such a decision maker may look, for instance, for the solution minimizing the maximum cost across all scenarios.

Finally, in some classes of problems, there is another aspect that influences the mathematical model to be considered: the identification of the ex ante and ex post decisions. In the first case, we have the decisions that must be implemented before uncertainty is revealed—also called the here-and-now decisions; in the second case, we have the decisions to be implemented after uncertainty is disclosed. The latter set of decisions is often used as a reaction to the values observed for the uncertain parameters. In a facility location problem, the location of the facilities is often an ex ante decision. This is a consequence of the strategic nature of such decisions in many problems, which imposes their implementation before uncertainty is revealed. Regarding the allocation or distribution decisions, it will depend on the specific problem being studied whether they are ex ante or ex post decisions. In the following sections we refer to both situations.

3 Robust Facility Location Problems

We start by assuming that uncertainty is appropriately captured by a finite set of scenarios. As mentioned above, each scenario fully determines the value of all the uncertain parameters. If no probabilistic information is available, one possibility for measuring the performance of a system is to use a robustness measure. In this case, two classical objectives are often considered: minmax cost and minmax regret.

For illustrative purposes, we consider the well-known p-median problem. In this problem, we have a set of demand nodes J each of which to be served by one out of p new facilities to be located. The potential locations for the facilities coincide with the locations of the demand nodes. In its discrete version, the problem can be formulated mathematically as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in J} \sum_{j \in J} d_j a_{ij} x_{ij} {} \end{aligned} $$
(8.1)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in J} x_{ij} = 1, \quad j \in J {} \end{aligned} $$
(8.2)
$$\displaystyle \begin{aligned} & \quad x_{ij} \leq x_{ii}, \quad i \in J,\: j \in J {} \end{aligned} $$
(8.3)
$$\displaystyle \begin{aligned} & \quad \sum_{i \in J} x_{ii} = p {} \end{aligned} $$
(8.4)
$$\displaystyle \begin{aligned} & \quad x_{ij} \in \{0,1\}, \quad i \in J,\: j \in J. {} \end{aligned} $$
(8.5)

In this formulation, aij represents the distance or travel time between demand nodes i and j (i, j ∈ J) and dj is the demand or weight of node j (j ∈ J); xij is a binary variable equal to 1 if node j ∈ J is allocated to node i ∈ J and 0 otherwise; xii = 1 indicates that a facility is located at i. The objective is to minimize the total weighted distance or travel time.

In a p-median problem, uncertainty can occur in the demands (or weights) or in the distances (or travel times). Denote by Ω the finite set of scenarios and by ω ∈ Ω one particular scenario (that fully specifies all the uncertain parameters). Suppose that the location of the facilities is an ex ante decision and the allocation of the customers to the operating facilities is an ex post decision. In order to capture uncertainty, we need to consider binary location variables yi indicating whether a facility is located at i ∈ J, and scenario-indexed binary allocation variables xijω indicating whether demand node j ∈ J is allocated to facility i ∈ J in scenario ω ∈ Ω. The minmax p-median problem can be formulated as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad v {} \end{aligned} $$
(8.6)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in J} \sum_{j \in J} d_{j\omega} a_{ij\omega} x_{ij\omega} \leq v, \quad \omega \in \varOmega {} \end{aligned} $$
(8.7)
$$\displaystyle \begin{aligned} & \quad \sum_{i \in J} x_{ij\omega} = 1, \quad j \in J,\: \omega \in \varOmega {} \end{aligned} $$
(8.8)
$$\displaystyle \begin{aligned} & \quad x_{ij\omega} \leq y_i, \quad i \in J,\: j \in J,\: \omega \in \varOmega {} \end{aligned} $$
(8.9)
$$\displaystyle \begin{aligned} & \quad \sum_{i \in J} y_i = p {} \end{aligned} $$
(8.10)
$$\displaystyle \begin{aligned} & \quad x_{ij\omega} \in \{0,1\}, \quad i \in J,\: j \in J,\: \omega \in \varOmega {} \end{aligned} $$
(8.11)
$$\displaystyle \begin{aligned} & \quad y_i \in \{0,1\}, \quad i \in J. {} \end{aligned} $$
(8.12)

In this model, d represents the demand of node j ∈ J under scenario ω ∈ Ω, and aijω represents the travel time between nodes i ∈ J and j ∈ J under scenario ω ∈ Ω. The minmax objective arises from the combination of (8.6) and (8.7).

The solution provided by the previous model tends to be overly conservative. It reflects a complete aversion of the decision maker towards risk. In fact, by planning for the worst case scenario (the maximum weighted distance occurring across all scenarios), the decision maker may be planning for a scenario which turns out to be very unlikely. A better compromise can be achieved by considering the minmax regretFootnote 1 criterion. In this case, the decision maker chooses the decision that minimizes the maximum regret across all scenarios. The corresponding model is obtained by replacing (8.7) with

$$\displaystyle \begin{aligned} \sum_{i \in J} \sum_{j \in J} d_{j\omega} a_{ij\omega} x_{ij\omega} - v_\omega^* \leq v, \quad \omega \in \varOmega, {} \end{aligned} $$
(8.13)

where \(v_\omega ^*\) is the optimal value of problem (8.1)–(8.5) solved for scenario ω ∈ Ω. Serra and Marianov (1998) consider the above minmax regret model after scaling the demands. In particular, for each scenario, they divide each demand by the total demand under that scenario. The authors also note a very relevant aspect: when the optimal objective function differs significantly across the different scenarios, the relative regret is a more appropriate robustness measure (see also Kouvelis and Yu 1997). In this case, (8.13) should be replaced with

$$\displaystyle \begin{aligned} \frac{\sum_{i \in J} \sum_{j \in J} d_{j\omega} a_{ij\omega} x_{ij\omega} - v_\omega^*}{v_\omega^*} \leq v, \quad \omega \in \varOmega. {} \end{aligned} $$
(8.14)

Serra and Marianov (1998) developed a heuristic for this problem.

A different problem is studied by Serra et al. (1996). They consider a firm that wishes to locate p facilities in a competitive environment. The goal is to maximize the minimum market captured in a region where competitors are already operating. The criterion considered corresponds to the “maximization” version of the minmax “cost” criterion discussed above. Uncertainty is assumed for the demand and for the location of the competitors. Again, a heuristic is proposed for tackling the problem.

If the allocation of customers to facilities is also an ex ante decision, the models above can be easily adapted. In this case, the scenario index should be removed from the allocation variables, i.e., the allocation variables become those introduced in model (8.1)–(8.5). Furthermore, the location variables yi are no longer necessary, since the variables xii (i ∈ J) can play their role.

The above models work with a finite set of scenarios. In practice, however, this is not always a correct representation for the uncertainty. In many situations, an uncertain parameter can lie in some infinite set. A popular way of capturing such uncertainty in these cases is via intervals. In the general context of robust optimization, two types of uncertainty sets are often considered: box and ellipsoidal uncertainty sets (see Ben-Tal et al. 2009, for further details). In the first case, uncertainty is defined by a set of linear constraints; in the second case, quadratic expressions involving the uncertain parameters are used. We illustrate both cases considering the uncapacitated facility location problem (UFLP), whose well-known mathematical formulation is the following:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij} {} \end{aligned} $$
(8.15)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} x_{ij} = 1, \quad j \in J {} \end{aligned} $$
(8.16)
$$\displaystyle \begin{aligned} & \quad x_{ij} \leq y_i, \quad i \in I,\: j \in J {} \end{aligned} $$
(8.17)
$$\displaystyle \begin{aligned} & \quad y_i \in \{0,1\},\quad i \in I {} \end{aligned} $$
(8.18)
$$\displaystyle \begin{aligned} & \quad x_{ij} \geq 0, \quad i \in I,\: j \in J. {} \end{aligned} $$
(8.19)

In this model, I denotes the set of potential locations for the facilities, J is the set of customers, fi represents the setup cost for facility i ∈ I, cij corresponds to the unit cost for supplying the demand of customer j ∈ J from facility i ∈ I and dj is the demand of customer j ∈ J. The binary variable yi indicates whether a facility is installed at i ∈ I, and the continuous variable xij represents the fraction of the demand of customer j ∈ J that is supplied from facility i ∈ I.

We consider now a common source of uncertainty in a facility location problem: the demand. Under box uncertainty, each demand level, dj (j ∈ J), lies in an interval \(\mathscr {U}^B_j=[\overline {d}_j -\epsilon \varDelta _j,\overline {d}_j + \epsilon \varDelta _j]\), 0 ≤ 𝜖 ≤ 1. The parameter 𝜖 measures the uncertainty “magnitude”; \(\overline {d}_j\) denotes a reference value for the demand of customer j ∈ J, and is commonly referred to as the nominal value for the unknown parameter; Δj is a scaling factor.

A particular case of box uncertainty arises when \(\varDelta _j=\overline {d}_j\) (j ∈ J), which leads to the intervals \(\mathscr {U}^B_j=[\overline {d}_j (1-\epsilon ) , \overline {d}_j (1 + \epsilon )]\), j ∈ J. Denote \(\mathscr {U}^B = \mathscr {U}^B_1 \times \dots \times \mathscr {U}^B_{|J|}\) and d the vector of demands, d = (d1, …, d|J|). We can write

$$\displaystyle \begin{aligned}\mathscr{U}^B=\{ d \in \mathbb{R} \mid -1 \leq \frac{d_j - \overline{d}_j}{\epsilon \overline{d}_j} \leq 1, \: \forall j \in J \},\end{aligned}$$

i.e., the multi-dimensional unit box is given by the absolute normalized deviations (Baron et al. 2011). We can now formulate the so-called robust counterpart of model (8.15)–(8.19). To do so, we start by considering an auxiliary variable v, which allows us to rewrite the objective function of the problem as

$$\displaystyle \begin{aligned} \mbox{Minimize} \quad v. {} \end{aligned} $$
(8.20)

The following constraint must now be included in the model:

$$\displaystyle \begin{aligned} \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij} \leq v. {} \end{aligned} $$
(8.21)

By considering an augmented constraint for (8.21), namely

$$\displaystyle \begin{aligned} \sum_{i \in I} f_i y_i + \max_{ d \in \mathscr{U}^B } \, \left\{ \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij} \right\} \leq v, {} \end{aligned} $$
(8.22)

the robust counterpart of (8.21) becomes

$$\displaystyle \begin{aligned} \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} \left[ \overline{d}_j (1+\epsilon) \right] x_{ij} \leq v. {} \end{aligned} $$
(8.23)

The robust counterpart of (8.15)–(8.19) consists of minimizing (8.20) subject to (8.16)–(8.19), and (8.23).

A drawback of box uncertainty is that it comprises the possibility of having all the uncertain parameters taking their worst values simultaneously. This is often not realistic.

Nikoofal and Sadjadi (2010) avoid the too conservative solutions often arising from considering box uncertainty by imposing a maximum total scaled variation for the uncertain parameters. The authors consider a p-median problem with interval uncertainty associated with the distances (or travel times). In particular, for each pair (i, j), i, j ∈ J, they assume that aij can take any value within an interval \([ \underline {a}_{ij},\overline {a}_{ij}]\) previously defined. Additionally, the choices for the values aij are restricted by the constraint

$$\displaystyle \begin{aligned}\sum_{i,j \in J,\: i < j} (a_{ij}-\underline{a}_{ij})/(\overline{a}_{ij}-\underline{a}_{ij}) \, \leq L,\end{aligned}$$

where L denotes a maximum level imposed for the total scaled variation. This type of constraint avoids the situation in which all or several parameters take their extreme values simultaneously.

Another alternative for overcoming the above-mentioned drawback when using box uncertainty is to consider ellipsoidal sets. Baron et al. (2011) apply this idea to a facility location problem with a time-varying (uncertain) demand. The location of the facilities and their operating capacity are ex ante decisions and should hold for the entire planning horizon, during which the demands must be satisfied. The goal is to maximize the overall profit. We illustrate the process using the UFLP. Ellipsoidal uncertainty can be embedded in a model by defining the following uncertainty set

$$\displaystyle \begin{aligned}\mathscr{U}^E = \{ d \in \mathbb{R}^{|J|} \mid \sum_{j \in J} \left[ \frac{d_j-\overline{d}_j}{\epsilon \overline{d}_j} \right]^2 \leq L^2 \} = \left \{ d \in \mathbb{R}^{|J|} \mid (d - \overline{d})^T \varLambda^{-1} (d - \overline{d}) \leq L^2 \right \},\end{aligned}$$

with d being the demand vector already presented, L being a parameter and Λ|J|×|J| being a diagonal matrix whose generic entry is \(\sigma _j = \epsilon \overline {d}_j\). Since Λ is a positive definite matrix, the set \(\mathscr {U}^E\) defines an ellipsoid. As pointed out by Baron et al. (2011), the set induced by L = 1 is the largest ellipsoid contained in \(\mathscr {U}^B\) while the set induced by \(L=\sqrt {|J|}\) is the smallest ellipsoid containing \(\mathscr {U}^B\).

Under ellipsoidal uncertainty the augmented constraint for (8.21) is similar to (8.22) but replacing \(\mathscr {U}^B\) with \(\mathscr {U}^E\). Denote Vj =∑iIcijxij and V   =  (V1, …, V|J|). The augmented constraint can be written as \(\sum _{i \in I} f_i y_i + \max _{ d \in \mathscr {U}^E } \, V^\prime d \leq v\).

The problem that consists of finding a value \(d \in \mathscr {U}^E\) maximizing Vd can be easily solved by standard optimization techniques. The optimal solution is \(V^\prime \overline {d} + L \sqrt {V^\prime \varLambda V} \). This leads to the following robust counterpart of (8.21):

$$\displaystyle \begin{aligned} \sum_{i \in I} f_i y_i + \sum_{j \in J} \overline{d}_j V_j + L \sqrt{\sum_{j \in J} \sigma_j^2 V_j^2} \leq v, {} \end{aligned} $$
(8.24)

The non-linearity in the above expression is typically handled by introducing a new variable, \(W = \sqrt {\sum _{j \in J} \sigma _j^2 V_j^2}\), which allows casting the problem as a conic programming problem (see Baron et al. (2011) and the references therein for further details).

In all problems discussed above, no probabilities were associated with the scenarios. However, in some situations, a probability πω can be associated to scenario ω ∈ Ω. A well-known robustness measure in this case, is the expected cost, which is equivalent to the expected regret (Snyder 2006). Current et al. (1997) study a facility location problem consisting of locating a set of p facilities here-and-now, together with the possibility of locating an extra set of facilities (whose cardinality is endogenously determined) during a planning horizon previously defined. The authors compare the solutions obtained using the minmax regret and the expected regret criteria.

When probabilities can be associated with the scenarios, an alternative robustness measure proposed by Snyder and Daskin (2006) is “α-robustness”. The idea is to look for a solution minimizing the expected cost/distance but such that the relative regret in each scenario is less than or equal to a parameter α. In the case of the p-median problem, assuming ex ante location decisions and ex post allocation of customers to the operating facilities, we obtain the following model:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{\omega \in \varOmega} \sum_{i \in J} \sum_{j \in J} \pi_\omega d_{j\omega} a_{ij\omega} x_{ij\omega} {} \end{aligned} $$
(8.25)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \mbox{(8.8)--(8.12)} \\ & \quad \sum_{i \in J} \sum_{j \in J} d_{j\omega} a_{ij\omega} x_{ij\omega} \leq (1+\alpha) v_\omega^*, \quad \omega \in \varOmega. {} \end{aligned} $$
(8.26)

As pointed out by Snyder and Daskin (2006), this model generalizes the well-known models proposed by Weaver and Church (1983) and Mirchandani et al. (1985). Snyder and Daskin (2006) also apply these ideas to the UFLP. They analyze the complexity of both problems (the α-robustness p-median problem and the α-robustness UFLP) and develop Lagrangean relaxation based procedures in order to compute lower and upper bounds for the problems. The final gaps are closed using branch-and-bound procedures.

All the robustness measures discussed and illustrated above involve all scenarios. When the number of scenarios is too high, the large-scale models obtained may become intractable. In this case, restricting the scenario set may be unavoidable. This was done by Daskin et al. (1997) who introduced the α-reliable minmax regret p-median problem. The authors seek to minimize the maximum regret over a subset of scenarios. This subset is referred to as the reliability set. It is built from the original set in such a way that the total probability associated with its scenarios is equal to at least some pre-specified value α. As pointed out by Baron et al. (2011), this idea has a purpose similar to the use of ellipsoid uncertainty: the exclusion of low-probability (typically extreme) scenarios. An extension of the above robustness measure was introduced by Chen et al. (2006) who introduced the α-reliable mean-excess regret. This measure weights the maximum regret over the reliability set and the conditional expectation of the regret over the scenarios not included in the reliability set.

A different robustness concept was introduced by Carrizosa and Nickel (2003) within the context of continuous facility location, although the concept can be extended to network or discrete problems. In that paper, nominal values are assumed to have been estimated for the (uncertain) weights of a set of nodes. A maximum value is preset for the weighted distance between a single facility to be located and the demand nodes. The robustness of a location is then defined as the minimum deviation of the vector of weights with respect to the nominal vector that turns that location an infeasible solution. The goal of the problem is to find the most robust location. This yields a non-linear fractional model that the authors tackle by existing methods and by ad hoc procedures they propose in the paper.

One final aspect worth mentioning in this section regards the relevance of using a model like the ones described above, instead of a “simplified” deterministic model. When probabilities can be associated with the scenarios, we can measure this relevance by using the expected value of perfect information (EVPI). This is a value indicating how much the decision maker would be willing to pay to obtain perfect information. For an expected cost minimization problem, the EVPI is obtained by computing the difference between the weighted sum of the optimal values for all scenarios (using the probabilities as weights) and the minimum expected cost. The reader can refer to Kouvelis and Yu (1997) for further details.

4 Stochastic Facility Location Problems

A facility location problem under uncertainty can often be cast within a stochastic programming modeling framework if we know the joint probability distribution of the underlying random vector. In this case, we say that we are dealing with a stochastic facility location problem.

We start by considering the UFLP (8.15)–(8.19). In practice, several parameters in this model may be uncertain. This is the case of the distribution costs and of the demands. Let us assume that uncertainty can be measured probabilistically. In particular, denote by Ξ the random vector containing all the stochastic parameters (e.g., \(\varXi =\left ( (c_{ij})_{i \in I, \, j \in J},(d_{j})_{j \in J} \right )\)). Furthermore, suppose that we know the joint probability distribution of Ξ. Assuming ex ante location decisions, the model to be adopted will depend on the ex post decisions, namely on the moment in time at which the allocation or distribution decisions are to be implemented. If we have ex post allocation decisions, the following stochastic uncapacitated facility location problem with recourse can be considered:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + {\mathrm Q}(y) {} \end{aligned} $$
(8.27)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} y_i \geq 1 {} \end{aligned} $$
(8.28)
$$\displaystyle \begin{aligned} & \quad y_i \in \{0,1\},\quad i \in I, {} \end{aligned} $$
(8.29)

with \({\mathrm Q}(y)=\mathbb {E}_{\varXi }\left [ Q(y,\xi ) \right ]\), and Q(y, ξ) denoting the optimal value of the following problem:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij} {} \end{aligned} $$
(8.30)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} x_{ij} = 1, \quad j \in J {} \end{aligned} $$
(8.31)
$$\displaystyle \begin{aligned} & \quad x_{ij} \leq y_i, \quad i \in I,\: j \in J {} \end{aligned} $$
(8.32)
$$\displaystyle \begin{aligned} & \quad x_{ij} \geq 0, \quad i \in I,\: j \in J. {} \end{aligned} $$
(8.33)

Model (8.30)–(8.33) is defined for every realization ξ of Ξ, i.e., for every realization of costs and demands. Accordingly, the allocation decisions xij (i ∈ I, j ∈ J), which do not appear in the first-stage problem, can change according to the different observations of the random vector. For this reason, they are referred to as recourse decisions. Regarding the variables yi associated with the location of the facilities they correspond to ex ante (first-stage) decisions and hence they must hold for all possible realizations of the random variables. The expectation defining the recourse function Q(y) implicitly conveys a neutral attitude of the decision maker toward risk. Later in this section, we discuss another possible attitude and the corresponding consequences from a modeling point of view. Finally, due to the presence of Constraint (8.28) we are dealing with a problem that has relatively complete recourse, i.e., for every first-stage feasible solution, yi (i ∈ I) there is at least one second-stage feasible solution, xij (i ∈ I, j ∈ J) for every possible realization of the random quantities.

If we have a finite set of scenarios, say Ω, we can go farther with the above model since we can consider scenario-indexed parameters and variables. Denote by cijω the unit cost for supplying customer j ∈ J from facility i ∈ I under scenario ω ∈ Ω, and let d be the demand of customer j ∈ J under scenario ω ∈ Ω. If xijω is the fraction of the demand of customer j ∈ J satisfied from facility i ∈ I under scenario ω ∈ Ω, then we can consider the following extensive form of the deterministic equivalent:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{\omega \in \varOmega} \pi_\omega \left( \sum_{i \in I} \sum_{j \in J} c_{ij\omega} d_{j\omega} x_{ij\omega} \right) {} \end{aligned} $$
(8.34)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \mbox{(8.28), (8.29)} \\ & \quad \sum_{i \in I} x_{ij\omega} = 1, \quad j \in J,\: \omega \in \varOmega {} \end{aligned} $$
(8.35)
$$\displaystyle \begin{aligned} & \quad x_{ij\omega} \leq y_i,\quad i \in I, \: j \in J, \: \omega \in \varOmega {} \end{aligned} $$
(8.36)
$$\displaystyle \begin{aligned} & \quad x_{ij\omega} \geq 0, \quad i \in I, \: j \in J, \: \omega \in \varOmega. {} \end{aligned} $$
(8.37)

In the above model, the non-anticipativity principleFootnote 2 is implicitly considered: each first-stage decision variable has the same value for all scenarios.

So far, facilities are assumed to be uncapacitated. When this is not the case, several adjustments are required. Denote by qi the capacity of a facility established at i ∈ I. A model for the capacitated stochastic facility location problem is obtained if we replace (8.32) with

$$\displaystyle \begin{aligned} \sum_{j \in J} d_j x_{ij} \leq q_i y_i, \quad i \in I. {} \end{aligned} $$
(8.38)

With the inclusion of these constraints, it may happen that for some first-stage feasible solution, no feasible completion exists in the second stage for one or several realizations of the random vector, i.e., the problem no longer has relatively complete recourse. This feasibility issue adds an extra difficulty to this stochastic programming problem. Infeasibility in the second stage is often an indication of an undesirable first-stage solution. A natural way of dealing with this issue is to penalize the non-satisfied demand, which makes sense from a practical point of view. In fact, such a penalty may correspond, for example, to a lost opportunity cost or to outsourcing. Denote by ψj the demand of customer j ∈ J which is not supplied from the open facilities and denote by μj the corresponding unit penalty cost. Note that ψj is also a random variable since it depends on the occurring realization of the random vector Ξ. We can still consider the first stage problem (8.27)–(8.29). However, the second stage problem must be rewritten as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} \sum_{j \in J} c_{ij} d_j x_{ij} + \sum_{j \in J} \mu_j \psi_j {} \end{aligned} $$
(8.39)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \mbox{(8.33), (8.38)} \\ & \quad \sum_{i \in I} x_{ij} + \frac{\psi_j}{d_j} = 1, \quad j \in J {} \end{aligned} $$
(8.40)
$$\displaystyle \begin{aligned} & \quad \psi_j \geq 0, \quad j \in J. {} \end{aligned} $$
(8.41)

Again, if a finite set of scenarios exists, we can consider scenario-indexed recourse variables and parameters, and we can write the deterministic equivalent in its extensive form.

In the capacitated model just described, capacities are exogenous. Louveaux (1986) considers a stochastic facility location problem with endogenous capacities. In particular, capacities must be set in advance before uncertainty is disclosed—they correspond to ex ante decisions. A unit cost gi is assumed for the capacity to be installed at location i ∈ I. Additionally, the author considers the existence of variable production costs at the facilities as well as revenues associated with demand satisfaction. Denote by rj the unit revenue obtained from customer j ∈ J. Additionally, assume that cij (i ∈ I, j ∈ J) includes the production costs. A new decision variable zi (i ∈ I) must be introduced, representing the capacity to be installed at location i ∈ I. With the inclusion of revenues, it is no longer necessary to consider constraint (8.28). Furthermore, it may not be rewarding to satisfy all the demand; the trade-off between revenues and costs will determine the best service level for each customer. The capacitated model formulated above, can be easily adapted to the new conditions, leading to the model proposed by Louveaux (1986):

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} g_i z_i + {\mathrm Q}(y,z) {} \end{aligned} $$
(8.42)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \mbox{(8.29)} \\ & \quad z_i \geq 0, \quad i \in I, {}\end{aligned} $$
(8.43)

with \({\mathrm Q}(y,z)=\mathbb {E}_{\varXi }\left [ Q(y,z,\xi ) \right ]\), and Q(y, z, ξ) denoting the optimal value of the following problem:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} \sum_{j \in J} \left( c_{ij} - r_j \right) d_j x_{ij} {} \end{aligned} $$
(8.44)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} x_{ij} \leq 1, \quad j \in J {} \end{aligned} $$
(8.45)
$$\displaystyle \begin{aligned} & \quad \mbox{(8.32), (8.33)} \\ & \quad \sum_{j \in J} d_j x_{ij} \leq z_i,\quad i \in I. {}\end{aligned} $$
(8.46)

Considering the above problem, Louveaux and Peeters (1992) assume that stochasticity is captured by a finite number of scenarios and propose a dual-based procedure for tackling the extensive form of the deterministic equivalent.

A different type of model emerges when the distribution decisions (represented by x-variables) become first-stage decisions. In this case, penalties are paid in the second stage for surplus or shortage inventory. In addition to the notation already presented, we denote by ϕj the inventory surplus at customer j ∈ J and by λj the corresponding unit cost. Assuming deterministic distribution costs (they are now associated with an ex ante decision), we can formulate the stochastic facility location problem as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} x_{ij} + {\mathrm Q}(x) {} \\ \mbox{subject to} & \quad \mbox{(8.29), (8.32), (8.33)}, \end{aligned} $$
(8.47)

with \({\mathrm Q}(x)=\mathbb {E}_{\varXi }\left [ Q(x,\xi ) \right ]\), and Q(x, ξ) denoting the optimal value of the following problem:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{j \in J} \lambda_j \phi_j + \sum_{j \in J} \mu_j \psi_j {} \end{aligned} $$
(8.48)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \psi_j - \phi_j = d_j \left( 1- \sum_{i \in I} x_{ij} \right), \qquad j \in J {} \end{aligned} $$
(8.49)
$$\displaystyle \begin{aligned} & \quad \psi_j, \phi_j \geq 0, \qquad j \in J. {} \end{aligned} $$
(8.50)

Capacities can be easily included in the above model leading to the so-called stochastic transportation-location problem which has been investigated by several authors (e.g., França and Luna 1982 and Holmberg and Tuy 1999).

So far in this section, we have assumed that the allocation and distribution decisions are made simultaneously (the latter determining the former), either after or before uncertainty is disclosed. Nevertheless, in some problems these decisions are made separately. Let us assume that the allocation of the customers to the facilities is a here-and-now decision but the exact quantities to ship from the facilities to the customers are to be decided after uncertainty is revealed. This situation is motivated, for instance, by logistics applications, when a contract has to be previously signed, determining a priori the distribution channels but leaving the shipping quantities dependent on the observed values of the stochastic parameters. The same type of situation occurs in service-providing companies that need to segment the customers a priori by allocating each customer to a server or facility. In this case, we need to explicitly consider allocation decision variables. In particular, we use the binary variable wij equal to 1 if and only if customer j ∈ J is allocated to facility i ∈ I. The single-allocation version of the problem was introduced by Laporte et al. (1994), who proposed the following optimization model:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} b_{ij} w_{ij} + {\mathrm Q}(w) {} \end{aligned} $$
(8.51)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad w_{ij} \leq y_i, \quad i \in I,\: j \in J {} \end{aligned} $$
(8.52)
$$\displaystyle \begin{aligned} & \quad \sum_{i \in I} w_{ij} \leq 1, \quad j \in J {} \end{aligned} $$
(8.53)
$$\displaystyle \begin{aligned} & \quad y_i, w_{ij} \in \{0,1\}, \quad i \in I,\: j \in J, {} \end{aligned} $$
(8.54)

with \({\mathrm Q}(w)=\mathbb {E}_{\varXi }\left [ Q(w,\xi ) \right ]\), and Q(w, ξ) denoting the optimal value of the following problem:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} \sum_{j \in J} \left( c_{ij} - r_j \right) d_j x_{ij} {} \end{aligned} $$
(8.55)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad x_{ij} \leq w_{ij}, \quad i \in I,\: j \in J {} \end{aligned} $$
(8.56)
$$\displaystyle \begin{aligned} & \quad \sum_{j \in J} d_j x_{ij} \leq q_i, \quad i \in I {} \end{aligned} $$
(8.57)
$$\displaystyle \begin{aligned} & \quad x_{ij} \geq 0, \quad i \in I,\: j \in J. {} \end{aligned} $$
(8.58)

In the above model, bij is a fixed cost for allocating customer j ∈ J to facility i ∈ I. The other notation was already introduced before. Note that in this problem, facilities are capacitated. Moreover, a service level of 100% is not imposed—a customer may not be served by the system (constraints (8.53)). Laporte et al. (1994) consider a finite set of scenarios for capturing the stochasticity and solved the extensive form of the deterministic equivalent using the integer L-shaped method previously proposed by Laporte and Louveaux (1993).

In line with the idea of allocating the customers before uncertainty is disclosed, Albareda-Sambola et al. (2011) consider Bernoulli demands, which represent a possible request for some service. This is an example of a problem in which the presence or absence of customers is itself a source of uncertainty. The problem, which we revisit next, is important to show that deriving a compact model for the deterministic equivalent problem is not always straightforward (or even possible) as it could seem at a first glance when considering the contents presented so far in this section.

In the problem studied by Albareda-Sambola et al. (2011), there is a limited capacity for the facilities in terms of the number of customers that can be served. In particular, for each facility i ∈ I, there is a maximum number qi of customers who can be served from the facility. Due to the uncertainty in the demand, it makes sense to allocate a priori to some facility more customers than the service capacity. However, depending on how uncertainty is revealed, it may turn out that a facility has a number of requests for service above its capacity. In this case, outsourcing is considered and the corresponding costs is paid. An important assumption in many logistics systems that the authors also consider is that, for each facility i ∈ I, there should be a minimum number i of customers allocated to it to justify its establishment. The problem can be conceptually formulated as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \mathbb{E}_{\varXi} \left[ \mbox{Service cost} + \mbox{Outsourcing cost} \right] {} \end{aligned} $$
(8.59)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} x_{ij} = 1, \quad j \in J {} \end{aligned} $$
(8.60)
$$\displaystyle \begin{aligned} & \quad x_{ij} \leq y_i, \quad i \in I,\: j \in J {} \end{aligned} $$
(8.61)
$$\displaystyle \begin{aligned} & \quad \ell_i y_i \leq \sum_{j \in J} x_{ij}, \quad i \in I {} \end{aligned} $$
(8.62)
$$\displaystyle \begin{aligned} & \quad y_i, x_{ij} \in \{0,1\}, \quad i \in I,\: j \in J. {} \end{aligned} $$
(8.63)

Denote by ξj the demand of customer j ∈ J, which is assumed to be a random variable following a Bernoulli distribution with parameter pj. For each first-stage solution, denote by zi the number of customers assigned to facility i ∈ I (i.e., zi =∑jJxij) and denote by ηi the random variable representing the number of customers who request the service (refereed to as demand customers) among those assigned to facility i ∈ I (i.e., ηi =∑jJξjxij). Note that the probability distribution of ηi is quite involved since it depends on the actual values of xij (j ∈ J). Denote by \(\mathbb {P}_x(\eta _i=s)\) the probability that ηi is equal to s (s = 0, …, zi).

Albareda-Sambola et al. (2011), investigate two possible outsourcing actions. We focus on the so-called customer outsourcing. In this case, when the number of customers allocated to some facility i ∈ I requesting the service (demand customers) exceeds qi, ηi − qi customers have to be served directly from an external source. A FIFO policy is assumed for deciding which customers to serve from the facility and which ones to outsource. The cost for supplying each outsourced customer is denoted by gi and depends on the facility to which the customer was originally assigned. Denote by \(\mathbb {P}_i(s)\) the conditional probability of serving a demand customer assigned to facility i ∈ I given that the total number of demand customers assigned to the facility is s (i.e., ηi = s). We have \(\mathbb {P}_i(s) = (1/s) \times \min \{q_i,s\}\).

The recourse function can be written as the sum of the expected service cost plus the expected outsourcing cost. These terms can be computed as follows:

$$\displaystyle \begin{aligned} \mathbb{E}_\xi (\mbox{service cost}) &= \sum_{i \in I} \sum_{s=0}^{z_i} \mathbb{P}_x(\eta_i=s) \times \mathbb{E}(\mbox{Service cost} | \eta_i=s) \\ & = \sum_{i \in I} \sum_{s=0}^{z_i} \left[ \mathbb{P}_x(\eta_i=s) \sum_{j \in J} \mathbb{P}(\xi_j=1 | \eta_i=s) \mathbb{P}_i(s) c_{ij} x_{ij} \right], {} \end{aligned} $$
(8.64)
$$\displaystyle \begin{aligned} \mathbb{E}_\xi (\mbox{Outsourcing cost}) &= \sum_{i \in I} \mathbb{P}_x(\eta_i=s) \times \mathbb{E}_\xi (\mbox{outsourcing cost}|\eta_i=s) \\ & = \sum_{i \in I} g_i \left( \sum_{s=q_i+1}^{z_i} \mathbb{P}_x(\eta_i=s) (s-q_i) \right). {} \end{aligned} $$
(8.65)

A close look at the above expressions reveals that even for tiny instances of the problem they are not tractable. In fact, the number of scenarios is huge even for a small number of customers because a scenario is defined not only by the set of customers requesting the service but also by the order the requests arrive. Nevertheless, for the homogeneous case, i.e., pj = p, j ∈ J, it is possible to go farther and derive a compact formulation for the deterministic equivalent, as we show next.

When all the customers have the same probability of requesting the service, then ηi follows a binomial distribution with parameters zi and p. Thus, \(\mathbb {P}_x(\eta _i=s)= \binom {z_i}{s}p^s(1-p)^{z_i-s}\), s = 0, …, zi. We denote by ζtps the probability that a binomial random variable with parameters t and p takes the value s. In the homogeneous case, it is straightforward to show that \(\mathbb {P}(\xi _j=1 | \eta _i=s)=s/t\) and consequently \(\mathbb {P}(\xi _j=1 | \eta _i=s) \times \mathbb {P}_i(s)=\min \{q_i,s\}/t\), which does not depend on x. Accordingly, the expected service cost (8.64) can be written as

$$\displaystyle \begin{aligned}\sum_{i \in I} \sum_{j \in J} \left( c_{ij} x_{ij} \sum_{s=0}^{z_i} \zeta_{z_ips} \frac{\min\{q_i,s\}}{t} \right).\end{aligned}$$

A deterministic equivalent can now be obtained by discretizing the location and allocation variables accounting for the number of customers allocated to a facility. In particular, define \(y_i^t\) as a binary variable equal to 1 if a facility is located at i ∈ I and t customers in total are allocated to it (t = i, …, |J|) and 0 otherwise. Also define \(x_{ij}^t\) as a binary variable equal to 1 if and only if customer j ∈ J is allocated to facility i ∈ I which has t customers allocated to it (t = i, …, |J|). Using the new variables, we can formulate a deterministic equivalent problem:

$$\displaystyle \begin{aligned} \mbox{Minimize} \:\: & \quad \sum_{i \in I} \sum_{t=\ell_i}^{|J|} y_i^t g_i \left[ \sum_{s=q_i+1}^{t} \zeta_{tps}(s-q_i) \right] \\ & \qquad + \sum_{i \in I} \sum_{j \in J} \left( c_{ij} \sum_{t=\ell_i}^{|J|} x_{ij}^t \left[ \sum_{s=0}^t \zeta_{tps} \frac{\min\{q_i,s\}}{t} \right] \right) {} \end{aligned} $$
(8.66)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \sum_{i \in I} \sum_{t=\ell_i}^{|J|} x_{ij}^t = 1, \quad j \in J {} \end{aligned} $$
(8.67)
$$\displaystyle \begin{aligned} & \quad \sum_{j \in J} x_{ij}^t = t y_i^t, \quad i \in I {} \end{aligned} $$
(8.68)
$$\displaystyle \begin{aligned} & \quad \sum_{t=\ell_i}^{|J|} y_i^t \leq 1, \quad i \in I {} \end{aligned} $$
(8.69)
$$\displaystyle \begin{aligned} & \quad y_i^t \in \{0,1\}, \quad i \in I,\: t=\ell_i,\ldots,|J| {} \end{aligned} $$
(8.70)
$$\displaystyle \begin{aligned} & \quad x_{ij}^t \in \{0,1\}, \quad i \in I,\: j \in J,\: t=\ell_i,\ldots,|J|. {} \end{aligned} $$
(8.71)

Albareda-Sambola et al. (2011) show that using a general solver, instances of the problem with a realistic size can be solved within an acceptable CPU time using this model. The authors also explore the advantages of the homogeneous case for the alternative outsourcing action they consider. This work would be later extended in two different ways. Bieniek (2015) showed that tractable expressions can be obtained for the recourse functions when other probability distributions are considered (not necessarily discrete) as long as the assumption of homogeneity among customers is kept. Albareda-Sambola et al. (2017) proposed a heuristic algorithm for tackling the general problem (heterogeneous demand probabilities). The procedure consists of two phases. First, a GRASP algorithm is used for building two pools of solutions—one based upon quality and another upon diversity. Second, a path relinking procedure is devised for connecting solutions from both pools hoping that better feasible solutions can be found during the process.

In all of the above models, the recourse function is the expected value of the second-stage problem. As mentioned before, this conveys a neutral attitude of the decision maker towards risk. Location decisions are often strategic and involve significant investments. Accordingly, a risk-averse attitude towards risk cannot be disregarded as a possibility to be considered. One way of capturing such attitude consists of applying a Markowicz type of objective in which the recourse function is expanded to account for variance. Taking, as an example, model (8.27)–(8.33) this consists of defining

$$\displaystyle \begin{aligned} {\mathrm Q}(y) = \mathbb{E}_{\varXi}\left[ Q(y,\xi) \right] - \lambda \mbox{Var}_{\varXi} \left[ Q(y,\xi) \right]. {} \end{aligned} $$
(8.72)

Such a modeling framework in facility location is far from new (see Jucker and Carlson 1976). Nevertheless, this type of model has a clear disadvantage: it often results in a non-linear large-scale mixed-integer model. Different possibilities for overcoming this difficulty are discussed by Louveaux (1993).

Stochastic programming approaches for discrete facility location problems have attracted much attention in the recent years. Some papers not mentioned so far include those by Ravi and Sinha (2004), Lin (2009), Wang et al. (2011), Kiya and Davoudpour (2012), and Álvarez-Miranda et al. (2015).

Hybridizing between stochastic programming with robust optimization has been also considered in the context of facility location. Alumur et al. (2012) explored this possibility by using a robustness measure embedded within a stochastic programming modeling framework. The authors apply the idea to a hub location problem. Uncertainty is associated with two sets of parameters. In both cases, it is captured by a finite set of scenarios. For one set of parameters, probabilistic information is assumed to be known, which is not the case for the other set. The authors propose a so-called robust-stochastic model: for each scenario associated with the parameters that have no probabilistic information associated to them, a stochastic program is formulated, capturing the uncertainty associated with the other set of parameters (those for which probabilistic information exists). A minmax regret formulation is then proposed for the overall problem.

Another work combining the flavor of two-stage stochastic programming with robust optimization is due to Marques and Dias (2018) who study a multi-period facility location problem. Uncertainty is associated with fixed and assignment costs as well as to the customers that exist in each period. The authors seek the minimization of the total expected cost but impose a constraint on the maximum regret allowed in each scenario.

In the context of logistics systems with particular emphasis to logistics network design, we can also observe an increasing attention paid to stochastic facility location problems (see Chap. 16 for further details). We can refer, among others, to Aghezzaf (2005), Listeş and Dekker (2005), Mo and Harrison (2005), Romauch and Hartl (2005), Pan and Nagi (2010), Fonseca et al. (2010), and Nickel et al. (2012).

One work worth pointing out is that of Hinojosa et al. (2014) who studied a stochastic facility location problem with location decisions made at an operational level, i.e., location decisions are ex post decisions. The multi-product problem considered in that paper arises in the context of logistics systems. Like in some of the above problems, the available distribution channels correspond to a decision made before demand is known and result from some contract or option. Furthermore, due to the limited capacity at the facilities, the distribution channels contracted in advance may turn out to be insufficient for covering the demand that occurs. In this case, a penalty is incurred (corresponding, e.g., to a “last minute” and thus more expensive contract, to an outsourcing action, or simply to an opportunity loss cost). The location decisions correspond to the “activation” of existing equipments or facilities from which the commodities will be shipped to the customers. Accordingly, this becomes a decision that can be made only after demand is revealed. The authors formulate the extensive form of the deterministic equivalent and solve it for instances with a realistic size using a general solver. The single-commodity version of this problem would be investigated by Fernández et al. (2019) from the perspective of a risk-averse decision maker. In particular, the conditional value at risk is to be minimized.

As in the preceding section, when using a stochastic programming model, it is important to evaluate its relevance compared to a more simplified deterministic one. Although no robust measure exists for asserting such relevance, two measures are often used to provide an indication of such relevance: the EVPI and the value of the stochastic solution (VSS). The EVPI is computed as described in Sect. 8.3. To obtain it, we have to solve the distributional problem (i.e., to find the optimal value of the single-scenario problem for every scenario). In many cases this is cumbersome, namely when the number of scenarios is large or even infinite. The VSS emerges as an alternative and can be obtained in two steps: (1) the expected value problem is solved. This is the deterministic problem obtained when the random variables are replaced by their expectation; (2) the stochastic problem is considered and the difference between its optimal value and the value of the solution obtained in (1) is computed. This difference gives the VSS (the reader should refer to Birge and Louveaux 2011, for further details).

5 Chance-Constrained Facility Location Problems

One important class of optimization problems under uncertainty includes chance-constrained problems. The idea is that one or several constraints of the problem are not required to always hold. Instead, the decision maker is satisfied if they hold with some given probability. This type of constraints may be of relevance when dealing with reliability issues.

In the particular case of a facility location problem, if demand is uncertain but still the decision maker wants to plan for satisfying all the demand whatever it may turn out to be, the resulting solution may call for an operational capacity much above the demand level that turns out being observed. In such situation, one alternative is to plan for ensuring a certain service level, i.e., ensuring that with some pre-specified probability, the overall demand does not exceed the capacity of the operating facilities.

In order to exemplify this paradigm, we consider the classical single-source capacitated facility location problem. Assume that fixed costs are associated with the location of the facilities and also with the allocation of customers to the facilities. Additionally, assume that facility i ∈ I has capacity qi, and that demands dj (j ∈ J) are stochastic. We can formulate a capacitated facility location problem with a service level constraint as follows:

$$\displaystyle \begin{aligned} \mbox{Minimize} & \quad \sum_{i \in I} f_i y_i + \sum_{i \in I} \sum_{j \in J} c_{ij} x_{ij} {} \end{aligned} $$
(8.73)
$$\displaystyle \begin{aligned} \mbox{subject to} & \quad \mbox{(8.16)--(8.18)} \\ & \quad \mathbb{P} \left[ \sum_{j \in J} d_j x_{ij} \leq q_i y_i \right] \geq \alpha_i, \quad i \in I {} \end{aligned} $$
(8.74)
$$\displaystyle \begin{aligned} & \quad x_{ij} \in \{0,1\}, \quad i \in I,\: j \in J. \end{aligned} $$
(8.75)

For every i ∈ I, the corresponding chance constraint sets qiyi equal to the αi-quantile of the distribution of the demand assigned to facility i. In other words, the constraint stipulates that the probability of observing a demand assigned to the facility not exceeding the capacity of the facility is at least αi. Typically, high values are assumed for αi (e.g., 0.90 or 0.95).

One desirable feature of such a model is the possibility of finding a deterministic equivalent formulation, i.e., replacing the probabilistic constraints by deterministic (equivalent) ones. Unfortunately, this is not always straightforward. One successful example for the problem we are considering is due to Lin (2009). The author assumes independent demands following a Poisson or a Gaussian distribution. For illustrative purposes, we detail the former case.

If the demands dj are independent and follow a Poisson distribution P(λj), j ∈ J, then the total demand assigned to facility i ∈ I, i.e., ∑jJdjxij follows a Poisson distribution P(μi) with μi =∑jJλjxij. Accordingly, (8.74) becomes equivalent to

$$\displaystyle \begin{aligned} \sum_{\ell=0}^{q_i y_i} e^{-\mu_i} \frac{\mu_i^\ell}{\ell!} \geq \alpha_i, \quad i \in I {} \end{aligned} $$
(8.76)

which, in turn, has a deterministic equivalent of the form

$$\displaystyle \begin{aligned} \sum_{j \in J} \lambda_j x_{ij} \leq \nu_i y_i, \quad i \in I. {} \end{aligned} $$
(8.77)

In this model, \(\nu _i = \mathbb {E} \left [ \varUpsilon \right ]\), where Υ is a random variable following a Poisson distribution with an expectation equal to the largest value ensuring that \(\mathbb {P} (\varUpsilon \leq q_i) \geq \alpha _i\). As detailed by Lin (2009), the value νi can easily be obtained by a search method in which the mean of Υ is changed until P(Υ ≤ qi) is approximately equal to αi (i ∈ I). After replacing the probabilistic constraints (8.74) with (8.77) the resulting problem becomes a single-source capacitated facility location problem which can be tackled by any appropriate method (see Chap. 4). Lin (2009) also explore the possibility of having independent demands following a Gaussian distribution. In this case, the deterministic equivalent of the probabilistic constraints yields a non-convex feasible region. The author proposes a relaxation for the problem, which is used as part of a heuristic.

A well-known facility location problem with chance constraints is the covering-location problem proposed by ReVelle and Hogan (1989). The authors assume that a server may be busy when a customer requests to be served. Let us denote by π the probability that this occurs. In a discrete covering-location problem, we have a set of potential locations for the facilities (see Chap. 5). A customer is said to be covered if a facility is established within a maximum distance or travel time specified in advance. Accordingly, for each customer, we can find the subset of potential locations for the facilities which cover the customer. The goal is to cover all the demand minimizing the number of facilities installed. The “classical” covering constraints are

$$\displaystyle \begin{aligned} \sum_{i \in I_j} y_i \geq 1, \quad j \in J, {} \end{aligned} $$
(8.78)

where Ij denotes the set of locations covering customer j ∈ J. The probabilistic version of these constraints is the following:

$$\displaystyle \begin{aligned} \mathbb{P} \left[ \mbox{At least one location is available for serving customer}\ j \right] \geq \alpha, \quad j \in J. {} \end{aligned} $$
(8.79)

These constraints have as a deterministic equivalent,

$$\displaystyle \begin{aligned} \sum_{i \in I_j} y_i \geq \beta, {} \end{aligned} $$
(8.80)

with \(\beta = \lceil \ln (1-\alpha ) / \ln \pi \rceil \). In fact, the probability that no location among those covering customer j ∈ J is available to serve the customer immediately is given by \(\pi ^{\sum _{i \in I_j} y_i}\). Therefore, the probability that at least one location among those covering customer j ∈ J can serve it immediately is given by \(1-\pi ^{\sum _{i \in I_j} y_i}\) which, together with (8.79) leads to the deterministic equivalent just presented.

For other applications of facility location problems with chance constraints we refer the reader to Kınay et al. (2018, 2019) as well as to the references therein.

6 Challenges and Further Readings

Despite all the work we can find focusing on facility location problems under uncertainty, many challenges still exist. In this section, we provide the reader with some notes on relevant issues not discussed in the previous pages, and we give suggestions for further readings.

6.1 Multi-Stage Stochastic Programming Models

In most of the stochastic facility location problems discussed above, a single moment in time for uncertainty to be disclosed was assumed. In many situations, this is not the case. Instead, we may observe uncertainty being progressively revealed in a succession of points in time. When this is the case, the two-stage stochastic programming modeling framework discussed in Sect. 8.4 is no longer appropriate, and a multi-stage setting is required. Nickel et al. (2012) address one such case by considering a multi-period facility location problem with service level and investment decisions. The demand as well as the rates of return for the investments are uncertain. Uncertainty is captured via a scenario tree. In addition to minimizing the overall cost, the problem seeks to minimize the downside risk.Footnote 3 The deterministic equivalent problem is formulated in its extensive form and solved using a general solver. Other works addressing multi-stage stochastic facility location problems include that of Hernández et al. (2012), who consider a multi-period problem with stochastic demands. The problem consists of (1) determining the locations and dimensions of a preset number of new jails in Chile; (2) deciding when and where to expand the existing capacity. The goal is to minimize the total expected costs of the system. A large-scale model is obtained and solved approximately using a heuristic that combines branch-and-fix coordination (Alonso-Ayuso et al. 2003) and branch-and-bound. Albareda-Sambola et al. (2013), propose a so-called fix-and-relax coordination approximation procedure for tackling a multi-period facility location problem with uncertainty in the costs and in the customers’ requests for service. This work would be complemented by Escudero et al. (2018), who developed two matheuristics for the problem. One is based upon cluster Lagrangean decomposition (Escudero et al. 2016) whereas the other is based upon a so-called sequential partial linear relaxation which is a scheme that optimizes a decreasing stage-based relaxation of the integrality constraints of the variables for obtaining tighter lower bounds to the original problem.

Taking the previous works into account, one might think that a stochastic multi-period facility location problem necessarily leads to a multi-stage stochastic programming problem. However, this is not true. In some cases, the strategic multi-period decisions can be seen as first-stage decisions in a two-stage stochastic programming modeling framework. For instance, we may decide here-and-now how the location of the facilities will occur during the entire planning horizon. In the second stage problem, the operational decisions will be made, which can adapt to the different realizations of the uncertainty. Works exploring this possibility in the context of facility location include those by Ahmed and Garcia (2004), Aghezzaf (2005), Correia et al. (2018), and Marques and Dias (2018).

6.2 Algorithms

Most facility location problems under uncertainty are NP-hard since they generalize well-known NP-hard problems. In particular, this is true for the discrete problems that have been discussed in this chapter. In these cases, either the size of an instance to be solved is such that the resulting model is manageable by a general solver, or one must resort to techniques from combinatorial optimization and integer programming, such as heuristics and relaxation-based procedures.

Regarding robust facility location problems, the minmax structure often considered makes them harder to solve than the corresponding minisum deterministic problems. The reader can refer to Snyder (2006) for a deeper discussion of this issue. That paper presents a sketch of the procedure typically followed for tackling minmax regret problems. Although some general procedures have been proposed for such problems (e.g., Mausser and Laguna 1998, for minmax regret linear problems with interval uncertainty) in most cases, tailored procedures, exact or approximate, must be developed to efficiently tackle the problems. Analytic results and polynomial time algorithms have also been proposed but only for problems with an underlying structure, such as a network.

As far as stochastic discrete facility location problems are concerned, again, they are often difficult to solve to optimality. Even when the number of scenarios is finite and a compact model can be derived for the extensive form of the deterministic equivalent, realistic instances often induce a large-scale mixed-integer linear programming problem not manageable by a general solver. In this case, specific algorithms, exact or heuristic, have to be developed for tackling the problems. Laporte et al. (1994) make use of the integer L-shaped method proposed by Laporte and Louveaux (1993) for solving a two-stage stochastic facility location problem with first-stage binary variables. Alonso-Ayuso et al. (2003) introduce the so-called branch-and-fix coordination scheme for tackling a problem in the context of logistics systems. The proposed technique can be used for solving general two-stage stochastic programming problems with binary first-stage variables and both binary and continuous variables in the second stage.

A general procedure for multi-stage stochastic mixed-integer linear programming problems was introduced by Escudero et al. (2009, 2010). In those papers, the branch-and-fix coordination scheme proposed by Alonso-Ayuso et al. (2003) was extended to solve multi-stage problems with integer variables. As mentioned above, Hernández et al. (2012) embed such approach within a heuristic procedure.

When exact algorithms fail to solve the problems, we must resort to approximate procedures. One particular difficulty in stochastic programming arises when the number of scenarios is too large or even infinite. In this case, one possibility is to use a sampling scheme. Sample average approximation (SAA) was introduced by Kleywegt et al. (2001) and it is one such example which has become quite popular. Applications of this procedure to stochastic facility location were proposed by Kiya and Davoudpour (2012), Romauch and Hartl (2005) and Santoso et al. (2005). Sampling schemes have also been proposed for general chance-constrained problems by Luedtke and Ahmed (2008) and Pagnoncelli et al. (2009). The application to facility location problems is a research direction worth exploring.

Armas et al. (2017) apply a so-called simheuristic to the stochastic UFLP. Uncertainty is assumed for the transportation costs. The algorithm integrates simulation and a metaheuristic. In particular, the authors integrate an iterative local search with Monte Carlo simulation (MCS). This type of procedure may be quite promising for tackling more complex stochastic facility location problems.

Other algorithms for stochastic programming problems include the generation of cutting planes introduced by Guan et al. (2009) for multi-stage problems, and the dual decomposition based algorithms developed by Carrøe and Schultz (1999) and Escudero et al. (2012). To the best of our knowledge, the first type of algorithm was never applied to stochastic facility location. However, there are several papers proposing dual decomposition based algorithms for problems that include location decisions, namely those by Schütz et al. (2008, 2009). The latter work combines dual decomposition with SAA. In this type of method, the non-anticipativity constraints are explicitly considered in the model and dualized, which allows a scenario-decoupling for the relaxed problem.

6.3 Scenario Generation

In this chapter it has often been assumed that uncertainty can be represented by a set of scenarios. In particular, it has been assumed that each scenario fully determines all the uncertain parameters. In practice, defining the scenarios is itself a relevant problem.

In some situations, scenarios are associated with driving forces (e.g., the political conditions in a specific region, economic trends or some technological developments) which, in turn, influence the input of the model that supports the decision making process. In this case, it is up to the decision maker to understand these driving forces and the way they influence the input of the model. This understanding leads to a complete definition of the scenarios. In some cases, experts may be inquired in terms of plausible scenarios as well as their occurrence probabilities. This may call for the use of subjective probabilities by means of eliciting probability distributions (O’Hagan 1998; Casement and Kahle 2017; Oakley 2017).

In other situations, namely in the context of stochastic programming, scenario generation may be important either to instantiate large deterministic equivalent models or to restrict the set of scenarios in a sampling scheme used within a solution procedure. The reader should refer to Dupačová et al. (2003), Høyland and Wallace (2001), Di Domenica et al. (2007) and the references therein for further details.

In the case of facility location problems, a short discussion on scenario generation is presented by Kouvelis and Yu (1997) who consider a network with uncertain node weights. Assuming a small set of possible values for the demand of each node, one possibility is to take as a scenario each element of the Cartesian product of the sets for all nodes. Nevertheless, this is strongly discouraged since the number of scenarios easily leads to intractable models. Instead, the authors highlight that in many location problems the driving forces mentioned above are the key element inducing uncertainty and thus should be identified and taken into account. Typically, these forces induce a high correlation between different parameters. If a small number of such factors is identified, the number of scenarios associated with them should be manageable.

6.4 Other Notes

One important research topic in facility location under uncertainty regards location-inventory problems. These are problems in which location decisions are combined with inventory management: uncertainty can hardly be disregarded in a realistic modeling framework. This type of problems, which was introduced by Daskin et al. (2002) and extended by Snyder et al. (2007), is of great relevance in complex systems such as those arising in logistics. The reader should refer to Chap. 16 for further details.

Another area with great potential is stochastic location-routing. One such problem was solved by Albareda-Sambola et al. (2007). This is a complex and challenging topic.

Finally, this chapter could not come to an end without a brief reference to continuous and network facility location problems under uncertainty. We did not focus on this type of problems although some significant work has been done and much progress has been achieved. The reader can refer to Snyder (2006) for a review of the fundamental literature addressing these problems. Some recent works on network facility location under uncertainty include those by Conde (2007), Berman and Drezner (2008), Berman and Wang (2010), Sonmez and Lim (2012), Lim and Sonmez (2013), López-de-los-Mozos et al. (2013), Lu (2013), and Lu and Sheu (2013). Recent references on continuous problems include Blanquero et al. (2011) and Drezner et al. (2012).

7 Conclusions

We have covered several essential aspects related with discrete facility location under uncertainty. Despite the extensive work reported, the existing literature can still be considered scarce in comparison with the literature devoted to deterministic models. However the relevance of facility location in areas where uncertainty if often unavoidable, such as logistics, routing and transportation, has led to an increased interest in the topic addressed in this chapter. In order to better support many decision making processes, it is important to embed uncertainty in the optimization models and, by doing so, to obtain solutions which can anticipate it. This keeps being a challenging and promising research field.