Abstract
We formalize the concept of Pareto Adaptive Robust Optimality (PARO) for linear two-stage Adaptive Robust Optimization (ARO) problems, with fixed continuous recourse. A worst-case optimal solution pair of here-and-now decisions and wait-and-see decisions is PARO if it cannot be Pareto dominated by another solution, i.e., there does not exist another worst-case optimal pair that performs at least as good in all scenarios in the uncertainty set and strictly better in at least one scenario. We argue that, unlike PARO, extant solution approaches—including those that adopt Pareto Robust Optimality from static robust optimization—could fail in ARO and yield solutions that can be Pareto dominated. The latter could lead to inefficiencies and suboptimal performance in practice. We prove the existence of PARO solutions, and present approaches for finding and approximating such solutions. Amongst others, we present a constraint & column generation method that produces a PARO solution for the considered two-stage ARO problems by iteratively improving upon a worst-case optimal solution. We present numerical results for a facility location problem that demonstrate the practical value of PARO solutions. Our analysis of PARO relies on an application of Fourier–Motzkin Elimination as a proof technique. We demonstrate how this technique can be valuable in the analysis of ARO problems, besides PARO. In particular, we employ it to devise more concise and more insightful proofs of known results on (worst-case) optimality of decision rule structures.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Robust Optimization (RO) is a widespread methodology for modeling decision-making problems under uncertainty that seeks to optimize worst-case performance [7, 11, 12]. In practice, RO problems usually admit multiple worst-case optimal solutions, the performance of which may differ substantially under non-worst-case uncertainty scenarios. Consequently, the choice of an optimal solution often has material impact on performance under real-world implementations. This important consideration, which was first brought forth by Iancu and Trichakis [16], has been successfully tackled for static, single-stage (linear) RO problems. For the increasingly popular and broad class of dynamic, multi-stage Adaptive Robust Optimization (ARO) problems [2], however, there is no successful approach for choosing an optimal solution, and the purpose of this paper is to bridge this gap.
In particular, for static RO problems, Iancu and Trichakis [16] proposed the choice of so-called Pareto Robustly Optimal (PRO) solutions. In general, PRO solutions unarguably dominate non-PRO solutions, because, by definition, the former guarantee that there do not exist other worst-case optimal solutions that perform at least as good as the current solution for all scenarios in the uncertainty set, while performing strictly better for at least one scenario. The PRO concept has been applied to several static robust optimization settings, such as distributionally robust mechanism design [18].
Going beyond static RO problems, it is well understood that the choice of an optimal solution remains crucial for the broader class of multi-stage ARO problems. Similar to RO solutions, by following a worst-case philosophy and not considering performance across the entire spectrum of possible scenarios, ARO optimal solutions could lead to substantial performance losses. For example, see the work by De Ruiter et al. [9], who numerically demonstrate existence of multiple worst-case optimal solutions for the classical multi-stage inventory-production model that was considered in Ben-Tal et al. [2], and find them to differ considerably from each other in their non-worst-case performance.
For ARO problems, however, a solution approach that unarguably chooses “good solutions,” similar to PRO solutions for static RO problems, has proved to be elusive thus far. Extant approaches have all attempted to simply apply the concept of PRO to ARO problems. Specifically, they advocate restricting attention to adaptive variables that depend affinely on the uncertain parameters; commonly referred to as Linear Decision Rules (LDRs). Restricting to LDRs reduces the problem to static RO, and enables the search for associated PRO solutions [8, 9, 16]. As we shall show, however, this indirect application of the PRO concept fails to produce solutions that cannot be dominated.
Recently, the concept of Pareto Adaptive Robustly Optimal (PARO) solutions was introduced for a specific ARO application [26], but no general treatment of the topic was included. In this paper, we formalize and study the concept of PARO solutions for linear ARO problems. Similar to PRO solutions for static RO problems, PARO solutions yield worst-case optimal performance and are not dominated by any other such solutions in non-worst-case scenarios. In other words, PARO solutions unarguably dominate non-PARO solutions, leading to improved performance in non-worst-case scenarios, while maintaining worst-case optimality. From a practical standpoint, this means that implementing PARO solutions can only yield performance benefits, without any associated drawbacks.
To introduce the PARO concept and highlight its practical importance, we provide an illustrative toy example. The example also serves two additional important purposes. First, it enables us to show in a simple setting how PARO solutions can dominate PRO solutions, as remarked above. Second, the example motivates the need for new analysis techniques for studying PARO.
Example 1
In treatment planning for radiation therapy, the goal is to deliver a curative amount of dose to the target volume (tumor tissue), while minimizing the dose to healthy tissues. Consider a simplified case with two target subvolumes. For subvolume \(i\in \{1,2\}\), the required dose level \(d_i\) depends on the radiation sensitivity of the tissue, which is unknown. Assume that, prior to treatment, the doses lie in
Mid-treatment, the required doses are ascertained via biomarker measurements.
Treatment doses are administered in two stages. The dose administered in the first stage, denoted by x, needs to be decided prior to treatment. The dose administered in the second stage, denoted by y, can be decided after the required doses have been ascertained, i.e., it can be adapted to uncertainty revelation. Both treatment doses are delivered homogeneously over both volumes in each stage. Dose in each stage is limited to the interval [20, 40]. The total dose administered is \(x+y\), and the healthy tissue receives a fraction \(\delta >0\) of it. The Stage-1 dose x, and a decision rule \(y(\cdot )\) for the adaptive Stage-2 dose can be chosen by solving:
Problem (1) is an ARO problem with constraintwise uncertainty, for which static decision rules are worst-case optimal [2]. Plugging in \(y(d_1,d_2) = {\hat{y}}\) and solving the resulting static RO model yields a worst-case optimal objective value of \(60\delta \), achieved by all \((x,{\hat{y}})\) such that \(x+{\hat{y}}=60\). For any such solution, the objective value remains \(60\delta \) in not only the worst-case scenario but in all scenarios. Hence, all these solutions are PRO, according to the definition of Iancu and Trichakis [16]. Consequently, the Stage-1 decisions that are PRO lie in the set:
Consider now the decision rule \(y^{*}(d_1,d_2) = \max \{20,d_1-x,d_2-x\}\), which is feasible for all feasible x. Furthermore, this rule minimizes the objective for any fixed x, \(d_1\) and \(d_2\). Plugging this in gives
For given \((d_1,d_2)\) the objective value is at least \(\delta \max \{d_1,d_2\}\), and this is achieved by all \(x \le 30\). Thus, it should be preferable to implement one of these solutions for the Stage-1 decision. In fact, these solutions, which we call PARO, cannot be dominated by other solutions. Notably, the set of PARO solutions
is a strict subset of \(X^{\text {PRO}}\). This implies that PARO solutions could dominate PRO solutions that are non-PARO. To exemplify, compare the following three solutions: (i) PARO solution \(x^{*} = 25\) with optimal decision rule \(y^{*}(d_1,d_2)\), (ii) PRO (non-PARO) solution \({\hat{x}} = 35\) with optimal decision rule \(y^{*}(d_1,d_2)\), (iii) PRO solution \({\hat{x}} = 35\) with static decision rule \({\hat{y}}=25\).
Table 1 shows the performance for three scenarios. For worst-case scenario (60, 60) all solutions perform equal. For scenario (50, 55) the solution (iii) is outperformed by the other two solutions, for scenario (50, 50) both solutions (ii) and (iii)) are outperformed by PARO solution (i). There is no scenario where \({\hat{x}}\) results in a strictly better objective value than \(x^{*}\), irrespective of the used decision rule. Thus, the PRO solution \({\hat{x}}\) is dominated by the PARO solution \(x^{*}\).
Besides showing that PRO solutions could be dominated in ARO problems, Example 1 also provides intuition into how. In particular, what unlocks extra performance in ARO problems is the application of decision rules that are not merely worst-case optimal, but rather “Pareto optimal,” i.e., they optimize performance over non-worst-case scenarios as well. Note, however, that although for worst-case optimality linear decision rules might suffice under special circumstances, for Pareto optimality non-linear rules appear to be more often necessary, as illustrated by the example.
The application of non-linear decision rules to study PARO solutions invalidates the techniques used in the analysis of Pareto efficiency in RO in the extant literature, which is solely focused on linear formulations. In other words, analysis of Pareto efficiency in ARO calls for a new line of attack, which brings us to another contribution we make. Specifically, to study PARO solutions, we rely heavily on Fourier–Motzkin Elimination (FME) as a proof technique. Through the lens of FME we consider optimality of decision rule structures, which then enables us to study PARO. As a byproduct, we illustrate how this proof technique can be applied in ARO more generally, by providing more general and more insightful proofs of known results (not related to Pareto efficiency).
1.1 Findings and contributions
Before we begin our analysis, we summarize the findings and the contributions of this paper. The treatment presented is restricted to two-stage ARO models that are linear in both decision variables and uncertain parameters.
-
1.
Concept of PARO Solutions In the context of linear ARO problems, we formalize the concept of Pareto Adaptive Robustly Optimal (PARO) solutions. PARO solutions have the property that no other solution and associated adaptive decision rule exist that dominate them, i.e., perform at least as good under any scenario, and perform strictly better under at least some scenario. As Example 1 above has already shown, in the context of ARO problems, PARO solutions can dominate other Pareto optimal solution concepts already proposed in the literature [16]. In practice, PARO solutions can only yield performance benefits compared with non-PARO solutions, as the latter lead to efficiency losses.
-
2.
Properties of PARO Solutions We derive several properties of PARO solutions. The main results are that, for linear ARO problems with fixed and continuous recourse, affine dependence on uncertain parameters and a compact feasible region, there exists a first-stage PARO solution, and there exists a piecewise linear (PWL) decision rule that is PARO. To arrive at these results, our analysis relies on FME.
-
3.
Finding PARO Solutions and their Practical Value We present several approaches to find and/or approximate PARO solutions in practice, amongst others using techniques based on FME. We also conduct a numerical study for a facility location example. The results reveal that (approximate) PARO solutions can yield substantially better performance in non-worst-case scenarios than worst-case optimal and PRO solutions, thus demonstrating the practical value of the proposed methodology.
-
4.
FME as a Proof Technique for PARO. Zhen et al. [30] introduce FME as both a solution and proof technique for ARO. We apply and extend the latter idea, and use FME to prove worst-case and Pareto optimality of various decision rule structures. We extend and/or generalize known results in ARO, not related to Pareto optimality, and provide more insightful proofs; for example, one that uses FME to establish the results by Bertsimas and Goyal [5] and Zhen et al. [30] on optimality of LDRs under simplex uncertainty sets.
Finally, to better position our contributions vis-à-vis the extant literature, we note that, as mentioned earlier, PARO solutions were previously discussed in [26]. They consider a specific 3-variable non-linear ARO problem arising in radiation therapy planning; their approach for finding PARO solutions relies heavily on the specific formulation. No general treatment of PARO was included, although this demonstrates that PARO may also be relevant for non-linear ARO problems. In the current paper, we formalize the PARO concept, derive properties, such as existence of PARO solutions, and also discuss constructive approaches towards finding them. With regards to FME, Zhen et al. [30] were the first to recognize its applicability to linear ARO problems, owing to its ability to eliminate adaptive variables. They use FME as both a solution and proof technique; for the latter the main obstacle is the exponential increase in number of constraints after variable elimination. In the current paper, we apply and extend the ideas of Zhen et al. [30], and use FME as a proof technique. Through the lens of FME we first consider (worst-case) optimality of decision rule structures, and provide more general and more insightful proofs of known results. Subsequently, we investigate Pareto optimality using FME and present numerical results which demonstrate the value of PARO solutions.
1.2 Notation and organization
Boldface characters represent matrices and vectors. All vectors are column vectors and the vector \(\varvec{a}_i\) is the i-th row of matrix \(\varvec{A}\). The space of all measurable functions from \({\mathbb {R}}^n\) to \({\mathbb {R}}^m\) that are bounded on compact sets is denoted by \({\mathcal {R}}^{n,m}\). The vectors \(\varvec{e}_i\), \(\varvec{1}\) and \(\varvec{0}\) are the standard unit basis vector, the vector of all-ones and the vector of all-zeros, respectively. Matrix \(\varvec{I}\) is the identity matrix. The relative interior of a set S is denoted by \(\text {ri}(S)\); its set of extreme points is denoted by \(\text {ext}(S)\).
The manuscript is organized as follows. First, Sect. 2 introduces the ARO setting and the notion of PARO. Section 3 investigates the existence of PARO solutions using FME, and Sect. 4 presents some practical approaches for the construction of PARO solutions. In Sect. 6 we present the results of our numerical experiments, and Sect. 7 concludes the manuscript. Appendix A uses FME to establish (worst-case) optimality of decision rule structures.
2 Pareto optimality in (adaptive) robust optimization
We first generalize the definition of PRO of Iancu and Trichakis [16] to non-linear static RO problems. The reason for this is that there turns out to be a relation between Pareto efficiency for non-linear static RO problems and linear ARO problems. Let \(\varvec{x} \in {\mathcal {X}} \subseteq {\mathbb {R}}^{n_x}\) denote the decision variables and let \(\varvec{z} \in U \subseteq {\mathbb {R}}^L\) denote the uncertain parameters. Let \(f: {\mathbb {R}}^{n_x} \times {\mathbb {R}}^L \mapsto {\mathbb {R}}\) and consider the static RO problem
Let \({\mathcal {X}}^{\text {RO}}\) denote the set of robustly optimal (i.e., worst-case optimal) solutions. A robustly optimal solution \(\varvec{x}\) is PRO if there does not exist another robustly optimal solution \(\varvec{{\bar{x}}}\) that performs at least as good as \(\varvec{x}\) for all scenarios in the uncertainty set, while performing strictly better for at least one scenario. If such a solution \(\varvec{{\bar{x}}}\) does exist, it is said to dominate \(\varvec{x}\). In practice, solution \(\varvec{{\bar{x}}}\) will always be preferred over \(\varvec{x}\). If all uncertainty in the objective is moved to constraints using an epigraph formulation, Pareto robust optimality may also be defined in terms of slack variables [16, Section 4.1], but we do not use that definition here. We use the following formal definition:
Definition 1
(Pareto Robustly Optimal) A solution \(\varvec{x} \in {\mathcal {X}}^{\text {RO}}\) is PRO to (2) if there does not exist another \(\varvec{{\bar{x}}} \in {\mathcal {X}}^{\text {RO}}\) such that
We aim to extend the concept of PRO to ARO problems. In particular, we consider the following adaptive linear optimization problem:
where \(\varvec{z}\in U \subseteq {\mathbb {R}}^{L}\) is an uncertain parameter, with U a compact, convex uncertainty set with nonempty relative interior. Variables \(\varvec{x} \in {\mathbb {R}}^{n_x}\) are the Stage-1 (here-and-now) decisions. Usually we will assume \(\varvec{x}\) to be continuous variables, but we emphasize that all results in the paper also hold if (part of) \(\varvec{x}\) is restricted to be integer-valued. Variables \(\varvec{y} \in {\mathcal {R}}^{L,n_y}\) are also continuous and capture the Stage-2 (wait-and-see) decisions, i.e., they are functions of \(\varvec{z}\). The matrix \(\varvec{B} \in {\mathbb {R}}^{m \times n_y}\) and vector \(\varvec{d} \in {\mathbb {R}}^{n_y}\) are assumed to be constant (fixed recourse), and \(\varvec{A}(\varvec{z})\), \(\varvec{r}(\varvec{z})\) and \(\varvec{c}(\varvec{z})\) depend affinely on \(\varvec{z}\):
with \(\varvec{A}^0,\dotsc ,\varvec{A}^{L} \in {\mathbb {R}}^{m \times n_x}\), \(\varvec{r}^0,\dotsc ,\varvec{r}^{L} \in {\mathbb {R}}^{m}\) and \(\varvec{c}^0,\dotsc ,\varvec{c}^{L} \in {\mathbb {R}}^{n_x}\). Uncertainty in the objective (3a) can be moved to the constraint using an epigraph formulation. Nevertheless, it is explicitly stated in the objective to facilitate a convenient definition of PARO. Let \(\text {OPT}\) denote the optimal (worst-case) objective value of (3). We continue by stating several assumptions and definitions regarding adaptive robust feasibility and optimality.
Definition 2
(Adaptive Robustly Feasible) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is Adaptive Robustly Feasible (ARF) to (3) if \(\varvec{A}(\varvec{z})\varvec{x} + \varvec{B}\varvec{y}(\varvec{z}) \le \varvec{r}(\varvec{z}),~~\forall \varvec{z} \in U\).
Sometimes it is useful to consider properties of the first- and second-stage decisions separately. Therefore, we also define adaptive robust feasibility for Stage-1 and Stage-2 decisions individually.
Definition 3
(Adaptive Robustly Feasible \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))
-
(i)
A Stage-1 decision \(\varvec{x}\) is ARF to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARF to (3).
-
(ii)
A Stage-2 decision \(\varvec{y}(\cdot )\) is ARF to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARF to (3).
The set of all ARF solutions \(\varvec{x}\) is given by
We assume set \({\mathcal {X}}\) is nonempty, i.e., there exists an \(\varvec{x}\) that is ARF, and we assume (3) has a finite optimal objective value, i.e., OPT is a finite number. After feasibility, the natural next step is to formally define optimality.
Definition 4
(Adaptive Robustly Optimal) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is adaptive robustly optimal (ARO)Footnote 1 to (3) if it is ARF and \(\varvec{c}(\varvec{z})^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}(\varvec{z}) \le \text {OPT},~~\forall \varvec{z} \in U.\)
We also define adaptive robust optimality for Stage-1 and Stage-2 decisions individually.
Definition 5
(Adaptive Robustly Optimal \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))
-
(i)
A Stage-1 decision \(\varvec{x}\) is ARO to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARO to (3).
-
(ii)
A Stage-2 decision \(\varvec{y}(\cdot )\) is ARO to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARO to (3).
We are now in position to define Pareto Adaptive Robust Optimality for two-stage ARO problems.
Definition 6
(Pareto Adaptive Robustly Optimal) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is Pareto Adaptive Robustly Optimal (PARO) to (3) if it is ARO and there does not exist a pair \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that is ARO and
As before, the definitions can be extended to Stage-1 and Stage-2 decisions individually.
Definition 7
(Pareto Adaptive Robustly Optimal \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))
-
(i)
A Stage-1 decision \(\varvec{x}\) is PARO to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is PARO to (3).
-
(ii)
A Stage-2 decision \(\varvec{y}(\cdot )\) is PARO to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is PARO to (3).
Our main interest is in Definition 7(i). The reason for this is that the here-and-now decision \(\varvec{x}\) is usually the only one that the decision maker has to commit to. In contrast, instead of using decision rule \(\varvec{y}(\cdot )\), one can often resort to re-solving the optimization problem for the second stage once the value of the uncertain parameter has been revealed. This is known as the folding horizon approach, and it is applicable as long as there is time to re-solve between observing \(\varvec{z}\) and having to implement \(\varvec{y}(\varvec{z})\). There is no such alternative for \(\varvec{x}\), however, and different decisions in Stage 1 may lead to different adaptation possibilities in Stage 2.
Lastly, Pareto optimality can also be investigated for Stage-2 decisions if the Stage-1 decision \(\varvec{x}\) is fixed.
Definition 8
(Pareto Adaptive Robustly Optimal extension \(\varvec{y}(\cdot )\)) A Stage-2 decision \(\varvec{y}(\cdot )\) is a PARO extension to \(\varvec{x}^{*}\) for (3) if \((\varvec{x}^{*},\varvec{y}(\cdot ))\) is ARO to (3) and there does not exist another \(\varvec{{\bar{y}}}(\cdot )\) such that \((\varvec{x}^{*},\varvec{{\bar{y}}}(\cdot ))\) is ARO to (3) and
Figure 1 illustrates the PARO concept for a single uncertain parameter, and illustrates that care must be exercised when drawing conclusions related to PARO. In the example, \((\varvec{x}^{*}, \varvec{y}^{*}(\varvec{z}))\) is possibly PARO, but this cannot be concluded from the figure. Also, Stage-1 solution \(\hat{\varvec{x}}\) might still be PARO, but that cannot be concluded from the figure either. The reason for the latter is that there may be yet another decision rule \(\varvec{{\tilde{y}}}(\varvec{z})\) so that \((\hat{\varvec{x}}, \varvec{{\tilde{y}}}(\varvec{z}))\) is not dominated by \((\varvec{x}^{*}, \varvec{y}^{*}(\varvec{z}))\).
We conclude this section by mentioning three ways that the PARO concept can be generalized and relaxed, although we do not consider these any further. First, Bertsimas et al. [8] define Pareto optimal adaptive solutions for general (non-linear) two-stage ARO problems, which for linear problems is equivalent to our definition of PARO. They subsequently define Pareto optimal affine adaptive solutions, which is equivalent to the definition of PRO after using LDRs, and focus on finding the latter type of solutions. In their numerical studies, Iancu and Trichakis [16] also consider two-stage problems and find PRO solutions after plugging in LDRs.
Secondly, we can also solely relax the requirement that the solution is ARO. For example, often LDRs do not guarantee an ARO solution but do exhibit good practical performance [20]. Suppose these yield a worst-case objective value p (> OPT). Then we can define p-PARO solutions as those solutions \((\varvec{x},\varvec{y}(\cdot ))\) that yield an objective value of at most p in each scenario, and are not dominated by another solution \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that yields an objective value of at most p in each scenario.
Thirdly, PARO may also be defined in terms of slack variables, analogous to the extension of PRO to constraint slacks in Iancu and Trichakis [16, Section 4.1]. In that paper, a slack value vector is introduced that quantifies the relative importance of slack in each constraint. This scalarization allows the computation of the total slack value of a solution in any scenario. Subsequently PRO (and also PARO) can be defined on this total slack value instead of the objective value. This may be useful in applications where ARO is mainly used for maintaining feasibility, such as immunizing against uncertain renewable energy source output [17] and adjusting to disturbances in railway timetabling [24].
3 Properties of PARO solutions
In this section, we prove existence of PARO solutions for two-stage ARO problems of form (3). First, we use FME to prove that a PARO Stage-1 (here-and-now) solution is equivalent to a PRO solution of a piecewise linear (PWL) convex static RO problem, and use that to prove the existence of a PARO Stage-1 solution. Subsequently, we prove that there exists a PWL decision rule that is PARO to (3).
3.1 Existence of a PARO Stage-1 solution
We prove existence of PARO Stage-1 solutions in three steps. First, we prove that (3) is equivalent to a static RO problem with a convex PWL objective. Subsequently, we prove that a PRO solution to this static RO problem is equivalent to a PARO solution to (3). Lastly, we prove that PRO solutions to such problems always exist.
The proof that (3) is equivalent to a static RO problem with a convex PWL objective makes use of FME [10, 22], an algorithm for solving systems of linear inequalities. We refer to Bertsimas and Tsitsiklis [6] for an introduction to FME in linear optimization. Its usefulness in ARO is due to the fact that it can be used to eliminate adaptive variables, as proposed by Zhen et al. [30]. FME leads to an exponential increase in the number of constraints. Zhen et al. [30] introduce a redundant constraint identification scheme, which helps to reduce the number of redundant constraints, although the number of constraints remains exponential. Zhen et al. [30] also propose to use FME to eliminate only part of the variables and using LDRs for remaining adaptive variables. Next to this, they use FME to prove (worst-case) optimality of PWL decision rules. Furthermore, they consider optimal decision rules for the adaptive variables in the dual problem: they prove (worst-case) optimality of LDRs in case of simplex uncertainty and (two-)piecewise linear decision rules in case of box uncertainty. Zhen and den Hertog [29] use a combination of FME and ARO techniques to compute the maximum volume inscribed ellipsoid of a polytopic projection. The following example illustrates the use of FME to eliminate an adaptive variable.
Example 2
We use FME to eliminate adaptive variable y from (1) in Example 1. We move the uncertain objective to the constraints using an epigraph variable \(t\in {\mathbb {R}}\), and rewrite the constraints to obtain:
For fixed \((d_1,d_2)\), Constraints (4c)-(4f) specify lower and/or upper bounds on y. By combining each pair of lower and upper bounds on y into a new constraint, we find the following problem in terms of (x, t):
where we have removed the trivial new constraint \(20 \le 40\). Any solution (x, t) sets the following bounds on y:
and any decision rule satisfying these inequalities is ARO to (1). Thus, two-stage problem (1) has been reduced to static linear RO problem (5). Auxiliary variable t can be eliminated, transforming (5) into an RO problem with a PWL objective.
We focus on applying FME as a proof technique. If FME is performed on Stage-1 feasible region \({\mathcal {X}}\) until all adaptive variables are eliminated, the feasible region can be written as
for some matrix \(\varvec{G}(\varvec{z})\) and vector \(\varvec{f}(\varvec{z})\) depending affinely on \(\varvec{z}\). Zhen et al. [30] show that \({\mathcal {X}} = {\mathcal {X}}_{\text {FME}}\). The following lemma proves that (3) is equivalent to a static RO problem with a convex PWL objective.
Lemma 1
If \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to (3), \(\varvec{y}^{*}(\cdot )\) satisfies
and \(\varvec{x}^{*}\) is optimal to
with
and linear functions
and sets \(C^{-}\), \(C^{+}\), functions \(\varphi (\cdot )\) and coefficients \(\alpha \) and \(\beta \) defined as in Lemma B.1. Conversely, if \(\varvec{x}^{*}\) is optimal to (7), there exists a \(\varvec{y}^{*}(\cdot )\) such that \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to (3), and any such \(\varvec{y}^{*}(\cdot )\) satisfies (6).
Proof
See Appendix B.2. \(\square \)
The result of Lemma 1 is also illustrated in Example 2, where if auxiliary variable t is eliminated the resulting problem has a convex PWL objective. If the number of adaptive variables in (3) is small enough that full FME can be performed (order of magnitude: 20 adaptive variables [30]), one can solve (7) via an epigraph formulation in order to obtain an ARO \(\varvec{x}\) to (3).
Lemma 2
A solution \(\varvec{x}^{*}\) is PARO to (3) if and only if it is PRO to
where each element (S, T) of set M is a pair of sets of original constraints of (3) and each function \(h_{S,T}(\varvec{x},\varvec{z})\) is bilinear in \(\varvec{x}\) and \(\varvec{z}\).
Proof
See Appendix B.3. \(\square \)
Thus, existence of a PARO solution to (3) is now reduced to existence of a PRO solution to a static RO problem with a convex PWL objective in both \(\varvec{x}\) and \(\varvec{z}\). For problems without adaptive variables in the objective the following result immediately follows.
Corollary 1
If \(\varvec{d}=\varvec{0}\), a solution \(\varvec{x}^{*}\) is PARO to (3) if and only if it is PRO to
Proof
This directly follows from plugging in \(\varvec{d}=\varvec{0}\) in the proof of Lemma 2. \(\square \)
We can now prove one of our main results: existence of a PARO \(\varvec{x}\) for any ARO problem of form (3) with compact feasible region. Our proof uses Lemma 2 and essentially proves existence of a PRO solution to (8).
Theorem 1
If \({\mathcal {X}}\) is compact, there exists a PARO \(\varvec{x}\) to (3).
Proof
See Appendix B.4. \(\square \)
Note that the theorem also holds if \({\mathcal {X}}\) restricts (some elements of) \(\varvec{x}\) to be integer-valued. The boundedness assumption on \({\mathcal {X}}\) cannot be relaxed, because in that case a PRO solution to (8) need not exist. For example, consider the static RO problem \(\max _{x\ge 0}\min _{z \in [0,1]} xz\). The worst-case scenario is \(z=0\), and any \(x\ge 0\) is worst-case optimal. In any other scenario \(z >0\), higher x is better. Any x is dominated by \(x+\epsilon \) with \(\epsilon >0\), and there is no PRO solution.
3.2 Existence of a PARO piecewise linear decision rule
Now that existence of a PARO \(\varvec{x}\) is established, we investigate the structure of decision rule \(\varvec{y}(\cdot )\). We illustrate via an example that for any ARF \(\varvec{x}\) there exists a PWL PARO extension \(\varvec{y}(\cdot )\).
Example 3
Consider the following ARO problem:
Using FME, we first eliminate \(y_1(z)\) and subsequently eliminate \(y_2(z)\):
From the bounds on \(y_2\), it can be concluded that the unique ARF (and hence ARO) Stage-1 solution is \(x^{*} = \frac{1}{2}\). This leads to the following bounds on \(y_1(z)\) and \(y_2(z)\):
Variables \(y_1(z)\) and \(y_2(z)\) have not been eliminated in the objective. Therefore, any decision rule satisfying (11) is ARF to (9) but need not be ARO or PARO.
Variable \(y_1(z)\) does not appear in the bounds of \(y_2(z)\), so we can consider its individual contribution to the objective value. The objective coefficient of \(y_1(z)\) is negative, so for any z (including the worst-case) the best possible contribution of \(y_1(z)\) to the objective value is achieved if we set \(y_1(z)\) equal to its upper bound. Therefore, for the given \(x^{*}\), we have the following PWL PARO extension as a function of \(y_2(z)\):
Now that \(y_1(z)\) is eliminated in the objective, it remains to find the optimal decision rule for \(y_2(z)\). Variable \(y_2(z)\) now appears directly in the objective (9a) and through its occurence in the decision rule \(y_1^{*}(z)\). For fixed z, the optimal \(y_2(z)\) is determined by solving
One can easily see that the objective is increasing in \(y_2(z)\), so for any z the best possible contribution of \(y_2(z)\) to the objective value is achieved if we set \(y_2(\varvec{z})\) equal to its lower bound. Thus, for the given \(x^{*}\), we have the following PWL PARO extension:
Note that plugging in a PWL argument in a PWL function retains the piecewise linear structure. Therefore, we also obtain the following PWL PARO extension for \(y_1^{*}(z)\):
Note that we did not move adaptive variables in the objective to the constraints using an epigraph variable, as was done in Example 2. Using an epigraph variable for the objective ensures that each decision rule satisfying the bounds is worst-case optimal, but prevents from comparing performance in other scenarios. Naturally, computationally it has the major advantage that it remains a linear program. \(\square \)
Bemporad et al. [1] show worst-case optimality of PWL decision rules for right-hand polyhedral uncertainty, i.e., ARO PWL decision rules in our terminology. Zhen et al. [30, Theorem 3] generalize this to problems of form (3) with particular assumptions on the uncertainty set. These decision rules are general PWL in \(\varvec{z}\) for all variables \(y_j\), \(j\ne l\), where \(y_l\) is the last eliminated variable in the FME procedure. The decision rule is convex or concave PWL in \(y_l\). These results solely consider the performance of PWL decision rules in the worst-case. Example 3 illustrates that for any ARF \(\varvec{x}\) there exists a PWL PARO extension \(\varvec{y}(\cdot )\). The lemma below formalizes this claim.
Lemma 3
For any \(\varvec{x}\) that is ARF to (3) there exists a PARO extension \(\varvec{y}(\varvec{z})\) that is PWL in \(\varvec{z}\).
We present two proofs to Lemma 3; one via FME using the idea of Example 3, and one via basic solutions in linear optimization.
Proof of Lemma 3 via FME
See Appendix B.5. \(\square \)
Proof of Lemma 3 via linear optimization
See Appendix B.6. \(\square \)
In both proofs the constructed decision rule is in fact optimal for all scenarios in the uncertainty set. As long as \(\varvec{x}\) is fixed, this is necessary for PARO solutions. The following theorem establishes the existence of PARO PWL decision rules.
Theorem 2
If \({\mathcal {X}}\) is compact, there exists a PARO \(\varvec{y}(\cdot )\) for (3) such that \(\varvec{y}(\varvec{z})\) is PWL in \(\varvec{z}\).
Proof
According to Theorem 1 there exists a PARO \(\varvec{x}\), and according to Lemma 3 there exists a PARO extension \(\varvec{y}(\cdot )\) for this \(\varvec{x}\) that is PWL in \(\varvec{z}\). It immediately follows that \(\varvec{y}(\cdot )\) is PARO. \(\square \)
4 Constructing PARO solutions—special cases
The methods used in the existence proofs of Sect. 3 are not computationally tractable, i.e., they provide little guidance for finding PARO solutions in practice. In this section, we discuss several practical methods for finding PARO solutions for special cases of problem (3). Several results assume that \(\varvec{d} = \varvec{0}\), i.e., Stage-2 variables do not appear in the objective. This can be a limiting assumption, although it is satisfied in applications where ARO is used for maintaining robust feasibility [17, 24]. In the next section, a general methodology that does not rely on this assumption is presented.
First, we consider special cases of problem (3) where particular decision rule structures guarantee PARO solutions in case \(\varvec{d} = \varvec{0}\). Subsequently, we show how for fixed \(\varvec{x}\) we can check whether \(\varvec{y}(\cdot )\) is a PARO extension. After that, we consider an application of the finite subset approach of Hadjiyiannis et al. [15]. Lastly, we present a method for finding a PARO solution if \(\varvec{d} = \varvec{0}\) and a convex hull description of the uncertainty set is available.
4.1 Known worst-case optimal decision rules
In Appendix A, using an FME lens, we analyze various decision rule structures, considering special uncertainty set forms, including so-called constraintwise, hybrid and block forms. This enables us to prove that an ARF decision rule with a particular structure exists for every ARF \(\varvec{x}\), instead of solely proving it is optimal for an ARO \(\varvec{x}\). For example, for ARO problems with hybrid uncertainty, for any ARF Stage-1 decision there exists an ARF decision that depends only on the non-constraintwise uncertain parameter. It turns out that, in case there are no adaptive variables in the objective, PRO solutions to the static problem obtained after plugging in that decision rule structure are PARO solutions to the original ARO problem. To formalize this, let \(\varvec{y}(\varvec{z}) = f_{\varvec{w}}(\varvec{z})\) be a decision rule with known form f (e.g., linear or quadratic) and finite number of parameters \(\varvec{w} \in {\mathbb {R}}^p\), such that \(f_{\varvec{w}}(\varvec{z}) \in {\mathcal {R}}^{L,n_y}\) for any \(\varvec{w}\).
Theorem 3
Let P denote an ARO problem of form (3) with \(\varvec{d} = \varvec{0}\) and where for any ARF \(\varvec{x}\) there exists an ARF decision rule of form \(\varvec{y}^{*}(\varvec{z}) = f_{\varvec{w}}(\varvec{z})\) for some \(\varvec{w}\). Then any \(\varvec{x}^{*}\) that is PRO to the static robust optimization problem obtained after plugging in decision rule structure \(f_{\varvec{w}}(\varvec{z})\) is PARO to P.
Proof
See Appendix B.7. \(\square \)
Due to Lemmas A.1 to A.3, the following result immediately follows for hybrid, block and simplex uncertainty.
Corollary 2
-
(i)
Let \(P_{\text {hybrid}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and hybrid uncertainty. Let Q denote the static robust optimization problem obtained from \(P_{\text {hybrid}}\) by plugging in a decision rule structure that depends only on the non-constraintwise parameter. Any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {hybrid}}\).
-
(ii)
Let \(P_{\text {block}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and block uncertainty. Let Q denote the static robust optimization problem obtained from \(P_{\text {block}}\) by plugging in a decision rule structure where adaptive variable \(\varvec{y}_{(v)}^{*}(\cdot )\) depends only on \(\varvec{z}_{(v)}\) for all \(v=1,\dotsc ,V\). Then any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {block}}\).
-
(iii)
Let \(P_{\text {simplex}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and a simplex uncertainty set, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1})\), with \(\varvec{z}^{j} \in {\mathbb {R}}^{L}\) such that \(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1}\) are affinely independent. Let Q denote the static robust optimization problem obtained from \(P_{\text {simplex}}\) by plugging in an LDR structure. Then any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {simplex}}\).
Proof
See Appendix B.8. \(\square \)
The case with constraintwise uncertainty is a special case of Corollary 2(i). The case with one uncertain parameter is a special case of Corollary 2(iii). Note that, unlike for worst-case optimization in Appendix A, it is necessary that \(\varvec{d}=\varvec{0}\), because our definition of PRO involves the term \(\varvec{d}\). If \(\varvec{d} \ne \varvec{0}\), the results above do not hold. This is also illustrated in Example 1 in Sect. 1.
The results of Corollary 2 can be combined. For example, for problems with both simplex uncertainty and hybrid uncertainty, Corollary 2(i) and Corollary 2(iii) together imply that one needs to consider only decision rules that are affine in the non-constraintwise parameter, if there are no adaptive variables in the objective. Simplex uncertainty sets arise in a variety of applications and can be used to approximate other uncertainty sets [4].
4.2 Check whether a decision rule is a PARO extension
If the Stage-1 decision \(\varvec{x}\) is fixed, one can verify whether the decision rule \(\varvec{y}\) is a PARO extension (Definition 8) as follows.
Lemma 4
Let \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) be an ARO solution to (3). Consider the problem
If the optimal objective value is zero, \(\varvec{y}^{*}(\cdot )\) is a PARO extension of \(\varvec{x}^{*}\). If the optimal objective value is positive, then \(\varvec{y}^{*}(\cdot )\) is not a PARO extension of \(\varvec{x}^{*}\) and the suboptimality of \(\varvec{y}^{*}(\cdot )\) is bounded by the optimal objective value.
Proof
See Appendix B.9. \(\square \)
If the optimal value to (12) is positive and \((\varvec{{\bar{z}}},\varvec{{\bar{y}}})\) denotes an optimal solution to (12), then \(\varvec{{\bar{z}}}\) is a scenario where the suboptimality bound is attained, and \(\varvec{{\bar{y}}}\) is an optimal decision for this scenario. Also, note that if the optimal value of (12) equals zero, the pair \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) need not be PARO; there may be a different pair \((\varvec{{\hat{x}}}, \varvec{{\hat{y}}}(\cdot ))\) that dominates the current pair.
4.3 Unique ARO solution on finite subset of scenarios is PARO
The finite subset approach of Hadjiyiannis et al. [15] can be used in a PARO setting as well. If the lower bound problem has a unique optimal solution and this solution is feasible to the original problem, it is a PARO solution to the original problem. This is formalized in Lemma 5.
Lemma 5
Let \(S = \{\varvec{z}^1,\dotsc ,\varvec{z}^N\}\) denote a finite set of scenarios, \(S \subseteq U\). Let \(\varvec{x}^{*}\) be the unique ARO Stage-1 solution for which there exist \(\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*}\) such that \((\varvec{x}^{*},\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*})\) is an optimal solution to
Then \(\varvec{x}^{*}\) is PARO to (3).
Proof
Let \((\varvec{{\bar{x}}}, \varvec{{\bar{y}}}(\cdot ))\) be ARO to (3) with \(\varvec{{\bar{x}}}\) unequal to \(\varvec{x}^{*}\). Then the solution \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\varvec{z}^1),\dotsc ,\varvec{{\bar{y}}}(\varvec{z}^N))\) is feasible to (13). Because \(\varvec{x}^{*}\) is the unique ARO Stage-1 solution that can be extended to an optimal solution of (13), it holds that
That is, for each \(\varvec{{\bar{x}}}\) that is ARO to (3) and unequal to \(\varvec{x}^{*}\) there is at least one scenario \(\varvec{z}^i\) in U for which \(\varvec{x}^{*}\) outperforms \(\varvec{{\bar{x}}}\). This implies that \(\varvec{x}^{*}\) is PARO to (3). \(\square \)
Objective (13a) optimizes for the worst-case scenario \(\varvec{z}^i\) in set S, and (13b) ensures that \(\varvec{x}^{*}\) is feasible for all scenarios in S. It should be noted that requiring \(\varvec{x}^{*}\) to be both ARO to (3) and a unique optimal solution to (13) is quite restrictive.
4.4 Convex hull description of scenario set
Next, consider the case where the uncertainty set is given by the convex hull of a finite set of points, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^N)\). Additionally assume that there are no adaptive variables in the objective. Then (3) is equivalent to
Analogous to Iancu and Trichakis [16], after finding an ARO solution we can perform an additional step by optimizing the set of ARO solutions over a scenario in the relative interior (denoted by \(\text {ri}(\cdot )\)) of the convex hull of our finite set of scenarios. This produces a PARO Stage-1 solution to (3).
Lemma 6
Let \(\varvec{d}=\varvec{0}\). Let \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^N)\), \(\varvec{{\bar{z}}} \in \text {ri}(U)\) and let \((\varvec{x}^{*},\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*})\) denote an optimal solution to
where OPT denotes the optimal objective value of (14). Then \(\varvec{x}^{*}\) is PARO to (3).
Proof
See Appendix B.10. \(\square \)
5 Constructing PARO solutions—general methodology
Adaptive robust optimization problems of form (3) are in general NP-hard [14], and finding ARO solutions is still the focus of ongoing research [27]. Thus, finding a general methodology that, given an ARO solution to (3), can produce a PARO solution is not an easy task either. In this section we present a general methodology for finding PARO solutions to ARO problems of form (3), which involves solving a sequence of bilinear optimization problems; by solving these heuristically, the method can be used to find ‘approximate’ PARO solutions in practice. First we describe the methodology, after that we discuss how to solve the bilinear optimization problems.
5.1 Constraint & column generation procedure for finding a PARO solution
The starting point is the constraint-and-column generation (C &CG) method of [28] for finding ARO solutions; we first briefly describe this method. Let \(M \subseteq {U}\) denote a finite set of scenarios, e.g., solely the nominal scenario. Consider the following LP:
The optimal Stage-1 solution \(\varvec{{\hat{x}}}\) to \(P_1(M)\) is ARO for the set M, but not necessarily ARO (or ARF) for the true uncertainty set U. Define the subproblem
and let \(q(\varvec{{\hat{x}}})\) denote the optimal objective value. Either \(q(\varvec{{\hat{x}}})\) is finite with optimal solution \(\varvec{{\hat{z}}}\), or a scenario \(\varvec{{\hat{z}}}\) is identified where the inner minimization problem is infeasible. In that case set \(q(\varvec{{\hat{x}}})\) to \(+\infty \) by convention. In either case, the solution \(\varvec{{\hat{z}}}\) is a scenario in U where the Stage-1 solution \(\varvec{{\hat{x}}}\) to \(P_1(M)\) performs worst. If the optimal objective value of \(Q(\varvec{{\hat{x}}})\) is higher than the optimal objective value of \(P_1(M)\), then M did not contain the worst-case scenario, and \(\varvec{{\hat{x}}}\) is not ARO. That is, \(\varvec{{\hat{z}}}\) provides either a feasibility or (worst-case) optimality cut. The scenario \(\varvec{{\hat{z}}}\) is added to M, and the procedure is repeated until both objective values are sufficiently close. Algorithm 1 provides the pseudocode, see Zeng and Zhao [28] for more details.
The result of termination of Algorithm 1 after k iterations is an ARO solution \(\varvec{x}^{\text {ARO}}:= \varvec{x}^k\); additionally let \(M^{\text {ARO}}:= M^k\) denote the resulting set of scenarios and let \(OPT:= \mu ^k\) denote the worst-case optimal objective value.
A PARO solution can be found by appending another C &CG procedure to Algorithm 1. In each iteration k of the second C &CG procedure, we find a candidate Stage-1 solution \(\varvec{x}^c\) that satisfies three conditions:
Condition 1
-
(i)
\(\varvec{x}^c\) is feasible for all scenarios \(\varvec{z}^l \in M^k\).
-
(ii)
\(\varvec{x}^c\) results in an objective value of at most \(v^l(\varvec{x}^k,\varvec{z}^l)\) for all scenarios \(\varvec{z}^l \in M^k\), with
$$\begin{aligned} v(\varvec{x}^k,\varvec{z}^l) = {\left\{ \begin{array}{ll} \text {OPT} &{} \forall \varvec{z}^l \in M^{\text {ARO}} \\ \varvec{c}(\varvec{z}^l)^{\top }\varvec{x}^k + \min _{\varvec{y}} \{\varvec{d}^{\top } \varvec{y} | \varvec{A}(\varvec{z}^l)\varvec{x}^k + \varvec{B}\varvec{y} \le \varvec{r}(\varvec{z}^l) \} &{} \forall \varvec{z}^l \in M^k \backslash M^{\text {ARO}}, \end{array}\right. } \end{aligned}$$(18)i.e., \(\varvec{x}^c\) is feasible on the original scenario set \(M^{\text {ARO}}\), and performs at least as good as \(\varvec{x}^k\) on any scenario subsequently added to \(M^k\).
-
(iii)
\(\varvec{x}^c\) results in a strictly lower objective value than the current Stage 1 solution \(\varvec{x}^k\) on a new (to be determined) scenario \(\varvec{z}^c\).
This is achieved by solving the following optimization problemFootnote 2
Constraints (19c) and (19d) enforce Conditions 1(i) and 1(ii), respectively. The objective aims to find the scenario \(\varvec{z}^c\) where the candidate solution \(\varvec{x}^c\) most outperforms \(\varvec{x}^k\). The inner maximization problem finds the optimal recourse decision \(\varvec{y}^k\) for the current solution \(\varvec{x}^k\) at \(\varvec{z}^c\). Constraint (19b) ensures that \(\varvec{x}^c\) is feasible for the new scenario \(\varvec{z}^c\). Note that setting \(\varvec{x}^c = \varvec{x}^k\) is feasible and yields objective value 0, so \(P_2(\varvec{x}^k,M^k)\) is always feasible and has a nonpositive optimal objective value.
If the optimal objective value of (19) is strictly negative, then candidate solution \(\varvec{x}^c\) satisfies Conditions 1(i) to 1(iii), i.e., it dominates \(\varvec{x}^k\) on set \(M^k\). However, because \(M^k\) is only a subset of U, a candidate solution \(\varvec{x}^c\) is not neccesarily ARO (or even ARF). Thus, we again solve problem \(Q(\varvec{x}^c)\); let \(q(\varvec{x}^c)\) denote its objective value and \(\varvec{z}^q\) the optimizer. We distinguish two cases:
-
\(q(\varvec{x}^c) \le \text {OPT}\). Candidate solution \(\varvec{x}^c\) is an ARO solution that dominates \(\varvec{x}^k\) on set \(M^k \cup \{\varvec{z}^c\}\). For the next iteration of the C &CG procedure we update the current best solution: \(\varvec{x}^{k+1} = \varvec{x}^c\). The new set of scenarios is \(M^{k+1} = M^k \cup \{\varvec{z}^c\}\). Further improvements over \(\varvec{x}^c\) may be possible, so it is not neccesarily PARO. In preparation of the next iteration, compute \(v(\varvec{x}^{k+1},\varvec{z}^l)\) for all \(l=1,\dotsc ,M^{k+1}\).
-
\(q(\varvec{x}^c) > \text {OPT}\). Candidate solution \(\varvec{x}^c\) is not ARO, so we set \(\varvec{x}^{k+1} = \varvec{x}^k\). For the next iteration we set \(M^{k+1} = M^k \cup \{\varvec{z}^q\}\), to render solution \(\varvec{x}^c\) infeasible in the next iteration. In preparation of the next iteration, compute \(v(\varvec{x}^{k+1},\varvec{z}^c)\).
The above procedure is repeated until the optimal objective value of (19) is nonnegative. Algorithm 2 describes the resulting C &CG algorithm and Lemma 7 proves that it yields a PARO Stage-1 solution.
Note that multiple PARO solutions may exist. Depending on the input solution \(\varvec{x}^{\text {ARO}}\), Algorithm 2 may potentially generate a different PARO solution.
Lemma 7
A solution \(\varvec{x}^{\text {PARO}}\) obtained from Algorithm 2 is PARO to (3).
Proof
See Appendix B.11. \(\square \)
In the special case that the vertices of U can be explicitly enumerated, the vertex set can be used to initialize \(M^k\). In this case, any candidate solution \(\varvec{x}^c\) is feasible on all vertices on U, so it is guaranteed to be ARO. Consequently, it is not needed to solve subproblem \(Q(\varvec{x}^c)\) in each iteration, and Algorithm 2 can be simplified. Note that this special case is the same as the case discussed in Sect. 4.4, but with \(\varvec{d} \ne \varvec{0}\).
5.2 Hints for solving the bilinear optimization problems
Algorithm 2 requires solving both problems \(P_2\) and Q in each iteration; Q is also solved in each iteration of Algorithm 1. Unfortunately, both are intractable in general. The reason is that for both problems the feasible region of the inner optimization problem depend on the variables of the outer optimization problem. For Q, dualizing the inner minimization problem results in the following bilinear problem:
Similarly, for \(P_2\) dualizing the inner maximization problem also results in a bilinear optimization problem:
For both (20) and (21), we propose to use a simple alternating direction heuristic, also known as mountain climbing, which guarantees a local optimum [19].
We describe this approach in more detail for (21). For some initial \(\varvec{z}^c\) one can determine the optimal \(\{\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\) by solving an LP. Subsequently, we alternate between optimizing for \(\varvec{z}^c\) and \(\{\varvec{\lambda },\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\) while keeping the other set of variables at their current value. For either set of variables, the problem is an LP. This is continued until two consecutive LP problems yield the same objective value.
The solution quality of the mountain climbing procedure depends on the starting value for \(\varvec{z}^c\). One option is to simply pick the nominal scenario, if it is defined. A better starting solution can be obtained by plugging in an LDR for \(\varvec{y}^k\) in \(P_2\), i.e., by solving
additionally subject to (19b)–(19e). This is a static linear robust optimization problem.
This mountain climbing procedure can also be used to find a local optimum to (20). A starting solution for \(\varvec{z}\) can be found by plugging in an LDR for \(\varvec{y}\) in (17), and subsequently one can alternate between solving for \(\varvec{z}\) and \(\varvec{\lambda }\) in (20).
By using a heuristic approach to solving (19), it possible that at a certain iteration k in Algorithm 2 we obtain an estimate \({\hat{p}}_k \ge 0\), while the true \(p_k<0\), so the algorithm terminates without finding a PARO solution. Nevertheless, also solutions obtained this way that are not proven to be PARO can improve upon the original ARO solution. Similarly, if (20) is not solved to optimality, we might falsely conclude that a solution is ARO. In the numerical experiments this risk is reduced by using multiple starting points for \(\varvec{z}^c\).
There are many different approaches to (approximately) solving problems (20) and (21), or equivalently, problems Q and \(P_2\); we conclude this section by providing some examples and references. In case of RHS uncertainty, the following approach gives an exact solution to (19). The inner minimization problem is an LP for which we can write down the optimality conditions. Subsequently, the complementary slackness conditions can be linearized using big-M constraints and auxiliary binary variables. This results in an exact reformulation to a mixed binary convex reformulation (mixed binary linear if U is polyhedral). This reformulation was previously used in the C &CG method of Zeng and Zhao [28] for ARO problems with a polyhedral uncertainty set, and to solve bilinear optimization problems with a disjoint uncertainty set [32]. One can verify that in case of RHS uncertainty, (21) indeed has disjoint polyhedral feasible regions for \(\lambda \) and \(\{\varvec{z}^c,\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\).
Another possible approach is the Reformulation-Perspectification-Technique proposed by Zhen et al. [31]. The advantage of this approach is that one gets both an upper and lower bound for the optimal objective value and this method can be easily extended to a Branch & Bound framework to find the global optimal solution.
For alternative approaches to solving bilinear optimization problems, we refer to Konno [19], Nahapetyan [23] and Zhen et al. [32]. Some bilinear formulations can be solved directly by Gurobi 9.0 [13].
6 Numerical experiments
To demonstrate the value of PARO solutions in practice, we focus on an example problem in which (adaptive) robust optimization has been successfully applied: a facility location problem.
6.1 Problem description
Consider a strategic decision-making problem where a number of facilities are to be opened, in order to satisfy the demand of a number of customers. The goal is to choose the locations for opening a facility such that the cost for opening the facilities plus the transportation cost for satisfying demand is minimized. We consider this problem in a two-stage setting with uncertain demand. Thus, facility opening decisions need to be made in Stage 1, before Stage-2 demand is known.
Suppose there are n locations where a facility can be opened, and m demand locations. Let \(\varvec{x} \in \{0,1\}^n\) be a binary Stage-1 decision variable denoting the facility opening decisions. Opening facility costs \(f_i\) and yields a capacity \(s_i\), \(i=1,\dotsc ,n\). Let \(\varvec{y} \in {\mathcal {R}}^{m, m\times n}\) be the Stage-2 decision variable denoting transport from facility i to demand location j; let \(c_{ij}\) denote the associated costs, \(i=1,\dotsc ,n\), \(j=1,\dotsc ,m\). Let \(z_j\) denote the uncertain demand in location j. The two-stage facility location model reads
with uncertainty set
6.2 Setup
Model (23) can be written in form (3) with right-hand side uncertainty. In order to find an (approximate) PARO solution \(\varvec{x}_{\text {PARO}}\), Algorithm 2 is used, including C &CG Algorithm 1 to find an ARO solution. In order to improve the likelihood of finding an ARO solution, problem (20) is solved (with the mountain climbing approach) using multiple starting points for \(\varvec{z}\): the LDR-based starting point and 5 randomly sampled starting points.
For comparison purposes, we also compute a PRO solution to (3) using the methodology of Iancu and Trichakis [16, Theorem 1], which we repeat for convenience. Specifically, we plug in LDR \(\varvec{y}(\varvec{z}) = \varvec{w}+\varvec{W}\varvec{z}\), and obtain optimal solution \((\varvec{x}_1^{*},\varvec{w}_1^{*},\varvec{W}_1^{*})\). Subsequently, we optimize for the nominal scenario \(\varvec{{\bar{z}}}\) whilst ensuring that performance in other scenarios does not deteriorate, and feasibility is maintained:
Constraint (24d) states that the new solution equals the original solution (variables with subscript 1) plus an adaptation (variables with subscript 2). Constraint (24b) ensures that the adaptation does not deteriorate performance in any scenario, and the objective is to optimize performance in scenario \(\varvec{{\bar{z}}}\). According to Iancu and Trichakis [16, Theorem 1], the optimal solution for \((\varvec{x},\varvec{w},\varvec{W})\) is PRO to (3). Let \(\varvec{\varvec{x}}_{\text {PRO}}\) denote the optimal solution for \(\varvec{x}\).
We compare the performance of the Stage-1 (here-and-now) solutions \(\varvec{x}_{\text {PARO}}\), \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PRO}}\). For solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\) we use the optimal recourse decision. For \(\varvec{x}_{\text {PRO}}\) we report the results for two decision rules: (i) the optimal recourse decision, (ii) the LDR. We refer to the four objective values as PARO, ARO, PRO and PRO(LDR). We compute the relative improvement (in \(\%\)) of PARO over the other three objective values for two different cases:
- Nominal::
-
Relative improvement in nominalFootnote 3 scenario \(\varvec{{\bar{z}}}\).
- Maximum::
-
Relative improvement in the scenario with the maximum performance difference between \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PARO}}\). This scenario, which we denote by \(\varvec{z}^{*}\), is found by solving (19) with fixed \(\varvec{x}^k = \varvec{x}_{\text {ARO}}\) and \(\varvec{x}^c = \varvec{x}_{\text {PARO}}\).
All optimization problems are solved using Gurobi 9.0 [13] with the dual simplex algorithm selected. We note that the influence of different solvers may also be investigated, but this is beyond the scope of this paper. All computations are performed on a Intel-Core i7-8565U PC with 16GB RAM, using all 8 threads.
During our numerical studies we found examples where Algorithm 2 was not able to improve upon the initial Stage-1 solution \(\varvec{x}_{\text {ARO}}\). This could occur if the initial \(\varvec{x}_{\text {ARO}}\) happens to be PARO. Or, it could occur if there is a unique ARO solution—after all, not every ARO instance has multiple worst-case optimal Stage-1 solutions. The latter has been reported before in literature. De Ruiter et al. [9] show that the multi-stage production-inventory model of Ben-Tal et al. [2] has unique here-and-now decisions in almost all time periods, if LDRs are used. In that example, the reported multiplicity of solutions is mainly due to non-PRO decision rule coefficients. We find that multiplicity of Stage-1 solutions appears in particular when problem data is integer.
6.3 Data
We solve the facility location problem for various combinations of demand locations m, potential facility locations n and maximum total demand \({\varGamma }\). For (m, n), we consider the pairs (10, 10), (10, 20), (20, 20), (20, 40), (50, 50), (50, 75), (100, 100) and (100, 150). For each pair, we consider a low, medium and high value of \({\varGamma }\); set at 8.5m, \(10\,m\) and 11.5m, respectively.
For each combination, we generate 100 random instances. Facility capacity \(s_i\) is set at 15 for each i. Other parameters are independently drawn from a discrete uniform distribution. Construction costs \(\varvec{f} \in {\mathbb {R}}^n\) are drawn between 4 and 22. Entries of transportation cost matrix \(\varvec{C} \in {\mathbb {R}}^{n\times m}\) are drawn between 2 and 12.
We set lower and upper bound \(l_j=8\) and \(u_j=12\) for each demand location \(j=1,\dotsc ,m\). The nominal demand scenario is \({\bar{z}}_j = \frac{1}{2}(8 + {\varGamma }/m)\) for all j. Note that \(\varvec{{\bar{z}}} \in \text {ri}(U)\).
The models with LDRs cannot be solved for the instances with \(m \ge 50\), due to memory limitations. Thus, for these instances the comparison to the PRO and PRO(LDR) objective values is not possible. For the other instances, the computation time for the LDR models is limited to 30 min.Footnote 4
6.4 Results
For the configurations with \(m=100\), none of the instances resulted in different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\). An ARO solution \(\varvec{x}_{\text {ARO}}\) that cannot be improved is a PARO solution itself (up to suboptimality of the C &CG procedures), which may also be a useful result.
For the other configurations, Table 2 reports the number of instances with different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\), along with the median required number of iterations and median computation time for obtaining these solutions.Footnote 5
The \(\ell _1\)-norm difference between Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\) represents the number of different facilities that are opened (out of n potential locations). For example, an \(\ell _1\) norm of 2 indicates that one solution opened facility i and another solution opened facility j, or one solution opened both facilities i and j and the other solution opened neither.
There is no clear trend visible as the instance size increases, in terms of number of instances with differences in Stage-1 solutions or magnitude of differences (\(\Vert \cdot \Vert _1\)). The median number of iterations also remains mostly constant, for both \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PARO}}\). There is no clear effect of the \({\varGamma }\)-parameter on the iteration count. Over all instances in all configurations, the maximum number of iterations is 10 for \(\varvec{x}_{\text {ARO}}\) and 4 for \(\varvec{x}_{\text {PARO}}\). Thus, the increase in computation time for the ARO and PARO C &CG procedure in larger instances is predominantly due to the increased size of the optimization problems in each iteration.
The LDR model (for obtaining the PRO and PRO(LDR) objective values) does not necessarily find the worst-case optimal solution because (i) it is restricted to linear recourse and thus an approximation of the true ARO model, and (ii) the 30-minute time limit may be restrictive (this led to an optimality gap of at most \(0.79\%\) over all instances). Thus, the LDR model may overestimate the worst-case cost. At the same time, ARO does not necessarily find the worst-case scenario and may thus underestimate the true worst-case cost.Table 3 reports the worst-case differences. Particularly for instances with \((m,n) = (20,40)\), differences in worst-case performance are observed.
Figure 2 shows boxplots of the relative objective value improvement of PARO over ARO and (where applicable) over PRO and PRO(LDR) for the instances with different Stage-1 solutions \(\varvec{x}^{\text {PARO}}\) and \(\varvec{x}^{\text {ARO}}\). Figure 2a shows the improvement for nominal scenario \(\varvec{{\bar{z}}}\) and Fig. 2b shows the improvement for maximum difference scenario \(\varvec{z^{*}}\).
As expected, the magnitude of differences is larger for maximum difference scenario \(\varvec{z}^{*}\) than for the nominal scenario \(\varvec{{\bar{z}}}\). In most cases the maximum relative improvement is substantial, but the median relative improvement is only minor in most cases. The relative improvement of PARO over the other solutions decreases as the instance size increases. This is best visible for the relative improvement of PARO over the ARO (blue). For the instances with \(m=50\), also the maximum relative improvement is minor. The reason for this is that the magnitude of (absolute) differences between \(\varvec{x}^{\text {PARO}}\) and \(\varvec{x}^{\text {ARO}}\) remains mostly constant (see Table 2), which leads to a decrease in the relative objective value differences.
However, note that if the Stage-1 solution represents a decision that is to be implemented in practice, even the possibility to get a small percentual improvement warrants the extra effort to obtain an (approximate) PARO solution. We note that for ARO we use the first found ARO solution \(\varvec{x}^{\text {ARO}}\); it is possible that there exists yet another ARO solution, for which the improvement percentages of PARO over ARO are larger than those reported in Fig. 2.
By construction, the relative improvement of PARO over ARO is positive for the maximum difference scenario \(\varvec{z}^{*}\). However, \(\varvec{x}^{\text {PARO}}\) does not necessarily dominateFootnote 6\(\varvec{x}^{\text {ARO}}\), thus, for specific scenarios \(\varvec{x}^{\text {ARO}}\) may still perform better. Indeed, although Fig. 2 shows that while for the large majority of instances the relative improvement is positive, sometimes it is negative. This may also be partially due to the fact that the obtained solutions \(\varvec{x}^{\text {ARO}}\) and \(\varvec{x}^{\text {PARO}}\) are approximate, due to the bilinear approximations. There are also instances where the relative difference between PARO and ARO is zero, i.e., a different Stage-1 solution \(\varvec{x}\) does not always translate to a different performance on the two reported measures.
The relative improvement of PARO over PRO is smaller than that of PARO over PRO(LDR). Thus, reported differences are partially due to the different Stage-1 decision, and partially due to the Stage-2 decision rule.
7 Conclusion
In this paper, we dealt with Pareto efficiency in two-stage adaptive robust optimization problems. Similar to static robust optimization, the large majority of solution techniques focus only on worst-case optimality, and may yield solutions that are not Pareto efficient. To alleviate this, the concept of Pareto Adaptive Robustly Optimal (PARO) solutions has been introduced in the literature, and is formalized in a general framework in this paper.Footnote 7
Using FME as the predominant technique, we have analyzed the relation between PRO and PARO and investigated optimality of various decision rule structures in both worst-case and non-worst-case scenarios. We have shown the existence of PARO here-and-now decisions and shown that there exists a PWL decision rule that is PARO.
Moreover, we have provided several practical approaches to generate or approximate PARO solutions. Numerical experiments on a facility location example demonstrate that PARO solutions can significantly improve performance in non-worst-case scenarios over ARO and PRO solutions.
A potential direction for future research would be to further investigate constructive approaches to find or approximate PARO solutions. In particular, it would be valuable to have tractable algorithms for larger instances and/or more general classes of problems than the ones that can be tackled using the currently presented approaches. Another interesting avenue for future (numerical) research would be understand more precisely the circumstances under which multiplicity of ARO solutions occurs, and when first-found ARO solutions are (un)-likely to be PARO.
Notes
To ease exposition, we overload and reuse certain acronyms, such as ARO for “Adaptive Robust Optimization” and “Adaptive Robustly Optimal”, as long as they can be readily disambiguated from the context.
To limit notational burden, we overload the notation \(\varvec{x}:^c\) and \(\varvec{z}^c\) and use these for both the optimization variables in (19) and their values in the optimal solution.
In preliminary numerical experiments, average relative improvement over 10 uniform randomly sampled scenarios has also been computed. Results are very similar to the relative improvement in the nominal scenario, and are therefore omitted.
Time limit solely for practical reasons. The resulting maximal optimality gap over all instances is \(0.79\%\)
PARO iteration count and solution time refer to Algorithm 2, i.e., they are additional to the required number of iterations and solution time for obtaining the ARO solution.
By definition of PARO, the solution \(\varvec{x}^{\text {PARO}}\) cannot be dominated, and the solution \(\varvec{x}^{\text {ARO}}\) is dominated by at least one PARO solution (if it is not PARO). However, that is not necessarily the current PARO solution. The current PARO solution is solely guaranteed to outperform \(\varvec{x}^{\text {ARO}}\) for all scenarios in set \(M^k\) in Algorithm 2.
For ARO problems, every non-PARO solution is dominated by a PARO solution, even if the former is Pareto Robustly Optimal (PRO), as defined by Iancu and Trichakis [16].
Note that constraintwise uncertainty in ARO differs from constraintwise uncertainty in static RO problems. Whereas in the latter uncertainty can always be treated ‘constraintwise’ [3, p11] in ARO this is not the case in general, due to the adaptive variables. Constraintwise uncertainty in ARO refers to problems that satisfy the construction of Definition A.1.
References
Bemporad, A., Borrelli, F., Morari, M.: Min–max control of constrained uncertain discrete-time linear systems. IEEE Trans. Autom. Control 48(9), 1600–1606 (2003)
Ben-Tal, A., Goryashko, A., Guslitzer, E., Nemirovski, A.: Adjustable robust solutions of uncertain linear programs. Math. Program. 99, 351–376 (2004)
Ben-Tal, A., El Ghaoui, L., Nemirovski, A.: Robust Optimization. Princeton University Press, Princeton Series in Applied Mathematics, Princeton (2009)
Ben-Tal, A., El Housni, O., Goyal, V.: A tractable approach for designing piecewise affine policies in two-stage adjustable robust optimization. Math. Program. 182, 57–102 (2020)
Bertsimas, D., Goyal, V.: On the power and limitations of affine policies in two-stage adaptive optimization. Math. Program. 134, 491–531 (2012)
Bertsimas, D., Tsitsiklis, J.N.: Introduction to Linear Optimization. Athena Scientific, Nashua (1997)
Bertsimas, D., Brown, D.B., Caramanis, C.: Theory and applications of robust optimization. SIAM Rev. 53(3), 464–501 (2011)
Bertsimas, D., Bidkhori, H., Dunning, I.: The price of flexibility, unpublished manuscript (2019)
De Ruiter, F.J.C.T., Brekelmans, R.C.M., den Hertog, D.: The impact of the existence of multiple adjustable robust solutions. Math. Program. 160, 531–545 (2016)
Fourier, J.: Reported in: Analyse des travaux de l’Académie, Royale des Sciences, pendant l’année 1824, Partie mathématique. In: Mémoires de l’Académie des sciences de l’Institut de France, vol 7, Académie des sciences (1827)
Gabrel, V., Murat, C., Thiele, A.: Recent advances in robust optimization: an overview. Eur. J. Oper. Res. 235, 471–483 (2014)
Gorissen, B.L., Yanıkoğlu, I., den Hertog, D.: A practical guide to robust optimization. Omega 53, 124–137 (2015)
Gurobi Optimization LLC: Gurobi optimizer reference manual. http://www.gurobi.com (2020)
Guslitser, E.: Uncertainty-immunized solutions in linear programming. Master’s thesis, Technion - Israel Institute of Technology. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.623.2382 &rep=rep1 &type=pdf (2002)
Hadjiyiannis, M.J., Goulart, P.J., Kuhn, D.: A scenario approach for estimating the suboptimality of linear decision rules in two-stage robust optimization. In: 50th IEEE Conference on Decision and Control and European Control Conference, pp. 7386–7391 (2011)
Iancu, D.A., Trichakis, N.: Pareto efficiency in robust optimization. Manag. Sci. 60(1), 130–147 (2014)
Jabr, R.A.: Adjustable robust OPF with renewable energy sources. IEEE Trans. Power Syst. 28(4), 4742–4751 (2013)
Koçyiğit, Ç., Iyengar, G., Kuhn, D., Wiesemann, W.: Distributionally robust mechanism design. Manag. Sci. 66(1), 159–189 (2019)
Konno, H.: A cutting plane algorithm for solving bilinear programs. Math. Program. 11, 14–27 (1976)
Kuhn, D., Wiesemann, W., Georghiou, A.: Primal and dual linear decision rules in stochastic and robust optimization. Math. Program. 130, 177–209 (2009)
Marandi, A., den Hertog, D.: When are static and adjustable robust optimization problems with constraint-wise uncertainty equivalent? Math. Program. 170(2), 555–568 (2018)
Motzkin, T.: Beitráge zur Theorie der linearen Ungleichungen. Azriel, Jerusalem (1936)
Nahapetyan, A.H.: Bilinear programming. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 279–288. Springer, Berlin (2009)
Polinder, G.J., Breugem, T., Dollevoet, T., Maróti, G.: An adjustable robust optimization approach for periodic timetabling. Transp. Res. B-Methodol. 128, 50–68 (2019)
Rockafellar, T.R.: Convex Analysis. Princeton University Press, Princeton (1970)
Ten Eikelder, S.C.M., Ajdari, A., Bortfeld, T., den Hertog, D.: Adjustable robust treatment-length optimization in radiation therapy. Optim. Eng. 23, 1949–1986 (2022)
Yanıkoğlu, I., Gorissen, B.L., den Hertog, D.: A survey of adjustable robust optimization. Eur. J. Oper. Res. 277, 799–813 (2019)
Zeng, B., Zhao, L.: Solving two-stage robust optimization problems using a column-and-constraint generation method. Oper. Res. Lett. 41(5), 457–461 (2013)
Zhen, J., den Hertog, D.: Computing the maximum volume inscribed ellipsoid of a polytopic projection. INFORMS J. Comput. 30(1), 31–42 (2018)
Zhen, J., den Hertog, D., Sim, M.: Adjustable robust optimization via Fourier–Motzkin elimination. Oper. Res. 66(4), 1086–1100 (2018)
Zhen, J., de Moor, D., den Hertog, D.: An extension of the Reformulation-Linearization Technique to nonlinear optimization. Optimization (2021)
Zhen, J., Marandi, A., de Moor, D., den Hertog, D., Vandenberghe, L.: Disjoint bilinear programming: a two-stage robust optimization perspective. INFORMS J. Comput. 34(5), 2410–2427 (2022)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Optimality of decision rule structures via an FME lens
As in Zhen et al. [30], we use FME as a proof technique for ARO, and we analyze various decision rule structures. Moreover, the results in this section show that FME not only provides more general results, but also leads to more concise (and perhaps more intuitive) proofs to known results on optimal decision structures. Zhen et al. [30] also use FME as a lens to derive certain optimality properties of decision rule structures. However, the results obtained in this section are either new or more general.
We consider several special cases of problem (3) for which particular decision rule structures are known to be optimal. We use FME to prove generalizations of these results for linear two-stage ARO problems. In particular, using FME as a proof technique enables us to show that the particular decision rule structure is not only ARO (i.e., worst-case optimal), but is ARF for each ARF Stage-1 decision \(\varvec{x}\). These results are used in analyzing PARO in Sect. 4.
We consider the cases where uncertainty appears (i) constraintwise, (ii) in a hybrid structure (part constraintwise, part non-constraintwise), (iii) in a block structure, and we consider (iv) the case with a simplex uncertainty set and the case with only one uncertain parameter.
In the current section, if FME is applied, w.l.o.g. it is applied on the adaptive variables in the order \(y_1(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\), i.e., according to their index. For the analysis of particular decision rule structures, it is crucial to keep track of the original constraints during the FME procedure. A frequently used technical result on this is provided in Lemma B.1 in Appendix B.1.
1.1 (i) Constraintwise uncertainty
Constraintwise uncertainty is formally defined as follows.
Definition A.1
ARO problem (3) has constraintwise uncertainty if there is a partition
such that \(\varvec{z}_{(0)},\dotsc ,\varvec{z}_{(m)}\) are disjoint, the objective depends only on \(\varvec{z}_{(0)}\) and constraint i depends only on \(\varvec{z}_{(i)}\), \(i=1,\dotsc ,m\). Additionally, \(U = \{(\varvec{z}_{(0)},\dotsc ,\varvec{z}_{(m)})~|\) \(\varvec{z}_{(i)} \in U^i, i=0,\dotsc ,m\}\), with \(U^i \subseteq {\mathbb {R}}^{|\varvec{z}_{(i)}|}\) for all \(i=0,\dotsc ,m\). \(\square \)
Ben-Tal et al. [2] show that for constraintwise uncertainty,Footnote 8 the optimal objective values of the static and adaptive problem are equal, i.e., there exists an optimal static decision rule. Using FME, a generalization of their result can be easily proved. We first provide an example.
Example A.1
Consider the following ARO problem with constraintwise uncertainty:
with \(U = [0,1]^3\). Uncertain parameter \(z_1\) occurs only in the first constraint and \((z_2,z_3)\) occur only in the second constraint. Using FME, we first eliminate \(y_1(\varvec{z})\) and subsequently eliminate \(y_2(\varvec{z})\):
From the bounds on \(y_2(\varvec{z})\) four linear constraints can be derived, two of which depend on x and are non-trivial:
One can verify that the (unique) ARO solution is \(x^{*} = \frac{1}{2}\). Additionally, note that the term \(\frac{1}{2}z_2 + \frac{1}{2}z_3\) appears in both upper bounds with a positive sign. As this is the only term that depends on \((z_2,z_3)\), it can be replaced by its worst-case value 0. Similarly, the term \(-\frac{1}{2}z_1\) appears in the lower bound on \(y_2(\varvec{z})\) with a negative sign, and can be replaced by its worst-case value \(-\frac{1}{2}\). This gives the following bounds on \(y_1(\varvec{z})\) and \(y_2(\varvec{z})\):
For \(y_2(\varvec{z})\), the only feasible (and hence ARO) decision rule is \(y_2(\varvec{z}) = \frac{3}{2}\). This implies \(y_1(\varvec{z}) = 1\), and we find that for both adaptive variables the optimal decision rule is static.
According to Lemma B.1, any term such as \(\frac{1}{2}z_2 + \frac{1}{2}z_3\) in Example A.1 appears in all upper bounds with a positive sign and all lower bounds with a negative sign, or vice versa. Hence, if this is the only term depending on \(z_2\) and \(z_3\), these uncertain parameters can be eliminated by replacing them with their worst-case value. The resulting bounds on adaptive variables are independent of uncertain parameters.
Instead of directly providing a formal proof of the result for constraintwise uncertainty, it follows as a corollary from our analysis of hybrid uncertainty, which is considered next.
1.2 (ii) Hybrid uncertainty
Hybrid uncertainty is a generalization of constraintwise uncertainty, where part of the uncertain parameters appear constraintwise, and part does not appear constraintwise. This uncertainty structure has previously been considered in Marandi and den Hertog [21]. Hybrid uncertainty is defined as follows.
Definition A.2
ARO problem (3) has hybrid uncertainty if there is a partition
such that \(\varvec{{\hat{z}}},\varvec{z}_{(0)},\dotsc ,\varvec{z}_{(m)}\) are disjoint, the objective depends only on \(\varvec{{\hat{z}}}\) and \(\varvec{z}_{(0)}\) and constraint i depends only on \(\varvec{{\hat{z}}}\) and \(\varvec{z}_{(i)}\), \(i=1,\dotsc ,m\). Additionally, \(U = \{(\varvec{{\hat{z}}},\varvec{z}_{(0)},\dotsc ,\varvec{z}_{(m)})~|~\varvec{{\hat{z}}} \in {\hat{U}}, \varvec{z}_{(i)} \in U^i, i=0,\dotsc ,m\}\), with \({\hat{U}} \subseteq {\mathbb {R}}^{|\varvec{{\hat{z}}}|}\) and \(U^i \subseteq {\mathbb {R}}^{|\varvec{z}_{(i)}|}\) for all \(i=0,\dotsc ,m\).
In case of hybrid uncertainty, there exist ARO decision rules that do not depend on the constraintwise uncertain parameters. We illustrate this with a toy example.
Example A.2
We extend Example A.1 to a problem with hybrid uncertainty by introducing a non-constraintwise uncertain parameter \({\hat{z}}\):
with \(U = [0,1]^4\). Uncertain parameter \({\hat{z}}\) occurs in both constraints, \(z_1\) occurs only in the first constraint and \((z_2,z_3)\) occur only in the second constraint. Using FME, we again first eliminate \(y_1(\varvec{z})\) and subsequently eliminate \(y_2(\varvec{z})\):
From the bounds on \(y_2(\varvec{z})\) again four linear constraints for x can be derived. The new parameter \({\hat{z}}\) does not break robustness of solution \(x^{*} = \frac{1}{2}\), so this is still the unique ARO solution. Similar to Example A.1, we can replace both occurrences of the term \(\frac{1}{2}z_2 + \frac{1}{2}z_3\) by its worst-case value 0, and \(-\frac{1}{2}z_1\) can be replaced by its worst-case value \(-\frac{1}{2}\). This yields the following bounds on \(y_1(\varvec{z})\) and \(y_2(\varvec{z})\):
For \(y_2(\varvec{z})\), the only feasible (and hence ARO) LDR is \(y_2({\hat{z}}) = \frac{3}{2} + \frac{1}{2}{\hat{z}}\). This implies \( 1 \le y_1(\varvec{z}) \le 1 + \frac{1}{2}{\hat{z}}\), and any decision rule that satisfies these bounds is ARO. Note that both decision rules do not depend on the constraintwise uncertain parameters. One can also pick a PWL decision rule for \(y_2(\varvec{z})\), such as its lower or upper bound. Also in this case the decision rules for \(y_1\) and \(y_2\) do not depend on \(z_1\), \(z_2\) or \(z_3\).
To formally prove our claim that there exist ARO decision rules that do not depend on the constraintwise uncertain parameters, we first need a result on feasibility.
Lemma A.1
Let \(P_{\text {hybrid}}\) denote an ARO problem of form (3) with hybrid uncertainty and let \(\varvec{x}^{*}\) be ARF to \(P_{\text {hybrid}}\). Then, there exists a decision rule \(\varvec{y}^{*}(\cdot )\) that depends only on \(\varvec{{\hat{z}}}\) such that \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARF to \(P_{\text {hybrid}}\).
Proof
See Appendix B.12. \(\square \)
The following result is an immediate consequence of Lemma A.1 for ARO decisions.
Corollary A.1
Let \(P_{\text {hybrid}}\) denote an ARO problem of form (3) with hybrid uncertainty. For each \(\varvec{x}^{*}\) that is ARO to \(P_{\text {hybrid}}\) there exists a decision rule \(\varvec{y}^{*}(\cdot )\) depending only on \(\varvec{{\hat{z}}}\) such that the pair \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to \(P_{\text {hybrid}}\).
Proof
See Appendix B.13. \(\square \)
In case of pure constraintwise uncertainty (\(U^0 = \emptyset \)) Lemma A.1 shows that for each ARF \(\varvec{x}\) there exists a static \(\varvec{y}\) such that \((\varvec{x},\varvec{y})\) is ARF. Additionally, Corollary A.1 shows that for each ARO \(\varvec{x}^{*}\) there exists a static \(\varvec{y}^{*}\) such that \((\varvec{x}^{*},\varvec{y}^{*})\) is ARO.
Marandi and den Hertog [21] prove a similar result to Corollary A.1 for non-linear problems. More precisely, they prove that for problems with hybrid uncertainty there exists an optimal decision rule that is a function of only the non-constraintwise uncertain parameters if the problem is convex in the decision variables, concave in uncertain parameters, has a convex compact uncertainty set and a convex compact feasible region for the adaptive variables.
1.3 (iii) Block uncertainty
Suppose we can split the constraints into blocks, where each block has its own uncertain parameters and adaptive variables, and the uncertainty set is a Cartesian product of the block-wise uncertainty sets, then there exists an optimal decision rule for each adaptive variable that depends only on the uncertain parameters in its own block.
The formal definition of block uncertainty is as follows. Recall that constraints are indexed \(1,\dotsc ,m\). Let index 0 refer to the objective.
Definition A.3
ARO problem (3) has block uncertainty if there exist partitions \(\varvec{z} =\) \( (\varvec{z}_{(1)},\dotsc ,\varvec{z}_{(V)})\), \(\varvec{y}(\cdot ) = (\varvec{y}_{(1)}(\cdot ),\dotsc ,\varvec{y}_{(V)}(\cdot ))\) and \(\{0,\dotsc ,m\} = \{K_{(1)},\dotsc ,K_{(V)}\}\) such that
-
\(U = \{(\varvec{z}_{(1)},\dotsc ,\varvec{z}_{(V)})~|~\varvec{z}_{(v)} \in U^v, v=1,\dotsc ,V\}\), with \(U^v \subseteq {\mathbb {R}}^{|\varvec{z}_{(v)}|}\) for all blocks \(v=1,\dotsc ,V\).
-
A constraint or objective with index in set \(K_{(v)}\) is independent of uncertain parameters \(\varvec{z}_{(w)}\) and adaptive variables \(\varvec{y}_{(w)}\) if block \(w\ne v\).
We first provide an example to develop some intuition for block uncertainty.
Example A.3
Consider again Example A.2. Add the following constraints to (25):
and let \(U = [0,1]^6\) denote the new uncertainty set. Then the first block consists of constraints (25b)-(25e), adaptive variables \(y_1(\varvec{z}), y_2(\varvec{z})\) and uncertain parameters \(z_0,\dotsc ,z_3\). The second block consists of constraints (26), adaptive variable \(y_3(\varvec{z})\) and uncertain parameters \(z_4\) and \(z_5\). One can verify that the unique ARO solution remains \(x^{*} = \frac{1}{2}\). The following bounds on \(y_3(\varvec{z})\) are obtained:
One feasible (and hence ARO) decision rule is \(y_3(z_4,z_5) = \frac{1}{2}(1+z_5-z_4)\). The decision rules for \(y_1\) and \(y_2\) remain unchanged. It follows that for each adaptive variable the optimal decision rule is a function of only the uncertain parameters in its own block.
In order to prove the claim that there exists an optimal decision rule for each adaptive variable that depends only on the uncertain parameters in its own block, we again first consider feasibility.
Lemma A.2
Let \(P_{\text {block}}\) denote an ARO problem of form (3) with block uncertainty and let \(\varvec{x}\) be ARF to \(P_{\text {block}}\). Then there exists a decision rule \(\varvec{y}(\cdot )\) with \(\varvec{y}_{(v)}(\cdot )\) depending only on \(\varvec{z}_{(v)}\), for all \(v=1,\dotsc ,V\), such that \((\varvec{x},\varvec{y}(\cdot ))\) is ARF to \(P_{\text {block}}\).
Proof
See Appendix B.14. \(\square \)
Corollary A.2
Let \(P_{\text {block}}\) denote an ARO problem of form (3) with block uncertainty. For each \(\varvec{x}\) that is ARO to \(P_{\text {block}}\) there exists a decision rule \(\varvec{y}(\cdot )\) with \(\varvec{y}_{(v)}(\cdot )\) depending only on \(\varvec{z}_{(v)}\), for all \(v=1,\dotsc ,V\), such that the pair \((\varvec{x},\varvec{y}(\cdot ))\) is ARO to \(P_{\text {block}}\).
Proof
Follows from Lemma A.2 analogous to the proof of Corollary A.1. \(\square \)
1.4 (iv) Simplex uncertainty or one uncertain parameter
Bertsimas and Goyal [5] prove optimality of LDRs for right-hand side uncertainty and a simplex uncertainty set. Zhen et al. [30] generalize this to both left- and right-hand side uncertainty, their proof uses FME on the dual problem. We use FME on the primal problem, which leads to a more intuitive proof; the following example illustrates the main idea. We note that the case with one uncertain parameter is a special case of simplex uncertainty, so the results of this section also hold for that case.
Example A.4
Consider the problem
with standard simplex uncertainty set \(U = \{(z_1,z_2,z_3): z_1+z_2+z_3 \le 1, z_1,z_2,z_3\ge 0\}\). Similar to Example A.2, we first eliminate \(y_1(\varvec{z})\) and then \(y_2(\varvec{z})\). This results in the following bounds on the adaptive variables:
Equivalently, these bounds have to be satisfied for each point in \(\text {ext}(U)\). One can verify that \(x^{*} = \frac{1}{2}\) is an ARO solution. Plugging this in (A.4), we get the following bounds for each extreme point:
Because U is a simplex, the four extreme points are affinely independent. Therefore, there is a unique LDR such that the upper bound on \(y_2(\cdot )\) holds with equality for each extreme point. This is also the case for the lower bound, and any convex combination of both decision rules also satisfies the bounds for \(y_2\) in (28). The LDR corresponding with the upper bounds is \(y_2(z_1,z_3) = \frac{1}{2}(3 + z_1+z_3)\), and plugging this in the bounds on \(y_1\) yields a similar system as (28) for \(y_1\). This guarantees existence of an LDR for \(y_1\); for the upper bound we find \(y_1(z_1,z_3) = \frac{1}{2}(2+z_1 + z_3)\). Note that this does not generalize to uncertainty sets described by more than \(L+1\) extreme points.
Similar to the cases for hybrid and block uncertainty, we first prove feasibility for each ARF \(\varvec{x}\), and subsequently prove optimality.
Lemma A.3
Let \(P_{\text {simplex}}\) denote an ARO problem of form (3) with a simplex uncertainty set, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1})\), with \(\varvec{z}^{j} \in {\mathbb {R}}^{L}\) such that \(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1}\) are affinely independent. Let \(\varvec{x}\) be ARF to \(P_{\text {simplex}}\). Then there exists an LDR \(\varvec{y}(\cdot )\) such that \((\varvec{x},\varvec{y})\) is ARF to \(P_{\text {simplex}}\).
Proof
See Appendix B.15. \(\square \)
Similar to Corollary A.1, we have the following result for ARO decisions.
Corollary A.3
Let \(P_{\text {simplex}}\) denote an ARO problem of form (3) with a simplex uncertainty set, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1})\), with \(\varvec{z}^{j} \in {\mathbb {R}}^{L}\) such that \(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1}\) are affinely independent. For each \(\varvec{x}\) that is ARO to \(P_{\text {simplex}}\) there exists an LDR \(\varvec{y}(\cdot )\) such that the pair \((\varvec{x},\varvec{y}(\cdot ))\) is ARO to \(P_{\text {simplex}}\).
Proof
Follows from Lemma A.3 analogous to the proof of Corollary A.1. \(\square \)
Because the case with one uncertain parameter is a special case of simplex uncertainty, the results of Lemma A.3 and Corollary A.3 also hold for that case.
The results on PARO in Sect. 4.1 make use of the fact that an ARF decision rule with a particular structure exists for every ARF \(\varvec{x}\), i.e., Lemmas A.1 to A.3.
Technical Lemmas and Proofs
In the proofs, if FME is applied, w.l.o.g. it is applied on the adaptive variables in the order \(y_1(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\), i.e., according to their index.
1.1 Bounds on eliminated adaptive variables
Lemma B.1
Let \(\varvec{x}\) be ARF to (3). Let \(\varphi _i(\varvec{x},\varvec{z}) = \varvec{r}_i(\varvec{z}) - \varvec{a}_i(\varvec{z})^{\top }\varvec{x}\) for each constraint \(i=1,\dotsc ,m\) of (3b). Consider the system of inequalities \(\varvec{b}_i^{\top } \varvec{y}(\varvec{z}) \le \varphi _i(\varvec{x},\varvec{z})\), \(i=1,\dotsc ,m\) and use FME to eliminate all variables. For all \(k=1,\dotsc ,n_y\) we can write the bounds after elimination of variable \(y_k(\varvec{z})\) as
for some coefficients \(\alpha \) and \(\beta \) independent of \(\varvec{z}\), and \(C_k^{-},C_k^{+} \subseteq P(\{1,\dotsc ,m\})\), with \(P(\{1,\dotsc ,m\})\) the power set of \(\{1,\dotsc ,m\}\). Additionally, if \(S_k \in C_k^{-}\) for some k, then \(\alpha (S_k,p) <0\) for all \(p\in S_k\). If \(T_k \in C_k^{+}\) for some k, then \(\alpha (T_k,q) >0\) for all \(q\in T_k\).
Proof
Proof by induction.
Base case:
Elimination of variable \(y_1(\varvec{z})\) yields
Define
then each constraint in \(C_1^{-}\) defines a lower bound on \(y_1(\varvec{z})\) and each constraint in \(C_1^{+}\) defines an upper bound on \(y_1(\varvec{z})\). Each element of \(C_1^{-}\) and \(C_1^{+}\) is an individual ‘original’ constraint index and not a set of constraints indices. For all \(S_1 = \{p\} \in C_1^{-}\) set \(\alpha (S,p) = b_{p,1}^{ -1}\), and for all \(T_1 = \{q\} \in C_1^{+}\) set \(\alpha (T,q) = b_{q,1}^{ -1}\). Furthermore, set \(\beta (S_1,l) = b_{p,l}B_{p,1}^{-1}\) for all \(S_1 = \{p\} \in C_1^{-} \cup C_1^{+}\) and all \(l=2,\dotsc ,n_y\). With these definitions, (30) is reformulated in form (29). Additionally, by construction, \(\alpha (S_1,p) <0\) if \(p\in S_1\), \(S_1\in C_1^{-}\) and \(\alpha (T_1,q) >0\) if \(q\in T_1\), \(T_1\in C_1^{+}\).
Induction step
Suppose the result holds for some \(k-1\) (i.e., after elimination of variable \(y_{k-1}(\varvec{z})\)). Variable \(y_{k}(\varvec{z})\) can occur in two types of constraints: (i) original constraints \(i=1,\dotsc ,m\) that do not depend on \(y_1(\varvec{z}),\dotsc ,y_{k-1}(\varvec{z}))\) and (ii) the new constraints acquired after elimination of \(y_1(\varvec{z}),\dotsc ,y_{k-1}(\varvec{z}))\). For case (i), define
then each constraint in \(I_k^{-}\) defines a lower bound on \(y_k(\varvec{z})\) and each constraint in \(I_k^{+}\) provides an upper bound on \(y_k(\varvec{z})\). Reformulation to form (29) is similar to the case \(k=1\). Thus, \(\alpha (S_k,p) <0\) if \(p\in S_k\), \(S_k\in I_k^{-}\) and \(\alpha (T_k,p) >0\) if \(p\in T_k\), \(T_k\in I_k^{+}\).
For case (ii), \(y_k(\varvec{z})\) can occur in constraints resulting from picking linear lower and upper bounds on \(y_l(\varvec{z})\) from (29). If these bounds are independent of \(y_{l+1}(\varvec{z}),\dotsc ,y_{k-1}(\varvec{z})\), for \(l=1,\dotsc ,k-1\), they are used directly to eliminate \(y_k(\varvec{z})\). For any such pair of constraints \(S_l \in C_l^{-}\) and \(T_l \in C_l^{+}\), FME yields the following bound on \(y_k(\varvec{z})\) (due to the induction assumption):
We proceed by dividing by the coefficient of \(y_k(\varvec{z})\). If \(\beta (S_l,k) > \beta (T_l,k)\), inequality (31) defines a lower bound for \(y_{k}(\varvec{z})\); if \(\beta (S_l,k) < \beta (T_l,k)\), inequality (31) defines an upper bound for \(y_{k}(\varvec{z})\). Define
so each element \(S_k\) in \(J_k^{-}\) (or \(T_k\) in \(J_k^{+}\)) is a union of the indices of a lower bound constraint (set \(S_l\)) and an upper bound constraint (set \(T_l\)) on \(y_l(\varvec{z})\). The condition \(\beta (S_l,j) = \beta (T_l,j),~\forall j<l\) on the second line ensures that these lower and upper bound constraints on \(y_l(\varvec{z})\) do not specify a constraint on \(y_{l+1}(\varvec{z}),\dotsc ,y_{k-1}(\varvec{z})\).
Set the coefficients for the not yet eliminated variables \(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\) for form (29) as
If \(S_k \in J_k^{-}\), with \(S_k = S_l \cup T_l\) for some \(S_l \in C_l^{-}\) and \(T_l \in C_l^{+}\), \(l=1,\dotsc ,k-1\), then set
Similarly, if \(T_k \in J_k^{+}\), with \(T_k = S_l \cup T_l\) for some \(S_l \in C_l^{-}\) and \(T_l \in C_l^{+}\) for some \(l=1,\dotsc ,k-1\), then set
Due to the induction hypothesis, \(\alpha (S_l,p)<0\) if \(S_l \in C_l^{-}\) and \(\alpha (T_l,p) >0\) if \(T_l \in C_l^{+}\) for \(l<k\). The denominator in both lines of (32) is positive, so in that case \(\alpha (S_k, p)<0\). The denominator in both lines of (33) is negative, so in that case \(\alpha (T_k, p)>0\). With the new coefficients chosen as above, (31) provides a lower or upper bound on \(y_{{\hat{k}}}(\varvec{z})\) of the form inside the maximum or minimum operator in (29), respectively.
Finally, define \(C_k^{-} = I_k^{-} \cup J_k^{-}\) and \(C_k^{+} = I_k^{+} \cup J_k^{+}\). Each constraint in \(C_k^{-}\) defines a lower bound on \(y_k(\varvec{z})\) and each constraint in \(C_k^{+}\) defines an upper bound on \(y_k(\varvec{z})\). Moreover, set \(C_k = C_k^{-} \cup C_k^{+}\) contains all constraints after elimination of \(y_1(\varvec{z}),\dotsc ,y_{k-1}(\varvec{z})\) that have \(y_k(\varvec{z})\) as lowest indexed adaptive variable. This completes the induction step. \(\square \)
1.2 Proof Lemma 1
Consider problem (3), with the objective moved to the constraints using epigraph variable \(t\in {\mathbb {R}}\):
Eliminate all adaptive variables in (34b)–(34c) via FME. Let \(\varphi _0(\varvec{x},t,\varvec{z}) = t - \varvec{c}(\varvec{z})^{\top } \varvec{x}\). In notation of Lemma B.1, FME is performed on
where the coefficient for t is zero in \(\varphi _i\), \(i=1,\dotsc ,m\). According to Lemma B.1, after elimination of variable k, inequalities (29) hold. Suppose for some \(S_k \in C_{k}^{-}\), \(T_k \in C_{k}^{+}\) the upper and lower bounds on \(y_k(\varvec{z})\) do not depend on \(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\). Then the following constraint is derived for the static robust optimization problem after completing the full FME procedure:
where \(\varphi _p(\cdot )\) is a function of t only if \(p=0\). Constraints of the original system (34c) that are independent of adaptive variables can also be represented in form (36). Original constraints (34b) are part of a particular constraint in form (36) if and only if \(0 \in S_k \cup T_k\) for some \(S_k \in C_{k}^{-}\), \(T_k \in C_{k}^{+}\), \(k=1,\dotsc ,n_y\). Thus, problem (34) after FME can be written as
with
In other words, we separated the constraints depending on t from the constraints not depending on t. From Lemma B.1 one can see that (37c) is the result of performing FME on the set of constraints (35b), which are the constraints defining set \({\mathcal {X}}\). Thus, (37c) describes set \({\mathcal {X}}_{\text {FME}}\). Furthermore, if we define \(\alpha (S,0) = 0\) if \( 0\notin S\) and \(\gamma (T,0) = 0\) if \(0 \notin T\), constraint (37b) can be rewritten to
because \(\alpha (T,0) > \alpha (S,0)\) according to Lemma B.1. Note that the coefficient for t is zero for all functions \(\varphi \) on the RHS. Thus, for fixed \(\varvec{z} \in U\), constraint (39) defines a lower bound on epigraph variable t that is convex PWL in \(\varvec{x}\). Subsequently, we eliminate t and define
This yields the following problem equivalent to (37):
If \((\varvec{x}^{*},t^{*},\varvec{y}^{*}(\cdot ))\) is optimal to (34), \(\varvec{x}^{*}\) is optimal to (41) with equal objective value. This implies that \(\varvec{y}^{*}(\cdot )\) satisfies
Conversely, if \(\varvec{x}^{*}\) is optimal to (41), there exists a \((t^{*},\varvec{y}^{*}(\cdot ))\) such that \((\varvec{x}^{*},t^{*},\varvec{y}^{*}(\cdot ))\) is optimal to (34) with equal objective value. This implies that any such \(\varvec{y}^{*}(\cdot )\) satisfies (42). Lastly, note that \(\varvec{x}^{*}\) is optimal to (3) if and only if there exists a \(t^{*}\in {\mathbb {R}}\) such that \((t^{*},\varvec{x}^{*})\) is optimal to (34). This completes the proof.
1.3 Proof Lemma 2
By Definition 7(i) a solution \(\varvec{x}^{*}\) is PARO to (3) if and only if
-
There exists a \(\varvec{y}^{*} \in {\mathcal {R}}^{L,n_y}\) such that \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) is ARO to (3) and there does not exist a pair \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that is ARO to (3) and the following conditions hold:
$$\begin{aligned} \begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}} + \varvec{d}^{\top }\varvec{{\bar{y}}}(\varvec{z})&\le \varvec{c}(\varvec{z})^{\top }\varvec{x^{*}} + \varvec{d}^{\top }\varvec{y}^{*}(\varvec{z}),~~ \forall \varvec{z} \in U, \\ \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{{\bar{x}}} + \varvec{d}^{\top }\varvec{{\bar{y}}}(\varvec{{\bar{z}}})&< \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x^{*}} + \varvec{d}^{\top }\varvec{y}^{*}(\varvec{{\bar{z}}}),~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned} \end{aligned}$$(43)
By Lemma 1, this holds if and only if
-
\(\varvec{x}^{*}\) is optimal to (8) and there exists a \(\varvec{y}^{*} \in {\mathcal {R}}^{L,n_y}\) such that
$$\begin{aligned} \varvec{d}^{\top }\varvec{y}^{*}(\varvec{z}) = \max _{(S,T)\in M} \{h_{S,T}(\varvec{x}^{*},\varvec{z}) \}~~~\forall \varvec{z}\in U, \end{aligned}$$(44)and there does not exist a \((\varvec{{\bar{x}}},\varvec{{\bar{y}}})\) such that \(\varvec{{\bar{x}}}\) is optimal to (8) and \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) satisfies (44) and (43) holds.
Substituting (44) in (43) yields the following set of equivalent conditions:
-
\(\varvec{x}^{*}\) is optimal to (8) and there does not exist another \(\varvec{{\bar{x}}}\) optimal to (8) such that
$$\begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}} + \max _{(S,T)\in M} \{h_{S,T}(\varvec{{\bar{x}}},\varvec{z}) \}&\le \varvec{c}(\varvec{z})^{\top }\varvec{x^{*}} + \max _{(S,T)\in M} \{h_{S,T}(\varvec{x^{*}},\varvec{z}) \},~~ \forall \varvec{z} \in U, \\ \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{{\bar{x}}} + \max _{(S,T)\in M} \{h_{S,T}(\varvec{{\bar{x}}},\varvec{{\bar{z}}}) \}&< \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x^{*}} +\max _{(S,T)\in M} \{h_{S,T}(\varvec{x^{*}},\varvec{{\bar{z}}}) \},~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned}$$
This statement holds if and only if \(\varvec{x}^{*}\) is PRO to (8), by Definition 1.
1.4 Proof Theorem 1
First, we prove the existence of PRO solutions to a general class of static RO problems, with bounded feasible region \({\mathcal {X}}\).
Lemma B.2
Let \(f: {\mathbb {R}}^n \times {\mathbb {R}}^L \mapsto {\mathbb {R}}\), with \(f(\varvec{x},\varvec{z})\) continuous in \(\varvec{z}\). Consider the static RO problem
Let \(U \subseteq {\mathbb {R}}^L\) be closed, convex with a nonempty relative interior. If (i) \({\mathcal {X}}\) is compact and \(f(\varvec{x},\varvec{z})\) continuous in \(\varvec{x}\) and/or (ii) \({\mathcal {X}}\) is a finite set, and additionally there exists an RO solution to (45), there also exists a PRO solution to (45).
Proof of Lemma B.2
Let \(({\mathbb {R}}^L,{\mathcal {B}}({\mathbb {R}}^L))\) be a measurable space, with \({\mathcal {B}}({\mathbb {R}}^L)\) the Borel \(\sigma \)-algebra. For fixed \(\varvec{x}\), function \(f(\varvec{x},\varvec{z})\) is continuous in \(\varvec{z}\), so it is measurable on closed subsets of \({\mathbb {R}}^L\), in particular set U. Define function \(g: {\mathbb {R}}^n \mapsto {\mathbb {R}}\) with
where P denotes a strictly positive probability measure on \({\mathbb {R}}^L\), such as the Gaussian measure. Because \(0\le P(U) \le P({\mathbb {R}}^L)=1\), the Lebesgue integral (46) assumes finite values for any \(\varvec{x}\). Hence, \(f(\varvec{x},\varvec{z})\) is Lebesgue-integrable in its second argument on measured space \(({\mathbb {R}}^L,{\mathcal {B}}({\mathbb {R}}^L),P)\) for any \(\varvec{x}\) and g is well-defined.
We proceed by showing that an optimal solution to the following optimization problem is PRO to (45):
The remainder of the proof consists of two parts. First, we show that an optimal solution to (47) is always attained. Subsequently, we show that such an optimal solution is PRO to (45).
Part 1 (The optimum is attained)
We treat the two cases for \({\mathcal {X}}\) separately.
Case (i): Set \({\mathcal {X}}\) is compact and \(f(\varvec{x},\varvec{z})\) continuous in \(\varvec{x}\). We show that g is continuous. Consider a sequence \(\{\varvec{x}_n\}_{n\in {\mathbb {N}}}\) converging to \(\varvec{x}\). By continuity of f in \(\varvec{x}\), \(\lim _{n \rightarrow \infty } f(\varvec{x}_n, \varvec{z}) = f(\varvec{x},\varvec{z})\). Thus,
Let \(M>0\) be such that \(|f(\varvec{x},\varvec{z})| < M\), and define \(h:{\mathbb {R}}^L \mapsto {\mathbb {R}}\) with \(h(\varvec{z}) =M\) for all \(\varvec{z}\). Then h is Lebesgue-integrable, and we can apply the dominated convergence theorem to switch the order of the limit and integration in (48) to obtain
Hence, \(g(\varvec{x})\) is continuous for each \(\varvec{x} \in {\mathbb {R}}^n\). Let \({\mathcal {X}}^{\text {RO}}\) denote the set of robustly (worst-case) optimal solutions to (45). Then \({\mathcal {X}}^{\text {RO}}\) is compact if \({\mathcal {X}}\) is compact. Problem (47) minimizes a continuous function over a compact domain, so, by the extreme value theorem, a minimum is always attained.
Case (ii): Set \({\mathcal {X}}\) is a finite set. Problem (47) minimizes \(g(\varvec{x})\) over a finite set, so the minimum is attained.
Part 2 (An optimal solution is PRO)
Let \(\varvec{{\hat{x}}}\) denote an optimal solution to (47). We proceed by showing via proof by contradiction that \(\varvec{{\hat{x}}}\) is PRO to (45). Suppose \(\varvec{{\hat{x}}}\) is not PRO to (45). Then there exists an \(\varvec{{\bar{x}}} \in {\mathcal {X}}^{\text {RO}}\) such that
We proceed by showing that there must exist a ball contained in U with strictly positive measure where strict inequality holds. Let \({\bar{B}}\) denote the ball with radius \(\delta \) centered at \(\varvec{{\bar{z}}}\):
By continuity of \(f(\varvec{{\bar{x}}},\varvec{z}) - f(\varvec{{\hat{x}}},\varvec{z})\) w.r.t. \(\varvec{z}\), there exists a \(\delta >0\) such that for each \(\varvec{z} \in {\bar{B}}\) it holds that \(f(\varvec{{\bar{x}}},\varvec{z}) - f(\varvec{{\hat{x}}},\varvec{z}) < 0\). Note that \(\varvec{{\bar{z}}}\) need not be in the relative interior of U. Hence, the ball \({\bar{B}}\) need not be contained in U. Let \(\varvec{{\tilde{z}}} \in \text {ri}(U)\). We construct a new scenario \(\varvec{z}^{*} = \theta \varvec{{\tilde{z}}} + (1-\theta ) \varvec{{\bar{z}}}\). Because U is convex, \(\varvec{z}^{*} \in \text {ri}(U)\) if \(0 \le \theta <1\) according to Rockafellar [25, Theorem 6.1]. Choosing \(1 - \delta \Vert \varvec{{\tilde{z}}} - \varvec{{\bar{z}}}\Vert _2^{-1}< \theta < 1\) ensures that \(\varvec{z}^{*} \in \text {int}({\bar{B}}) \cap \text {ri}(U) = \text {ri}(U \cap {\bar{B}})\). Consider the ball \(B^{*}\) with radius \(\epsilon >0\) centered at \(\varvec{z}^{*}\):
For sufficiently small \(\epsilon > 0\), it holds that \(\varvec{z} \in B^{*} \Rightarrow \varvec{z} \in U \cap {\bar{B}}\). In other words, for such an \(\epsilon \), each point \(\varvec{z} \in B^{*}\) is in the uncertainty set U and is such that \(f(\varvec{{\bar{x}}},\varvec{z}) < f(\varvec{{\hat{x}}},\varvec{z})\).
Finally, we consider the difference between \(g(\varvec{{\bar{x}}})\) and \(g(\varvec{{\hat{x}}})\) on U. Note that \(|g(\varvec{x})|<\infty \) for all \(\varvec{x}\). The following holds:
The first integral is nonpositive since \(f(\varvec{{\bar{x}}},\varvec{z}) \le f(\varvec{{\hat{z}}},\varvec{z})\) for each \(\varvec{z} \in U\backslash B^{*}\). The second integral is strictly negative since \(f(\varvec{{\bar{x}}},\varvec{z}) < f(\varvec{{\hat{z}}},\varvec{z})\) for \(\varvec{z}\in B^{*}\) and measure P is strictly positive, i.e., \(P(B^{*}) >0\). Hence, \(g(\varvec{{\bar{x}}}) < g(\varvec{{\hat{x}}})\), contradicting the fact that \(\varvec{{\hat{x}}}\) is optimal to (47). \(\square \)
The result of Theorem 1 immediately follows.
Proof of Theorem 1
By Lemma 2, it suffices to prove existence of a PRO solution to (8). Because \({\mathcal {X}}={\mathcal {X}}_{\text {FME}}\), set \({\mathcal {X}}_{\text {FME}}\) is compact. By construction of (3), uncertainty set U is assumed to be convex, compact with a nonempty relative interior. Lastly, the objective function of (8) is continuous in \(\varvec{x}\) and \(\varvec{z}\). Hence, all conditions of Lemma B.2 are satisfied, and existence of a PARO solution to (3) is guaranteed. \(\square \)
1.5 Proof Lemma 3 via FME
Let \(\varvec{x}\) be ARF to (3). W.l.o.g., suppose in the FME procedure the adaptive variables are eliminated in the order \(y_1,\dotsc ,y_{n_y}\), i.e., according to their index. Let \(F_k(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) denote the optimal decision rule for \(y_k\) as a function of the decision rules for the adaptive variables with higher index and the uncertain parameter \(\varvec{z}\). We prove by induction on \(k=1,\dotsc ,n_y\) that \(F_k(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) is jointly PWL in \(y_{k+1},\dotsc ,y_{n_y}\) and \(\varvec{z}\).
According to Lemma B.1, we can write the bounds after elimination of variable \(y_1(\varvec{z})\) as
for some coefficients \(\alpha \) and \(\beta \) independent of \(\varvec{z}\). For fixed \(y_2,\dotsc ,y_{n_y}\), \(\varvec{z}\) and \(\varvec{x}\), the highest possible contribution of \(y_1\) to the objective value is achieved by setting \(y_1\) equal to its upper bound if \(d_1 <0\), and equal to its lower bound if \(d_1 >0\). Thus, \(F_1(y_2(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) is equal to either the upper or the lower bound on \(y_1\). Both the upper and lower bound are jointly PWL in \(y_i\), \(i=2,\dotsc ,n_y\) and \(\varvec{z}\).
Now, suppose that for each \(i=1,\dotsc ,k-1\), after elimination of variable \(y_{i}(\varvec{z})\) the optimal decision rule \(F_{i}(y_{i+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) is jointly PWL in \(y_{i+1},\dotsc ,y_{n_y}\).
After elimination of \(y_k(\varvec{z})\) we can again write the bounds according to Lemma B.1. For fixed \(y_{k+1},\dotsc ,y_{n_y}\), \(\varvec{z}\) and \(\varvec{x}\), the highest possible contribution of \(y_k\) to the objective value is achieved by minimizing \(\varvec{d}^{\top }\varvec{y}\), i.e., solving
where the last term in the objective (the last summation) may be dropped because it does not depend on \(y_k\). In the objective each decision rule \(F_i\), \(i=1,\dotsc ,k-1\), is a function of the decision rules \(F_{i+1},\dotsc , F_{k-1}\), variables \(y_k(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\) and \(\varvec{z}\). Plugging in a PWL argument in a PWL function retains the piecewise linear structure. Thus, (49) asks to minimize a univariate PWL function on a closed interval. The optimum is attained at either an interior point or a boundary point; we consider these cases separately.
-
Problem (49) has a boundary minimum. The minimum is attained at either the lower or upper bounds provided by (49b) and (49c). In this case, \(F_k(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) is clearly jointly PWL in \(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\) and \(\varvec{z}\).
-
Problem (49) has an interior minimum. The unrestricted minimum of (49a) is at the intersection of two functions that are jointly linear in \(y_{k},\dotsc ,y_{n_y}\) and \(\varvec{z}\). Any intersection point can be expressed as
$$\begin{aligned} s_0(\varvec{z}) + \sum _{i=k}^{n_y} s_i y_i(\varvec{z}) = t_0(\varvec{z}) + \sum _{i=k}^{n_y} t_i y_i(\varvec{z}), \end{aligned}$$for some scalars \(s_0(\varvec{z})\) and \(t_0(\varvec{z})\) depending linearly on \(\varvec{z}\) and some vectors \(\varvec{s},\varvec{t} \in {\mathbb {R}}^{n_y-k}\). This is equivalent to
$$\begin{aligned} y_k(\varvec{z}) = \frac{s_0(\varvec{z}) - t_0(\varvec{z}) + \sum _{i=k+1}^{n_y}(s_i-t_i) y_i(\varvec{z}) }{t_k - s_k}, \end{aligned}$$and this is jointly linear in \(y_{k},\dotsc ,y_{n_y}\) and \(\varvec{z}\). The pair \(\{(s_0(\varvec{z}),\varvec{s}),(t_0(\varvec{z}),\varvec{t})\}\) that defines the interior minimum intersection point depends on \(y_{k},\dotsc ,y_{n_y}\) and \(\varvec{z}\). Thus, the optimal decision rule \(F_k(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) is a PWL function of \(y_{k+1},\dotsc ,y_{n_y}\) and \(\varvec{z}\).
This completes the induction step. Lastly, note that \(F_{n_y}(\varvec{z})\) is PWL in \(\varvec{z}\) and that plugging in a PWL argument in a PWL function retains the piecewise linear structure. Thus, going from \(k=n_y\) to \(k=1\) and for each k plugging in \(F_{k}(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) in \(F_{k-1}(y_{k}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z}),\varvec{z})\) yields decision rules that are PWL in \(\varvec{z}\) for all variables \(y_1,\dotsc ,y_{n_y}\).
1.6 Proof Lemma 3 via linear optimization
Let \(\varvec{x}\) be ARF to (3). We make use of the concept of basic solutions in linear optimization [6]. In standard form the remaining problem for \(\varvec{y}\) for fixed \(\varvec{z}\), reads:
where \(\varvec{s}\) is a slack variable and \(\varvec{y}\) is represented by the difference of two nonnegative variables. Let \(\varvec{v} \in {\mathbb {R}}^{2n_y+m}\), \(\varvec{M} \in {\mathbb {R}}^{m\times (2n_y+m)}\) and \(\varvec{f} \in {\mathbb {R}}^{2n_y+m}\) denote the vector of decision variables, the equality constraint matrix and the objective vector of (50), respectively:
Each basis is represented by m linearly independent columns of \(\varvec{M}\). Let \(\varvec{W} \in {\mathbb {R}}^{m\times m}\) denote a basis matrix, and let \(\varvec{v}_{\varvec{W}}\) and \(\varvec{f}_{\text {W}}\) denote the components of \(\varvec{v}\) and \(\varvec{f}\) corresponding to the basic variables. For any basic solution \(\varvec{v}\) it holds that
and the remaining non-basic components of \(\varvec{v}\) are equal to zero. Denote the basic solution by \((\varvec{v}_{\varvec{W}},\varvec{0}_{\backslash \varvec{W}})\); it is a basic feasible solution (BFS) to (50) if and only if \(\varvec{v}_{\varvec{W}} \ge \varvec{0}\). For optimality of \((\varvec{v}_{\varvec{W}},\varvec{0}_{\backslash \varvec{W}})\) it is additionally required that the reduced costs are nonnegative. Nonnegativity of the reduced costs (i.e., optimality of \((\varvec{v}_{\varvec{W}},\varvec{0}_{\backslash \varvec{W}})\)) reads
We restrict ourselves to those basic solutions for which optimality condition (53) holds, note that this condition is independent of \(\varvec{z}\). It follows that for each basis matrix \(\varvec{W}\) that satisfies (53), it associated basic solution \((\varvec{v}_{\varvec{W}},\varvec{0}_{\backslash \varvec{W}})\) is feasible (and optimal) if and only if \(\varvec{z}\) is in the following subset of U:
Let \(\varvec{y}(\varvec{x},\varvec{z},\varvec{W})\) denote the basic solution corresponding to \(\varvec{W}\) in terms of the original variables \(\varvec{y}\). From (52) it follows that \(\varvec{y}(\varvec{x},\varvec{z},\varvec{W})\) is linear in \(\varvec{z}\).
Any basic solution to (50) corresponds with at least one basis, and each basis is represented by m linearly independent columns of \(\varvec{M}\). Thus, there are at most \(\beta =\left( {\begin{array}{c}2n_y+m\\ m\end{array}}\right) \) bases (i.e., matrices \(\varvec{W}\)) to (50) that satisfy (53), independent of \(\varvec{z}\). Number the matrices \(\varvec{W}_1,\dotsc ,\varvec{W}_{\beta }\). Each of these matrices \(\varvec{W}_j\) has its own LDR \(\varvec{y}(\varvec{x},\varvec{z},\varvec{W}_j)\) that is optimal for all \(\varvec{z} \in U_{\varvec{W}_j}(\varvec{x})\).
Because \(\varvec{x}\) is ARF to (3) and (3) has a finite optimal objective value, problem (50) is feasible and has a finite optimum for all \(\varvec{z} \in U\). Therefore, there exists an optimal basic feasible solution for all \(\varvec{z} \in U\), and the union of all \(U_{\varvec{W}_i}\) equals U itself. This implies that, for the given \(\varvec{x}\), the following PWL decision rule is optimal for each \(\varvec{z} \in U\):
Note that a different numbering of the matrices gives a (possibly) different optimal PWL decision rule. In essence, the proof performs sensitivity analysis on the right-hand side vectors of (50), which is the only term in (50) that depends on \(\varvec{z}\).
1.7 Proof Theorem 3
Let \(\text {OPT}\) denote the optimal (worst-case) objective value of P. By Definition 7(i), and using that \(\varvec{d} = \varvec{0}\), a solution \(\varvec{x}^{*}\) is PARO to P if and only if the following statement holds:
-
There exists a \(\varvec{y}^{*} \in {\mathcal {R}}^{L,n_y}\) such that \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) is ARO to P and there does not exist a pair \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that is ARO to P and
$$\begin{aligned} \begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}}&\le \varvec{c}(\varvec{z})^{\top }\varvec{x^{*}}, ~~ \forall \varvec{z} \in U,\\ \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{{\bar{x}}}&< \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x^{*}}, ~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned} \end{aligned}$$(54)
By definition of set \({\mathcal {X}}\), this holds if and only if
-
\(\varvec{x}^{*} \in {\mathcal {X}}\), \(\text {OPT} = \max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }\varvec{x}^{*}\) and there does not exist an \(\varvec{{\bar{x}}} \in {\mathcal {X}}\) such that \(\text {OPT} = \max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}}\) and (54) holds.
Because for any ARF \(\varvec{x}\) there exists an ARF decision rule \(\varvec{y}(\cdot )\) such that \(\varvec{y}(\varvec{z}) = f_{\varvec{w}}(\varvec{z})\) for some \(\varvec{w}\), it follows that \({\mathcal {X}}\) is equal to
which is the set of feasible \(\varvec{x}\) when Stage-2 decision rules are restricted to be of form \(f_{\varvec{w}}(\varvec{z})\). Hence, the previous set of conditions holds if and only if
-
\(\varvec{x}^{*} \in {\mathcal {X}}_f\), \(\text {OPT} = \max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }\varvec{x}^{*}\) and there does not exist an \(\varvec{{\bar{x}}} \in {\mathcal {X}}\) such that \(\text {OPT} = \max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}}\) and (54) holds.
Parameters \(\varvec{w}\) are now Stage-1 decision variables, so \({\mathcal {X}}_f\) does not contain adaptive variables. The set of conditions describes a PRO solution to the static robust optimization problem obtained after plugging in decision rule structure \(f_{\varvec{w}}(\cdot )\).
1.8 Proof Corollary 2
Corollary 2(i) For any vector of parameters \(\varvec{w} \in {\mathbb {R}}^p\), let \(f_{\varvec{w}}(\varvec{{\hat{z}}})\) denote a decision rule that depends only on \(\varvec{{\hat{z}}} \in {\hat{U}}\), the non-constraintwise component of uncertain parameter \(\varvec{z}\). From Lemma A.1 it follows that \({\mathcal {X}}\) is equal to
i.e., the feasible region for \(\varvec{x}\) remains unchanged if all adaptive variables are restricted to depend only on the non-constraintwise component of \(\varvec{z}\). Hence, setting \(X_f = {\mathcal {X}}_{\text {hybrid}}\) in the proof of Theorem 3 yields the result.
Corollary 2(ii) For each block \(v=1,\dotsc ,V\), let \(\varvec{w}(v) \in {\mathbb {R}}^{p(v)}\) denote a vector of parameters and let \(f^{v}_{\varvec{w}(v)}(\varvec{z}_{(v)})\) denote a decision rule that depends only on \(\varvec{z}_{(v)}\), the uncertain parameters in block v. From Lemma A.2 it follows that \({\mathcal {X}}\) is equal to
i.e., the feasible region for \(\varvec{x}\) remains unchanged if all adaptive variables are restricted to depend only on uncertain parameters in their own block. Hence, setting \(X_f = {\mathcal {X}}_{\text {block}}\) in the proof of Theorem 3 yields the result.
Corollary 2(iii) From Lemma A.3 it follows that for simplex uncertainty \({\mathcal {X}}\) is equal to
i.e., the feasible region for \(\varvec{x}\) remains unchanged if all adaptive variables are restricted to depend affinely on \(\varvec{z}\). Hence, setting \(X_f = {\mathcal {X}}_{\text {simplex}}\) in the proof of Theorem 3 yields the result.
1.9 Proof Lemma 4
The two cases are considered separately.
-
Optimal objective value is zero: Proof by contradiction. Suppose \(\varvec{y}^{*}(\cdot )\) is not a PARO extension of \(\varvec{x}^{*}\). Then, by Definition 8, there exists a \(\varvec{{\tilde{y}}}(\cdot )\) such that \((\varvec{x}^{*},\varvec{{\tilde{y}}}(\cdot ))\) is ARO to (3) and for some \(\varvec{{\tilde{z}}} \in U\) it holds that
$$\begin{aligned} \varvec{c}(\varvec{{\tilde{z}}})^{\top } \varvec{x}^{*} + \varvec{d}^{\top } \varvec{y}^{*}(\varvec{{\tilde{z}}}) > \varvec{c}(\varvec{{\tilde{z}}})^{\top } \varvec{x}^{*} + \varvec{d}^{\top } \varvec{{\tilde{y}}}(\varvec{{\tilde{z}}}). \end{aligned}$$However, then \((\varvec{z},\varvec{y}) = (\varvec{{\tilde{z}}}, \varvec{{\tilde{y}}}(\varvec{{\tilde{z}}}))\) is feasible to (12) with positive objective value. This is a contradiction.
-
Optimal objective value is positive: Let \((\varvec{{\bar{z}}},\varvec{{\bar{y}}})\) denote the optimal solution to (12) and let \({\bar{v}}\) denote the optimal objective value. The decision rule
$$\begin{aligned} \varvec{y}(\varvec{z}) = {\left\{ \begin{array}{ll} \varvec{y}^{*}(\varvec{z}) &{} \text { if } \varvec{z} \ne \varvec{{\bar{z}}} \\ \varvec{{\bar{y}}} &{} \text { otherwise,} \end{array}\right. } \end{aligned}$$dominates the decision rule \(\varvec{y}^{*}(\cdot )\), so the latter is not PARO. We prove the last part of the lemma by contradiction. Suppose there exists a scenario \(\varvec{{\tilde{z}}}\) and a decision \(\varvec{{\tilde{y}}}\) such that
$$\begin{aligned} \Big (\varvec{c}(\varvec{{\tilde{z}}})^{\top } \varvec{x}^{*} + \varvec{d}^{\top } \varvec{y}^{*}(\varvec{{\tilde{z}}})\Big ) - \Big ( \varvec{c}(\varvec{{\tilde{z}}})^{\top } \varvec{x^{*}} + \varvec{d}^{\top }\varvec{{\tilde{y}}} \Big )&> {\bar{v}}, \\ \varvec{A}(\varvec{{\tilde{z}}}) \varvec{x}^{*} + \varvec{B}\varvec{{\tilde{y}}}&\le \varvec{r}(\varvec{{\tilde{z}}}), \end{aligned}$$i.e., \(\varvec{{\tilde{y}}}\) is a feasible wait-and-see decision for scenario \(\varvec{{\tilde{z}}}\), and the resulting objective value of \(\varvec{y}^{*}(\varvec{{\tilde{z}}})\) exceeds that of \(\varvec{{\tilde{y}}}\) by more than \({\bar{v}}\). Then \((\varvec{{\tilde{z}}}, \varvec{{\tilde{y}}})\) is feasible to (12) with a strictly better objective value than \({\bar{v}}\). This is a contradiction.
1.10 Proof Lemma 6
Proof by contradiction, analogous to proof of Theorem 1 of Iancu and Trichakis [16]. Because U is the convex hull of \(\varvec{z}^1,\dotsc ,\varvec{z}^N\), (15b) and (15c) ensure that \(\varvec{x}^{*}\) is ARO to (3) (with \(\varvec{d}=\varvec{0}\)). Suppose \(\varvec{x}^{*}\) is not PARO to (3). According to Definition 7(i) there exists an \(\varvec{{\hat{x}}}\) that is ARO to (3) and
Because \(\varvec{{\hat{x}}}\) is ARO to (3), there also exist \((\varvec{{\hat{y}}^1},\dotsc ,\varvec{{\hat{y}}^N})\) that, together with \(\varvec{{\hat{x}}}\), are feasible to (15).
The linear optimization problem \(\min _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }(\varvec{{\hat{x}}} - \varvec{x}^{*})\) attains the minimum in a vertex solution, so without loss of generality we can assume \(\varvec{{\hat{z}}} \in \text {ext}(U)\). Any point \(\varvec{{\bar{z}}} \in \text {ri}(U)\) can be written as a strict convex combination of the extreme points of U [25], so \(\varvec{{\bar{z}}} = \sum _{i=1}^N \alpha _i \varvec{z}^i\) for some \(\varvec{\alpha } \in {\mathbb {R}}^{N}\) with \(\sum _{i=1}^N \alpha _i = 1\), \(\alpha _i >0\) for all i. Then
where the first term of the RHS is nonpositive and the second term is strictly negative. This contradicts the fact that \((\varvec{x}^{*},\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*})\) is optimal to (15).
1.11 Proof Lemma 7
In iteration 0 of Algorithm 2, solution \(\varvec{x}^0\) is the ARO solution resulting from Algorithm 1. In subsequent iterations, solution \(\varvec{x}^k\) is only replaced by a candidate solution \(\varvec{x}^c\) if \(q(\varvec{x}^c) \le \text {OPT}\). Value \(q(\varvec{x}^c)\) is the optimal objective value of \(Q(\varvec{x}^c)\), i.e., it is the worst-case objective value of (3) with fixed Stage-1 decision \(\varvec{x}^c\). Thus, \(\varvec{x}^k\) is only replaced by \(\varvec{x}^c\) if \(\varvec{x}^c\) is ARO, so in any iteration \(\varvec{x}^k\) is ARO. It remains to show that if the optimal objective value of problem \(P_2(\varvec{x}^k,M^k)\) is nonnegative, solution \(\varvec{x}^k\) is PARO. Proof by contradiction.
Suppose \(\varvec{x}^k\) is not PARO. Then there exists another \(\varvec{x}^{*}\) that is ARO to (3) that additionally satisfies the following two conditions:
-
1.
For each \(\varvec{z} \in U\) there exists a \(\varvec{y}\) such that for all \(\varvec{y}^k\) with \(\varvec{A}(\varvec{z})\varvec{x}^k + \varvec{B}\varvec{y}^k \le \varvec{r}(\varvec{z})\) we have
$$\begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{x^{*}} + \varvec{d}^{\top } \varvec{y}&\le \varvec{c}(\varvec{z})^{\top }\varvec{x}^k + \varvec{d}^{\top }\varvec{y}^k,\\ \varvec{A}(\varvec{z})\varvec{x^{*}} + \varvec{B} \varvec{y}&\le \varvec{r}(\varvec{z}). \end{aligned}$$ -
2.
There exists a \(\varvec{z^{*}} \in U\) and a \(\varvec{y^{*}}\) such that for all \(\varvec{y}^k\) with \(\varvec{A}(\varvec{z}^{*})\varvec{x}^k + \varvec{B}\varvec{y}^k \le \varvec{r}(\varvec{z^{*}})\) we have
$$\begin{aligned} \varvec{c}(\varvec{z^{*}})^{\top }\varvec{x}^{*} + \varvec{d}^{\top } \varvec{y^{*}}&< \varvec{c}(\varvec{z}^{*})^{\top }\varvec{x}^k + \varvec{d}^{\top }\varvec{y}^k, \\ \varvec{A}(\varvec{z}^{*})\varvec{x}^{*} + \varvec{B} \varvec{y}^{*}&\le \varvec{r}(\varvec{z}^{*}). \end{aligned}$$
Because \(M^k \subseteq U\), the first condition implies that for each \(\varvec{z}^l \in M^k\) there exists a recourse decision \(\varvec{y}^{l *}\) such that \((\varvec{x}^{*},\varvec{y}^{l *})\) satisfies constraints (19c) and (19d). The second condition is equivalent to the statement that there exists \(\varvec{z}^{*} \in U\) and a \(\varvec{y}^{*}\) such that
Put together, this implies that \((\varvec{z}^{*},\varvec{x}^{*},\varvec{y}^{*},\varvec{y}^{1 *},\dotsc ,\varvec{y}^{ |M^k| *})\) is a feasible solution to \(P_2(\varvec{x}^k,M^k)\) with strictly negative objective value. This contradicts with \(p_2^k \ge 0\). Thus, \(\varvec{x}^k\) is PARO.
1.12 Proof Lemma A.1
We consider only adaptive robust feasibility and not optimality, so the objective of \(P_{\text {hybrid}}\) can be ignored. According to Lemma B.1, each adaptive variable \(y_k(\varvec{z})\), \(k=1,\dotsc ,n_y\) must satisfy bounds (29). For \(P_{\text {hybrid}}\) term \(\varphi _i(\varvec{{\hat{z}}},\varvec{z}_{(i)}) = r_i(\varvec{{\hat{z}}},\varvec{z}_{(i)}) - \varvec{a}_i(\varvec{{\hat{z}}}, \varvec{z}_{(i)})^{\top }\varvec{x}\) depends only on \(\varvec{{\hat{z}}}\) and \(\varvec{z}_{(i)}\), for each \(i=1,\dotsc ,m\). Sets \({\hat{U}}\) and \(U^i\) are disjoint for each \(i=1,\dotsc ,m\) so this is equivalent to
We proceed by backward induction. For \(k=n_y\), i.e., the last eliminated variable, bounds (55) depend only on \(\varvec{z}\) and not on other adaptive variables. According to Lemma B.1, each term \(\varphi _i(\varvec{z}_{(i)})\), \(i=1,\dotsc ,m\), appears in upper bounds with a positive coefficient and in lower bounds with a negative coefficient for all variables \(y_1(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\) (if it appears), or vice versa. Hence, the worst-case scenario for \(\varvec{z}_{(i)} \in U^i\) (in terms of feasibility) is equal for all linear terms in the lower and the upper bound for all \(i=1,\dotsc ,m\). Plugging in this worst-case scenario yields lower and upper bounds on \(y_{n_y}(\varvec{z})\) depending only on \(\varvec{{\hat{z}}}\). Thus, there exists a decision rule for \(y_{n_y}(\cdot )\) that is a function of only the non-constraintwise uncertain parameters \(\varvec{{\hat{z}}}\).
Suppose that for some k the lower and upper bounds (55) for \(y_k(\varvec{z})\) depend only on \(\varvec{{\hat{z}}}\). Thus, there exists a decision rule for \(y_{k}(\cdot )\) that is a function of only \(\varvec{{\hat{z}}}\). Plug this decision rule in the lower and upper bounds (55) for \(y_{k-1}(\varvec{z})\). Then, according to Lemma B.1, each term \(\varphi _i(\varvec{z}_{(i)})\), \(i=1,\dotsc ,m\), appears in upper bounds with a positive coefficient and in lower bounds with a negative coefficient (if it appears), or vice versa. Hence, the worst-case scenario for \(\varvec{z}_{(i)} \in U^i\) (in terms of feasibility) is equal for all linear terms in the lower and the upper bound, for all \(i=1,\dotsc ,m\). Plugging in this worst-case scenario yields lower and upper bounds on \(y_{k-1}(\varvec{z})\) depending only on \(\varvec{{\hat{z}}}\). This completes the induction.
Let \(\varvec{y}(\varvec{{\hat{z}}})\) be the decision rule resulting from the above procedure. Because \(\varvec{x}\) is ARF to \(P_{\text {hybrid}}\), the resulting pair \((\varvec{x},\varvec{y}(\varvec{{\hat{z}}}))\) is ARF to \(P_{\text {hybrid}}\).
1.13 Proof Corollary A.1
We note that if (3) has hybrid uncertainty and the objective (3a) contains adaptive variables, it can equivalently be written as
where \(t \in {\mathbb {R}}\) is an auxiliary Stage-1 decision variable. Problem (56) also has hybrid uncertainty, and a pair \((\varvec{x},\varvec{y}(\cdot ))\) is ARO to (3) if and only if there exists a \(t \in {\mathbb {R}}\) such that \((\varvec{x},\varvec{y}(\cdot ),t)\) is ARO to (56). Thus, in the remainder of the proof we can assume \(\varvec{d}=\varvec{0}\), i.e., the objective is independent of adaptive variables.
According to Lemma A.1, for any ARF \(\varvec{x}\) there exists a decision rule \(\varvec{y}(\cdot )\) that depends only on \(\varvec{{\hat{z}}}\) such that \((\varvec{x},\varvec{y}(\cdot ))\) is ARF to \(P_{\text {hybrid}}\). Any \(\varvec{x}^{*}\) that is ARO to \(P_{\text {hybrid}}\) is also ARF to \(P_{\text {hybrid}}\), so also for each ARO \(\varvec{x}^{*}\) there exists such a decision rule \(\varvec{y}^{*}(\cdot )\). The objective is independent of adaptive variables, so \((\varvec{x}^{*},\varvec{y}(\cdot ))\) is ARO for any ARF \(\varvec{y}(\cdot )\). Hence, \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to \(P_{\text {hybrid}}\).
1.14 Proof Lemma A.2
We consider only adaptive robust feasibility and not optimality, so the objective of \(P_{\text {block}}\) can be ignored. Remove index 0 from its constraint set K(v) (for some v). The set of constraints can be written as
Due to the block uncertainty structure, all adaptive variables can be eliminated by performing FME on each block v separately. According to Lemma B.1, bounds on each adaptive variable \(y_k(\varvec{z})\) can be represented by (29). If for some \(k=1,\dotsc ,n_y\), variable \(y_k(\varvec{z})\) is an element of \(\varvec{y}_{(v)}(\varvec{z})\) for some block v, any \(S \in C_k^{-}\) or \(T \in C_k^{+}\) is a subset of \(K_{(v)}\), the original set of constraints for block v. The following two observations immediately follow for the given block v:
-
For each \(l=1,\dotsc ,n_y\) the coefficient of \(y_l(\varvec{z})\) is zero if \(y_l(\varvec{z})\) is not an element of \(\varvec{y}_{(v)}\), i.e., \(\beta (S,l) = 0\) for all \(S \in C_k^{-} \cup C_k^{+}\).
-
For any p in S or T it holds that \(\varphi _p(\cdot )\) is a function of \(\varvec{z}_{(v)}\) only.
For \(k=n_y\), i.e., the last eliminated variable, this implies the lower and upper bounds on \(y_{n_y}()\) are independent of \(\varvec{z}_{(w)}\) for \(w\ne v\), and any feasible decision rule can be written as a function of \(\varvec{z}_{(v)}\) only. Plugging any such decision rule in the lower and upper bounds for \(k=n_y-1\) yields the same result for \(y_{n_y-1}()\). The final result follows from backward induction.
Let \(\varvec{y}(\varvec{z})\) be the decision rule resulting from the above procedure. Because \(\varvec{x}\) is ARF to \(P_{\text {block}}\), the resulting pair \((\varvec{x},\varvec{y}(\varvec{z}))\) is ARF to \(P_{\text {block}}\).
1.15 Proof Lemma A.3
We consider only adaptive robust feasibility and not optimality, so the objective of \(P_{\text {simplex}}\) can be ignored. According to Lemma B.1, in the FME procedure the bounds on variable \(y_k(\varvec{z})\) are given by (29). It is sufficient to satisfy the bounds on \(y_k(\varvec{z})\) for all extreme points of uncertainty set U, so we can alternatively write:
For each \(j=1,\dotsc ,L+1\), let \(l_k(\varvec{z}^j)\) and \(u_k(\varvec{z}^j)\) denote the lower resp. upper bound on \(y_k(\varvec{z}^j)\) from (57). Affine independence of \(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1}\) implies linear independence of \((1,\varvec{z}^1),\dotsc ,\) \((1,\varvec{z}^{L+1})\). Hence, by basic linear algebra, there exists exactly one \((a_0,\varvec{a}) \in {\mathbb {R}}\times {\mathbb {R}}^{L}\) such that \(a_0 + \varvec{a}^{\top }\varvec{z}^j = l(\varvec{z}^j)\) for all \(j=1,\dotsc ,L+1\). Consider the LDR \(y_k(\varvec{z}) = a_0 + \varvec{a}^{\top }\varvec{z}\). Then \(l(\varvec{z}^j) = y_k(\varvec{z}^j) \le u(\varvec{z}^j)\) for all \(j=1,\dotsc ,L+1\). Hence, \(y_k(\varvec{z})\) is an LDR that satisfies bounds (57). Alternatively, one can construct an LDR that passes through points \((\varvec{z}^j,u(\varvec{z}^j))\) for all \(j=1,\dotsc ,L+1\), or any LDR that is a convex combination of the previous two LDRs.
Thus, we can construct a decision rule for \(y_k(\varvec{z})\) that is linear in \(\varvec{z}\). For all \(k=1,\dotsc ,n_y-1\), this decision rule depends on \(y_{k+1}(\varvec{z}),\dotsc ,y_{n_y}(\varvec{z})\). For variable \(y_{n_y}(\cdot )\), the constructed decision rule is independent of other adaptive variables. Plugging this in the decision rule for \(y_{n_y-1}(\cdot )\) yields a decision rule that is again independent of other adaptive variables, and still linear in \(\varvec{z}\) because the coefficient for \(y_{n_y}(\varvec{z})\) in \(l_{n_y-1}(\varvec{z})\) and \(u_{n_y-1}(\varvec{z})\) does not depend on \(\varvec{z}\) (fixed recourse). Continuing this procedure yields LDRs for all adaptive variables \(y_1(\cdot ),\dotsc ,y_{n_y}(\cdot )\).
Let \(\varvec{y}(\varvec{z})\) be the decision rule resulting from the above procedure. Because \(\varvec{x}\) is ARF to \(P_{\text {simplex}}\), the resulting pair \((\varvec{x},\varvec{y}(\varvec{z}))\) is ARF to \(P_{\text {simplex}}\).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bertsimas, D., ten Eikelder, S.C.M., den Hertog, D. et al. Pareto Adaptive Robust Optimality via a Fourier–Motzkin Elimination lens. Math. Program. 205, 485–538 (2024). https://doi.org/10.1007/s10107-023-01983-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-023-01983-z
Keywords
- Robust optimization
- Adaptive robust optimization
- Pareto optimality
- Fourier–Motzkin elimination
- Decision rules