1 Introduction

Robust Optimization (RO) is a widespread methodology for modeling decision-making problems under uncertainty that seeks to optimize worst-case performance [7, 11, 12]. In practice, RO problems usually admit multiple worst-case optimal solutions, the performance of which may differ substantially under non-worst-case uncertainty scenarios. Consequently, the choice of an optimal solution often has material impact on performance under real-world implementations. This important consideration, which was first brought forth by Iancu and Trichakis [16], has been successfully tackled for static, single-stage (linear) RO problems. For the increasingly popular and broad class of dynamic, multi-stage Adaptive Robust Optimization (ARO) problems [2], however, there is no successful approach for choosing an optimal solution, and the purpose of this paper is to bridge this gap.

In particular, for static RO problems, Iancu and Trichakis [16] proposed the choice of so-called Pareto Robustly Optimal (PRO) solutions. In general, PRO solutions unarguably dominate non-PRO solutions, because, by definition, the former guarantee that there do not exist other worst-case optimal solutions that perform at least as good as the current solution for all scenarios in the uncertainty set, while performing strictly better for at least one scenario. The PRO concept has been applied to several static robust optimization settings, such as distributionally robust mechanism design [18].

Going beyond static RO problems, it is well understood that the choice of an optimal solution remains crucial for the broader class of multi-stage ARO problems. Similar to RO solutions, by following a worst-case philosophy and not considering performance across the entire spectrum of possible scenarios, ARO optimal solutions could lead to substantial performance losses. For example, see the work by De Ruiter et al. [9], who numerically demonstrate existence of multiple worst-case optimal solutions for the classical multi-stage inventory-production model that was considered in Ben-Tal et al. [2], and find them to differ considerably from each other in their non-worst-case performance.

For ARO problems, however, a solution approach that unarguably chooses “good solutions,” similar to PRO solutions for static RO problems, has proved to be elusive thus far. Extant approaches have all attempted to simply apply the concept of PRO to ARO problems. Specifically, they advocate restricting attention to adaptive variables that depend affinely on the uncertain parameters; commonly referred to as Linear Decision Rules (LDRs). Restricting to LDRs reduces the problem to static RO, and enables the search for associated PRO solutions [8, 9, 16]. As we shall show, however, this indirect application of the PRO concept fails to produce solutions that cannot be dominated.

Recently, the concept of Pareto Adaptive Robustly Optimal (PARO) solutions was introduced for a specific ARO application [26], but no general treatment of the topic was included. In this paper, we formalize and study the concept of PARO solutions for linear ARO problems. Similar to PRO solutions for static RO problems, PARO solutions yield worst-case optimal performance and are not dominated by any other such solutions in non-worst-case scenarios. In other words, PARO solutions unarguably dominate non-PARO solutions, leading to improved performance in non-worst-case scenarios, while maintaining worst-case optimality. From a practical standpoint, this means that implementing PARO solutions can only yield performance benefits, without any associated drawbacks.

To introduce the PARO concept and highlight its practical importance, we provide an illustrative toy example. The example also serves two additional important purposes. First, it enables us to show in a simple setting how PARO solutions can dominate PRO solutions, as remarked above. Second, the example motivates the need for new analysis techniques for studying PARO.

Example 1

In treatment planning for radiation therapy, the goal is to deliver a curative amount of dose to the target volume (tumor tissue), while minimizing the dose to healthy tissues. Consider a simplified case with two target subvolumes. For subvolume \(i\in \{1,2\}\), the required dose level \(d_i\) depends on the radiation sensitivity of the tissue, which is unknown. Assume that, prior to treatment, the doses lie in

$$\begin{aligned} U = \{(d_1,d_2)~|~ 50 \le d_i \le 60,~i=1,2\}. \end{aligned}$$

Mid-treatment, the required doses are ascertained via biomarker measurements.

Treatment doses are administered in two stages. The dose administered in the first stage, denoted by x, needs to be decided prior to treatment. The dose administered in the second stage, denoted by y, can be decided after the required doses have been ascertained, i.e., it can be adapted to uncertainty revelation. Both treatment doses are delivered homogeneously over both volumes in each stage. Dose in each stage is limited to the interval [20, 40]. The total dose administered is \(x+y\), and the healthy tissue receives a fraction \(\delta >0\) of it. The Stage-1 dose x, and a decision rule \(y(\cdot )\) for the adaptive Stage-2 dose can be chosen by solving:

$$\begin{aligned} \min _{x,y(\cdot )} ~&~ \max _{(d_1,d_2) \in U} \delta (x + y(d_1,d_2)) \end{aligned}$$
(1a)
$$\begin{aligned} \text {s.t.}~&~ x + y(d_1,d_2) \ge d_1, \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(1b)
$$\begin{aligned} ~&~ x + y(d_1,d_2) \ge d_2,\quad \forall (d_1,d_2) \in U, \end{aligned}$$
(1c)
$$\begin{aligned} ~&~ 20 \le x \le 40, \end{aligned}$$
(1d)
$$\begin{aligned} ~&~ 20 \le y(d_1,d_2) \le 40,\quad \forall (d_1,d_2) \in U. \end{aligned}$$
(1e)

Problem (1) is an ARO problem with constraintwise uncertainty, for which static decision rules are worst-case optimal [2]. Plugging in \(y(d_1,d_2) = {\hat{y}}\) and solving the resulting static RO model yields a worst-case optimal objective value of \(60\delta \), achieved by all \((x,{\hat{y}})\) such that \(x+{\hat{y}}=60\). For any such solution, the objective value remains \(60\delta \) in not only the worst-case scenario but in all scenarios. Hence, all these solutions are PRO, according to the definition of Iancu and Trichakis [16]. Consequently, the Stage-1 decisions that are PRO lie in the set:

$$\begin{aligned} X^{\text {PRO}} = \{x ~|~ 20 \le x \le 40 \}. \end{aligned}$$

Consider now the decision rule \(y^{*}(d_1,d_2) = \max \{20,d_1-x,d_2-x\}\), which is feasible for all feasible x. Furthermore, this rule minimizes the objective for any fixed x, \(d_1\) and \(d_2\). Plugging this in gives

$$\begin{aligned} \min _{20 \le x \le 40} \max _{(d_1,d_2) \in U} \delta \max \{20 + x,d_1,d_2\}. \end{aligned}$$

For given \((d_1,d_2)\) the objective value is at least \(\delta \max \{d_1,d_2\}\), and this is achieved by all \(x \le 30\). Thus, it should be preferable to implement one of these solutions for the Stage-1 decision. In fact, these solutions, which we call PARO, cannot be dominated by other solutions. Notably, the set of PARO solutions

$$\begin{aligned} X^{\text {PARO}} = \{x~|~ 20 \le x \le 30\}, \end{aligned}$$

is a strict subset of \(X^{\text {PRO}}\). This implies that PARO solutions could dominate PRO solutions that are non-PARO. To exemplify, compare the following three solutions: (i) PARO solution \(x^{*} = 25\) with optimal decision rule \(y^{*}(d_1,d_2)\), (ii) PRO (non-PARO) solution \({\hat{x}} = 35\) with optimal decision rule \(y^{*}(d_1,d_2)\), (iii) PRO solution \({\hat{x}} = 35\) with static decision rule \({\hat{y}}=25\).

Table 1 Difference PARO and PRO solutions for Example 1

Table 1 shows the performance for three scenarios. For worst-case scenario (60, 60) all solutions perform equal. For scenario (50, 55) the solution (iii) is outperformed by the other two solutions, for scenario (50, 50) both solutions (ii) and (iii)) are outperformed by PARO solution (i). There is no scenario where \({\hat{x}}\) results in a strictly better objective value than \(x^{*}\), irrespective of the used decision rule. Thus, the PRO solution \({\hat{x}}\) is dominated by the PARO solution \(x^{*}\).

Besides showing that PRO solutions could be dominated in ARO problems, Example 1 also provides intuition into how. In particular, what unlocks extra performance in ARO problems is the application of decision rules that are not merely worst-case optimal, but rather “Pareto optimal,” i.e., they optimize performance over non-worst-case scenarios as well. Note, however, that although for worst-case optimality linear decision rules might suffice under special circumstances, for Pareto optimality non-linear rules appear to be more often necessary, as illustrated by the example.

The application of non-linear decision rules to study PARO solutions invalidates the techniques used in the analysis of Pareto efficiency in RO in the extant literature, which is solely focused on linear formulations. In other words, analysis of Pareto efficiency in ARO calls for a new line of attack, which brings us to another contribution we make. Specifically, to study PARO solutions, we rely heavily on Fourier–Motzkin Elimination (FME) as a proof technique. Through the lens of FME we consider optimality of decision rule structures, which then enables us to study PARO. As a byproduct, we illustrate how this proof technique can be applied in ARO more generally, by providing more general and more insightful proofs of known results (not related to Pareto efficiency).

1.1 Findings and contributions

Before we begin our analysis, we summarize the findings and the contributions of this paper. The treatment presented is restricted to two-stage ARO models that are linear in both decision variables and uncertain parameters.

  1. 1.

    Concept of PARO Solutions In the context of linear ARO problems, we formalize the concept of Pareto Adaptive Robustly Optimal (PARO) solutions. PARO solutions have the property that no other solution and associated adaptive decision rule exist that dominate them, i.e., perform at least as good under any scenario, and perform strictly better under at least some scenario. As Example 1 above has already shown, in the context of ARO problems, PARO solutions can dominate other Pareto optimal solution concepts already proposed in the literature [16]. In practice, PARO solutions can only yield performance benefits compared with non-PARO solutions, as the latter lead to efficiency losses.

  2. 2.

    Properties of PARO Solutions We derive several properties of PARO solutions. The main results are that, for linear ARO problems with fixed and continuous recourse, affine dependence on uncertain parameters and a compact feasible region, there exists a first-stage PARO solution, and there exists a piecewise linear (PWL) decision rule that is PARO. To arrive at these results, our analysis relies on FME.

  3. 3.

    Finding PARO Solutions and their Practical Value We present several approaches to find and/or approximate PARO solutions in practice, amongst others using techniques based on FME. We also conduct a numerical study for a facility location example. The results reveal that (approximate) PARO solutions can yield substantially better performance in non-worst-case scenarios than worst-case optimal and PRO solutions, thus demonstrating the practical value of the proposed methodology.

  4. 4.

    FME as a Proof Technique for PARO. Zhen et al. [30] introduce FME as both a solution and proof technique for ARO. We apply and extend the latter idea, and use FME to prove worst-case and Pareto optimality of various decision rule structures. We extend and/or generalize known results in ARO, not related to Pareto optimality, and provide more insightful proofs; for example, one that uses FME to establish the results by Bertsimas and Goyal [5] and Zhen et al. [30] on optimality of LDRs under simplex uncertainty sets.

Finally, to better position our contributions vis-à-vis the extant literature, we note that, as mentioned earlier, PARO solutions were previously discussed in [26]. They consider a specific 3-variable non-linear ARO problem arising in radiation therapy planning; their approach for finding PARO solutions relies heavily on the specific formulation. No general treatment of PARO was included, although this demonstrates that PARO may also be relevant for non-linear ARO problems. In the current paper, we formalize the PARO concept, derive properties, such as existence of PARO solutions, and also discuss constructive approaches towards finding them. With regards to FME, Zhen et al. [30] were the first to recognize its applicability to linear ARO problems, owing to its ability to eliminate adaptive variables. They use FME as both a solution and proof technique; for the latter the main obstacle is the exponential increase in number of constraints after variable elimination. In the current paper, we apply and extend the ideas of Zhen et al. [30], and use FME as a proof technique. Through the lens of FME we first consider (worst-case) optimality of decision rule structures, and provide more general and more insightful proofs of known results. Subsequently, we investigate Pareto optimality using FME and present numerical results which demonstrate the value of PARO solutions.

1.2 Notation and organization

Boldface characters represent matrices and vectors. All vectors are column vectors and the vector \(\varvec{a}_i\) is the i-th row of matrix \(\varvec{A}\). The space of all measurable functions from \({\mathbb {R}}^n\) to \({\mathbb {R}}^m\) that are bounded on compact sets is denoted by \({\mathcal {R}}^{n,m}\). The vectors \(\varvec{e}_i\), \(\varvec{1}\) and \(\varvec{0}\) are the standard unit basis vector, the vector of all-ones and the vector of all-zeros, respectively. Matrix \(\varvec{I}\) is the identity matrix. The relative interior of a set S is denoted by \(\text {ri}(S)\); its set of extreme points is denoted by \(\text {ext}(S)\).

The manuscript is organized as follows. First, Sect. 2 introduces the ARO setting and the notion of PARO. Section 3 investigates the existence of PARO solutions using FME, and Sect. 4 presents some practical approaches for the construction of PARO solutions. In Sect. 6 we present the results of our numerical experiments, and Sect. 7 concludes the manuscript. Appendix A uses FME to establish (worst-case) optimality of decision rule structures.

2 Pareto optimality in (adaptive) robust optimization

We first generalize the definition of PRO of Iancu and Trichakis [16] to non-linear static RO problems. The reason for this is that there turns out to be a relation between Pareto efficiency for non-linear static RO problems and linear ARO problems. Let \(\varvec{x} \in {\mathcal {X}} \subseteq {\mathbb {R}}^{n_x}\) denote the decision variables and let \(\varvec{z} \in U \subseteq {\mathbb {R}}^L\) denote the uncertain parameters. Let \(f: {\mathbb {R}}^{n_x} \times {\mathbb {R}}^L \mapsto {\mathbb {R}}\) and consider the static RO problem

$$\begin{aligned} \min _{\varvec{x}\in {\mathcal {X}}} \max _{\varvec{z} \in U}~f(\varvec{x},\varvec{z}). \end{aligned}$$
(2)

Let \({\mathcal {X}}^{\text {RO}}\) denote the set of robustly optimal (i.e., worst-case optimal) solutions. A robustly optimal solution \(\varvec{x}\) is PRO if there does not exist another robustly optimal solution \(\varvec{{\bar{x}}}\) that performs at least as good as \(\varvec{x}\) for all scenarios in the uncertainty set, while performing strictly better for at least one scenario. If such a solution \(\varvec{{\bar{x}}}\) does exist, it is said to dominate \(\varvec{x}\). In practice, solution \(\varvec{{\bar{x}}}\) will always be preferred over \(\varvec{x}\). If all uncertainty in the objective is moved to constraints using an epigraph formulation, Pareto robust optimality may also be defined in terms of slack variables [16, Section 4.1], but we do not use that definition here. We use the following formal definition:

Definition 1

(Pareto Robustly Optimal) A solution \(\varvec{x} \in {\mathcal {X}}^{\text {RO}}\) is PRO to (2) if there does not exist another \(\varvec{{\bar{x}}} \in {\mathcal {X}}^{\text {RO}}\) such that

$$\begin{aligned} f(\varvec{{\bar{x}}},\varvec{z})&\le f(\varvec{x},\varvec{z}), ~~\forall \varvec{z} \in U, \\ f(\varvec{{\bar{x}}},\varvec{{\bar{z}}})&< f(\varvec{x},\varvec{{\bar{z}}}), ~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned}$$

We aim to extend the concept of PRO to ARO problems. In particular, we consider the following adaptive linear optimization problem:

$$\begin{aligned} \min _{\varvec{x},\varvec{y}(\cdot )} ~&~ \max _{\varvec{z} \in U}~ \varvec{c}(\varvec{z})^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}(\varvec{z}), \end{aligned}$$
(3a)
$$\begin{aligned} \text {s.t.} ~&~ \varvec{A}(\varvec{z})\varvec{x} + \varvec{B}\varvec{y}(\varvec{z}) \le \varvec{r}(\varvec{z}),~~\forall \varvec{z} \in U, \end{aligned}$$
(3b)

where \(\varvec{z}\in U \subseteq {\mathbb {R}}^{L}\) is an uncertain parameter, with U a compact, convex uncertainty set with nonempty relative interior. Variables \(\varvec{x} \in {\mathbb {R}}^{n_x}\) are the Stage-1 (here-and-now) decisions. Usually we will assume \(\varvec{x}\) to be continuous variables, but we emphasize that all results in the paper also hold if (part of) \(\varvec{x}\) is restricted to be integer-valued. Variables \(\varvec{y} \in {\mathcal {R}}^{L,n_y}\) are also continuous and capture the Stage-2 (wait-and-see) decisions, i.e., they are functions of \(\varvec{z}\). The matrix \(\varvec{B} \in {\mathbb {R}}^{m \times n_y}\) and vector \(\varvec{d} \in {\mathbb {R}}^{n_y}\) are assumed to be constant (fixed recourse), and \(\varvec{A}(\varvec{z})\), \(\varvec{r}(\varvec{z})\) and \(\varvec{c}(\varvec{z})\) depend affinely on \(\varvec{z}\):

$$\begin{aligned} \varvec{A}(\varvec{z}) := \varvec{A}^0 + \sum _{k=1}^{L} \varvec{A}^k z_k,~~\varvec{r}(\varvec{z}) := \varvec{r}^0 + \sum _{k=1}^{L} \varvec{r}^k z_k,~~\varvec{c}(\varvec{z}) := \varvec{c}^0 + \sum _{k=1}^{L} \varvec{c}^k z_k, \end{aligned}$$

with \(\varvec{A}^0,\dotsc ,\varvec{A}^{L} \in {\mathbb {R}}^{m \times n_x}\), \(\varvec{r}^0,\dotsc ,\varvec{r}^{L} \in {\mathbb {R}}^{m}\) and \(\varvec{c}^0,\dotsc ,\varvec{c}^{L} \in {\mathbb {R}}^{n_x}\). Uncertainty in the objective (3a) can be moved to the constraint using an epigraph formulation. Nevertheless, it is explicitly stated in the objective to facilitate a convenient definition of PARO. Let \(\text {OPT}\) denote the optimal (worst-case) objective value of (3). We continue by stating several assumptions and definitions regarding adaptive robust feasibility and optimality.

Definition 2

(Adaptive Robustly Feasible) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is Adaptive Robustly Feasible (ARF) to (3) if \(\varvec{A}(\varvec{z})\varvec{x} + \varvec{B}\varvec{y}(\varvec{z}) \le \varvec{r}(\varvec{z}),~~\forall \varvec{z} \in U\).

Sometimes it is useful to consider properties of the first- and second-stage decisions separately. Therefore, we also define adaptive robust feasibility for Stage-1 and Stage-2 decisions individually.

Definition 3

(Adaptive Robustly Feasible \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))

  1. (i)

    A Stage-1 decision \(\varvec{x}\) is ARF to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARF to (3).

  2. (ii)

    A Stage-2 decision \(\varvec{y}(\cdot )\) is ARF to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARF to (3).

The set of all ARF solutions \(\varvec{x}\) is given by

$$\begin{aligned} {\mathcal {X}} = \{\varvec{x} \in {\mathbb {R}}^{n_x}~|~\exists \varvec{y} \in {\mathcal {R}}^{L,n_y} : \varvec{A}(\varvec{z})\varvec{x} + \varvec{B}\varvec{y}(\varvec{z}) \le \varvec{r}(\varvec{z}),~~\forall \varvec{z} \in U \}. \end{aligned}$$

We assume set \({\mathcal {X}}\) is nonempty, i.e., there exists an \(\varvec{x}\) that is ARF, and we assume (3) has a finite optimal objective value, i.e., OPT is a finite number. After feasibility, the natural next step is to formally define optimality.

Definition 4

(Adaptive Robustly Optimal) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is adaptive robustly optimal (ARO)Footnote 1 to (3) if it is ARF and \(\varvec{c}(\varvec{z})^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}(\varvec{z}) \le \text {OPT},~~\forall \varvec{z} \in U.\)

We also define adaptive robust optimality for Stage-1 and Stage-2 decisions individually.

Definition 5

(Adaptive Robustly Optimal \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))

  1. (i)

    A Stage-1 decision \(\varvec{x}\) is ARO to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARO to (3).

  2. (ii)

    A Stage-2 decision \(\varvec{y}(\cdot )\) is ARO to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is ARO to (3).

We are now in position to define Pareto Adaptive Robust Optimality for two-stage ARO problems.

Definition 6

(Pareto Adaptive Robustly Optimal) A pair \((\varvec{x},\varvec{y}(\cdot ))\) is Pareto Adaptive Robustly Optimal (PARO) to (3) if it is ARO and there does not exist a pair \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that is ARO and

$$\begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{{\bar{x}}} + \varvec{d}^{\top } \varvec{{\bar{y}}}(\varvec{z})&\le \varvec{c}(\varvec{z})^{\top }\varvec{x} + \varvec{d}^{\top }\varvec{y}(\varvec{z}), ~~\forall \varvec{z} \in U, \\ \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{{\bar{x}}} + \varvec{d}^{\top }\varvec{{\bar{y}}}(\varvec{{\bar{z}}})&< \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x} + \varvec{d}^{\top }\varvec{y}(\varvec{{\bar{z}}}), ~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned}$$

As before, the definitions can be extended to Stage-1 and Stage-2 decisions individually.

Definition 7

(Pareto Adaptive Robustly Optimal \(\varvec{x}\) and/or \(\varvec{y}(\cdot )\))

  1. (i)

    A Stage-1 decision \(\varvec{x}\) is PARO to (3) if there exists a \(\varvec{y}(\cdot )\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is PARO to (3).

  2. (ii)

    A Stage-2 decision \(\varvec{y}(\cdot )\) is PARO to (3) if there exists a \(\varvec{x}\) such that \((\varvec{x}, \varvec{y}(\cdot ))\) is PARO to (3).

Our main interest is in Definition 7(i). The reason for this is that the here-and-now decision \(\varvec{x}\) is usually the only one that the decision maker has to commit to. In contrast, instead of using decision rule \(\varvec{y}(\cdot )\), one can often resort to re-solving the optimization problem for the second stage once the value of the uncertain parameter has been revealed. This is known as the folding horizon approach, and it is applicable as long as there is time to re-solve between observing \(\varvec{z}\) and having to implement \(\varvec{y}(\varvec{z})\). There is no such alternative for \(\varvec{x}\), however, and different decisions in Stage 1 may lead to different adaptation possibilities in Stage 2.

Lastly, Pareto optimality can also be investigated for Stage-2 decisions if the Stage-1 decision \(\varvec{x}\) is fixed.

Definition 8

(Pareto Adaptive Robustly Optimal extension \(\varvec{y}(\cdot )\)) A Stage-2 decision \(\varvec{y}(\cdot )\) is a PARO extension to \(\varvec{x}^{*}\) for (3) if \((\varvec{x}^{*},\varvec{y}(\cdot ))\) is ARO to (3) and there does not exist another \(\varvec{{\bar{y}}}(\cdot )\) such that \((\varvec{x}^{*},\varvec{{\bar{y}}}(\cdot ))\) is ARO to (3) and

$$\begin{aligned} \varvec{c}(\varvec{z})^{\top }\varvec{x}^{*} + \varvec{d}^{\top }\varvec{{\bar{y}}}(\varvec{z})&\le \varvec{c}(\varvec{z})^{\top }\varvec{x}^{*} + \varvec{d}^{\top }\varvec{y}(\varvec{z}), ~~\forall \varvec{z} \in U, \\ \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x}^{*} + \varvec{d}^{\top }\varvec{{\bar{y}}}(\varvec{{\bar{z}}})&< \varvec{c}(\varvec{{\bar{z}}})^{\top }\varvec{x}^{*} + \varvec{d}^{\top }\varvec{y}(\varvec{{\bar{z}}}), ~~ \text {for some } \varvec{{\bar{z}}} \in U. \end{aligned}$$
Fig. 1
figure 1

Illustration of PARO concept. Each graph represents the objective value of (3) for a given pair \((\varvec{x},\varvec{y}(\varvec{z}))\) as a function of uncertain parameter \(\varvec{z}\). Solution \((\hat{\varvec{x}}, \hat{\varvec{y}}(\varvec{z}))\) (solid line) is dominated by \((\hat{\varvec{x}}, \bar{\varvec{y}}(\varvec{z}))\) (solid-dotted line). Thus, decision rule \(\bar{\varvec{y}}(\varvec{z})\) may be a PARO extension of \(\hat{\varvec{x}}\), decision rule \(\hat{\varvec{y}}(\varvec{z})\) is not. Solution \((\varvec{x}^{*}, \varvec{y}^{*}(\varvec{z}))\) (dashed line) dominates both \((\hat{\varvec{x}}, \hat{\varvec{y}}(\varvec{z}))\) and \((\hat{\varvec{x}}, \bar{\varvec{y}}(\varvec{z}))\) and may be PARO. Solution \(\hat{\varvec{x}}\) may also be a PARO Stage-1 solution

Figure 1 illustrates the PARO concept for a single uncertain parameter, and illustrates that care must be exercised when drawing conclusions related to PARO. In the example, \((\varvec{x}^{*}, \varvec{y}^{*}(\varvec{z}))\) is possibly PARO, but this cannot be concluded from the figure. Also, Stage-1 solution \(\hat{\varvec{x}}\) might still be PARO, but that cannot be concluded from the figure either. The reason for the latter is that there may be yet another decision rule \(\varvec{{\tilde{y}}}(\varvec{z})\) so that \((\hat{\varvec{x}}, \varvec{{\tilde{y}}}(\varvec{z}))\) is not dominated by \((\varvec{x}^{*}, \varvec{y}^{*}(\varvec{z}))\).

We conclude this section by mentioning three ways that the PARO concept can be generalized and relaxed, although we do not consider these any further. First, Bertsimas et al. [8] define Pareto optimal adaptive solutions for general (non-linear) two-stage ARO problems, which for linear problems is equivalent to our definition of PARO. They subsequently define Pareto optimal affine adaptive solutions, which is equivalent to the definition of PRO after using LDRs, and focus on finding the latter type of solutions. In their numerical studies, Iancu and Trichakis [16] also consider two-stage problems and find PRO solutions after plugging in LDRs.

Secondly, we can also solely relax the requirement that the solution is ARO. For example, often LDRs do not guarantee an ARO solution but do exhibit good practical performance [20]. Suppose these yield a worst-case objective value p (> OPT). Then we can define p-PARO solutions as those solutions \((\varvec{x},\varvec{y}(\cdot ))\) that yield an objective value of at most p in each scenario, and are not dominated by another solution \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\cdot ))\) that yields an objective value of at most p in each scenario.

Thirdly, PARO may also be defined in terms of slack variables, analogous to the extension of PRO to constraint slacks in Iancu and Trichakis [16, Section 4.1]. In that paper, a slack value vector is introduced that quantifies the relative importance of slack in each constraint. This scalarization allows the computation of the total slack value of a solution in any scenario. Subsequently PRO (and also PARO) can be defined on this total slack value instead of the objective value. This may be useful in applications where ARO is mainly used for maintaining feasibility, such as immunizing against uncertain renewable energy source output [17] and adjusting to disturbances in railway timetabling [24].

3 Properties of PARO solutions

In this section, we prove existence of PARO solutions for two-stage ARO problems of form (3). First, we use FME to prove that a PARO Stage-1 (here-and-now) solution is equivalent to a PRO solution of a piecewise linear (PWL) convex static RO problem, and use that to prove the existence of a PARO Stage-1 solution. Subsequently, we prove that there exists a PWL decision rule that is PARO to (3).

3.1 Existence of a PARO Stage-1 solution

We prove existence of PARO Stage-1 solutions in three steps. First, we prove that (3) is equivalent to a static RO problem with a convex PWL objective. Subsequently, we prove that a PRO solution to this static RO problem is equivalent to a PARO solution to (3). Lastly, we prove that PRO solutions to such problems always exist.

The proof that (3) is equivalent to a static RO problem with a convex PWL objective makes use of FME [10, 22], an algorithm for solving systems of linear inequalities. We refer to Bertsimas and Tsitsiklis [6] for an introduction to FME in linear optimization. Its usefulness in ARO is due to the fact that it can be used to eliminate adaptive variables, as proposed by Zhen et al. [30]. FME leads to an exponential increase in the number of constraints. Zhen et al. [30] introduce a redundant constraint identification scheme, which helps to reduce the number of redundant constraints, although the number of constraints remains exponential. Zhen et al. [30] also propose to use FME to eliminate only part of the variables and using LDRs for remaining adaptive variables. Next to this, they use FME to prove (worst-case) optimality of PWL decision rules. Furthermore, they consider optimal decision rules for the adaptive variables in the dual problem: they prove (worst-case) optimality of LDRs in case of simplex uncertainty and (two-)piecewise linear decision rules in case of box uncertainty. Zhen and den Hertog [29] use a combination of FME and ARO techniques to compute the maximum volume inscribed ellipsoid of a polytopic projection. The following example illustrates the use of FME to eliminate an adaptive variable.

Example 2

We use FME to eliminate adaptive variable y from (1) in Example 1. We move the uncertain objective to the constraints using an epigraph variable \(t\in {\mathbb {R}}\), and rewrite the constraints to obtain:

$$\begin{aligned} \min _{x,t,y(d_1,d_2)} ~&~ t, \end{aligned}$$
(4a)
$$\begin{aligned} \text {s.t.}~&~ 20 \le x \le 40, \end{aligned}$$
(4b)
$$\begin{aligned} ~&~ y(d_1,d_2) \le t/\delta - x, \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(4c)
$$\begin{aligned} ~&~ d_1 - x \le y(d_1,d_2), \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(4d)
$$\begin{aligned} ~&~ d_2 - x \le y(d_1,d_2), \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(4e)
$$\begin{aligned} ~&~ \quad 20 \le y(d_1,d_2) \le 40, \quad \forall (d_1,d_2) \in U. \end{aligned}$$
(4f)

For fixed \((d_1,d_2)\), Constraints (4c)-(4f) specify lower and/or upper bounds on y. By combining each pair of lower and upper bounds on y into a new constraint, we find the following problem in terms of (xt):

$$\begin{aligned} \min _{x,t} ~&~ t, \end{aligned}$$
(5a)
$$\begin{aligned} \text {s.t.}~&~ 20 \le x \le 40, \end{aligned}$$
(5b)
$$\begin{aligned} ~&~ d_1 \le t/\delta , \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(5c)
$$\begin{aligned} ~&~ d_2 \le t/\delta , \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(5d)
$$\begin{aligned} ~&~ 20 \le t/\delta - x, \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(5e)
$$\begin{aligned} ~&~ d_1 - x \le 40, \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(5f)
$$\begin{aligned} ~&~ d_2 - x \le 40, \quad \forall (d_1,d_2) \in U, \end{aligned}$$
(5g)

where we have removed the trivial new constraint \(20 \le 40\). Any solution (xt) sets the following bounds on y:

$$\begin{aligned} \max \{d_1-x,d_2-x,20 \} \le y(d_1,d_2) \le \min \{ t/\delta - x, 40\},~~\forall (d_1,d_2) \in U, \end{aligned}$$

and any decision rule satisfying these inequalities is ARO to (1). Thus, two-stage problem (1) has been reduced to static linear RO problem (5). Auxiliary variable t can be eliminated, transforming (5) into an RO problem with a PWL objective.

We focus on applying FME as a proof technique. If FME is performed on Stage-1 feasible region \({\mathcal {X}}\) until all adaptive variables are eliminated, the feasible region can be written as

$$\begin{aligned} {\mathcal {X}}_{\text {FME}} = \{\varvec{x} \in {\mathbb {R}}^{n_x}~|~\varvec{G}(\varvec{z}) \varvec{x} \le \varvec{f}(\varvec{z}), ~~\forall \varvec{z} \in U\}, \end{aligned}$$

for some matrix \(\varvec{G}(\varvec{z})\) and vector \(\varvec{f}(\varvec{z})\) depending affinely on \(\varvec{z}\). Zhen et al. [30] show that \({\mathcal {X}} = {\mathcal {X}}_{\text {FME}}\). The following lemma proves that (3) is equivalent to a static RO problem with a convex PWL objective.

Lemma 1

If \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to (3), \(\varvec{y}^{*}(\cdot )\) satisfies

$$\begin{aligned} \varvec{d}^{\top }\varvec{y}^{*}(\varvec{z}) = \max _{(S,T)\in M} \{h_{S,T}(\varvec{x}^{*},\varvec{z}) \},~~\forall \varvec{z}\in U, \end{aligned}$$
(6)

and \(\varvec{x}^{*}\) is optimal to

$$\begin{aligned} \min _{\varvec{x} \in {\mathcal {X}}_{\text {FME}}}~&~\max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top } \varvec{x} + \max _{(S,T)\in M} \{h_{S,T}(\varvec{x},\varvec{z}) \}, \end{aligned}$$
(7)

with

$$\begin{aligned} M = \Big \lbrace (S,T)~|~ \exists k=1,\dotsc ,n_y \text { s.t. }&S \in C_k^{-}, T\in C_k^{+}, \\&\beta (S,l) = \beta (T,l),~\forall l>k,~ 0\in S \cup T \Big \rbrace , \end{aligned}$$

and linear functions

$$\begin{aligned} h_{S,T}(\varvec{x},\varvec{z}) = \sum _{p \in S, p>0} \frac{\alpha (S,p)}{\alpha (T,0) - \alpha (S,0)} \varphi _p(\varvec{x},\varvec{z}) - \sum _{q \in T, q>0} \frac{\alpha (T,q)}{\alpha (T,0) - \alpha (S,0)} \varphi _q(\varvec{x},\varvec{z}), \end{aligned}$$

and sets \(C^{-}\), \(C^{+}\), functions \(\varphi (\cdot )\) and coefficients \(\alpha \) and \(\beta \) defined as in Lemma B.1. Conversely, if \(\varvec{x}^{*}\) is optimal to (7), there exists a \(\varvec{y}^{*}(\cdot )\) such that \((\varvec{x}^{*},\varvec{y}^{*}(\cdot ))\) is ARO to (3), and any such \(\varvec{y}^{*}(\cdot )\) satisfies (6).

Proof

See Appendix B.2. \(\square \)

The result of Lemma 1 is also illustrated in Example 2, where if auxiliary variable t is eliminated the resulting problem has a convex PWL objective. If the number of adaptive variables in (3) is small enough that full FME can be performed (order of magnitude: 20 adaptive variables [30]), one can solve (7) via an epigraph formulation in order to obtain an ARO \(\varvec{x}\) to (3).

Lemma 2

A solution \(\varvec{x}^{*}\) is PARO to (3) if and only if it is PRO to

$$\begin{aligned} \min _{\varvec{x} \in {\mathcal {X}}_{\text {FME}}}~&~\max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top } \varvec{x} + \max _{(S,T)\in M} \{h_{S,T}(\varvec{x},\varvec{z}) \}, \end{aligned}$$
(8)

where each element (ST) of set M is a pair of sets of original constraints of (3) and each function \(h_{S,T}(\varvec{x},\varvec{z})\) is bilinear in \(\varvec{x}\) and \(\varvec{z}\).

Proof

See Appendix B.3. \(\square \)

Thus, existence of a PARO solution to (3) is now reduced to existence of a PRO solution to a static RO problem with a convex PWL objective in both \(\varvec{x}\) and \(\varvec{z}\). For problems without adaptive variables in the objective the following result immediately follows.

Corollary 1

If \(\varvec{d}=\varvec{0}\), a solution \(\varvec{x}^{*}\) is PARO to (3) if and only if it is PRO to

$$\begin{aligned} \min _{\varvec{x} \in {\mathcal {X}}_{\text {FME}}} \max _{\varvec{z} \in U} \varvec{c}(\varvec{z})^{\top }\varvec{x}. \end{aligned}$$

Proof

This directly follows from plugging in \(\varvec{d}=\varvec{0}\) in the proof of Lemma 2. \(\square \)

We can now prove one of our main results: existence of a PARO \(\varvec{x}\) for any ARO problem of form (3) with compact feasible region. Our proof uses Lemma 2 and essentially proves existence of a PRO solution to (8).

Theorem 1

If \({\mathcal {X}}\) is compact, there exists a PARO \(\varvec{x}\) to (3).

Proof

See Appendix B.4. \(\square \)

Note that the theorem also holds if \({\mathcal {X}}\) restricts (some elements of) \(\varvec{x}\) to be integer-valued. The boundedness assumption on \({\mathcal {X}}\) cannot be relaxed, because in that case a PRO solution to (8) need not exist. For example, consider the static RO problem \(\max _{x\ge 0}\min _{z \in [0,1]} xz\). The worst-case scenario is \(z=0\), and any \(x\ge 0\) is worst-case optimal. In any other scenario \(z >0\), higher x is better. Any x is dominated by \(x+\epsilon \) with \(\epsilon >0\), and there is no PRO solution.

3.2 Existence of a PARO piecewise linear decision rule

Now that existence of a PARO \(\varvec{x}\) is established, we investigate the structure of decision rule \(\varvec{y}(\cdot )\). We illustrate via an example that for any ARF \(\varvec{x}\) there exists a PWL PARO extension \(\varvec{y}(\cdot )\).

Example 3

Consider the following ARO problem:

$$\begin{aligned} \min _{x,y(\cdot )}~&~ \max _{z \in [0,1]} x - y_1(z) + y_2(z), \end{aligned}$$
(9a)
$$\begin{aligned} \text {s.t.}~&~ x - y_2(z) \le -z - \frac{1}{2}, \quad \forall z \in [0,1], \end{aligned}$$
(9b)
$$\begin{aligned} ~&~ -x + y_1(z) + y_2(z) \le z + 2,\quad \forall z \in [0,1], \end{aligned}$$
(9c)
$$\begin{aligned} ~&~ 1 \le y_1(z) \le 2,\quad \forall z \in [0,1], \end{aligned}$$
(9d)
$$\begin{aligned} ~&~ \frac{3}{2} \le y_2(z) \le 2,\quad \forall z \in [0,1]. \end{aligned}$$
(9e)

Using FME, we first eliminate \(y_1(z)\) and subsequently eliminate \(y_2(z)\):

$$\begin{aligned} 1 \le y_1(z) \le \min \big \lbrace 2, x-y_2(z) + z + 2 \big \rbrace , \end{aligned}$$
(10a)
$$\begin{aligned} \max \left\{ \frac{3}{2}, x + z + \frac{1}{2} \right\} \le y_2(z) \le \min \big \lbrace 2, x+z+1 \rbrace . \end{aligned}$$
(10b)

From the bounds on \(y_2\), it can be concluded that the unique ARF (and hence ARO) Stage-1 solution is \(x^{*} = \frac{1}{2}\). This leads to the following bounds on \(y_1(z)\) and \(y_2(z)\):

$$\begin{aligned} 1 \le&y_1(z) \le \min \left\{ 2, z - y_2(z) + \frac{5}{2} \right\} , \end{aligned}$$
(11a)
$$\begin{aligned} \max \left\{ \frac{3}{2}, 1 + z \right\} \le&y_2(z) \le \min \left\{ 2, \frac{3}{2} + z \right\} . \end{aligned}$$
(11b)

Variables \(y_1(z)\) and \(y_2(z)\) have not been eliminated in the objective. Therefore, any decision rule satisfying (11) is ARF to (9) but need not be ARO or PARO.

Variable \(y_1(z)\) does not appear in the bounds of \(y_2(z)\), so we can consider its individual contribution to the objective value. The objective coefficient of \(y_1(z)\) is negative, so for any z (including the worst-case) the best possible contribution of \(y_1(z)\) to the objective value is achieved if we set \(y_1(z)\) equal to its upper bound. Therefore, for the given \(x^{*}\), we have the following PWL PARO extension as a function of \(y_2(z)\):

$$\begin{aligned} y_1^{*}(z) = \min \left\{ 2, z - y_2(z) + \frac{5}{2} \right\} . \end{aligned}$$

Now that \(y_1(z)\) is eliminated in the objective, it remains to find the optimal decision rule for \(y_2(z)\). Variable \(y_2(z)\) now appears directly in the objective (9a) and through its occurence in the decision rule \(y_1^{*}(z)\). For fixed z, the optimal \(y_2(z)\) is determined by solving

$$\begin{aligned} \min _{y_2(z)}&- \min \left\{ 2, z - y_2(z) + \frac{5}{2} \right\} + y_2(z), \\ \text {s.t.}~&~ \max \left\{ \frac{3}{2}, 1 + z \right\} \le y_2(z) \le \min \left\{ 2, \frac{3}{2} + z \right\} . \end{aligned}$$

One can easily see that the objective is increasing in \(y_2(z)\), so for any z the best possible contribution of \(y_2(z)\) to the objective value is achieved if we set \(y_2(\varvec{z})\) equal to its lower bound. Thus, for the given \(x^{*}\), we have the following PWL PARO extension:

$$\begin{aligned} y_2^{*}(z) = \max \left\{ \frac{3}{2}, 1 + z \right\} . \end{aligned}$$

Note that plugging in a PWL argument in a PWL function retains the piecewise linear structure. Therefore, we also obtain the following PWL PARO extension for \(y_1^{*}(z)\):

$$\begin{aligned} y_1^{*}(z) = \min \left\{ 2, z - \max \left\{ \frac{3}{2}, 1 + z \right\} + \frac{5}{2}\right\} . \end{aligned}$$

Note that we did not move adaptive variables in the objective to the constraints using an epigraph variable, as was done in Example 2. Using an epigraph variable for the objective ensures that each decision rule satisfying the bounds is worst-case optimal, but prevents from comparing performance in other scenarios. Naturally, computationally it has the major advantage that it remains a linear program. \(\square \)

Bemporad et al. [1] show worst-case optimality of PWL decision rules for right-hand polyhedral uncertainty, i.e., ARO PWL decision rules in our terminology. Zhen et al. [30, Theorem 3] generalize this to problems of form (3) with particular assumptions on the uncertainty set. These decision rules are general PWL in \(\varvec{z}\) for all variables \(y_j\), \(j\ne l\), where \(y_l\) is the last eliminated variable in the FME procedure. The decision rule is convex or concave PWL in \(y_l\). These results solely consider the performance of PWL decision rules in the worst-case. Example 3 illustrates that for any ARF \(\varvec{x}\) there exists a PWL PARO extension \(\varvec{y}(\cdot )\). The lemma below formalizes this claim.

Lemma 3

For any \(\varvec{x}\) that is ARF to (3) there exists a PARO extension \(\varvec{y}(\varvec{z})\) that is PWL in \(\varvec{z}\).

We present two proofs to Lemma 3; one via FME using the idea of Example 3, and one via basic solutions in linear optimization.

Proof of Lemma 3 via FME

See Appendix B.5. \(\square \)

Proof of Lemma 3 via linear optimization

See Appendix B.6. \(\square \)

In both proofs the constructed decision rule is in fact optimal for all scenarios in the uncertainty set. As long as \(\varvec{x}\) is fixed, this is necessary for PARO solutions. The following theorem establishes the existence of PARO PWL decision rules.

Theorem 2

If \({\mathcal {X}}\) is compact, there exists a PARO \(\varvec{y}(\cdot )\) for (3) such that \(\varvec{y}(\varvec{z})\) is PWL in \(\varvec{z}\).

Proof

According to Theorem 1 there exists a PARO \(\varvec{x}\), and according to Lemma 3 there exists a PARO extension \(\varvec{y}(\cdot )\) for this \(\varvec{x}\) that is PWL in \(\varvec{z}\). It immediately follows that \(\varvec{y}(\cdot )\) is PARO. \(\square \)

4 Constructing PARO solutions—special cases

The methods used in the existence proofs of Sect. 3 are not computationally tractable, i.e., they provide little guidance for finding PARO solutions in practice. In this section, we discuss several practical methods for finding PARO solutions for special cases of problem (3). Several results assume that \(\varvec{d} = \varvec{0}\), i.e., Stage-2 variables do not appear in the objective. This can be a limiting assumption, although it is satisfied in applications where ARO is used for maintaining robust feasibility [17, 24]. In the next section, a general methodology that does not rely on this assumption is presented.

First, we consider special cases of problem (3) where particular decision rule structures guarantee PARO solutions in case \(\varvec{d} = \varvec{0}\). Subsequently, we show how for fixed \(\varvec{x}\) we can check whether \(\varvec{y}(\cdot )\) is a PARO extension. After that, we consider an application of the finite subset approach of Hadjiyiannis et al. [15]. Lastly, we present a method for finding a PARO solution if \(\varvec{d} = \varvec{0}\) and a convex hull description of the uncertainty set is available.

4.1 Known worst-case optimal decision rules

In Appendix A, using an FME lens, we analyze various decision rule structures, considering special uncertainty set forms, including so-called constraintwise, hybrid and block forms. This enables us to prove that an ARF decision rule with a particular structure exists for every ARF \(\varvec{x}\), instead of solely proving it is optimal for an ARO \(\varvec{x}\). For example, for ARO problems with hybrid uncertainty, for any ARF Stage-1 decision there exists an ARF decision that depends only on the non-constraintwise uncertain parameter. It turns out that, in case there are no adaptive variables in the objective, PRO solutions to the static problem obtained after plugging in that decision rule structure are PARO solutions to the original ARO problem. To formalize this, let \(\varvec{y}(\varvec{z}) = f_{\varvec{w}}(\varvec{z})\) be a decision rule with known form f (e.g., linear or quadratic) and finite number of parameters \(\varvec{w} \in {\mathbb {R}}^p\), such that \(f_{\varvec{w}}(\varvec{z}) \in {\mathcal {R}}^{L,n_y}\) for any \(\varvec{w}\).

Theorem 3

Let P denote an ARO problem of form (3) with \(\varvec{d} = \varvec{0}\) and where for any ARF \(\varvec{x}\) there exists an ARF decision rule of form \(\varvec{y}^{*}(\varvec{z}) = f_{\varvec{w}}(\varvec{z})\) for some \(\varvec{w}\). Then any \(\varvec{x}^{*}\) that is PRO to the static robust optimization problem obtained after plugging in decision rule structure \(f_{\varvec{w}}(\varvec{z})\) is PARO to P.

Proof

See Appendix B.7. \(\square \)

Due to Lemmas A.1 to A.3, the following result immediately follows for hybrid, block and simplex uncertainty.

Corollary 2

  1. (i)

    Let \(P_{\text {hybrid}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and hybrid uncertainty. Let Q denote the static robust optimization problem obtained from \(P_{\text {hybrid}}\) by plugging in a decision rule structure that depends only on the non-constraintwise parameter. Any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {hybrid}}\).

  2. (ii)

    Let \(P_{\text {block}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and block uncertainty. Let Q denote the static robust optimization problem obtained from \(P_{\text {block}}\) by plugging in a decision rule structure where adaptive variable \(\varvec{y}_{(v)}^{*}(\cdot )\) depends only on \(\varvec{z}_{(v)}\) for all \(v=1,\dotsc ,V\). Then any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {block}}\).

  3. (iii)

    Let \(P_{\text {simplex}}\) denote an ARO problem of form (3) with \(\varvec{d}=\varvec{0}\) and a simplex uncertainty set, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1})\), with \(\varvec{z}^{j} \in {\mathbb {R}}^{L}\) such that \(\varvec{z}^1,\dotsc ,\varvec{z}^{L+1}\) are affinely independent. Let Q denote the static robust optimization problem obtained from \(P_{\text {simplex}}\) by plugging in an LDR structure. Then any \(\varvec{x}^{*}\) that is PRO to Q is PARO to \(P_{\text {simplex}}\).

Proof

See Appendix B.8. \(\square \)

The case with constraintwise uncertainty is a special case of Corollary 2(i). The case with one uncertain parameter is a special case of Corollary 2(iii). Note that, unlike for worst-case optimization in Appendix A, it is necessary that \(\varvec{d}=\varvec{0}\), because our definition of PRO involves the term \(\varvec{d}\). If \(\varvec{d} \ne \varvec{0}\), the results above do not hold. This is also illustrated in Example 1 in Sect. 1.

The results of Corollary 2 can be combined. For example, for problems with both simplex uncertainty and hybrid uncertainty, Corollary 2(i) and Corollary 2(iii) together imply that one needs to consider only decision rules that are affine in the non-constraintwise parameter, if there are no adaptive variables in the objective. Simplex uncertainty sets arise in a variety of applications and can be used to approximate other uncertainty sets [4].

4.2 Check whether a decision rule is a PARO extension

If the Stage-1 decision \(\varvec{x}\) is fixed, one can verify whether the decision rule \(\varvec{y}\) is a PARO extension (Definition 8) as follows.

Lemma 4

Let \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) be an ARO solution to (3). Consider the problem

$$\begin{aligned} \max _{\varvec{z},\varvec{y}} ~&~ \varvec{d}^{\top } (\varvec{y}^{*}(\varvec{z}) - \varvec{y}), \end{aligned}$$
(12a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}) \varvec{x}^{*} + \varvec{B}\varvec{y} \le \varvec{r}(\varvec{z}), \end{aligned}$$
(12b)
$$\begin{aligned} ~&~ \varvec{z} \in U. \end{aligned}$$
(12c)

If the optimal objective value is zero, \(\varvec{y}^{*}(\cdot )\) is a PARO extension of \(\varvec{x}^{*}\). If the optimal objective value is positive, then \(\varvec{y}^{*}(\cdot )\) is not a PARO extension of \(\varvec{x}^{*}\) and the suboptimality of \(\varvec{y}^{*}(\cdot )\) is bounded by the optimal objective value.

Proof

See Appendix B.9. \(\square \)

If the optimal value to (12) is positive and \((\varvec{{\bar{z}}},\varvec{{\bar{y}}})\) denotes an optimal solution to (12), then \(\varvec{{\bar{z}}}\) is a scenario where the suboptimality bound is attained, and \(\varvec{{\bar{y}}}\) is an optimal decision for this scenario. Also, note that if the optimal value of (12) equals zero, the pair \((\varvec{x}^{*}, \varvec{y}^{*}(\cdot ))\) need not be PARO; there may be a different pair \((\varvec{{\hat{x}}}, \varvec{{\hat{y}}}(\cdot ))\) that dominates the current pair.

4.3 Unique ARO solution on finite subset of scenarios is PARO

The finite subset approach of Hadjiyiannis et al. [15] can be used in a PARO setting as well. If the lower bound problem has a unique optimal solution and this solution is feasible to the original problem, it is a PARO solution to the original problem. This is formalized in Lemma 5.

Lemma 5

Let \(S = \{\varvec{z}^1,\dotsc ,\varvec{z}^N\}\) denote a finite set of scenarios, \(S \subseteq U\). Let \(\varvec{x}^{*}\) be the unique ARO Stage-1 solution for which there exist \(\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*}\) such that \((\varvec{x}^{*},\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*})\) is an optimal solution to

$$\begin{aligned} \min _{\varvec{x},\varvec{y}^1,\dotsc ,\varvec{y}^N} ~&~ \max _{i=1,\dotsc ,N} \{\varvec{c}(\varvec{z}^i)^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}^i\}, \end{aligned}$$
(13a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^i) \varvec{x} + \varvec{B} \varvec{y}^i \le \varvec{r(\varvec{z}^i)},~\forall i=1,\dotsc ,N. \end{aligned}$$
(13b)

Then \(\varvec{x}^{*}\) is PARO to (3).

Proof

Let \((\varvec{{\bar{x}}}, \varvec{{\bar{y}}}(\cdot ))\) be ARO to (3) with \(\varvec{{\bar{x}}}\) unequal to \(\varvec{x}^{*}\). Then the solution \((\varvec{{\bar{x}}},\varvec{{\bar{y}}}(\varvec{z}^1),\dotsc ,\varvec{{\bar{y}}}(\varvec{z}^N))\) is feasible to (13). Because \(\varvec{x}^{*}\) is the unique ARO Stage-1 solution that can be extended to an optimal solution of (13), it holds that

$$\begin{aligned} \varvec{c}(\varvec{z}^i)^{\top } \varvec{x}^{*} + \varvec{d}^{\top } \varvec{y}^{i*} < \varvec{c}(\varvec{z}^i)^{\top } \varvec{{\bar{x}}} + \varvec{d}^{\top } \varvec{{\bar{y}}}(\varvec{z}^i) \text { for some } \varvec{z}^i \in S. \end{aligned}$$

That is, for each \(\varvec{{\bar{x}}}\) that is ARO to (3) and unequal to \(\varvec{x}^{*}\) there is at least one scenario \(\varvec{z}^i\) in U for which \(\varvec{x}^{*}\) outperforms \(\varvec{{\bar{x}}}\). This implies that \(\varvec{x}^{*}\) is PARO to (3). \(\square \)

Objective (13a) optimizes for the worst-case scenario \(\varvec{z}^i\) in set S, and (13b) ensures that \(\varvec{x}^{*}\) is feasible for all scenarios in S. It should be noted that requiring \(\varvec{x}^{*}\) to be both ARO to (3) and a unique optimal solution to (13) is quite restrictive.

4.4 Convex hull description of scenario set

Next, consider the case where the uncertainty set is given by the convex hull of a finite set of points, i.e., \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^N)\). Additionally assume that there are no adaptive variables in the objective. Then (3) is equivalent to

$$\begin{aligned} \min _{\varvec{x},\varvec{y}^1,\dotsc ,\varvec{y}^N} ~&~ \max _{i=1,\dotsc ,N} \varvec{c}(\varvec{z}^i)^{\top } \varvec{x}, \end{aligned}$$
(14a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^i)\varvec{x} + \varvec{B}\varvec{y}^i \le \varvec{r}(\varvec{z}^i),~\forall i=1,\dotsc ,N. \end{aligned}$$
(14b)

Analogous to Iancu and Trichakis [16], after finding an ARO solution we can perform an additional step by optimizing the set of ARO solutions over a scenario in the relative interior (denoted by \(\text {ri}(\cdot )\)) of the convex hull of our finite set of scenarios. This produces a PARO Stage-1 solution to (3).

Lemma 6

Let \(\varvec{d}=\varvec{0}\). Let \(U = \text {Conv}(\varvec{z}^1,\dotsc ,\varvec{z}^N)\), \(\varvec{{\bar{z}}} \in \text {ri}(U)\) and let \((\varvec{x}^{*},\varvec{y}^{1*},\dotsc ,\varvec{y}^{N*})\) denote an optimal solution to

$$\begin{aligned} \min _{\varvec{x},\varvec{y}^1,\dotsc ,\varvec{y}^N} ~&~ \varvec{c}(\varvec{{\bar{z}}})^{\top } \varvec{x},\end{aligned}$$
(15a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^i) \varvec{x} + \varvec{B} \varvec{y}^i \le \varvec{r(\varvec{z}^i)},\quad \forall i=1,\dotsc ,N, \end{aligned}$$
(15b)
$$\begin{aligned} ~&~ \varvec{c}(\varvec{z^i})^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}^i \le \text {OPT},\quad \forall i=1,\dotsc ,N, \end{aligned}$$
(15c)

where OPT denotes the optimal objective value of (14). Then \(\varvec{x}^{*}\) is PARO to (3).

Proof

See Appendix B.10. \(\square \)

5 Constructing PARO solutions—general methodology

Adaptive robust optimization problems of form (3) are in general NP-hard [14], and finding ARO solutions is still the focus of ongoing research [27]. Thus, finding a general methodology that, given an ARO solution to (3), can produce a PARO solution is not an easy task either. In this section we present a general methodology for finding PARO solutions to ARO problems of form (3), which involves solving a sequence of bilinear optimization problems; by solving these heuristically, the method can be used to find ‘approximate’ PARO solutions in practice. First we describe the methodology, after that we discuss how to solve the bilinear optimization problems.

5.1 Constraint & column generation procedure for finding a PARO solution

The starting point is the constraint-and-column generation (C &CG) method of [28] for finding ARO solutions; we first briefly describe this method. Let \(M \subseteq {U}\) denote a finite set of scenarios, e.g., solely the nominal scenario. Consider the following LP:

$$\begin{aligned} (P_1(M))~~~~~~\min _{\varvec{x},\varvec{y}^i,\mu } ~&~ \mu , \end{aligned}$$
(16a)
$$\begin{aligned} \text {s.t.} ~&~ \mu \ge \varvec{c}(\varvec{z}^i)^{\top } \varvec{x} + \varvec{d}^{\top } \varvec{y}^i,\quad \forall i=1,\dotsc ,|M|, \end{aligned}$$
(16b)
$$\begin{aligned} ~&~ \varvec{A}(\varvec{z}^i)\varvec{x} + \varvec{B}\varvec{y}^i \le \varvec{r}(\varvec{z}^i),\quad \forall i=1,\dotsc ,|M|. \end{aligned}$$
(16c)

The optimal Stage-1 solution \(\varvec{{\hat{x}}}\) to \(P_1(M)\) is ARO for the set M, but not necessarily ARO (or ARF) for the true uncertainty set U. Define the subproblem

$$\begin{aligned} (Q(\varvec{{\hat{x}}})) ~~~~ \max _{\varvec{z} \in U}&~ \varvec{c}(\varvec{z})^{\top }\varvec{{\hat{x}}} + \min _{\varvec{y}} \{ \varvec{d}^{\top } \varvec{y} ~|~ \varvec{A}(\varvec{z})\varvec{{\hat{x}}} + \varvec{B}\varvec{y} \le \varvec{r}(\varvec{z}) \}, \end{aligned}$$
(17)

and let \(q(\varvec{{\hat{x}}})\) denote the optimal objective value. Either \(q(\varvec{{\hat{x}}})\) is finite with optimal solution \(\varvec{{\hat{z}}}\), or a scenario \(\varvec{{\hat{z}}}\) is identified where the inner minimization problem is infeasible. In that case set \(q(\varvec{{\hat{x}}})\) to \(+\infty \) by convention. In either case, the solution \(\varvec{{\hat{z}}}\) is a scenario in U where the Stage-1 solution \(\varvec{{\hat{x}}}\) to \(P_1(M)\) performs worst. If the optimal objective value of \(Q(\varvec{{\hat{x}}})\) is higher than the optimal objective value of \(P_1(M)\), then M did not contain the worst-case scenario, and \(\varvec{{\hat{x}}}\) is not ARO. That is, \(\varvec{{\hat{z}}}\) provides either a feasibility or (worst-case) optimality cut. The scenario \(\varvec{{\hat{z}}}\) is added to M, and the procedure is repeated until both objective values are sufficiently close. Algorithm 1 provides the pseudocode, see Zeng and Zhao [28] for more details.

figure a

C &CG method of [28]

The result of termination of Algorithm 1 after k iterations is an ARO solution \(\varvec{x}^{\text {ARO}}:= \varvec{x}^k\); additionally let \(M^{\text {ARO}}:= M^k\) denote the resulting set of scenarios and let \(OPT:= \mu ^k\) denote the worst-case optimal objective value.

A PARO solution can be found by appending another C &CG procedure to Algorithm 1. In each iteration k of the second C &CG procedure, we find a candidate Stage-1 solution \(\varvec{x}^c\) that satisfies three conditions:

Condition 1

  1. (i)

    \(\varvec{x}^c\) is feasible for all scenarios \(\varvec{z}^l \in M^k\).

  2. (ii)

    \(\varvec{x}^c\) results in an objective value of at most \(v^l(\varvec{x}^k,\varvec{z}^l)\) for all scenarios \(\varvec{z}^l \in M^k\), with

    $$\begin{aligned} v(\varvec{x}^k,\varvec{z}^l) = {\left\{ \begin{array}{ll} \text {OPT} &{} \forall \varvec{z}^l \in M^{\text {ARO}} \\ \varvec{c}(\varvec{z}^l)^{\top }\varvec{x}^k + \min _{\varvec{y}} \{\varvec{d}^{\top } \varvec{y} | \varvec{A}(\varvec{z}^l)\varvec{x}^k + \varvec{B}\varvec{y} \le \varvec{r}(\varvec{z}^l) \} &{} \forall \varvec{z}^l \in M^k \backslash M^{\text {ARO}}, \end{array}\right. } \end{aligned}$$
    (18)

    i.e., \(\varvec{x}^c\) is feasible on the original scenario set \(M^{\text {ARO}}\), and performs at least as good as \(\varvec{x}^k\) on any scenario subsequently added to \(M^k\).

  3. (iii)

    \(\varvec{x}^c\) results in a strictly lower objective value than the current Stage 1 solution \(\varvec{x}^k\) on a new (to be determined) scenario \(\varvec{z}^c\).

This is achieved by solving the following optimization problemFootnote 2

$$\begin{aligned} \min _{\begin{array}{c} \varvec{z}^c,\varvec{x}^c,\varvec{y}^c, \\ \{\varvec{y}^l\}_{l=1}^{|M^k|} \end{array}} ~&~ \max _{\varvec{y}^k:\varvec{A} \varvec{x}^k + \varvec{B} \varvec{y^k} \le \varvec{r}(\varvec{z}^c)} (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^c + \varvec{d}^{\top }\varvec{y}^c) - (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^k + \varvec{d}^{\top }\varvec{y}^k), \end{aligned}$$
(19a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^c) \varvec{x}^c + \varvec{B} \varvec{y}^c \le \varvec{r}(\varvec{z}^c), \end{aligned}$$
(19b)
$$\begin{aligned} (P_2(\varvec{x}^k,M^k))\quad ~&~ \varvec{A}(\varvec{z}^l) \varvec{x}^c + \varvec{B} \varvec{y}^l \le \varvec{r}(\varvec{z}^l),\quad \forall l=1,\dotsc ,|M^k|, \end{aligned}$$
(19c)
$$\begin{aligned} ~&~ \varvec{c}(\varvec{z}^l)^{\top } \varvec{x}^c + \varvec{d}^{\top } \varvec{y}^l \le v(\varvec{x}^k,\varvec{z}^l),\quad \forall l=1,\dotsc ,|M^k|, \end{aligned}$$
(19d)
$$\begin{aligned} ~&~ \varvec{z}^c \in U. \end{aligned}$$
(19e)

Constraints (19c) and (19d) enforce Conditions 1(i) and 1(ii), respectively. The objective aims to find the scenario \(\varvec{z}^c\) where the candidate solution \(\varvec{x}^c\) most outperforms \(\varvec{x}^k\). The inner maximization problem finds the optimal recourse decision \(\varvec{y}^k\) for the current solution \(\varvec{x}^k\) at \(\varvec{z}^c\). Constraint (19b) ensures that \(\varvec{x}^c\) is feasible for the new scenario \(\varvec{z}^c\). Note that setting \(\varvec{x}^c = \varvec{x}^k\) is feasible and yields objective value 0, so \(P_2(\varvec{x}^k,M^k)\) is always feasible and has a nonpositive optimal objective value.

If the optimal objective value of (19) is strictly negative, then candidate solution \(\varvec{x}^c\) satisfies Conditions 1(i) to 1(iii), i.e., it dominates \(\varvec{x}^k\) on set \(M^k\). However, because \(M^k\) is only a subset of U, a candidate solution \(\varvec{x}^c\) is not neccesarily ARO (or even ARF). Thus, we again solve problem \(Q(\varvec{x}^c)\); let \(q(\varvec{x}^c)\) denote its objective value and \(\varvec{z}^q\) the optimizer. We distinguish two cases:

  • \(q(\varvec{x}^c) \le \text {OPT}\). Candidate solution \(\varvec{x}^c\) is an ARO solution that dominates \(\varvec{x}^k\) on set \(M^k \cup \{\varvec{z}^c\}\). For the next iteration of the C &CG procedure we update the current best solution: \(\varvec{x}^{k+1} = \varvec{x}^c\). The new set of scenarios is \(M^{k+1} = M^k \cup \{\varvec{z}^c\}\). Further improvements over \(\varvec{x}^c\) may be possible, so it is not neccesarily PARO. In preparation of the next iteration, compute \(v(\varvec{x}^{k+1},\varvec{z}^l)\) for all \(l=1,\dotsc ,M^{k+1}\).

  • \(q(\varvec{x}^c) > \text {OPT}\). Candidate solution \(\varvec{x}^c\) is not ARO, so we set \(\varvec{x}^{k+1} = \varvec{x}^k\). For the next iteration we set \(M^{k+1} = M^k \cup \{\varvec{z}^q\}\), to render solution \(\varvec{x}^c\) infeasible in the next iteration. In preparation of the next iteration, compute \(v(\varvec{x}^{k+1},\varvec{z}^c)\).

The above procedure is repeated until the optimal objective value of (19) is nonnegative. Algorithm 2 describes the resulting C &CG algorithm and Lemma 7 proves that it yields a PARO Stage-1 solution.

figure b

C &CG method for finding a PARO solution

Note that multiple PARO solutions may exist. Depending on the input solution \(\varvec{x}^{\text {ARO}}\), Algorithm 2 may potentially generate a different PARO solution.

Lemma 7

A solution \(\varvec{x}^{\text {PARO}}\) obtained from Algorithm 2 is PARO to (3).

Proof

See Appendix B.11. \(\square \)

In the special case that the vertices of U can be explicitly enumerated, the vertex set can be used to initialize \(M^k\). In this case, any candidate solution \(\varvec{x}^c\) is feasible on all vertices on U, so it is guaranteed to be ARO. Consequently, it is not needed to solve subproblem \(Q(\varvec{x}^c)\) in each iteration, and Algorithm 2 can be simplified. Note that this special case is the same as the case discussed in Sect. 4.4, but with \(\varvec{d} \ne \varvec{0}\).

5.2 Hints for solving the bilinear optimization problems

Algorithm 2 requires solving both problems \(P_2\) and Q in each iteration; Q is also solved in each iteration of Algorithm 1. Unfortunately, both are intractable in general. The reason is that for both problems the feasible region of the inner optimization problem depend on the variables of the outer optimization problem. For Q, dualizing the inner minimization problem results in the following bilinear problem:

$$\begin{aligned} \max _{\varvec{z} \in U,~\varvec{\lambda }\le \varvec{0}}&~ \varvec{c}(\varvec{z})^{\top }\varvec{x} + \varvec{\lambda }^{\top }\big (\varvec{r}(\varvec{z}) - \varvec{A}(\varvec{z})\varvec{x}\big ) \end{aligned}$$
(20a)
$$\begin{aligned} \text {s.t.}~~~&~ \varvec{B}^{\top }\varvec{\lambda } = \varvec{d}. \end{aligned}$$
(20b)

Similarly, for \(P_2\) dualizing the inner maximization problem also results in a bilinear optimization problem:

$$\begin{aligned} \min _{\begin{array}{c} \varvec{z}^c,\varvec{x}^c,\varvec{y}^c, \varvec{\lambda } \\ \{\varvec{y}^l\}_{l=1}^{|M^k|} \end{array}} ~&~ \big (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^c + \varvec{d}^{\top }\varvec{y}^c\big ) - \Big (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^k + \varvec{\lambda }^{\top } (\varvec{r}(\varvec{z}^c) - \varvec{A}(\varvec{z}^c)\varvec{x}^k)\Big ), \end{aligned}$$
(21a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^c) \varvec{x}^c + \varvec{B} \varvec{y}^c \le \varvec{r}(\varvec{z}^c), \end{aligned}$$
(21b)
$$\begin{aligned} ~&~ \varvec{A}(\varvec{z}^c) \varvec{x}^c + \varvec{B} \varvec{y}^l \le \varvec{r}(\varvec{z}^l),\quad \forall l=1,\dotsc ,|M^k|, \end{aligned}$$
(21c)
$$\begin{aligned} ~&~ \varvec{c}(\varvec{z}^c)^{\top } \varvec{x}^c + \varvec{d}^{\top } \varvec{y}^l \le v(\varvec{x}^k,\varvec{z}^l),\quad \forall l=1,\dotsc ,|M^k|, \end{aligned}$$
(21d)
$$\begin{aligned} ~&~ \varvec{z}^c \in U, \end{aligned}$$
(21e)
$$\begin{aligned} ~&~ \varvec{B}^{\top }\varvec{\lambda } = \varvec{d},~~\varvec{\lambda } \le \varvec{0}. \end{aligned}$$
(21f)

For both (20) and (21), we propose to use a simple alternating direction heuristic, also known as mountain climbing, which guarantees a local optimum [19].

We describe this approach in more detail for (21). For some initial \(\varvec{z}^c\) one can determine the optimal \(\{\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\) by solving an LP. Subsequently, we alternate between optimizing for \(\varvec{z}^c\) and \(\{\varvec{\lambda },\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\) while keeping the other set of variables at their current value. For either set of variables, the problem is an LP. This is continued until two consecutive LP problems yield the same objective value.

The solution quality of the mountain climbing procedure depends on the starting value for \(\varvec{z}^c\). One option is to simply pick the nominal scenario, if it is defined. A better starting solution can be obtained by plugging in an LDR for \(\varvec{y}^k\) in \(P_2\), i.e., by solving

$$\begin{aligned} \max _{\varvec{w},\varvec{W}}~\min _{\begin{array}{c} \varvec{z}^c,\varvec{x}^c,\varvec{y}^c, \\ \{\varvec{y}^l\}_{l=1}^{|M^k|} \end{array}} ~&~ (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^c + \varvec{d}^{\top }\varvec{y}^c) - (\varvec{c}(\varvec{z}^c)^{\top }\varvec{x}^k + \varvec{d}^{\top }(\varvec{w} + \varvec{W}\varvec{z}^c)) \end{aligned}$$
(22a)
$$\begin{aligned} \text {s.t.}~&~ \varvec{A}(\varvec{z}^c) \varvec{x}^k + \varvec{B} (\varvec{w} + \varvec{W}\varvec{z}^c) \le \varvec{r}(\varvec{z}^c), \end{aligned}$$
(22b)

additionally subject to (19b)–(19e). This is a static linear robust optimization problem.

This mountain climbing procedure can also be used to find a local optimum to (20). A starting solution for \(\varvec{z}\) can be found by plugging in an LDR for \(\varvec{y}\) in (17), and subsequently one can alternate between solving for \(\varvec{z}\) and \(\varvec{\lambda }\) in (20).

By using a heuristic approach to solving (19), it possible that at a certain iteration k in Algorithm 2 we obtain an estimate \({\hat{p}}_k \ge 0\), while the true \(p_k<0\), so the algorithm terminates without finding a PARO solution. Nevertheless, also solutions obtained this way that are not proven to be PARO can improve upon the original ARO solution. Similarly, if (20) is not solved to optimality, we might falsely conclude that a solution is ARO. In the numerical experiments this risk is reduced by using multiple starting points for \(\varvec{z}^c\).

There are many different approaches to (approximately) solving problems (20) and (21), or equivalently, problems Q and \(P_2\); we conclude this section by providing some examples and references. In case of RHS uncertainty, the following approach gives an exact solution to (19). The inner minimization problem is an LP for which we can write down the optimality conditions. Subsequently, the complementary slackness conditions can be linearized using big-M constraints and auxiliary binary variables. This results in an exact reformulation to a mixed binary convex reformulation (mixed binary linear if U is polyhedral). This reformulation was previously used in the C &CG method of Zeng and Zhao [28] for ARO problems with a polyhedral uncertainty set, and to solve bilinear optimization problems with a disjoint uncertainty set [32]. One can verify that in case of RHS uncertainty, (21) indeed has disjoint polyhedral feasible regions for \(\lambda \) and \(\{\varvec{z}^c,\varvec{x}^c,\varvec{y}^c,\varvec{y}^1,\dotsc ,\varvec{y}^{|M^k|}\}\).

Another possible approach is the Reformulation-Perspectification-Technique proposed by Zhen et al. [31]. The advantage of this approach is that one gets both an upper and lower bound for the optimal objective value and this method can be easily extended to a Branch & Bound framework to find the global optimal solution.

For alternative approaches to solving bilinear optimization problems, we refer to Konno [19], Nahapetyan [23] and Zhen et al. [32]. Some bilinear formulations can be solved directly by Gurobi 9.0 [13].

6 Numerical experiments

To demonstrate the value of PARO solutions in practice, we focus on an example problem in which (adaptive) robust optimization has been successfully applied: a facility location problem.

6.1 Problem description

Consider a strategic decision-making problem where a number of facilities are to be opened, in order to satisfy the demand of a number of customers. The goal is to choose the locations for opening a facility such that the cost for opening the facilities plus the transportation cost for satisfying demand is minimized. We consider this problem in a two-stage setting with uncertain demand. Thus, facility opening decisions need to be made in Stage 1, before Stage-2 demand is known.

Suppose there are n locations where a facility can be opened, and m demand locations. Let \(\varvec{x} \in \{0,1\}^n\) be a binary Stage-1 decision variable denoting the facility opening decisions. Opening facility costs \(f_i\) and yields a capacity \(s_i\), \(i=1,\dotsc ,n\). Let \(\varvec{y} \in {\mathcal {R}}^{m, m\times n}\) be the Stage-2 decision variable denoting transport from facility i to demand location j; let \(c_{ij}\) denote the associated costs, \(i=1,\dotsc ,n\), \(j=1,\dotsc ,m\). Let \(z_j\) denote the uncertain demand in location j. The two-stage facility location model reads

$$\begin{aligned} \min _{\varvec{x},\varvec{y}(\cdot )} ~&~ \max _{\varvec{z} \in U}~ \sum _{i=1}^n \sum _{j =1}^m c_{ij} y_{ij}(\varvec{z}) + \sum _{i=1}^n f_i x_i, \end{aligned}$$
(23a)
$$\begin{aligned} \text {s.t.}~&~ \sum _{i =1}^n y_{ij}(\varvec{z}) \ge z_j,\quad \forall \varvec{z} \in U,~\forall j=1,\dotsc ,m, \end{aligned}$$
(23b)
$$\begin{aligned} ~&~ \sum _{j=1}^m y_{ij}(\varvec{z}) \le s_i x_i,\quad \forall \varvec{z} \in U,~\forall i =1,\dotsc ,n, \end{aligned}$$
(23c)
$$\begin{aligned} ~&~y_{ij}(\varvec{z}) \ge 0,~\forall \varvec{z} \in U,\quad \forall i=1,\dotsc ,n,~j=1,\dotsc ,m, \end{aligned}$$
(23d)
$$\begin{aligned} ~&~\varvec{x} \in \{0,1\}^n, \end{aligned}$$
(23e)

with uncertainty set

$$\begin{aligned} U = \left\{ \varvec{z} : \sum _{j =1}^m z_j \le {\varGamma },~ l_j \le z_j \le u_j,~\forall j=1,\dotsc ,m \right\} . \end{aligned}$$

6.2 Setup

Model (23) can be written in form (3) with right-hand side uncertainty. In order to find an (approximate) PARO solution \(\varvec{x}_{\text {PARO}}\), Algorithm 2 is used, including C &CG Algorithm 1 to find an ARO solution. In order to improve the likelihood of finding an ARO solution, problem (20) is solved (with the mountain climbing approach) using multiple starting points for \(\varvec{z}\): the LDR-based starting point and 5 randomly sampled starting points.

For comparison purposes, we also compute a PRO solution to (3) using the methodology of Iancu and Trichakis [16, Theorem 1], which we repeat for convenience. Specifically, we plug in LDR \(\varvec{y}(\varvec{z}) = \varvec{w}+\varvec{W}\varvec{z}\), and obtain optimal solution \((\varvec{x}_1^{*},\varvec{w}_1^{*},\varvec{W}_1^{*})\). Subsequently, we optimize for the nominal scenario \(\varvec{{\bar{z}}}\) whilst ensuring that performance in other scenarios does not deteriorate, and feasibility is maintained:

$$\begin{aligned} \min _{\begin{array}{c} \varvec{x},\varvec{w},\varvec{W}\\ \varvec{x_2},\varvec{w_2},\varvec{W_2} \end{array}} ~&~ \varvec{c}(\varvec{{\bar{z}}})^{\top } \varvec{x_2} + \varvec{d}^{\top }(\varvec{w_2} + \varvec{W_2}\varvec{{\bar{z}}}), \end{aligned}$$
(24a)
$$\begin{aligned} \text {s.t.} ~&~ \varvec{c}(\varvec{z})^{\top } \varvec{x_2} + \varvec{d}^{\top }(\varvec{w_2} + \varvec{W_2}\varvec{z}) \le 0, \quad \forall \varvec{z} \in U, \end{aligned}$$
(24b)
$$\begin{aligned} ~&~ \varvec{A}(\varvec{z})\varvec{x} + \varvec{B}(\varvec{w} + \varvec{W}\varvec{z}) \le \varvec{r}(\varvec{z}),\quad \forall \varvec{z} \in U, \end{aligned}$$
(24c)
$$\begin{aligned} ~&~\varvec{x} = \varvec{x_1^{*}}+\varvec{x_2}, \varvec{w} = \varvec{w_1^{*}}+\varvec{w_2^{*}}, \varvec{W} = \varvec{W_1^{*}}+\varvec{W_2}. \end{aligned}$$
(24d)

Constraint (24d) states that the new solution equals the original solution (variables with subscript 1) plus an adaptation (variables with subscript 2). Constraint (24b) ensures that the adaptation does not deteriorate performance in any scenario, and the objective is to optimize performance in scenario \(\varvec{{\bar{z}}}\). According to Iancu and Trichakis [16, Theorem 1], the optimal solution for \((\varvec{x},\varvec{w},\varvec{W})\) is PRO to (3). Let \(\varvec{\varvec{x}}_{\text {PRO}}\) denote the optimal solution for \(\varvec{x}\).

We compare the performance of the Stage-1 (here-and-now) solutions \(\varvec{x}_{\text {PARO}}\), \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PRO}}\). For solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\) we use the optimal recourse decision. For \(\varvec{x}_{\text {PRO}}\) we report the results for two decision rules: (i) the optimal recourse decision, (ii) the LDR. We refer to the four objective values as PARO, ARO, PRO and PRO(LDR). We compute the relative improvement (in \(\%\)) of PARO over the other three objective values for two different cases:

Nominal::

Relative improvement in nominalFootnote 3 scenario \(\varvec{{\bar{z}}}\).

Maximum::

Relative improvement in the scenario with the maximum performance difference between \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PARO}}\). This scenario, which we denote by \(\varvec{z}^{*}\), is found by solving (19) with fixed \(\varvec{x}^k = \varvec{x}_{\text {ARO}}\) and \(\varvec{x}^c = \varvec{x}_{\text {PARO}}\).

All optimization problems are solved using Gurobi 9.0 [13] with the dual simplex algorithm selected. We note that the influence of different solvers may also be investigated, but this is beyond the scope of this paper. All computations are performed on a Intel-Core i7-8565U PC with 16GB RAM, using all 8 threads.

During our numerical studies we found examples where Algorithm 2 was not able to improve upon the initial Stage-1 solution \(\varvec{x}_{\text {ARO}}\). This could occur if the initial \(\varvec{x}_{\text {ARO}}\) happens to be PARO. Or, it could occur if there is a unique ARO solution—after all, not every ARO instance has multiple worst-case optimal Stage-1 solutions. The latter has been reported before in literature. De Ruiter et al. [9] show that the multi-stage production-inventory model of Ben-Tal et al. [2] has unique here-and-now decisions in almost all time periods, if LDRs are used. In that example, the reported multiplicity of solutions is mainly due to non-PRO decision rule coefficients. We find that multiplicity of Stage-1 solutions appears in particular when problem data is integer.

6.3 Data

We solve the facility location problem for various combinations of demand locations m, potential facility locations n and maximum total demand \({\varGamma }\). For (mn), we consider the pairs (10, 10), (10, 20), (20, 20), (20, 40), (50, 50), (50, 75), (100, 100) and (100, 150). For each pair, we consider a low, medium and high value of \({\varGamma }\); set at 8.5m, \(10\,m\) and 11.5m, respectively.

For each combination, we generate 100 random instances. Facility capacity \(s_i\) is set at 15 for each i. Other parameters are independently drawn from a discrete uniform distribution. Construction costs \(\varvec{f} \in {\mathbb {R}}^n\) are drawn between 4 and 22. Entries of transportation cost matrix \(\varvec{C} \in {\mathbb {R}}^{n\times m}\) are drawn between 2 and 12.

We set lower and upper bound \(l_j=8\) and \(u_j=12\) for each demand location \(j=1,\dotsc ,m\). The nominal demand scenario is \({\bar{z}}_j = \frac{1}{2}(8 + {\varGamma }/m)\) for all j. Note that \(\varvec{{\bar{z}}} \in \text {ri}(U)\).

The models with LDRs cannot be solved for the instances with \(m \ge 50\), due to memory limitations. Thus, for these instances the comparison to the PRO and PRO(LDR) objective values is not possible. For the other instances, the computation time for the LDR models is limited to 30 min.Footnote 4

6.4 Results

For the configurations with \(m=100\), none of the instances resulted in different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\). An ARO solution \(\varvec{x}_{\text {ARO}}\) that cannot be improved is a PARO solution itself (up to suboptimality of the C &CG procedures), which may also be a useful result.

For the other configurations, Table 2 reports the number of instances with different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\), along with the median required number of iterations and median computation time for obtaining these solutions.Footnote 5

Table 2 Stage-1 solution differences, iteration count and computation time for the facility location problem

The \(\ell _1\)-norm difference between Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\) represents the number of different facilities that are opened (out of n potential locations). For example, an \(\ell _1\) norm of 2 indicates that one solution opened facility i and another solution opened facility j, or one solution opened both facilities i and j and the other solution opened neither.

There is no clear trend visible as the instance size increases, in terms of number of instances with differences in Stage-1 solutions or magnitude of differences (\(\Vert \cdot \Vert _1\)). The median number of iterations also remains mostly constant, for both \(\varvec{x}_{\text {ARO}}\) and \(\varvec{x}_{\text {PARO}}\). There is no clear effect of the \({\varGamma }\)-parameter on the iteration count. Over all instances in all configurations, the maximum number of iterations is 10 for \(\varvec{x}_{\text {ARO}}\) and 4 for \(\varvec{x}_{\text {PARO}}\). Thus, the increase in computation time for the ARO and PARO C &CG procedure in larger instances is predominantly due to the increased size of the optimization problems in each iteration.

Table 3 Difference in worst-case objective value between PRO and ARO, for the facility location problem instances with different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\)

The LDR model (for obtaining the PRO and PRO(LDR) objective values) does not necessarily find the worst-case optimal solution because (i) it is restricted to linear recourse and thus an approximation of the true ARO model, and (ii) the 30-minute time limit may be restrictive (this led to an optimality gap of at most \(0.79\%\) over all instances). Thus, the LDR model may overestimate the worst-case cost. At the same time, ARO does not necessarily find the worst-case scenario and may thus underestimate the true worst-case cost.Table 3 reports the worst-case differences. Particularly for instances with \((m,n) = (20,40)\), differences in worst-case performance are observed.

Fig. 2
figure 2

Boxplots with relative improvement of PARO solution over alternative solutions, for the facility location problem instances with different Stage-1 solutions \(\varvec{x}_{\text {PARO}}\) and \(\varvec{x}_{\text {ARO}}\)

Figure 2 shows boxplots of the relative objective value improvement of PARO over ARO and (where applicable) over PRO and PRO(LDR) for the instances with different Stage-1 solutions \(\varvec{x}^{\text {PARO}}\) and \(\varvec{x}^{\text {ARO}}\). Figure 2a shows the improvement for nominal scenario \(\varvec{{\bar{z}}}\) and Fig. 2b shows the improvement for maximum difference scenario \(\varvec{z^{*}}\).

As expected, the magnitude of differences is larger for maximum difference scenario \(\varvec{z}^{*}\) than for the nominal scenario \(\varvec{{\bar{z}}}\). In most cases the maximum relative improvement is substantial, but the median relative improvement is only minor in most cases. The relative improvement of PARO over the other solutions decreases as the instance size increases. This is best visible for the relative improvement of PARO over the ARO (blue). For the instances with \(m=50\), also the maximum relative improvement is minor. The reason for this is that the magnitude of (absolute) differences between \(\varvec{x}^{\text {PARO}}\) and \(\varvec{x}^{\text {ARO}}\) remains mostly constant (see Table 2), which leads to a decrease in the relative objective value differences.

However, note that if the Stage-1 solution represents a decision that is to be implemented in practice, even the possibility to get a small percentual improvement warrants the extra effort to obtain an (approximate) PARO solution. We note that for ARO we use the first found ARO solution \(\varvec{x}^{\text {ARO}}\); it is possible that there exists yet another ARO solution, for which the improvement percentages of PARO over ARO are larger than those reported in Fig. 2.

By construction, the relative improvement of PARO over ARO is positive for the maximum difference scenario \(\varvec{z}^{*}\). However, \(\varvec{x}^{\text {PARO}}\) does not necessarily dominateFootnote 6\(\varvec{x}^{\text {ARO}}\), thus, for specific scenarios \(\varvec{x}^{\text {ARO}}\) may still perform better. Indeed, although Fig. 2 shows that while for the large majority of instances the relative improvement is positive, sometimes it is negative. This may also be partially due to the fact that the obtained solutions \(\varvec{x}^{\text {ARO}}\) and \(\varvec{x}^{\text {PARO}}\) are approximate, due to the bilinear approximations. There are also instances where the relative difference between PARO and ARO is zero, i.e., a different Stage-1 solution \(\varvec{x}\) does not always translate to a different performance on the two reported measures.

The relative improvement of PARO over PRO is smaller than that of PARO over PRO(LDR). Thus, reported differences are partially due to the different Stage-1 decision, and partially due to the Stage-2 decision rule.

7 Conclusion

In this paper, we dealt with Pareto efficiency in two-stage adaptive robust optimization problems. Similar to static robust optimization, the large majority of solution techniques focus only on worst-case optimality, and may yield solutions that are not Pareto efficient. To alleviate this, the concept of Pareto Adaptive Robustly Optimal (PARO) solutions has been introduced in the literature, and is formalized in a general framework in this paper.Footnote 7

Using FME as the predominant technique, we have analyzed the relation between PRO and PARO and investigated optimality of various decision rule structures in both worst-case and non-worst-case scenarios. We have shown the existence of PARO here-and-now decisions and shown that there exists a PWL decision rule that is PARO.

Moreover, we have provided several practical approaches to generate or approximate PARO solutions. Numerical experiments on a facility location example demonstrate that PARO solutions can significantly improve performance in non-worst-case scenarios over ARO and PRO solutions.

A potential direction for future research would be to further investigate constructive approaches to find or approximate PARO solutions. In particular, it would be valuable to have tractable algorithms for larger instances and/or more general classes of problems than the ones that can be tackled using the currently presented approaches. Another interesting avenue for future (numerical) research would be understand more precisely the circumstances under which multiplicity of ARO solutions occurs, and when first-found ARO solutions are (un)-likely to be PARO.