1 Introduction

Two-stage stochastic programming is a mathematical framework to model decision making under uncertainty. In this context, first-stage decisions are made under uncertainty, while the second-stage (or recourse) decisions are intended to improve the impact of the first-stage decisions after observing the uncertainty realization. This framework is widely used in real-world applications covering the planning of power system operations and supply chains, logistics, inventory, and financial planning, to mention a few. Since there is no analytical solution to most two-stage stochastic optimization problems, efficient numerical methods are of paramount importance.

One of the first numerical methods explored in the stochastic programming literature utilized bounds for the expected recourse function. Initiated by [15], this approach considers minimization problems of which the recourse is a convex function of the uncertainty. It relies on a partition of the uncertainty support, the law of total probability, and Jensen’s and Edmundson–Madanski inequalities in order to obtain lower and upper bounds. Partition refinement methods ensure improvements to the approximation error with a monotone sequence of limits (see [6, 9,10,11,12]). It is worth mentioning that this numerical approach appeared as first solution method to stochastic optimization problems, since it is based on numerical integration techniques that are easy to implement, that is why the referenced works are fairly old. With the advancement of computers, this approach was replaced by sampling-based techniques, but in this work we want to return to the partition-based method to highlight the importance of obtaining a solution given by deterministic limits.

Starting from the law of total probability, previous works have provided an optimistic solution (Solution that underestimates the actual cost) by applying Jensen’s inequality to each partition cell to obtain the lower approximation problem in a computationally tractable fashion. The optimal value of this proxy provides a lower bound to the original problem because it replaces the expected recourse function by a weighted average recourse evaluation. A deterministic optimality gap is obtained by fixing the current solution and computing an upper-bound that applies Edmundson-Madanski inequality to each cell of the partition and averages its results with the associated probability mass. Given a refining partition sequence, this method converges to the true optimal in an optimistic manner, i.e., at each iteration, and the solution underestimates the actual recourse costs.

The available partition refinement methods in literature only provide optimistic solutions when the method is not converged. However, some practical applications depend on the implementation of a non-converged sub-optimal solution because of computational time limitations. In these cases, an optimistic solution might generate significantly higher expected costs when compared to the obtained objective function. In this context, a solution methodology that provides pre-convergence conservative solutions is of significant importance to practical applications with high computational burden and time limitations.

The objective of this study is to obtain a conservative convergent solution method for two-stage stochastic linear optimization problems of which the recourse is a convex function of the uncertainty. We formulated a distributionally robust optimization model based on a generalization of Edmundson-Madanski inequality (see [6, 13, 14]), and solved it to obtain a conservative solution and a tighter upper bound to the original problem. The distributionally robust model minimizes the worst expected cost over every extreme probability measure with known partition-adapted conditional expectations. A deterministic optimality gap was obtained by solving the lower-bound problem that applied Jensen’s inequality to each partition cell and computing the distance between both limits. This technique was motivated by the idea that distributionally robust solutions avoid out-of-sample disappointment [19].

Nevertheless, in the presence of high dimensional uncertainty vectors, the proposed method is challenged because of the exponential growth of the number of linear constraints and variables. To handle this, we propose different solution schemes depending on the uncertainty dimensionality: (i) for problems with low-dimensional uncertainty, we developed a deterministic equivalent linear programming model, (ii) for medium-sized uncertainty dimensionality, we propose a column and constraint generation algorithm [24], and (iii) to handle high dimensional uncertainty, we propose a simplex-based heuristic method whose complexity grows linearly with the uncertainty dimension. For the latter, we prove convergence when the recourse function is monotone over the uncertainty. Therefore, the main contributions of the study are the following:

  • Raise awareness on the negative impact of optimistic solution methods to solve two-stage stochastic optimization problems.

  • Reformulation of the upper-bound, conservative, distributionally robust problem into a linear programming model.

  • Development of two acceleration procedures, i.e., (1) an exact decomposition approach based on the column and constraint generation algorithm to solve medium-scaled problems and (2) an approximative simplex-based partitioning scheme to find robust solutions for large-scale instances.

  • A new conservative solution framework for two-stage stochastic linear optimization problems based on a deterministic partition refinement algorithm.

The remainder of this paper is organized as follows. Section 2 describes the theoretical and methodological background on the existing partition-based method and related bounding optimization problems. In Sect. 3, we develop the deterministic equivalent reformulation of the upper-bound problem and propose accelerating methods to handle medium- and large-sized problems. Section 4 presents the proposed conservative partition refining (CPR) method for two-stage stochastic optimization and proves the convergence of the sequential algorithm. Section 5 presents the numerical results of the computational experiments showing the superior pre-convergence performance of the proposed method. Finally, in Sect. 6, relevant conclusions are provided.

2 Existing methods

In this section, we introduce the nomenclature and provide a short review of the existing partition methods from which our contributions are derived.

2.1 Notation

We study two-stage stochastic linear programming problems of the form:

$$\begin{aligned} z^* :=\begin{aligned} \underset{{\mathbf {x}}\in X}{\text {Min}}\,\{f({\mathbf {x}}):= {\mathbf {c}}^\top {\mathbf {x}} + {\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]\} \end{aligned} \end{aligned}$$
(1)

where

$$\begin{aligned} \begin{aligned} Q({\mathbf {x}},\varvec{\xi }) = \underset{{\mathbf {y}}\ge 0}{\text {Min}}&\quad {\mathbf {q}}^\top {\mathbf {y}}\\ \text {s.t.}&\quad {\mathbf {W}}{\mathbf {y}} \ge {\mathbf {h}}(\varvec{\xi }) - {\mathbf {T}}(\varvec{\xi }){\mathbf {x}}. \end{aligned} \end{aligned}$$
(2)

is known as the recourse function or second-stage problem. This model corresponds to a linear optimization problem that minimizes the cost where \({\mathbf {x}}\) denotes the first-stage decisions, \({\mathbf {y}}\) denotes the second-stage decisions and the expectation \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]\) represents the expected cost of the recourse. We assume that the random vector \(\varvec{\xi }\) has known probability distribution with support in \(\Xi \subseteq {\mathbb {R}}^{d_{\xi }}\). We denote the set of feasible first-stage decisions by \(X\subseteq {\mathbb {R}}^{d_x}\). Here, \({\mathbf {c}}\in {\mathbb {R}}^{d_x}\), \(Q:X\times \text {co}(\Xi )\longrightarrow {\mathbb {R}}\), \({\mathbf {q}}\in {\mathbb {R}}^{d_y}\), \({\mathbf {W}}\in {\mathbb {R}}^{m_y\times d_y}\), \({\mathbf {h}}(\varvec{\xi })\in {\mathbb {R}}^{m_y}\), and \({\mathbf {T}}(\varvec{\xi })\in {\mathbb {R}}^{m_y\times d_x}\), where \(\text {co}(\Xi )\) denotes the convex hull of the uncertainty support.

We assume that problem (1) has relatively complete recourse, i.e., problem (2) is feasible for every \({\mathbf {x}}\in X\), and every realization of the unknown data \(\varvec{\xi }\); the uncertainty support \(\Xi \) is compact and the expectation \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]\) exists. We further assume that \(Q({\mathbf {x}},\cdot )\) is a convex function on \(\text {co}(\Xi )\) for all \({\mathbf {x}}\in X\). Particularly, this last assumption holds when the unknown data is only in the components of \({\mathbf {h}}(\varvec{\xi })\) and \({\mathbf {T}}(\varvec{\xi })\), respectively.

The existing partition-based method [9, 11, 12, 14, 15, 17] makes use of a partition of the support of the uncertainty to generate a monotonic sequence of limits. These limits are given by the optimal objective values of the upper- and lower-bound problems obtained by the classical inequalities of Edmundson-Mandanski and Jensen, respectively.

According to previous reported partition methods, we start from the partition of the uncertainty support \(\Xi \). The set \({\mathscr {P}}_n = \{\Xi ^k:\,k=1,\ldots , n \}\) of cells \(\Xi ^k\subseteq \Xi \), is a partition of the support \(\Xi \subset {\mathbb {R}}^{d_{\xi }}\) if:

  1. 1.

    \({\mathbb {P}} \left( \bigcap _{k\in K} \Xi ^k \right) = 0, \quad \forall K\subseteq \{1,\ldots , n \}\),

  2. 2.

    \({{\bigcup }^{n}_{k=1}} \Xi ^k = \Xi \).

Since most real-life applications are based on bounded probability distributions for the uncertainty, in this work, we assume rectangular support for the data-generating probability distribution. Moreover, we assume rectangular partitions with cells of the form , because the partition refinement procedure for this partition regards the method computational tractable.

We denote by \(p^k={\mathbb {P}}(\varvec{\xi }\in \Xi ^k)\) the probability mass, and \(\overline{\varvec{\xi }}^k={\mathbb {E}}[\varvec{\xi }|\varvec{\xi }\in \Xi ^k]\) the conditional mean of the cell k, \(k=1,\ldots ,n\). It is worth mentioning that, for a general probability distribution, the computation of the probability mass \(p^k\) and the conditional mean \(\overline{\varvec{\xi }}^k\) is not easy. In Sect. 5 we present a different methods to assess this computation.

Finally, even though the disappointment concept has been widely studied by the risk-averse theory and there exist several definitions under this concept, in this work, we assume the following definition:

Definition 1

The disappointment is the difference between the out-of-sample cost and the in-sample estimate (i.e., the optimal objective value).

2.2 Existing lower bound

According to [15], if we consider a partition of the uncertainty support \(\Xi \), by the law of total probability, we have that the expectation in (1) can be expressed by \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })] = \sum _{k=1}^{n}p^k{\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \Xi ^k]\). Then (1) is equivalent to the following linear programming problem:

$$\begin{aligned} \begin{aligned} \underset{{\mathbf {x}}\in X}{\text {Min}}\,\left\{ f({\mathbf {x}}):= {\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^{n}p^k{\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \Xi ^k]\right\} \end{aligned} \end{aligned}$$
(3)

By Jensen’s inequality we have that

$$\begin{aligned} Q({\mathbf {x}},\overline{\varvec{\xi }}^k) \le {\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \Xi ^k], \quad k=1,\ldots ,n, \end{aligned}$$

which gives the following lower-bound for the optimal objective value of (3):

$$\begin{aligned} z_n^L&:= \underset{{\mathbf {x}}\in X}{\text {Min}}\left\{ f_n^L({\mathbf {x}}):={\mathbf {c}}^\top {\mathbf {x}} +\sum _{k=1}^{n}p^kQ({\mathbf {x}},\overline{\varvec{\xi }}^k)\right\} \nonumber \\&\le \underset{{\mathbf {x}}\in X}{\text {Min}}\,\left\{ {\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^{n}p^k{\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \Xi ^k]\right\} . \end{aligned}$$
(4)

Note that the lower-bound problem on the left side of Eq. (4) underestimates the expected cost, because it only considers the finite set of conditional mean scenarios \(\{ \overline{\varvec{\xi }}^k\,:\, k=1,\ldots ,n\}\) to approximate \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]\).

We can explicitly write the lower-bound problem as the following deterministic-equivalent linear program:

$$\begin{aligned} \begin{aligned} \underset{{\mathbf {x}},{\mathbf {y}}^k}{\text {Min}}&\quad {\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^n p^k\cdot {\mathbf {q}}^\top {\mathbf {y}}^k \\ \text {s.t.}&\quad {\mathbf {W}}{\mathbf {y}}^{k} \ge {\mathbf {h}}(\overline{\varvec{\xi }}^{k}) - {\mathbf {T}}(\overline{\varvec{\xi }}^{k}){\mathbf {x}}, \quad k=1,\ldots ,n, \\&\quad \mathbf {y}^{\mathbf {k}} \ge 0, \quad k=1,\ldots ,n, \\&\quad {\mathbf {x}}\in X. \end{aligned} \end{aligned}$$
(5)

Note that the number of blocks of linear constraints for (5) is the same as the partition size, meaning that the lower-bound problem does not demand a high computational effort. When n is not sufficiently large, the optimal solution \(\mathbf{x }_n^L\) of (5) represents an optimistic decision because of the lower approximation from below to the conditional expectation \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \Xi ^k]\) by \(Q({\mathbf {x}},\overline{\varvec{\xi }}^k)\), \(k=1,\ldots ,n\), underestimating the actual conditional expected cost.

2.3 Existing upper bound

To derive an upper bound for the expectation \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]\), Edmundson-Madanski [18] initially proposed an inequality based on a convex combination of the value of \(Q(\mathbf{x }, \cdot )\) at the extreme points of the convex hull of the support \(\Xi \). Therefore, if \(\{{\mathbf {e}}_j ~:~ j=1,\ldots ,2^{d_{\xi }}\}\) is the set of extreme points of , since every point \(\varvec{\xi }\in \Xi \) can be expressed by a convex combination of the vertices of \(\Xi \), it is always possible to find \(p_j(\varvec{\xi })\ge 0\), \(j=1,\ldots ,2^{d_{\xi }}\), such that

$$\begin{aligned} \sum _{j=1}^{2^{d_{\xi }}} \, p_j(\varvec{\xi })\cdot {\mathbf {e}}_j = \varvec{\xi } \quad \forall \varvec{\xi }\in \Xi \end{aligned}$$
(6)

and

$$\begin{aligned} \sum _{j=1}^{2^{d_{\xi }}} \, p_j(\varvec{\xi }) = 1 \quad \forall \varvec{\xi }\in \Xi . \end{aligned}$$
(7)

The weights \(p_j(\varvec{\xi })\), \(j=1,\ldots ,2^{d_{\xi }}\), can be interpreted as conditional probabilities, i.e., \(p_j(\varvec{\xi })={\mathbb {P}}({\mathbf {e}}={\mathbf {e}}_j\,|\,\varvec{\xi })\), considering a random vector \({\mathbf {e}}\) with support in the set of extreme points \(\{{\mathbf {e}}_j \, :\, j=1,\ldots ,2^{d_{\xi }}\}\). Given that (6) and (7) are true and \(Q({\mathbf {x}},\cdot )\) is a convex function for any given \({\mathbf {x}}\), the following inequality holds for any set of conditional probabilities \(\{(p_1(\varvec{\xi }),\dots ,p_{2^{d_{\xi }}}(\varvec{\xi }))\}_{\varvec{\xi }\in \Xi }\):

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })]&= \int _{\Xi } Q\left( {\mathbf {x}},\sum _{j=1}^{2^{d_{\xi }}} p_j(\varvec{\xi })\cdot {\mathbf {e}}_j\right) d{\mathbb {P}}(\varvec{\xi }) \le \sum _{j=1}^{2^{d_{\xi }}} Q({\mathbf {x}},{\mathbf {e}}_j) \int _{\Xi }\,p_j(\varvec{\xi })d{\mathbb {P}}(\varvec{\xi }). \end{aligned}\nonumber \\ \end{aligned}$$
(8)

According to the above, we will consider the worst-case of the right-hand-side of the inequality (8) because of through an infinite maximization problem over the set of conditional probabilities that satisfies (6), we are able to derive an upper bound for (3) by a finite tractable optimization problem, as it can be seen in the following.

Since (8) holds for any set of conditional probabilities \(\{(p_1(\varvec{\xi }),\dots ,p_{2^{d_{\xi }}} (\varvec{\xi }))\}_{\varvec{\xi }\in \Xi }\) that satisfies (6) then,

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })] \le \underset{p_j(\varvec{\xi })}{\text {Max}}&\quad \sum _{j=1}^{2^{d_{\xi }}}\,Q({\mathbf {x}},{\mathbf {e}}_j) \int _{\Xi }\,p_j(\varvec{\xi })d{\mathbb {P}}(\varvec{\xi })\\ \text {s.t.}&\quad \sum _{j=1}^{2^{d_{\xi }}} \, {\mathbf {e}}_j p_j(\varvec{\xi }) = \varvec{\xi } \quad \forall \varvec{\xi }\in \Xi \\&\quad \sum _{j=1}^{2^{d_{\xi }}} \, p_j(\varvec{\xi }) = 1 \quad \forall \varvec{\xi }\in \Xi . \end{aligned} \end{aligned}$$
(9)

Note that (9) is a semi-infinite optimization problem that allows us to derive an upper bound through a tractable finite relaxation. If we replace the two constraints in (9) with their expected value, i.e.,

$$\begin{aligned} \begin{aligned}&\sum _{j=1}^{2^{d_{\xi }}} {\mathbf {e}}_j \int _{\Xi }\,p_j(\varvec{\xi })d{\mathbb {P}}(\varvec{\xi }) = \overline{\varvec{\xi }}\\&\sum _{j=1}^{2^{d_{\xi }}} \int _{\Xi }\,p_j(\varvec{\xi })d{\mathbb {P}}(\varvec{\xi }) = 1,\\ \end{aligned} \end{aligned}$$

and denote \(\delta _j = \int _{\Xi }\,p_j(\varvec{\xi })d{\mathbb {P}}(\varvec{\xi })\), for \(j=1,\ldots ,2^{d_{\xi }},\) we obtain an upper bound based on the following tractable finite linear optimization problem:

$$\begin{aligned} \underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }})}{\text {Max}}\, {\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}}, {\mathbf {e}})] := \underset{\varvec{\delta }\ge 0}{\text {Max}}&\quad \sum _{j=1}^{2^{d_{\xi }}} Q({\mathbf {x}},{\mathbf {e}}_j)\,\delta _j\nonumber \\ \text {s.t.}&\quad \sum _{j=1}^{2^{d_{\xi }}} {\mathbf {e}}_j\,\delta _j =\overline{\varvec{\xi }},\nonumber \\&\quad \sum _{j=1}^{2^{d_{\xi }}}\delta _j=1, \end{aligned}$$
(10)

where \({\mathcal {D}}(\overline{\varvec{\xi }}) = \left\{ \varvec{\delta }\in {\mathbb {R}}_+^{2^{d_{\xi }}}\,:\,\sum _{j=1}^{2^{d_{\xi }}}{\mathbf {e}}_j\, \delta _j = \overline{\varvec{\xi }},\quad \sum _{j=1}^{2^{d_{\xi }}}\delta _j=1\right\} \). Thus, as per the above developments, the following inequality holds:

$$\begin{aligned} {\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })] \le \underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }})}{\text {Max}}\, {\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {e}})]. \end{aligned}$$
(11)

Existing partition-based methods used the derived the upper bound (11) for a first-stage decision found by the lower bound problem (5). So, supposing that \({\mathbf {x}}_n^L\) is the solution of (5), in [9, 12, 13] the following upper bound is proposed:

$$\begin{aligned} z^* \le f_n^U({\mathbf {x}}_n^L):= {\mathbf {c}}^\top {\mathbf {x}}_n^L + \sum _{k=1}^n p^k \underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }}^k)}{\text {Max}}\left( \sum _{j=1}^{2^{d_{\xi }}} \delta _j^k Q({\mathbf {x}}_n^L,{\mathbf {e}}_j^k)\right) , \end{aligned}$$
(12)

where,

$$\begin{aligned} \begin{aligned} Q({\mathbf {x}}_n^L,{\mathbf {e}}_j^k):=\underset{{\mathbf {y}}\ge 0}{\text {Min}}&\quad {\mathbf {q}}^\top {\mathbf {y}}\\ \text {s.t.}&\quad {\mathbf {W}}{\mathbf {y}} \ge {\mathbf {h}}({\mathbf {e}}_j^k) - {\mathbf {T}}({\mathbf {e}}_j^k){\mathbf {x}}_n^L, \end{aligned} \end{aligned}$$
(13)

for all \(j=1,\ldots ,2^{d_{\xi }}\) and \(k=1,\ldots ,n\).

The existing partition-based method is summarized in the following pseudo-algorithm:

figure a

The existing partition-based methods are based on the lower bound problem, as it underestimates the recourse cost by relying only on the partition-adapted conditional means within the recourse function assessment. Therefore, pre-convergence solutions of the lower-bound problem are optimistic and generate high disappointment levels when they are evaluated for adverse scenarios of the uncertainty realization as illustrated in Sect. 5 of the computational experiments.

3 Proposed upper-bound problem reformulation and solution methdods

Motivated by the concept explored in [19], i.e., conservative solutions obtained by solving distributionally robust optimization problems have performance guarantees against out-of-sample disappointment. Our thrust is to obtain pre-convergence conservative solutions in a computationally efficient manner. For that, we develop in this section a deterministic equivalent reformulation of the upper-bound (distributionally robust) problem that is computationally tractable for low-dimensional uncertainty. We also propose a column-constraint generation algorithm for medium-sized uncertainty dimensionality and, to handle high dimensional uncertainty, we propose a simplex-based heuristic method whose complexity grows linearly with the uncertainty dimension. In the presence of monotone recourse functions with regard to an uncertain parameter, we prove convergence of the proposed simplex-based heuristic method. The assumption of monotonicity of the resource function is reasonable in light of the fact that most real-world applications belong to this kind of problem.

3.1 Deterministic equivalent model for upper-bound problem

Following the upper approximation of the expected recourse function (11), we define the upper-bound problem

$$\begin{aligned} z_n^U := \underset{{\mathbf {x}}\in X}{\text {Min}}\left\{ f_n^U({\mathbf {x}}):={\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^{n}p^k\underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }}^k)}{\text {Max}} {\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {e}}^k)]\right\} , \end{aligned}$$
(14)

where \({\mathbf {e}}^k\) is the random vector whose support is the set of vertex of the hypercube \(\Xi ^k\) and \({\mathcal {D}}(\overline{\varvec{\xi }}^k)\) is the ambiguity set generated by the conditional mean \(\overline{\varvec{\xi }}^k\) of the cell \(\Xi ^k\).

The solutions obtained from the upper-bound problem can be seen as robust or conservative solutions. Note that problem (14) is a distributionally robust optimization problem, where the conditional-probability distribution within each cell is selected to represent the worst-case distribution preserving the conditional-average information of the cell [5]. Since the upper-bound problem (14) overestimates the actual cost (11), the upper-bound solution has a mathematical certificate against disappointment. This is specially useful when non-converged solutions are actually implemented owing to time or computational-power limitations.

However, solving the upper-bound problem to obtain the conservative solution \({\mathbf {x}}_n^U\) requires significant computational effort. The number of variables of the inner problem grows exponentially with the uncertainty dimension. In this work, we propose a new framework to obtain a conservative solution by solving the upper-bound problem. To the best of our knowledge, no existing partition-based method solves the upper bound problem.

For a given cell \(\Xi ^k\) of the rectangular partition, the upper-bound for the conditional-expected recourse cost (10) can be recast according to its dual formulation as follows:

$$\begin{aligned} \begin{aligned} \underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }}^k)}{\text {Max}} {\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {e}}^k)]= \underset{\pi ,\,\varvec{\theta }}{\text {Min}}&\quad \pi + \varvec{\theta }^\top \overline{\varvec{\xi }}^k\\ \text {s.t.}&\quad \pi + \varvec{\theta }^\top {\mathbf {e}}_j^k \ge Q({\mathbf {x}},{\mathbf {e}}_j^k), \quad j=1,\ldots ,2^{d_{\xi }}. \end{aligned}\nonumber \\ \end{aligned}$$
(15)

where \(\pi ^k\in {\mathbb {R}}\) and \(\varvec{\theta }^k\in {\mathbb {R}}^{d_{\xi }}\) be the dual variables associated with the first and second linear constraints of (10), respectively.

Replacing the worst-case conditional-expected recourse cost of (14) with its dual formulation (15), the upper-bound problem (14) can be recast as the following linear optimization problem:

$$\begin{aligned} \begin{aligned} \underset{\pi ^k,\, \varvec{\theta }^k,\,{\mathbf {x}}}{\text {Min}}&\quad {\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^{n} p^k (\pi ^k + [\varvec{\theta }^k]^\top \overline{\varvec{\xi }}^k)\\ \text {s.t.}&\quad \pi ^k + [\varvec{\theta }^k]^\top {\mathbf {e}}_j^k \ge Q({\mathbf {x}},{\mathbf {e}}_j^k), \quad j=1,\ldots ,2^{d_{\xi }}, \,k=1,\ldots ,n\\&\quad {\mathbf {x}}\in X. \end{aligned}\nonumber \\ \end{aligned}$$
(16)

Since the recourse function is a minimization problem, we are able to obtain the deterministic equivalent

$$\begin{aligned} \underset{\pi ^k,\,\varvec{\theta }^k,\,{\mathbf {x}},\,{\mathbf {y}}}{\text {Min}}&\quad {\mathbf {c}}^\top {\mathbf {x}} +\sum _{k=1}^{n} p^k(\pi ^k + [\varvec{\theta }^{k}]^\top \overline{\varvec{\xi }}^k) \end{aligned}$$
(17a)
$$\begin{aligned} \text {s.t.}&\quad \pi ^k+[\varvec{\theta }^{k}]^\top \overline{\varvec{\xi }}^k \ge {\mathbf {q}}^\top {\mathbf {y}}_{\overline{\varvec{\xi }}^k}, \quad \forall k = 1,\ldots ,n \end{aligned}$$
(17b)
$$\begin{aligned}&\quad {\mathbf {W}}{\mathbf {y}}_{\overline{\varvec{\xi }}^k} \ge {\mathbf {h}}(\overline{\varvec{\xi }}^k) - {\mathbf {T}}(\overline{\varvec{\xi }}^k){\mathbf {x}}, \quad \forall k = 1,\ldots ,n \end{aligned}$$
(17c)
$$\begin{aligned}&\quad \pi ^k+[\varvec{\theta }^{k}]^\top {\mathbf {e}}_j^k \ge {\mathbf {q}}^\top {\mathbf {y}}_j^k, \quad \forall j \in J^k, \,\forall k = 1,\ldots ,n \end{aligned}$$
(17d)
$$\begin{aligned}&\quad {\mathbf {W}}{\mathbf {y}}_j^k \ge {\mathbf {h}}({\mathbf {e}}_j^k) - {\mathbf {T}}({\mathbf {e}}_j^k){\mathbf {x}}, \quad \forall j \in J^k, \,\forall k = 1,\ldots ,n \end{aligned}$$
(17e)
$$\begin{aligned}&\quad {\mathbf {x}}\in X. \end{aligned}$$
(17f)

which is an equivalent formulation of the original problem (16) whenever \(J^k\) comprises all vertices of cell \(\Xi ^k\). This equivalent formulation consists of replacing the recourse function by the second-stage objective function, including all second stage decisions as variables and adding all second stage feasibility constraints. Indeed, if the left-hand-side (LHS) of the first block of constraints is greater than or equal to the second stage cost of a feasible second-stage solution, then the LHS is greater than or equal to the minimum second stage cost given by the recourse function.

Note for instance that (17) is a linear programming problem whenever X is a polyhedral set and can be efficiently solved whenever for problems with low dimension uncertainty vector. However, problem (17) is in general an intractable problem for medium- and large-scale instances as it relies on an exponential set of constraints. To handle this challenge, we present two acceleration procedures: (1) an exact decomposition approach based on the column and constraint generation algorithm [24] to solve medium-dimensional problems and (2) an heuristic procedure for high-dimensional uncertainty based on a simplex-based heuristic method using a circumscribed simplex for each partition cell. For the latter, we prove convergence whenever the recourse function over the uncertainty dimension is monotone.

3.2 Column-constraint generation algorithm to handle medium-sized uncertainty dimensionality

For problems with medium-sized uncertainty dimension, we adapted the column and constraint generation framework to iteratively identify a subset of constraints that ensures feasibility to the original problem. The thrust of this algorithm is to obtain an equivalent formulation of (16) with significantly fewer constraints and variables.

According to [24], the column and constraint generation procedure is implemented in a master-oracle scheme. In our context, the master problem is a relaxation of an equivalent formulation of the original problem (16). Given the solution of the master (relaxed) problem, the oracle finds the vertex that generates the worst infeasibility and adds the associated block of linear constraints and decision variables to the master problem. This iterative procedure stops whenever the oracle asserts that the master solution is feasible for the original problem.

For the initial iteration \(\ell = 0\) of the column-constraint algorithm, we start with an empty set \(J^k = \emptyset \) of constraints, i.e., the initial master problem is a relaxation of (17). For initialization purposes, we also add a block of constraints and variables (17b)–(17c) associated with the conditional expectation \(\overline{\varvec{\xi }}^k\) adapted to cell \(\Xi ^k\). This additional constraint does not affect the original feasible set since it can be represented as an weighted average of all constraints associated to each vertex.

For a given iteration \(\ell \), with an updated set \(J^k\), we solve (17) and obtain a candidate solution \((\pi _{\ell }^{*},\varvec{\theta }_{\ell }^{*}, {\mathbf {x}}_{\ell }^*)\). Then, we solve the oracle problem

$$\begin{aligned} \mathbf{ORACLE }: \quad \vartheta ^*=\underset{j}{\text {Max}}\,\{Q({\mathbf {x}}_{\ell }^*, {\mathbf {e}}_j^k)-\pi _{\ell }^* -[\varvec{\theta }_{\ell }^*]^\top {\mathbf {e}}_j^k \}, \end{aligned}$$
(18)

to find the highest constraint violation given the current solution. To efficiently solve the oracle problem, we replace the recourse function by its dual representation and combine them in a single maximization problem

$$\begin{aligned} \begin{aligned} \underset{j,\,\varvec{\lambda }\ge 0}{\text {Max}}&\quad \varvec{\lambda }^\top ({\mathbf {h}}({\mathbf {e}}_j^k) -{\mathbf {T}}({\mathbf {e}}_j^k){\mathbf {x}}^*) - \pi ^* -[\varvec{\theta }^*]^\top {\mathbf {e}}_j^k\\ \text {s.t.}&\quad {\mathbf {W}}^\top \varvec{\lambda } \le {\mathbf {q}}. \end{aligned} \end{aligned}$$
(19)

Following [23], we avoid solving (19) by enumeration of vertex \(\{{\mathbf {e}}_j^k : j=1,\ldots ,d_{\xi }\}\) of the cell \(\Xi ^k\) by introducing binary variables to represent each vertex and transform (19) into a MILP equivalent formulation presented in details in Appendix A. The algorithm stops whenever the oracle optimal value is non-positive, i.e., the current solution \((\pi _{\ell +1}^{*},\varvec{\theta }_{\ell +1}^{*}, {\mathbf {x}}_{\ell +1}^*)\) is feasible for the original problem. If the oracle optimal value is positive, then update the set \(J^k\) including the oracle solution \(i^*\) and repeat the process solving again the master problem.

We summarize the proposed column and constraint generation algorithm in the following pseudo-algorithm:.

figure b

3.3 Simplex-based heuristic method to handle high-dimensional uncertainty

As the master problem of the column-constraint generation algorithm has exponential size with the uncertainty dimensionality, there are some instances of high dimensional uncertainty that reaches the tractability limit of this accelerating method. For these cases, we developed an heuristic method to obtain an upper bound and corresponding conservative solution.

To handle high dimensional uncertainty, we propose a heuristic solution method that extends the original box uncertainty support to a circumscribed simplex polyhedral where the number of vertexes depends linearly on the uncertainty dimension. Under the extended support, we reformulate (15) to obtain an upper-bound problem whose complexity grows linearly with the uncertainty dimension.

The proposed extension is the minimum volume simplex that contains the original cell (see Fig. 1) and one selected vertex \(\hat{\varvec{\xi }}^k\) coinciding with the cell’s vertexes. This simplex has the property that the length of the edges that contain the original vertex \(\hat{\varvec{\xi }}\) is equal to the sum of the length of the projection of the original support along to each dimension, respectively.

Fig. 1
figure 1

The blue region in the figure corresponds to the original support \([a_1,b_1]\times [a_2,b_2]\), and the simplex, represented in green, is its extension. The vertex in black denotes the selected \(\hat{\varvec{\xi }}\) of the realization of the uncertainty for the recourse and the point highlighted at the middle indicates the mean. a \(\hat{\varvec{\xi }}=(b_1,b_2)\); b \(\hat{\varvec{\xi }}=(a_1,a_2)\)

For example in Fig. 1, it is represented the extension of the support \([a_1,b_1]\times [a_2,b_2]\) by the simplex represented in green. We considered two possibilities \(\{(a_1,a_2),(b_1,b_2)\}\) for the original vertex \(\hat{\varvec{\xi }}^k\), indicated by the black point, that corresponds to the extreme events of the realization of the uncertainty. In both cases, the length of the edges of the simplex that contain the vertex \(\hat{\varvec{\xi }}\in \{(a_1,a_2),(b_1,b_2)\}\) is equal to \((b_1-a_1)+(b_2-a_2)\), the sum of dimensions. We use this property as a simple rule to create the simplex. Figure 2 (b1) and (b2) illustrate how the chosen cell (highlighted in blue) in the refinement procedure is extended to the green simplex for two different partition sizes to visualize the extension procedure.

The proposed simplex-based heuristic method solves the sequential partition refinement problem by considering the extension of each cell of the partition. Based on the extension of cell \(\Xi ^k\), problem (15) can be rewritten as follows:

$$\begin{aligned} \begin{aligned} \underset{\varvec{\delta }\in {\mathcal {R}}(\overline{\varvec{\xi }}^k)}{\text {Max}}\, {\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {v}}^k)] :=\underset{\pi ,\,\varvec{\theta }}{\text {Min}}&\quad \pi + \varvec{\theta }^\top \overline{\varvec{\xi }}^k\\ \text {s.t.}&\quad \pi + \varvec{\theta }^\top {\mathbf {v}}_j^k \ge Q({\mathbf {x}},{\mathbf {v}}_j^k), \quad j=1,\ldots , (d_{\xi }+1), \end{aligned}\nonumber \\ \end{aligned}$$
(20)

where \({\mathcal {R}}(\overline{\varvec{\xi }}^k)=\left\{ \varvec{\delta }_j^k\in {\mathbb {R}}^{d_{\xi }+1}\, : \, \sum _{j=1}^{d_{\xi }+1}\,\delta _j^k{\mathbf {v}}_j^k=\overline{\varvec{\xi }}^k,\,\sum _{j=1}^{d_{\xi }+1}\,\delta _j^k=1\right\} \), \({\mathbf {v}}^k\) is a random vector with support in the set \(\{{\mathbf {v}}_j^k: j=1,\ldots ,d_{\xi }+1\}\) of vertexes of the simplex that contains the cell \(\Xi ^k\) and \({\mathbf {v}}_1^k:=\hat{\varvec{\xi }}^k\) is the original vertex of the hypercube \(\Xi ^k\). Note that the vertex \({\mathbf {v}}_j^k\), \(j=2,\ldots ,(d_{\xi }+1)\), differs from \(\hat{\varvec{\xi }}^k\) in just one component, according to the rule to create the simplex that contains the cell \(\Xi ^k\). That said, we propose the simplex-based heuristic problem

$$\begin{aligned} \begin{aligned} {\tilde{z}}_n^U:= \underset{\pi ^k,\, \varvec{\theta }^k,\,{\mathbf {x}}}{\text {Min}}&\quad {\mathbf {c}}^\top {\mathbf {x}} + \sum _{k=1}^{n} p^k (\pi ^k + [\varvec{\theta }^k]^\top \overline{\varvec{\xi }}^k)\\ \text {s.t.}&\quad \pi ^k + [\varvec{\theta }^k]^\top {\mathbf {v}}_j^k \ge Q({\mathbf {x}},{\mathbf {v}}_j^k), \quad j=1,\ldots ,(d_{\xi }+1), \,k=1,\ldots ,n\\&\quad {\mathbf {x}}\in X. \end{aligned}\nonumber \\ \end{aligned}$$
(21)

Note that the number of blocks of linear constraints of the problem (21) is \(n\cdot (d_{\xi }+1)\), i.e., it depends linearly on the uncertainty dimension. With this alternative upper-bound problem we can solve the sequential partition refinement problem described in Algorithm 3 solving (21) instead (16).

In particular, by assuming monotonicity of the recourse function \(Q({\mathbf {x}},\cdot )\) over the uncertainty vector \(\varvec{\xi }\in \Xi \), and that the selected original vertex \(\hat{\varvec{\xi }}^k\) of the hypercube \(\Xi ^k\) is the worst-case of the recourse, i.e., \(\hat{\varvec{\xi }}^k\in {\mathop {{\mathrm{arg\,max}}}\limits }_{\varvec{\xi }\in \Xi ^k}\,Q({\mathbf {x}},\varvec{\xi })\) for any \({\mathbf {x}}\in X\), the convergence of the partition refinement problem by using the simplex-based heuristic method is guaranteed.

Proposition 1

Let \(Q({\mathbf {x}},\xi _i)\) be a monotonic function of \(\xi _i\), for all \(i=1,\ldots ,d_{\xi }\). Assuming that, for any \({\mathbf {x}}\), \(\hat{\varvec{\xi }}^k\in \mathop {{\mathrm{arg\,max}}}\limits _{\varvec{\xi }\in \Xi ^k}\, Q({\mathbf {x}},\varvec{\xi })\) generates the worst-case recourse cost given the cell \(\Xi ^k\). Then, the sequences \(\{{\tilde{z}}^U_n\}_{n=1}^{\infty }\) and \(\{\tilde{{\mathbf {x}}}^U_n\}_{n=1}^{\infty }\) converge to the optimal objective value and optimal solution of (1), respectively, where \(\tilde{{\mathbf {x}}}^U_n\) is the optimal solution of (21).

The proof of Proposition1 is presented in the Appendix B.

Fig. 2
figure 2

(a) Extension of original support \([a_1,b_1]\times [a_2,b_2]\) for newsvendor problem for two items; (b1) and (b2) The extension of the chosen cell (highlighted in blue) in the refinement procedure is the same as that of original support for different partition sizes

Note that a reformulation similar to (17) can be obtained by just considering \(J^k\) as the number of vertexes of the circumscribed simplex. For a polyhedral set X, this reformulation is a linear programming problem whose complexity grows linearly with the uncertainty vector dimension.

4 Proposed conservative partition refining (CPR) method for two-stage stochastic programming

In this section, we present a new conservative solution framework for two-stage stochastic linear optimization problems that solves the upper-bound problem. The aim is to obtain a conservative solution that avoids disappointment, i.e., the objective function cost estimate is the upper limit for the actual expected cost. We propose a tractable reformulation, namely the deterministic equivalent model, for the upper-bound problem and proof convergence of the proposed methodology. We developed a simple and efficient partition refinement algorithm based on the structure of the optimality gap of each iteration.

We start from a sequential procedure to split the uncertainty support to obtain a refined partition. For iteration n and partition \({\mathscr {P}}_n\), we solve the upper-bound problem (16) and obtain the conservative solution \({\mathbf {x}}_n^U\). Then, we solve the lower-bound problem for the same partition \({\mathscr {P}}_n\) and compute the optimality GAP—the difference of the optimal values of the upper- and lower-bound problems. The proposed CPR method is outlined in the following pseudo-algorithm:

figure c

The convergence of the CPR method is ensured by the Theorem 1, whose proof is presented in the Appendix C. It states that the sequence of optimal solutions and objective values of the lower- and upper-bound problems converge to the optimal solution \({\mathbf {x}}^*\) and the optimal objective value \(z^*\) of the two-stage stochastic optimization problem (1), respectively. Note that the proof of the convergence does not depend on the refinement-partition procedure.

Theorem 1

  1. 1.1

    The optimal objective value sequence \(\{z_n^U\}_{n=1}^{\infty }\) corresponds to conservative solutions given by the upper bound problem, for a family of partitions \(\{{\mathscr {P}}_n\}_{n=1}^{\infty }\) such that \({\mathscr {P}}_{n+1}\) refines \({\mathscr {P}}_n\), is non-increasing.

  2. 1.2

    The optimal objective value sequence \(\{z_n^L\}_{n=1}^{\infty }\) corresponds to optimistic solutions given by the lower bound problem, for a family of partitions \(\{{\mathscr {P}}_n\}_{n=1}^{\infty }\) such that \({\mathscr {P}}_{n+1}\) refines \({\mathscr {P}}_n\), is non-decreasing.

  3. 1.3

    We have that the sequences \(\{z_n^U\}_{n=1}^{\infty }\) and \(\{z_n^L\}_{n=1}^{\infty }\) are convergent, i.e., \(z_n^L\longrightarrow z^*\longleftarrow z_n^U\), as \(n\longrightarrow \infty \). Also the sequence \(\{x_n^U\}_{n=1}^{\infty }\) and \(\{x_n^L\}_{n=1}^{\infty }\) converge to \({\mathbf {x}}^*\in \text {Argmin}_{{\mathbf {x}}\in X} \, f({\mathbf {x}})\).

It is worth mentioning that the upper bound sequence obtained by computing the objective function of the upper-bound problem at the optimistic solution defines a non-increasing sequence. This fact ensures the convergence of the existing partition-based method. However, the optimistic solution is sub-optimal for the upper-bound problem, so the existing partition-based method derives an upper bound less tight.

Next, we propose a simple and efficient partition refinement algorithm that defines the sequential upper- and lower-bounds and, consequently, generates the sequence of conservative solutions \(\{x_n^U\}_{n=1}^{\infty }\).

4.1 Solution algorithm with worst-case partition refinement (SAWPR)

Any partition-based method is supported on the refinement procedure of the partition to improve the upper and lower approximations. Next, we detail the three basic steps of the proposed partition refining procedure as (i) selection of the cell \(\Xi _{k*}\) to be split, (ii) selection of the uncertainty direction \(i^*\in \{1,\ldots ,d_{\xi }\}\) to refine the partition, and (iii) selection of the cutting point.

In the first step of the partition refinement procedure, we aim to select the cell with the highest contribution to the current optimality gap, which is composed of the difference of upper and lower approximations of the expected recourse function. Given that the partition influences the gap through the expected recourse function, we select the cell with the maximum contribution to the difference of upper and lower approximations of the expected recourse function. Let us define the selected cell as

$$\begin{aligned} k^* \in \underset{k}{\mathop {{\mathrm{arg\,max}}}\limits }\,\left\{ p^k\left( \underset{\varvec{\xi } \in {\mathcal {D}}(\overline{\varvec{\xi }}^k)}{\text {Max}}\,{\mathbb {E} }^{\varvec{\delta }}\,[Q({\mathbf {x}}_n^U,{\mathbf {e}}^k)] -Q({\mathbf {x}}_n^L,\overline{\varvec{\xi }}^k)\right) \right\} . \end{aligned}$$
(22)

The second step of the partition refinement procedure is the selection of the uncertainty direction. Most existing partition-based methods consider the direction with the highest metric of non-linearity of \(Q({\mathbf {x}},\cdot )\) as the optimal direction to split the optimal cell to refine the partition. This is motivated by the fact the lower and upper approximations coincide for an affine function. Initially, [5] proposed to use dual (subgradient) information at the endpoints of the cell, while [12] compares the difference between the upper and lower approximations. However, the use of dual information increases the computational burden of the sequential partition refinement problem.

In this work, we selected the uncertainty direction by solving an optimization problem that resembles the robust optimization model with an uncertainty budget proposed by [4]. For the selected cell \(k^*\), we formulated an adversary problem in (23) that aimed to find the uncertainty realization with the highest cost for the optimistic solution given by the lower-bound problem. The intent was to select the dimension where the conditional mean was less representative of the entire conditional distribution adapted to that cell. We imposed a unitary budget constraint, i.e., only one component of the random vector is allowed to change around its nominal value (conditional mean). Hence, we selected the uncertainty direction \(i^* \) such that \({\hat{\xi }}_{i^*}\not = {\overline{\xi }}_{i^*}^k\), where

$$\begin{aligned} \hat{\varvec{\xi }}\in \, \underset{\varvec{\xi },\, {\mathbf {z}}}{\mathop {{\mathrm{arg\,max}}}\limits }&\quad Q({\mathbf {x}}_n^L,\varvec{\xi })\nonumber \\ \text {s.t.}&\quad {\overline{\xi }}_i^k -z_i({\overline{\xi }}_i^k - a_i^k ) \le \xi _i\quad \forall i,\nonumber \\&\quad \xi _i\le {\overline{\xi }}_i^k + z_i (b_i^k - {\overline{\xi }}_i^k)\quad \forall i,\nonumber \\&\quad \sum _{i=1}^{d_{\xi }} z_i \le 1,\nonumber \\&\quad 0 \le z_i \le 1 \quad \forall i. \end{aligned}$$
(23)

Since (23) is a maximization problem with a convex objective function and the feasible set is a polyhedron, there is a vertex that is optimal. By construction, the number of vertices of the feasible polyhedral is \(2\,d_\xi \), i.e., grows linearly on the uncertainty dimension. Thus, to efficiently solve (23) it suffices to enumerate the vertices and cast the one with the highest recourse cost.

Note also that the vertices are intuitive and easily identifiable since they are defined as the conditional mean \(\overline{\varvec{\xi }}^k\) projected on the faces of the hypercube \(\Xi ^k\). To illustrate this concept, let us consider (see Fig. 3) and a given optimistic solution \({\mathbf {x}}_n^L\). A unitary uncertainty budget leads to a “diamond” shaped polyhedral inscribed in the hypercube \(\Xi ^k\). In other words, the vertices of the feasible set differ from the center (conditional mean \(\overline{\varvec{\xi }}^k\)) in just one component i, which can assume any extreme value \(\{a_i^k,b_i^k\}\). Therefore, the number of vertexes is \(2\,d_\xi \).

Fig. 3
figure 3

Ambiguity set representing the feasible region of the problem (23) for the case \(d_{\xi }=3\)

The third and last step of the partition refinement procedure is the selection of the cutting point. For simplicity and computational efficiency, we assumed that the cutting point is the component of the conditional mean \({\overline{\xi }}_{i^*}\) along to the uncertainty direction \(i^*\) the same as [5, 16].

Finally, we summarize the proposed partition refinement procedure, namely, solution algorithm with worst-case partition refinement (SAWPR) as follows:

figure d
Fig. 4
figure 4

Partition of the support with: a 4 cells, b 20 cells, c 100 cells, d 300 cells, e 500 cells, and f 900 cells. The shaded region shows the chosen cell and the red edges show the direction chosen to refine the partition

In Fig. 4 we present the refinement procedure for the farmer’s problem instance for three types of crops as an illustrative example of the SAWPR algorithm. Note that the cells of the partition are clustered around the region of the uncertainty support corresponding to scenarios of less land productivity which represents in particular, the worst-case for this instance.

5 Empirical study

In this section, we present an empirical study to test the proposed CPR method. As we mentioned above (see Sect. 2.1), the partition refinement method depends on the computation of the probability mass \(p^k\) and the conditional mean \(\overline{\varvec{\xi }}^k\) of the cell k, \(k=1,\ldots ,n\). For a general probability distribution, this computation is not easy. We develop or empirical study method based on the uniform probability distribution since in this case, we have an analytic solution for the conditional probability and expectation. However, to cover a more general class of distributions we provide a high-level guideline on how to deal with distributions without analytic solutions for the conditional probability and expectations.

5.1 Implementation remark

We argue that it is possible to efficiently compute the probability mass \(p^k\) and the conditional expectation \(\overline{\varvec{\xi }}^k\) of the cell \(k=1,\ldots ,N\), according to the following cases:

  • Independent random variables: Suppose that the coordinates of the random vector \(\varvec{\xi }\) are continuous independent random variables. Let \(\varphi \) the probability density function of the random vector \(\varvec{\xi }\). Then, by independence, \(\varphi \) can be written as the product of the marginals \(\varphi _i\) of the coordinates \(\xi _i\), \(i=1,\ldots ,d_\xi \). Thereby,

    $$\begin{aligned} p^k = P(\Xi ^k)= \int _{\Xi ^k}\varphi (\varvec{\xi })d\varvec{\xi } =\int _{a_1}^{b_1}\varphi (\xi _1)d\xi _1\cdots \int _{a_{d_\xi }}^{b_{d_\xi }}\varphi _{d_\xi }(\xi _{d_\xi })d\xi _{d_\xi }, \end{aligned}$$

    and

    $$\begin{aligned} \begin{aligned} \overline{\varvec{\xi }}^k&= {\mathbb {E}}[\varvec{\xi }\, | \, \varvec{\xi }\in \Xi ^k] = \int _{\Xi }\varvec{\xi }\varphi (\varvec{\xi } \, | \, \varvec{\xi }\in \Xi ^k)d\varvec{\xi }\\&=\int _{\Xi ^k} \varvec{\xi }\left[ \frac{\varphi (\varvec{\xi })}{p^k}\right] d\varvec{\xi } = \frac{1}{p^k} \int _{a_1}^{b_1}\xi _1\varphi (\xi _1)d\xi _1\cdots \int _{a_{d_\xi }}^{b_{d_\xi }}\xi _{d_\xi }\varphi _{d_\xi }(\xi _{d_\xi })d\xi _{d_\xi }, \end{aligned} \end{aligned}$$

    i.e., the calculation of the probability mass \(p^k\) and the conditional expectation \(\overline{\varvec{\xi }}^k\) derives in the computation of sequence univariate numerical integrals, which is computationally tractable (see [20, 22]).

  • Linearly dependent random variables: Suppose that \(\varvec{\xi }\) can be expressed as \(\varvec{\xi } = {\mathbf {L}}\varvec{\zeta }\), where \({\mathbf {L}}\in {\mathbb {R}}^{d_\xi \times d_\xi }\) is a non-singular matrix and \(\varvec{\zeta }\in {\mathbb {R}}^{d_\xi }\) is an independent random vector. Then, we can calculate the probability mass \(p^k\) and the conditional mean \(\overline{\varvec{\xi }}^k\) as in the previous case. To see that, let us denote by \(\Xi ^k = [{\mathbf {a}}^k,{\mathbf {b}}^k]\) the hypercube for the cell k, \(k=1,\ldots ,n\), where \({\mathbf {a}}^k = [a_1^k, \ldots ,a_{d_\xi }^k]\) and \({\mathbf {b}}^k = [b_1^k,\ldots ,b_{d_\xi }^k]\). Then,

    $$\begin{aligned} p^k= & {} P(\Xi ^k) = P({\mathbf {a}}^k\le \varvec{\xi }\le {\mathbf {b}}^k) = P({\mathbf {a}}^k\le {\mathbf {L}}\varvec{\zeta }\le {\mathbf {b}}^k) = P({\mathbf {L}}^{-1}{\mathbf {a}}^k\le \varvec{\zeta }\le {\mathbf {L}}^{-1}{\mathbf {b}}^k) \\= & {} P(\varvec{\zeta \in {\widetilde{\Xi }}^k}), \end{aligned}$$

    where \({\widetilde{\Xi }}^k\) is the hypercube defined by \({\widetilde{\Xi }}^k =[{\mathbf {L}}^{-1}{\mathbf {a}}^k,{\mathbf {L}}^{-1}{\mathbf {b}}^k]\), and

    $$\begin{aligned} \overline{\varvec{\xi }}^k = {\mathbb {E}}[\varvec{\xi }\, | \, \varvec{\xi }\in \Xi ^k] = {\mathbb {E}}[{\mathbf {L}}\varvec{\zeta }\, | \, {\mathbf {L}}\varvec{\zeta }\in \Xi ^k] = {\mathbf {L}}{\mathbb {E}}[\varvec{\zeta }\, | \, \varvec{\zeta }\in {\widetilde{\Xi }}^k]. \end{aligned}$$

    Therefore, the computation of the conditional probability \(P({\widetilde{\Xi }}^k)\) and the conditional expectation \({\mathbb {E}}[\varvec{\zeta }\, | \, \varvec{\zeta }\in {\widetilde{\Xi }}^k]\) reduces to the independent case. As example, suppose that \(\varvec{\xi }\) is a multivariate normal variable. Let \(\Sigma \in {\mathbb {R}}^{d_\xi \times d_\xi }\) the covariance matrix of \(\varvec{\xi }\), and \(\Sigma = {\mathbf {L}}{\mathbf {L}}^\top \) its Cholesky decomposition. Then, there exists an independent random vector \(\varvec{\zeta }\) such that \(\varvec{\xi } = {\mathbf {L}}\varvec{\zeta }\).

  • Generically dependent random variables case: In this case, we can assess the computation of the probability mass \(p^k\) and the conditional mean \(\overline{\varvec{\xi }}^k\) by the importance sampling method (see [2]) via numerical simulation. Let us denote by

    $$\begin{aligned} U(\varvec{\xi };[{\mathbf {a}}^k,{\mathbf {b}}^k]) = {\left\{ \begin{array}{ll} \prod _{i=1}^{d_\xi } \frac{1}{b_i^k - a_i^k},&{} \varvec{\xi }\in [{\mathbf {a}}^k,{\mathbf {b}}^k], \\ 0, &{} \text {otherwise}, \end{array}\right. } \end{aligned}$$

    the probability density function of the multivariate uniform distribution with support \(\Xi ^k = [{\mathbf {a}}^k,{\mathbf {b}}^k]\). Thereby, we can consider the change of probability measure

    $$\begin{aligned} p^k = \int _{\Xi ^k}\varphi (\varvec{\xi })d\varvec{\xi } = \int _{\Xi ^k}\left[ \frac{\varphi (\varvec{\xi })}{U(\varvec{\xi }; [{\mathbf {a}}^k,{\mathbf {b}}^k])}\right] U(\varvec{\xi };[{\mathbf {a}}^k, {\mathbf {b}}^k])d\varvec{\xi }, \end{aligned}$$

    and

    $$\begin{aligned} \overline{\varvec{\xi }}^k= & {} \int _{\Xi }\varvec{\xi }\varphi (\varvec{\xi } \, | \, \varvec{\xi }\in \Xi ^k)d\varvec{\xi } = \int _{\Xi ^k} \varvec{\xi }\left[ \frac{\varphi (\varvec{\xi })}{p^k}\right] d\varvec{\xi } \\= & {} \int _{\Xi ^k} \varvec{\xi }\left[ \frac{\varphi (\varvec{\xi })}{p^k U(\varvec{\xi };[{\mathbf {a}}^k,{\mathbf {b}}^k])}\right] U(\varvec{\xi };[{\mathbf {a}}^k,{\mathbf {b}}^k]) d\varvec{\xi }. \end{aligned}$$

    Therefore, according to the central limit theorem,

    $$\begin{aligned} \frac{1}{S}\sum _{s=1}^S \left[ \frac{\varphi (\hat{\varvec{\xi }}_s)}{U(\hat{\varvec{\xi }}_s;[{\mathbf {a}}^k, {\mathbf {b}}^k])}\right] \overset{P}{\longrightarrow } p_k, \end{aligned}$$

    and

    $$\begin{aligned} \frac{1}{S}\sum _{s=1}^S \hat{\varvec{\xi }}_s\left[ \frac{\varphi (\hat{\varvec{\xi }}_s)}{p^k U(\hat{\varvec{\xi }}_s;[{\mathbf {a}}^k, {\mathbf {b}}^k])}\right] \overset{P}{\longrightarrow } \overline{\varvec{\xi }}^k, \end{aligned}$$

    where \(\hat{\varvec{\xi }}_s, \forall s = 1, \ldots , S\) are independent random samples of the uniform distribution with density defined by \(U(\varvec{\xi };[{\mathbf {a}}^k,{\mathbf {b}}^k])\). Based on standard statistical results, for a given error precision and confidence level, one can define a sample size S and compute the above estimates to be used in our algorithms as proxies for the conditional probability \(p_k\) and expectation \(\overline{\varvec{\xi }}^k\).

5.2 Computational experiments

We test the CPR method based on a study case for each uncertainty dimensionality category: low, medium, and high. We assessed the aircraft allocation problem [7] (see Appendix D) by the deterministic equivalent reformulation of the upper-bound problem. For medium-sized uncertainty dimensionality, we assessed an instance of the farmer’s problem [5] (see Appendix E) with eight cultivated crops. For high uncertainty dimensionality, we assessed an instance of the farmer’s problem with twenty cultivated crops and an instance of the Stochastic Unit Commitment (SUC) problem for a small single-bus system with five generators over a 24-hour time span (see Appendix F). The medium-sized uncertainty instance (8-crops) was assessed by the column and constraint generation method whereas the high uncertainty dimensionality instances (20-crops and SUC) were assessed by the proposed simplex-based heuristic method of the extension of the uncertainty support. This empirical study was made to evidence the disappointment generated by both conservative and optimistic solutions to raise awareness on the negative impact of optimistic solution methods to solve two-stage stochastic optimization problems, it does not intend to compare the proposed approach with other solution methodologies.

For the first application, different types of aircraft must be allocated to a certain route for the transportation of passengers. The number of allocated aircraft is the first-stage decision, and the recourse of the problem is defined by the number of bumped passengers when demand for seats outstrips capacity. The right-hand side uncertainty corresponds to the unknown demand of passengers modeled by a uniform probability distribution. For this problem, the dimension of the uncertainty is 5, the number of first-stage variables is 21, and the number of second stage-variables is 10 for each uncertainty realization.

Regarding the second application, the farmer problem is an example of a production model under uncertainty where the first-stage decisions correspond to the land allocation destined to rise different types of crops, and the recourse consists in trading the cultivated products in the local market to satisfy a given demand. In this problem, we assume a uniform probability distribution to model the uncertainty in the land productivity for growing each crop. The number of first-stage variables is equal to the uncertainty dimension, since they correspond to the land allocation to cultivate each type of crop while the number of second-stage variables for each uncertainty realization is twice the quantity of cultivated products because they correspond to the sold and buy quantities of each crop in the local market.

To generate different instances of the farmer problem, we varied the uncertainty dimension considering eight and 20 types of crops to create a computational experiment for each of the following situations:

  • It is impossible to solve in practical time the deterministic equivalent linear model for the enumerative case considering the \(2^{d_{\xi }}\) vertexes of the hypercubes of the rectangular partition, but it is possible to handle the medium-sized uncertainty dimensionality with the proposed column and constraint generation algorithm.

  • It is impossible to obtain a conservative solution solving the re-formulated upper-bound problem using the column and constraint generation algorithm, and it is necessary to appeal to a heuristic solution method.

For both applications, the refinement algorithm and the algorithm for the sequential partition-based method is implemented in JuMP [8], a modeling language for mathematical optimization embedded in the Julia programming language and Gurobi was used as the linear optimization solver to run the computational experiments on a Intel Core i7, 4.0-GHz processor with 32 GB of RAM. Moreover, we have considered a uniform probability distribution to alleviate the computational burden of the CPR method in the computation of the probability mass \(p^k\) and the conditional mean \(\overline{\varvec{\xi }}^k\) of the cell k, \(k=1,\ldots ,N\). For a general probability distribution, this computation can be assessed via numerical simulation (see 2.1).

5.3 Results

In general, the optimal solutions of the upper-bound problem of both the CPR method solved by the column and constraint generation algorithm and the simplex-based heuristic method of the extension of the uncertainty support represents a conservative decision policy. Indeed, these solutions are obtained from the approximation of the conditional expectation \({\mathbb {E}}[Q({\mathbf {x}},\varvec{\xi })|\varvec{\xi }\in \xi ^k]\) by the worst distributionally expectation

$$\begin{aligned} \underset{\varvec{\delta }\in {\mathcal {D}}(\overline{\varvec{\xi }}^k)}{\text {Max}}\,{\mathbb {E} }^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {e}}^k)] \end{aligned}$$

and

$$\begin{aligned} \underset{\varvec{\delta }\in {\mathcal {R}}(\overline{\varvec{\xi }}^k)}{\text {Max}} \,{\mathbb {E}}^{\varvec{\delta }}\,[Q({\mathbf {x}},{\mathbf {v}}^k)], \end{aligned}$$

considering the marginal distribution of the random vectors \({\mathbf {e}}^k\) and \({\mathbf {v}}^k\) with support in the set \(\{{\mathbf {e}}_j^k:j=1,\ldots ,2^{d_{\xi }}\}\) and \(\{{\mathbf {v}}_j^k:j=1,\ldots ,d_{\xi }+1\}\), respectively, for all \(k=1,\ldots ,n\).

Fig. 5
figure 5

Disappointment in an out-of-sample analysis for the aircraft allocation problem: a low recourse cost; b high recourse cost

As mentioned above, the existing partition-based methods do not solve the upper-bound problem, instead, it determines the optimistic solution \({\mathbf {x}}_n^L\) given by the lower-bound problem. Nevertheless, depending on the recourse cost, sometimes it is necessary to obtain a conservative solution because the optimistic solution generates a significant disappointment. To study the disappointment of a given conservative and optimistic solution in an out-of-sample analysis, we performed a Monte Carlo simulation with a number N of scenarios to estimate the actual cost by the sample-average cost [3].

For the aircraft allocation problem, we consider two instances: (a) low recourse cost; (b) high recourse cost. For the low recourse cost—low cost for the second-stage variables, we consider a negligible cost for unattended demand. For the high recourse cost, we consider a significantly high deficit cost associated with unattended demand. Given these two instances, we depict in Fig. 5 the disappointment by confronting the out-of-sample cost against the cost estimate given by objective value of the upper (conservative) and lower (optimistic) bound problems. For didactic purposes, the results associated with the optimistic solution are presented in red while the results for the conservative solution are presented in blue. The horizontal axis of the figure reports the computational time to obtain the conservative and the optimistic solution, given a partition size. The cost estimated by the optimal value of the upper- and lower-bound problem (optimal objective value) is represented by the dashed line. The solid line corresponds to the out-of-sample cost evaluation and the shaded area corresponds to the associated 95% confidence interval. It is worth clarifying that the computational time reported in the horizontal axis does not include the time to compute the out-of-sample cost.

For the low recourse cost instance, Fig. 5a, we observe that both (conservative and optimistic) solutions have similar out-of-sample performances. However, for the high recourse cost instance, Fig. 5b, the conservative solution out-performs the optimistic one for any partition size. Note also that, for both instances, the optimistic solution leads to significant disappointment—the out-of-sample cost evaluation is significantly higher than the cost estimated by the optimal value of the lower bound problem. On the other hand, the conservative solution has a mathematical guarantee against disappointment, which is corroborated by this empirical study—there is no statistically significant disappointment. This means that optimistic solution methods might not be suitable for applications with high recourse cost due to poor out-of-sample performance and potentially high disappointment levels.

Fig. 6
figure 6

Disappointment in an out-of-sample analysis of the conservative solution and optimistic solution of the farmer’s problem with eight crops

In Fig. 6, we present the numerical results for the farmer’s problem with eight crops considering a high recourse cost instance. The out-of-sample evaluation of the conservative solution shows a very similar behavior when compared to the optimal objective value of the upper-bound problem. Conversely, the out-of-sample cost evaluation for the optimistic solution is significantly higher than the optimal objective value, i.e., it shows a high disappointment level. Moreover, the conservative solution significantly out-performs the optimistic one in an out-of-sample analysis. It is important to note that the above mentioned effects are amplified whenever the algorithm is far from converging.

This result alerts us to use of the optimistic solution for two-stage stochastic linear programming problems when the recourse cost is significantly high. If the method stops before convergence—due to time or computational-power limitations—the optimistic solution is not reliable given its poor out-of-sample performance and high disappointment level. On the other hand, the conservative decision is robust even for non-converged solutions.

Fig. 7
figure 7

Upper bound and lower bound obtained by the simplex-based heuristic method of the extension of the uncertainty support for a a farmer’s problem with 20-crops; b the sstochastic unit commitment problem for a single-bus system over a 24-hour time span

Finally, Fig. 7a presents the numerical results for the farmer’s problem with 20-crops, and Fig. 7b reports the result for the instance of the SUC problem, both obtained by the simplex-based heuristic method for highly asymmetric recourse cost, which means that the corrective actions are much more expensive in one direction—unattended demand is more expensive than over-production (farmer’s problem) or over-generation (SUC problem). As in the upper-bound problem (16), solved by the column and constraint generation algorithm, the solution given by the simplex-based heuristic method (21) also avoids disappointments and significantly outperforms the optimistic solution. As before, the non-converged optimistic solution presented a significant disappointment and poor out-of-sample performance.

Table 1 displays the uncertainty dimension \(d_\xi \), the size (# variables, # constraints) of the upper-bound problem and the computational time to solve the CPR method by applying the corresponding numerical scheme to solve each instance.

Table 1 Computational time of the proposed algorithmic schemes for each instance

6 Conclusions

We developed a conservative convergent solution method for a two-stage stochastic linear optimization problem with a convex recourse function of uncertainty based on the distributionally robust optimization model that generalizes the Edmundson-Madanski inequality. Considering that the complexity grows exponentially over an uncertainty dimension, for computational tractability, we reformulated the upper-bound problem and proposed algorithmic schemes: (i) for problems with low-dimensional uncertainty, we developed a deterministic equivalent linear programming model, (ii) for medium-sized uncertainty dimensionality, we proposed a column and constraint generation algorithm, and (iii) to handle high dimensional uncertainty, we proposed a simplex-based heuristic method whose complexity grew linearly with the uncertainty dimension.

An out-of-sample computational experiments showed that our method avoids disappointment in comparison to the non-converged sub-optimal optimistic solution given by the lower-bound problem based on Jensen’s inequality when the cost of recourse was high. This raises awareness regarding the use of the optimistic solution provided by the partition-based method to solve two-stage stochastic optimization problems when the recourse cost is significantly high. Many practical applications that exhibit this type of recourse cost and depend on the implementation of a non-converged sub-optimal solution because of computational time limitations will benefit from the developed framework to obtain a conservative solution.