Keywords

1 Introduction

Bilevel programming problems describe hierarchical interaction between two subjects. The subject making decision first is called a leader. The second subject is called a follower. Their decisions are solutions to upper and lower level problems respectively. The parameters of the lower level problem depend on the leader’s strategy. The leader takes into account the optimal follower’s solution when the upper level strategy is selected. The theory of bilevel problems is described in monographs [1,2,3] and in the review [4].

In this paper, we study stochastic bilevel programming problems with probabilistic and quantile criteria. These criteria are used in stochastic models for taking into account reliability requirements [5]. The probabilistic criterion is defined as the probability of successful work of the system modeled. The quantile criterion (also known as Value-at-Risk) is the minimal losses that cannot be exceeded with a given probability. There are a few works on using the probabilistic and quantile criteria in the stochastic bilevel optimization. The probabilistic criterion for a linear problem with fuzzy random data is studied in [6], where an algorithm to solve the problem is suggested. The quantile criterion for a stochastic bilevel problem with quantile criterion and discrete distribution of the random parameters is studied in [7]. A method to reduce this problem to a mixed integer programming problem is suggested in [7]. Another approach in stochastic programming to deal with reliability requirements is using coherent risk measures [8]. Properties of stochastic bilevel problems with criteria involving coherent risk measures and optimality conditions in these problems are studied in [9, 10].

When the exact distribution of the random parameters is unknown, the objective function and the constraints of a stochastic problem can be estimated by using a sample. Thus, original problems are replaced by their approximations. The properties of the obtained sample approximations are studied in [11] for the expectation criterion and in [8, 12] for problems with probabilistic constraints. In [13], the convergence of this method is studied for problems with probabilistic and quantile criteria.

In this paper, we study sample approximations of bilevel stochastic programming problems with probabilistic and quantile criteria. We reduce the sample approximations to mixed integer programming problems and give sufficient conditions of their convergence. Also, we describe conditions guaranteeing that the loss function of the problem is a normal integrand. These conditions are required to formulate results on the existence of an optimal solution and on the convergence of the sample approximations.

2 Statement

Let X be a random vector defined on a probability space \((\mathcal X, \mathcal F, \mathbf{P})\), where \(\mathcal X\) is a closed subset of \(\mathbb R^m\). The \(\sigma \)-algebra \(\mathcal F\) is assumed to be complete, i.e., \(S'\in \mathcal F\) if there exists a set \(S\in \mathcal F\) such that \(S'\subset S \) and \(\mathbf{P}(S) = 0\). For simplicity, we assume that \(X(x) = x\) for all \( x\in \mathcal X\). This means that the sample space \(\mathcal X\) is considered as the space of realizations of the random vector X.

Let \(U \subset \mathbb R^r\) be a set feasible values of leader’s variables. The follower’s problem is defined by the linear programming problem

$$\begin{aligned} \mathcal Y^*(u, x) := {{\,\mathrm{Arg}\,}}\min _{y\in \mathbb R^s}\left\{ c(u, x)^\top y \mid y\in \mathcal Y(u, x) \right\} ,\end{aligned}$$
(1)
$$\begin{aligned} \mathcal Y(u, x) := \left\{ y\in \mathbb R^s \mid A(u, x) y \ge b(u, x), \, y \ge 0\right\} . \end{aligned}$$
(2)

where \(u \in U\) is the leader’s variable, y is the follower’s variable, \(A :U\times \mathcal X\rightarrow \mathbb R^{k\times s}\), \(b :U\times \mathcal X\rightarrow \mathbb R^{k}\), \(c :U \times \mathcal X \rightarrow \mathbb R^s\) are a matrix and vectors depending on u and x. Thus, the leader’s variable u and the realization x of the random vector X define the constraints and the objective function of the follower’s problem. The set-valued mappings \(\mathcal Y^*, \mathcal Y:U\times \mathcal X\rightarrow 2^{\mathbb R^s}\) are introduced by (1) and (2). We note that \(\mathcal Y^*(u, x) = \emptyset \) if problem (1) is infeasible or unbounded.

Let \(\varPsi :U\times Y \times \mathcal X \rightarrow \mathbb R^*\) be a leader’s loss function, where Y is the closure of the set

$$\begin{aligned} \bigcup _{u \in U,\, x\in \mathcal X} \mathcal Y(u, x), \end{aligned}$$

\(\mathbb R^* = \mathbb R \cup \{-\infty , +\infty \}\) is the extended real line.

To describe the losses of the leader when the follower chooses an optimal variable \(y \in \mathcal Y^*(u, x)\), we introduce the function \(\varPhi :U\times \mathcal X\rightarrow \mathbb R^*\) by the rule

$$\begin{aligned} \varPhi (u, x) := \inf _{y\in \mathcal Y^*(u, x)} \varPsi (u, y, x). \end{aligned}$$
(3)

The infimum in (3) means that the follower chooses the best decision for the leader among the optimal decisions \(y\in \mathcal Y^*(u, x)\). Thus, the optimistic statement of the bilevel problem formulated below will be studied. We call \(\varPhi \) the optimistic leader’s loss function.

Let us consider the probability function

$$\begin{aligned} P_\varphi (u) := \mathbf{P}\{\varPhi (u, X) \le \varphi \}, \end{aligned}$$
(4)

where \(\varphi \in \mathbb R^*\) is a fixed level of the optimistic leader’s loss function \(\varPhi \). The value \(P_\varphi (u)\) in (4) is well defined when the function \(x \mapsto \varPhi (u, x)\) is measurable. Sufficient conditions for this will be suggested below.

The quantile function is defined by the equality

$$\begin{aligned} \varphi _\alpha (u) := \min \left\{ \varphi \in \mathbb R^*\mid P_\varphi (u) \ge \alpha \right\} , \end{aligned}$$
(5)

where \(\alpha \in (0, 1]\) is a fixed probability level.

In this paper, we study the probability maximization problem

$$\begin{aligned} U^* := {{\,\mathrm{Arg}\,}}\max _{u\in U} P_\varphi (u),\quad \alpha ^* := \sup _{u\in U}P_\varphi (u), \end{aligned}$$
(6)

and the quantile minimization problem

$$\begin{aligned} V^* := {{\,\mathrm{Arg}\,}}\min _{u\in U} \varphi _\alpha (u),\quad \varphi ^*:=\inf _{u\in U} \varphi _\alpha (u). \end{aligned}$$
(7)

Problems (6) and (7) are optimistic bilevel stochastic programming problems with probabilistic and quantile criteria respectively.

3 Existence of Optimal Solution

It is known [5, 14] that problems (6) and (7) are well defined and have optimal solutions if the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand. When the \(\sigma \)-algebra \(\mathcal F\) is complete, the normal integrand can be defined as a lower semicontinuous in \(u\in U\) and \(\mathcal B(U)\times \mathcal F\)-measurable function, where \(\mathcal B(U)\) is the Borel \(\sigma \)-algebra of subsets U. In this section, we suggest conditions under which the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand.

Let us consider the follower’s problem (1). According to duality theory, the follower’s variable \(y\in \mathbb R^s\) is optimal in (1) if \(y \in \mathcal Y(u, x)\) and there exists a vector \(\lambda \in \mathbb R^k\) such that

$$\begin{aligned} A(u, x)^\top \lambda \le c(u, x), \, \lambda \ge 0,\end{aligned}$$
(8)
$$\begin{aligned} \left( A(u, x) y - b(u, x)\right) ^\top \lambda = 0,\end{aligned}$$
(9)
$$\begin{aligned} \left( A(u, x)^\top \lambda - c(u, x)\right) ^\top y = 0. \end{aligned}$$
(10)

Denote by \(\varLambda (u, y, x)\) the set of \(\lambda \in \mathbb R^k\) satisfying (8)–(10). Let us introduce the function

$$\begin{aligned} \delta _\varLambda (u, y, \lambda , x) = {\left\{ \begin{array}{ll} 0 &{}\text { if }y \in \mathcal Y(u, x) \text { and }\lambda \in \varLambda (u, y, x),\\ +\infty &{}\text { otherwise}. \end{array}\right. } \end{aligned}$$

Then the optimistic loss function \(\varPhi \) can be represented in the form

$$\begin{aligned} \varPhi (u, x) = \min _{y\in Y}\min _{\lambda \in \varLambda (u, y, x)}\left\{ \varPsi (u, y, x) + \delta _\varLambda (u, y, \lambda , x) \right\} . \end{aligned}$$
(11)

We use the convention \(-\infty + \infty = +\infty \) in (11) and below.

Let us denote by \(Y^*\) the closure of the set \(\bigcup _{u\in U x\in \mathcal X} \mathcal Y^*(u, x)\).

Theorem 1

Let the following conditions hold:

  1. (i)

    the function \((u, y, x) \mapsto \varPsi (u, y, x)\) is lower semicontinuous in \((u, y) \in U\times Y\) and \(\mathcal B(U) \times \mathcal B(Y) \times \mathcal F\)-measurable;

  2. (ii)

    the functions \((u, x) \mapsto A(u, x)\), \((u, x) \mapsto b(u, x)\), \((u, x) \mapsto c(u, x)\) are continuous in \(u\in U\) and measurable in x;

  3. (iii)

    the set \(Y^*\) is bounded;

  4. (iv)

    there exists a compact set \(\varLambda ^*\) such that \(\varLambda ^* \cap \varLambda (u, y, x) \ne \emptyset \) if and only if \(\varLambda (u, y, x) \ne \emptyset \).

Then the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand.

Proof

Taking into account (iii), (iv) and (11), the optimistic loss function can be rewritten in the form

$$\begin{aligned} \varPhi (u, x) = \min _{y\in Y^*,\, \lambda \in \varLambda ^*}\left\{ \varPsi (u, y, x) + \delta _\varLambda (u, y, \lambda , x) \right\} . \end{aligned}$$
(12)

From (i) it follows that the function \(((u, y), x) \mapsto \varPsi (u, y, x)\) is a normal integrand. From (ii) it follows that functions \((u, x) \mapsto A(u, x)\), \((u, x) \mapsto b(u, x)\), \((u, x) \mapsto c(u, x)\) are normal integrands [15, Example 14.29]. Therefore, the set

$$\begin{aligned}\left\{ (u, y, \lambda , x) \in U\times Y^*\times \varLambda ^*\times \mathcal X \mid \lambda \in \varLambda (u, y, x)\right\} \end{aligned}$$

is \(\mathcal B(U) \times \mathcal B(Y^*) \times \mathcal B(\varLambda ^*) \times \mathcal F\)-measurable and its sections \(\{(u, y, \lambda ) \mid \lambda \in \varLambda (u, y, x)\}\) are closed for all \(x\in \mathcal X\). Hence, the function

$$\begin{aligned} ((u, y, \lambda ), x) \mapsto \varPsi (u, y, x) + \delta _\varLambda (u, y, \lambda , x) \end{aligned}$$

defined on the set \(U\times Y^* \times \varLambda ^*\times \mathcal X\) is a normal integrand. Since the set \(Y^* \times \varLambda ^* \) is compact, the minimum in (12) can be written and the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand [15, Proposition 14.47]. Theorem 1 is proved.

Let us consider the case when the values of functions b and c are positive.

Corollary 1

Let conditions (i), (ii) of Theorem 1 hold. Suppose that

  1. (v)

    there exist \(\underline{c},\underline{b}\in \mathbb R\) such that

    $$\begin{aligned} \inf _{u\in U, x\in \mathcal X}\min _{i=\overline{1, s}}c_i(u, x)> \underline{c}> 0,\\ \inf _{u\in U, x\in \mathcal X}\min _{j=\overline{1, k}}b_j(u, x)> \underline{b} > 0; \end{aligned}$$
  2. (vi)

    there exists \(\bar{c} \in \mathbb R\) such that

    $$\begin{aligned} \sup _{u\in U, x \in \mathcal X} \min _{y\in \mathcal Y(u, x)} c(u, x)^\top y < \bar{c}; \end{aligned}$$

Then the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand.

Proof

Let us notice that from (vi) it follows that \(\mathcal Y(u, x) \ne \emptyset \) for all \(u\in U\), \(x\in \mathcal X\). From (v) and (vi) we get

$$\begin{aligned}&\bigcup \limits _{u\in U x\in \mathcal X} \mathcal Y^*(u, x) \subset \left\{ y\in \mathbb R^s \mid \sup _{u\in U, x \in \mathcal X} c(u, x)^\top y \le \bar{c},\, y \ge 0 \right\} \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \subset \left\{ y\in \mathbb R^s \mid \max _{i=\overline{1, s}} y_i \le \frac{\bar{c}}{\underline{c}},\, y \ge 0 \right\} . \end{aligned}$$

Thus, condition (iii) of Theorem 1 is satisfied.

From duality theory it is known that for \(y \in \mathcal Y^*(u, x)\) there exists a vector \(\lambda \in \varLambda (u, y, x)\) such that \(c(u, x)^\top y = b(u, x)^\top \lambda \). Hence, the set \(\varLambda ^*\) satisfying condition (iv) of Theorem 1 can be taken in the form

$$\begin{aligned} \varLambda ^* = \left\{ \lambda \in \mathbb R^k \mid \max _{j=\overline{1, k}} \lambda _j \le \frac{\bar{c}}{\underline{b}},\, \lambda \ge 0 \right\} . \end{aligned}$$

By Theorem 1, the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand. Corollary 1 is proved.

Example 1

Let us consider the follower’s problem

$$\begin{aligned} \mathcal Y^*(u, x) = {{\,\mathrm{Arg}\,}}\min _{y\in \mathbb R}\left\{ \max \{0, u\} y \mid y\geqslant 0\right\} , \end{aligned}$$

where \(u\in U = [-1, 1]\). It easily seen that \(\mathcal Y^*(u, x) = \{0\}\) if \(u > 0\) and \(\mathcal Y^*(u, x) = [0, +\infty )\) if \(u \le 0\). Then for the normal integrand \(\varPsi (u, y, x) = e^{uy}\) the infimum-function \(\varPhi (u, x) = \inf _{y\in \mathcal Y^*(u, x)}\varPsi (u, y, x)\) is not lower semicontinuous, because \(\varPhi (u, x) = 1\) if \(u \ge 0\) and \(\varPhi (u, x) = 0\) if \(u < 0\). This example shows that condition (v) of Corollary 1 cannot be replaced by the conditions

$$\begin{aligned} \inf _{u\in U, x\in \mathcal X}\min _{i=\overline{1, s}}c_i(u, x) \ge 0,\quad \inf _{u\in U, x\in \mathcal X}\min _{j=\overline{1, k}}b_j(u, x) \ge 0. \end{aligned}$$

Example 2

Let us consider a production planning model. In this model, the leader and the follower are the head office and the production division of a company. The leader can get a contract for the production of several types of products. The prices of these products manufactured according to the contract are deterministic. The leader gives the follower a task to produce the products. The leader supplies resources to the follower. The follower pays for using resources. The prices for using resources are known when the follower produces products, but the prices are considered to be random when the task for the follower is stated. The leader’s variable \(u \in \mathbb R^r\) (\(u \ge 0\)) consists of production volumes required by the contract. The follower’s variable \(y \in \mathbb R^s\) (\(y \ge 0\)) consists of volumes of required resources. Let \(c(u, X) = \tilde{c}(X)\) be a random vector of prices for using resources such that \(\bar{c}> c_i(u, x) = \tilde{c}_i(x)> \underline{c} > 0\) for all \(x\in \mathcal X\). The matrix A(ux) is constant such that A(ux)y is the vector of manufactured products. Let \(b(u, x) = u\). Thus, the constraint \(A(u, x) y \ge b(u, x)\) means that the follower must produce the products according to the leader’s decision u. The leader’s loss function has the form \(\varPsi (u, y, x) = -\pi ^\top u - \tilde{\pi }(x)^\top (A(u, x) y - b(u, x)) + f(y) \), where f(y) is the cost of buying resources y, \(\pi \) is the vector of prices for manufactured products according to the contract, \(\tilde{\pi }(x)\) is a random vector of prices for additionally manufactured products. The function f can be linear or convex (if big volumes of resources require additional costs). Notice that the conditions of Corollary 1 are satisfied for this model if \(U = \{u \in \mathbb R^r \mid \underline{u} \le u \le \bar{u}\} \), where \(0< \underline{u} < \bar{u}\).

Let us formulate a corollary from Theorem 1 on the existence of optimal solutions to problems (6) and (7).

Corollary 2

Let the conditions of Theorem 1 (or the conditions of Corollary 1) hold. Let the set U be compact. Then the set \(U^*\) of optimal solutions to problem (6) is nonempty. If there exists a point \(u\in U\) such that \(\varphi _\alpha (u) < +\infty \), then the set \(V^*\) of optimal solutions to problem (7) is nonempty.

Proof

It is proved in [14, Theorem 6] that, if the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand, then the probability function \(u \mapsto P_\varphi (u)\) is upper semicontinuous for all \(\varphi \in \mathbb R\) and the quantile function \(u\mapsto \varphi _\alpha (u)\) is lower semicontinuous for all \(\alpha \in (0, 1]\) for any normal integrand \((u, x) \mapsto \varPhi (u, x)\). The conditions of Theorem 1 (or the conditions of Corollary 1) guarantees that the function \((u, x) \mapsto \varPhi (u, x)\) is a normal integrand. Thus, the assertion of Corollary 2 follows from the Weierstrass theorem.

4 Sample Approximations

In this section, we construct sample approximations of the probability maximization problem (6) and the quantile minimization problem (7) by using a sample \(\left( X^1, X^{2}, \ldots , X^{N}\right) \) generated by the random vector X. The sequence of random vectors \(\left( X^{N}\right) \), \(N\in \mathbb N\), is defined on a complete probability space \((\Omega , \mathcal {F'}, \mathbf {P'})\). The distribution functions of independent random variables \(X^N\) coincide with the distribution function of X.

Let us estimate the probability function (4) by the frequency of the event \(\left\{ \varPhi (u, X) \le \varphi \right\} \):

$$\begin{aligned} P_\varphi ^{(N)}(u) :=\frac{1}{N} \sum _{\nu =1}^N \chi _{[-\infty , \varphi ]} (\varPhi (u, X^{\nu })), \, N\in \mathbb N, \end{aligned}$$
(13)

where

$$\begin{aligned}\chi _S(x):={\left\{ \begin{array}{ll} 1, &{} x \in S;\\ 0, &{} x \notin S. \end{array}\right. } \end{aligned}$$

Replacing the probability function in (5) by the estimator (13), we obtain the sample estimator of the quantile function:

$$\begin{aligned} \varphi _\alpha ^{(N)}(u) := \min \left\{ \varphi \in \mathbb R^*\mid P^{(N)}_\varphi (u)\ge \alpha \right\} . \end{aligned}$$

We consider the sample approximation of the probability maximization problem in the form

$$\begin{aligned} U^{(N)} :={{\,\mathrm{Arg}\,}}\max _{u\in U} P^{(N)}_\varphi (u),\quad \alpha _N := \sup _{u\in U}P^{(N)}_\varphi (u), \end{aligned}$$
(14)

and the sample approximation of the quantile minimization problem in the form

$$\begin{aligned} V^{(N)} := {{\,\mathrm{Arg}\,}}\min _{u\in U} \varphi _\alpha ^{(N)}(u), \quad \varphi _N := \inf _{u\in U}\varphi _\alpha (u). \end{aligned}$$
(15)

When a realization \(\left( x^1, x^2, \ldots , x^N\right) \) of the sample \(\left( X^1, X^{2}, \ldots , X^{N}\right) \) is fixed, problems (14) and (15) can be considered as stochastic programming problems with discrete distribution of the random parameters. This allows us to use the technique suggested in [16, 17] for reducing the problems to deterministic mixed integer programming problems.

Recall that \(Y^*\) is the closure of the set \(\bigcup _{u\in U, \, x\in \mathcal X}\mathcal Y^*(u, x)\). Suppose that a set \(\varLambda ^* \subset \mathbb R^k\) is chosen in such a way that, for all \(u\in U\), \(y\in Y^*\), \(x\in \mathcal X\), the set \(\varLambda ^* \cap \varLambda (u, y, x) \ne \emptyset \) if and only if \(\varLambda (u, y, x) \ne \emptyset \).

Let functions \(\gamma _1:U\times Y \times \mathcal X\times \mathbb R\rightarrow \mathbb R\), \(\gamma _2:U\times Y \times \mathcal X\rightarrow \mathbb R^k\), \(\gamma _3:U\times Y \times \mathcal X\rightarrow \mathbb R^k\), \(\gamma _4:\mathbb R^k\mapsto \mathbb R^k\), \(\gamma _5:U\times \mathbb R^k \times \mathcal X\rightarrow \mathbb R^s\), \(\gamma _6:U\times \mathbb R^k \times \mathcal X\rightarrow \mathbb R^s\), \(\gamma _7:Y\rightarrow \mathbb R^s\) satisfying the following conditions be known:

  1. 1.

    \(\varPsi (u, y, x) - \varphi \le \gamma _1(u, y, x, \varphi )\) for all \(u\in U\), \(y\in Y^*\), \(x\in \mathcal X\), \(\varphi \in \mathbb R\);

  2. 2.

    \(-\gamma _2(u, y, x)\le A(u, x)y - b(u, x) \le \gamma _3(u, y, x) \) for all \(u\in U\), \(y\in Y^*\), \(x\in \mathcal X\);

  3. 3.

    \(\lambda \le \gamma _4(\lambda )\) for all \(\lambda \in \varLambda ^*\).

  4. 4.

    \(-\gamma _6(u, \lambda , x) \le A(u, x)^\top \lambda - c(u, x)\le \gamma _5(u, \lambda , x)\) for all \(u\in U\), \(\lambda \in \varLambda ^*\), \(x\in \mathcal X\);

  5. 5.

    \(y\le \gamma _7(y)\) for all \(y\in Y^*\).

Let us introduce the variables \(y^\nu \) and \(\lambda ^\nu \) corresponding to realizations \(x^\nu \), \(\nu =\overline{1, N}\), and the vectors of auxilary binary variables \(\delta \in \{0, 1\}^N\), \(\eta ^\nu \in \{0, 1\}^k\), \(\zeta ^\nu \in \{0, 1\}^s\), \(\nu =\overline{1, N}\). The sense of these variables follows from the proof of Theorem 2 given below. The value of \(\delta _\nu \) is equal to 1 if \(\varPhi (u, x^\nu ) \le \varphi \). If \(\delta _\nu = 1\), then zero elements of \(\eta ^\nu \) correspond to nonzero elements of the dual variables for constraints \(A(u, x^\nu ) y^\nu \ge b(u, x^\nu )\), zero elements of \(\zeta ^\nu \) correspond to nonzero elements of the vector \(y^\nu \). Problem (14) reduces to the problem

$$\begin{aligned} \frac{1}{N} \sum _{\nu =1}^N \delta _\nu \rightarrow \max _{u\in U,\, y^\nu \in Y,\, \lambda ^\nu \in \mathbb R^k,\,\delta \in \{0, 1\}^N,\, \eta ^\nu \in \{0, 1\}^k,\, \zeta ^\nu \in \{0, 1\}^s,\, \nu =\overline{1, N}} \end{aligned}$$
(16)

subject to

$$\begin{aligned} \varPsi (u, y^{\nu }, x^{\nu })&\le \varphi + (1-\delta _\nu ) \gamma _1(u, y^{\nu }, x^{\nu }, \varphi ),\end{aligned}$$
(17)
$$\begin{aligned} A(u, x^{\nu }) y^\nu&\ge b(u, x^{\nu }) - (1-\delta _\nu ) \gamma _2(u, y^{\nu }, x^{\nu }), \end{aligned}$$
(18)
$$\begin{aligned} A(u, x^{\nu }) y^{\nu }&\le b(u, x^{\nu })+ (\eta ^{\nu } + (1-\delta _\nu )e_k) \circ \gamma _3(u, y^{\nu }, x^{\nu }),\end{aligned}$$
(19)
$$\begin{aligned} 0 \le \lambda ^{\nu }&\le ((2-\delta _\nu ) e_k - \eta ^{\nu }) \circ \gamma _4(\lambda ^\nu ),\end{aligned}$$
(20)
$$\begin{aligned} A(u, x^{\nu })^\top \lambda ^\nu&\le c(u, x^{\nu })+ (1-\delta _\nu ) \gamma _5(u, \lambda ^\nu , x^{\nu }),\end{aligned}$$
(21)
$$\begin{aligned} A(u, x^{\nu })^\top \lambda ^\nu&\ge c(u, x^{\nu }) - (\zeta ^{\nu } + (1-\delta _\nu )e_s) \circ \gamma _6(u, y^{\nu },\lambda ^\nu , x^{\nu }),\end{aligned}$$
(22)
$$\begin{aligned} 0 \le y^{\nu }&\le ((2-\delta _\nu ) e_s - \zeta ^{\nu })\circ \gamma _7(y^\nu ) , \, \nu =\overline{1, N}, \end{aligned}$$
(23)

where \(e_s\), \(e_k\) are vectors consisting of ones with dimension s and k respectively, \(\circ \) denotes the element-wise product of two vectors.

Denote by \(\left( \bar{u}, (\bar{y}^\nu ), (\bar{\lambda }^\nu ), \bar{\delta }, (\bar{\eta }^\nu ), (\bar{\zeta }^\nu )\right) \) the optimal solution to problem (16), where \((\bar{y}^\nu ) := (y^1, y^2, \ldots , y^N)\). Notation \((\bar{\lambda }^\nu )\), \((\bar{\eta }^\nu )\), \((\bar{\zeta }^\nu )\) has the same sense.

Theorem 2

Let the conditions of Theorem 1 hold. Suppose that \(\varLambda ^*\) satisfies condition (iv) of Theorem 1. Then,

  1. 1.

    if \(\bar{u}\) is an optimal value of the variable u in problem (16), then \(\bar{u}\in U^{(N)}\);

  2. 2.

    if \(\bar{\delta }\) is an optimal value of the variable \(\delta \) in problem (16), then

    $$\begin{aligned} \alpha _N = \frac{1}{N}\sum _{\nu =1}^N \bar{\delta }_\nu ; \end{aligned}$$
  3. 3.

    for any optimal \(\tilde{u}\in U^{(N)}\) there exist values \(\tilde{y}^\nu \in Y\), \(\tilde{\lambda }^\nu \in \mathbb R^k\), \(\tilde{\delta }\in \{0, 1\}^N\), \(\tilde{\eta }^\nu \in \{0, 1\}^k\), \(\tilde{\zeta }^\nu \in \{0, 1\}^s\), \(\nu =\overline{1, N}\) such that

    $$\begin{aligned} \left( \tilde{u}, (\tilde{y}^\nu ), (\tilde{\lambda }^\nu ), \tilde{\delta }, (\tilde{\eta }^\nu ), (\tilde{\zeta }^\nu )\right) \end{aligned}$$

    is an optimal solution to problem (16).

Proof

Let \(\left( \bar{u}, (\bar{y}^\nu ), (\bar{\lambda }^\nu ), \bar{\delta }, (\bar{\eta }^\nu ), (\bar{\zeta }^\nu )\right) \) be an optimal solution to problem (16). Let

$$\begin{aligned} K:=\left\{ \nu \mid \bar{\delta }_\nu = 1,\,\nu =\overline{1, N}\right\} . \end{aligned}$$

It follows from inequalities (18)–(23) that, for all \(\nu \in K\), \(\bar{y}^\nu \) belongs to \(\mathcal Y(\bar{u}, x^\nu )\), and inequalities (8)–(10) hold for \(\lambda = \bar{\lambda }^\nu \), \(y =\bar{y}^\nu \). Therefore, \(\bar{y}^\nu \in \mathcal Y^*(\bar{u}, x^\nu )\). Hence, for any \(\nu \in K\),

$$\begin{aligned} \varPhi (\bar{u}, x^\nu ) \le \varPsi (\bar{u}, \bar{y}^\nu ,x^\nu ) \le \varphi . \end{aligned}$$

Thus,

$$\begin{aligned} P^{(N)}_\varphi (\bar{u}) = \frac{1}{N} \sum _{\nu =1}^N \chi _{[-\infty , \varphi ]} (\varPhi (\bar{u}, x^{\nu })) \ge \frac{1}{N}\sum _{\nu \in K} \bar{\delta }_\nu = \frac{1}{N}\sum _{\nu =1}^N \bar{\delta }_\nu = \bar{\alpha }^*, \end{aligned}$$
(24)

where \(\bar{\alpha }^*\) is the optimal objective value of problem (16). We notice that inequality (24) is written for the fixed realization \((x^1, x^2, \ldots x^N)\) of the sample.

Now, let \(\tilde{u}\) be an optimal solution to problem (14). Let

$$\begin{aligned} \tilde{K}:=\left\{ \nu \mid \mathcal Y^*(\tilde{u}, x^\nu ) \ne \emptyset ,\,\nu =\overline{1, N}\right\} . \end{aligned}$$
(25)

For each \(\nu \in \tilde{K}\) let us choose \(\tilde{y}^\nu \in \mathcal Y^*(\tilde{u}, x^\nu )\), \(\tilde{\lambda }^\nu \in \varLambda ^*\) in such a way that \(\varPhi (\tilde{u}, x^\nu ) = \varPsi (\tilde{u}, \tilde{y}^\nu , x^\nu )\). The existence of such values \(\tilde{y}^\nu \) follows from the representation (12), because the minimum of the lower semicontinuous function \((y, \lambda )\mapsto \varPsi (\tilde{u}, y, x^\nu ) + \delta _\varLambda (\tilde{u}, y, \lambda , x^\nu )\) is attained on the compact set \(Y^*\times \varLambda ^*\). If \(\nu \notin \tilde{K}\), then we take \(\tilde{y}^\nu \in Y^*\), \(\tilde{\lambda }^\nu \in \varLambda ^*\) arbitrarily. If \(\nu \in \tilde{K}\) and \(\varPhi (\tilde{u}, x^{\nu }) \le \varphi \), then \(\tilde{\delta }_\nu = 1\); otherwise \(\tilde{\delta }_\nu = 0\). Let \(\tilde{\zeta }_i^\nu = 1\) if \(\tilde{y}_i^\nu = 0\) and \(\tilde{\zeta }_i^\nu = 0\) if \(\tilde{y}_i^\nu > 0\), \(i = \overline{1, s}\); \(\tilde{\eta }_j^\nu = 1\) if \(\tilde{\lambda }_j^\nu = 0\) and \(\tilde{\eta }_j^\nu = 0\) if \(\tilde{\lambda }_j^\nu > 0\), \(j = \overline{1, k}\). All the constraints (17)–(23) are satisfied for the solution \(\left( \tilde{u}, (\tilde{y}^\nu ), (\tilde{\lambda }^\nu ), \tilde{\delta }, (\tilde{\eta }^\nu ), (\tilde{\zeta }^\nu )\right) \). Thus,

$$\begin{aligned} P^{(N)}_\varphi (\tilde{u}) = \frac{1}{N} \sum _{\nu =1}^N \chi _{[-\infty , \varphi ]} (\varPhi (\tilde{u}, x^{\nu })) = \frac{1}{N}\sum _{\nu =1}^N\tilde{\delta }_\nu \le \bar{\alpha }^*. \end{aligned}$$
(26)

Taking into account the optimality of \(\tilde{u}\), we obtain from inequality (24) that

$$\begin{aligned} P^{(N)}_\varphi (\tilde{u}) \ge P^{(N)}_\varphi (\bar{u}) \ge \bar{\alpha }^*. \end{aligned}$$
(27)

Hence, \(\frac{1}{N}\sum _{\nu =1}^N\tilde{\delta }_\nu = \bar{\alpha }^*\). This proves the third assertion of the theorem. Combining (26) and (27), we get

$$\begin{aligned} P^{(N)}_\varphi (\bar{u}) = P^{(N)}_\varphi (\tilde{u}) = \bar{\alpha }^*. \end{aligned}$$
(28)

This implies the first assertion of the theorem. By definition, \(\bar{\alpha }^* = \frac{1}{N}\sum _{\nu =1}^N \bar{\delta }_\nu \). We conclude from (28) that

$$\begin{aligned} \alpha _N = P^{(N)}_\varphi (\tilde{u}) = \bar{\alpha }^* = \frac{1}{N}\sum _{\nu =1}^N \bar{\delta }_\nu . \end{aligned}$$

This equality proves the second assertion. All the assertions of Theorem 2 are proved.

Remark 1

Let us consider the model from Example 2. If the sets of feasible values of leader’s and follower’s variables are bounded then the function \(\gamma _i\) can be taken constant. In this case, constraints (17)–(23) are linear or convex depending on the properties of the function f.

The quantile minimization problem (15) reduces to a mixed integer programming problem:

$$\begin{aligned} \varphi \rightarrow \min _{\varphi \in \mathbb R^*,\, u\in U,\, y^\nu \in Y,\, \lambda ^\nu \in \mathbb R^k,\,\delta \in \{0, 1\}^N,\, \eta ^\nu \in \{0, 1\}^k,\, \zeta ^\nu \in \{0, 1\}^s,\, \nu =\overline{1, N}} \end{aligned}$$
(29)

subject to

$$\begin{aligned} \frac{1}{N} \sum _{\nu =1}^N \delta _\nu \ge \alpha . \end{aligned}$$
(30)

and (17)–(23).

Theorem 3

Let the conditions of Theorem 1 hold. Suppose that \(\varLambda ^*\) satisfies condition (iv) of Theorem 1. Then,

  1. 1.

    if \(\bar{u}\) is an optimal value of the variable u in problem (29), then \(\bar{u}\in V^{(N)}\);

  2. 2.

    if \(\bar{\varphi }\) is the optimal value of the variable \(\varphi \) in problem (29), then \(\varphi _N = \bar{\varphi }\);

  3. 3.

    for any optimal \(\tilde{u}\in V^{(N)}\) there exist values \(\tilde{y}^\nu \in Y\), \(\tilde{\lambda }^\nu \in \mathbb R^k\), \(\tilde{\delta }\in \{0, 1\}^N\), \(\tilde{\eta }^\nu \in \{0, 1\}^k\), \(\tilde{\zeta }^\nu \in \{0, 1\}^s\), \(\nu =\overline{1, N}\) such that

    $$\begin{aligned} \left( \varphi _N,\tilde{u}, (\tilde{y}^\nu ), (\tilde{\lambda }^\nu ), \tilde{\delta }, (\tilde{\eta }^\nu ), (\tilde{\zeta }^\nu )\right) \end{aligned}$$

    is an optimal solution to problem (29).

Proof

Let \(\left( \bar{\varphi },\bar{u}, \bar{y}^\nu , \bar{\lambda }^\nu , \bar{\delta }, \bar{\eta }^\nu , \bar{\zeta }^\nu \right) \) be an optimal solution to problem (16). Due to constraint (30),

$$\begin{aligned} \frac{1}{N}\sum _{\nu =1}^N \bar{\delta }_\nu \ge \alpha . \end{aligned}$$
(31)

It follows from inequalities (18)–(23) that

$$\begin{aligned} \varPhi (\bar{u}, x^\nu ) \le \varPsi (\bar{u}, \bar{y}^\nu ,x^\nu ) \le \bar{\varphi }. \end{aligned}$$
(32)

if \(\bar{\delta }_\nu = 1\) (see the proof of Theorem 2). Since inequalities (31) and (32) hold,

$$\begin{aligned} \varphi _\alpha ^{(N)}(\bar{u}) = \min \left\{ \varphi \mid \frac{1}{N} \sum _{\nu =1}^N \chi _{[-\infty , \varphi ]} (\varPhi (\bar{u}, x^{\nu }))\ge \alpha \right\} \le \bar{\varphi }. \end{aligned}$$
(33)

Now, let \(\tilde{u}\) be an optimal solution to problem (15). For each \(\nu \in \tilde{K}\) let us choose \(\tilde{y}^\nu \in \mathcal Y^*(\tilde{u}, x^\nu )\), \(\tilde{\lambda }^\nu \in \varLambda ^*\) in such a way that \(\varPhi (\tilde{u}, x^\nu ) = \varPsi (\tilde{u}, \tilde{y}^\nu , x^\nu )\), where \(\tilde{K}\) is defined in (25). If \(\nu \notin \tilde{K}\), then we take \(\tilde{y}^\nu \in Y^*\), \(\tilde{\lambda }^\nu \in \varLambda ^*\) arbitrarily. Let \(\tilde{\zeta }_i^\nu = 1\) if \(\tilde{y}_i^\nu = 0\) and \(\tilde{\zeta }_i^\nu = 0\) if \(\tilde{y}_i^\nu > 0\), \(i = \overline{1, s}\); \(\tilde{\eta }_j^\nu = 1\) if \(\tilde{\lambda }_j^\nu = 0\) and \(\tilde{\eta }_j^\nu = 0\) if \(\tilde{\lambda }_j^\nu > 0\), \(j = \overline{1, k}\). If \(\nu \in \tilde{K}\) and \(\varPhi (\tilde{u}, x^{\nu }) \le \varphi _N\), then \(\tilde{\delta }_\nu = 1\); otherwise \(\tilde{\delta }_\nu = 0\). Taking into account the definition of the sample quantile function, we get \(\sum _{\nu =1}^N\tilde{\delta }_\nu \ge \alpha \). Thus, all the constraints (17)–(23) and (30) are satisfied for the solution \(\left( \varphi _N,\tilde{u}, (\tilde{y}^\nu ), (\tilde{\lambda }^\nu ), \tilde{\delta }, (\tilde{\eta }^\nu ), (\tilde{\zeta }^\nu )\right) \). Hence \(\bar{\varphi }\le \varphi _N\).

Since \(\varphi _N\) is the optimal objective value in (15), \(\varphi _N\le \varphi _\alpha ^{(N)}(\bar{u})\). Due to (33), we obtain

$$\begin{aligned} \varphi _N\le \varphi _\alpha ^{(N)}(\bar{u}) \le \bar{\varphi }\le \varphi _N. \end{aligned}$$

Thus, \(\varphi _\alpha ^{(N)}(\bar{u}) =\bar{\varphi }=\varphi _N\). This proves assertions 1 and 2 of Theorem 3. Assertion 3 follows from the existence of the solution \(\left( \varphi _N,\tilde{u}, (\tilde{y}^\nu ), (\tilde{\lambda }^\nu ), \tilde{\delta }, (\tilde{\eta }^\nu ), (\tilde{\zeta }^\nu )\right) \) such that \(\bar{\varphi }= \varphi _N\). Theorem 3 is proved.

5 Convergence of the Sample Approximations

The convergence of sample approximations of stochastic programming problems with probabilistic criterion is studied in [13], where it was proved that \(\lim _{N\rightarrow \infty }\alpha _N = \alpha ^*\) almost surely (a.s.) (with respect to the probability measure \(\mathbf{P}'\)) if the function \((u, x)\mapsto \varPhi (u, x)\) is a normal integrand and U is nonempty and compact. The sufficient conditions guaranteeing that the function \((u, x)\mapsto \varPhi (u, x)\) is a normal integrand are given in Theorem 1 and Corollary 1. Let us formulate the theorem on the convergence of the sample approximations of the bilevel stochastic programming problem with probabilistic criterion. Denote by

$$D(S, T) := \sup _{s\in S}\inf _{t\in T}\Vert s-t\Vert $$

the deviation of the set \(S\subset \mathbb R^r\) from the set \(T\subset \mathbb R^r\).

Theorem 4

Suppose that the function \((u, x)\mapsto \varPhi (u, x)\) is a normal integrand. Let the set U be nonempty and compact. Then \(\lim _{N\rightarrow \infty }\alpha _N = \alpha ^*\) a.s. and \(\lim _{N\rightarrow \infty }D\left( U^{(N)}, U^*\right) = 0\) a.s.

Proof

The convergence of \(\alpha _N\) to \(\alpha ^*\) a.s. is proved in [13, Theorem 7]. Also, it was proved that, under the conditions of the theorem, that every limit point \(\bar{u}\) of the sequence \((u^N)\), where \(u^N \in U^{(N)}\), is optimal in problem (6) a.s., i.e., \(\bar{u} \in U^*\) a.s. To prove the set convergence, suppose that \(\limsup _{N\rightarrow \infty }D\left( U^{(N)}, U^*\right) > 0\) with nonzero probability. This implies that there exists \(\epsilon > 0\) such that \(\limsup _{N\rightarrow \infty }D\left( U^{(N)}, U^*\right) > \epsilon \) with probability \(\beta > 0\). Then, with probability \(\beta \), we can find a sequence \(u^N\) such that

$$\limsup _{N\rightarrow \infty }\inf _{u\in U^*}\Vert u^N - u\Vert > \epsilon .$$

From the compactness of the set U and the continuity of the function \(v \mapsto \inf _{u\in U^*}\Vert v - u\Vert \) it follows that there exists a limit point \(\bar{u}\) of the sequence \((u^N)\) such that \(\inf _{u\in U^*}\Vert \bar{u} - u\Vert \ge \epsilon \). Therefore, with probability \(\beta > 0\) there exists a limit point \(\bar{u}\) (depending on the realization of the sample) such that \(\bar{u} \notin U^*\). But \(\bar{u} \in U^*\) a.s. This contradiction proves the theorem.

In applied problems, it is important to know the sample size guaranteeing a given accuracy of the approximation. This question is studied in [18, Theorem 1]. According to this result, if the set U is finite, then

$$ \mathbf{P}'\{U^{(N)} \subset U_\epsilon \} \ge \beta $$

for

$$ N \ge 2\frac{\ln |U| - \ln (1-\beta )}{|\ln (1-\epsilon ^2)|}, $$

where \(\beta \in (0, 1)\), \(\epsilon \in (0, 1)\), \(U_\epsilon := \{u \in U \mid P_\varphi (u) \ge \alpha ^* - \epsilon \}\) is the set of \(\epsilon \)-optimal solutions to problem (6). This estimation was obtained for arbitrary functions \((u, x) \rightarrow \varPhi (u, x)\) being normal integrands and taking values from \(\mathbb R\). Replacing infinite values of \(\varPhi (u, x)\) in definition (3) by finite values that have the same sign does not change \(P_\varphi (u)\) and \(P^{(N)}_\varphi (u)\). Thus, the given estimation is valid for the considered bilevel problems (of course, if the conditions of Theorem 1 or Corollary 1 are satisfied).

Sufficient conditions of the convergence of problems with quantile criteria are given in [13, 14].

Theorem 5

([14, Theorem 10]). Suppose that

  1. (i)

    The set U is compact and nonempty.

  2. (ii)

    The function \((u, x)\mapsto \varPhi (u, x)\) is a normal integrand, and \(\varPhi (u, x) > -\infty \) for all \((u, x) \in U\times \mathcal X\).

  3. (iii)

    If \(\varphi ^* \ne +\infty \), then for all \(\epsilon > 0\) there exists a pair \((\tilde{u}, \tilde{\varphi })\in U\times \mathbb R\) such that \(|\tilde{\varphi }- \varphi ^*| \le \epsilon \) and \(P_{\tilde{\varphi }}(\tilde{u}) > \alpha \).

Then \(\lim _{N\rightarrow \infty }\varphi _N = \varphi ^*\) a.s. and every limit point of the sequence \((v^N)\), where \(v^N \in V^{(N)}\), is optimal in problem (7) a.s.

Theorem 5 was proved for arbitrary functions \((u, x) \rightarrow \varPhi (u, x)\) being normal integrands and taking values from \((-\infty , +\infty ]\). Thus, it holds for the considered functions \(\varPhi \) in bilevel problems. Due to condition (ii), Theorem 5 is not applied to functions \(\varPhi \) taking value \(-\infty \).

In the same manner as in the proof of Theorem 4, it can be proved that the assertion on the optimality of limit points in Theorem 5 can be replaced by

$$\lim _{N\rightarrow \infty }D\left( V^{(N)}, V^*\right) = 0 \text { a.s.}$$

The most difficult point in applying Theorem 5 is to check assumption (iii). It is hard to describe sufficient conditions for this, because the dependence \((u, x) \mapsto \varPhi (u, x)\) must be known. However, in some cases (for example, in the case of linear follower’s problem [19]) this dependence can be found. It is easy to check that assumption (iii) of Theorem 5 holds if the function \(x \mapsto \varPhi (u, x)\) is strictly increasing and X has a positive on \(\mathbb R\) density.

6 Conclusion

In this paper, sample approximations of the stochastic optimistic bilevel programming problems with probabilistic and quantile criteria were studied. The sample approximations reduced to deterministic optimization problems. These problems can be solved by using special software for nonlinear optimization. Conditions ensuring the convergence of the sample approximations were given. Since these conditions require that the leader’s loss function is a normal integrand, some classes of the considered problems with such leader’s loss functions were described. Although the convergence was proved, the sufficient sample size for the infinite set of the leader’s variables (and for the quantile minimization problem even when the set of the variable is finite) is still unknown. This question can be studied in future research.