Abstract
The vast majority of the literature on stochastic semidefinite programs (stochastic SDPs) with recourse is concerned with risk-neutral models. In this paper, we introduce mean-risk models for stochastic SDPs and study structural properties as convexity and (Lipschitz) continuity. Special emphasis is placed on stability with respect to changes of the underlying probability distribution. Perturbations of the true distribution may arise from incomplete information or working with (finite discrete) approximations for the sake of computational efficiency. We discuss extended formulations for stochastic SDPs under finite discrete distributions, which turn out to be deterministic (mixed-integer) SDPs that are (almost) block-structured for many popular risk measures.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Stochastic semidefinite programs with recourse were first considered by Ariyawansa and Zhu in [1], where, for finite discrete distributions, the authors reformulate the risk-neutral stochastic SDP as a block-structured deterministic SDP and discuss an application to the stochastic version of the minimum-volume covering ellipsoid problem (cf. [23, 25]). In [29], the same authors give a multitude of other applications, including problems in geometry, location aided routing, RC circuit design and structural optimization.
Some approaches to the algorithmic treatment of risk neutral programs with linear recourse carry over to expectation based stochastic SDPs. Extending the results of Zhao (cf. [27]), Mehrotra and Özevin derive a polynomial logarithmic barrier algorithm employing Bender’s decomposition (cf. [15]). Using the volumetric barrier of Vaidya (cf. [24]), Ariyawansa and Zhu construct algorithms of similar complexity in [2]. Furthermore, in [13], Jin, Ariyawansa and Zhu propose homogeneous self-dual algorithms with complexities comparable to the ones of the methods mentioned before. Motivated by an application in multi-antenna wireless networks, Gaujal and Mertikopoulos establish a stochastic approximation algorithm in [9].
Chance constrained SDP models are introduced by Ariyawansa and Zhu in [28, Chapter 3], where an application to the stochastic minimum-volume covering ellipsoid problem is considered. A different approach towards risk-aversion is taken by Schultz and Wollenberg [20], who consider stochastic mixed-integer semidefinite programs arising from unit commitment problems in AC transmission systems. Based on Lagrangian relaxation of the nonanticipativity constraint, a decomposition algorithm for minimizing a weighted sum of the expectation and the probability of exceeding a certain threshold is proposed.
The present work extends the models of [20] and [2] by considering more general risk measures. Instead of focussing on a certain application, we discuss structural properties as convexity and (Lipschitz) continuity of the resulting objective functions. Furthermore, we establish sufficient conditions for differentiability in the risk neutral setting.
Consequences for quantitative stability of the stochastic SDP models under perturbations of the underlying distribution are pointed out. Such perturbations may arise from incomplete information about the distribution or the choice to work with a simpler (possibly finite discrete) approximation for reasons of computational efficiency.
Finally, for finite discrete distributions, we establish equivalent almost block-structured SDPs for various risk measures. For instance, these results allow to adopt the well-known L-shaped method by Slyke and Wets (cf. [22]) to various risk-averse stochastic SDP models. Wollenberg has demonstrated how block structures can be exploited to solve large stochastic SDPs in the risk neutral setting (cf. [26]).
2 Two-Stage Stochastic SDPs with Continuous Recourse
Let \(\mathcal {S}^{k}_{+}\) denote the cone of symmetric positive semidefinite matrices in \(\mathbb {R}^{k \times k}\). The componentwise Frobenius product of \(A = (a_{1}, \ldots , a_{s})^{\top } \in (\mathcal {S}^{k}_{+})^{s}\) and \(x \in \mathcal {S}^{k}_{+}\) is defined as \(A \bullet x := \left (\text {tr}(a_{1} x), \ldots , \text {tr}(a_{s} x) \right )^{\top } \in \mathbb {R}^{s}\), where tr denotes the trace of a quadratic matrix. Furthermore, the Frobenius norm on \(\mathcal {S}^{k}_{+}\) is given by
We shall consider the parametric SDP
where \(z \in \mathbb {R}^{s}\) enters as a parameter. The data is comprised of \(c \in \mathcal {S}^{n}_{+}\), \(q \in \mathcal {S}^{m}_{+}\), \(T \in (\mathcal {S}^{n}_{+})^{s}\), \(W \in (\mathcal {S}^{m}_{+})^{s}\) and a nonempty, closed, convex set \(X \subseteq \mathcal {S}^{n}_{+}\). The set X is usually given as a spectrahedron, i.e., the intersection of the solution sets of a finite number of affine matrix inequalities with the cone of positive semidefinite matrices.
Let z = Z(ω) be the realization of a random vector \(Z: {\Omega } \to \mathbb {R}^{s}\) on some probability space \(({\Omega }, \mathcal {F}, \mathbb {P})\). A two-stage stochastic SDP arises from (P(z)) if the decision x has to be taken without knowledge of the particular realization Z(ω), while y can be chosen after observing the previously unknown parameter. In this setting, the optimal decision y is governed by the recourse problem
Let \(\varphi : \mathbb {R}^{s} \to \overline {\mathbb {R}}\) denote the optimal value function of (1) with respect to the right-hand side of the system of matrix equations in its constraints, i.e.,
Introducing the function \(f: \mathcal {S}^{n}_{+} \times \mathbb {R}^{s} \to \overline {\mathbb {R}}\), f(x,z) := c ∙ x + φ(z − T ∙ x) we may rewrite (P(Z(⋅)) as
Due to the assumed interplay between decision and observation, problem (2) is not well-defined without further modelling choices. For any x, f(x,Z(⋅)) belongs to the space \(L^{0}({\Omega }, \mathcal {F}, \mathbb {P})\) of extended real-valued random variables on the underlying probability space. We thus may fix any functional \(\mathcal {R}: \mathcal {X} \to \mathbb {R}\) satisfying
and consider the optimization problem
where the mapping \(Q_{\mathcal {R}}: \mathcal {S}^{n}_{+} \to \mathbb {R}\) is given by \(Q_{\mathcal {R}}(x) = \mathcal {R}[f(x,Z(\cdot ))]\).
We shall work with the following assumptions:
- A1
(Relatively complete recourse) \(W \bullet \mathcal {S}^{m}_{+} \supseteq \text {supp} \mathbb {P} \circ Z^{-1}\).
- A2
(Strict dual feasibility) There is some \(u \in \mathbb {R}^{s}\) such that q − W⊤u is positive definite.
Similar, yet more restrictive assumptions are also made in [15].
Lemma 1
Assume A2, then A1 holds if and only if \(M_{D} := \{u \in \mathbb {R}^{s} | q - W^{\top } u \in \mathcal {S}^{m}_{+} \}\) is compact.
Proof
MD is closed due to the closedness of \(S^{m}_{+}\). Suppose that MD is unbounded, i.e., that there exists a sequence \(\{u_{k}\}_{k \in \mathbb {N}} \subseteq M_{D}\) with \(\lim _{k \to \infty } \|u_{k}\| = \infty \). Define vk := uk/∥uk∥, then ∥vk∥ = 1 holds for all \(k \in \mathbb {N}\). Therefore, the sequence \(\{v_{k} \}_{k \in \mathbb {N}}\) can be assumed to converge to some v≠ 0 without loss of generality. By uk ∈ MD we have \(q -W^{\top } u_{k} \in \mathcal {S}^{m}_{+}\) for all \(k \in \mathbb {N}\). Thus,
Now select any u0 ∈ MD. Then u0 + αv ∈ MD holds for any α ≥ 0 and we have
verifying \(\sup \{v^{\top } u | q - W^{\top } u \in \mathcal {S}^{m}_{+} \} = \infty \). By duality, the set \(\{y \in \mathcal {S}^{m}_{+} | W \bullet y = v \}\) has to be empty, which contradicts A1.
Let MD be compact, then once again by duality for arbitrary \(t \in \mathbb {R}^{s}\), there exists u ∈ MD with \({\min \nolimits } \{q \bullet y | W \bullet y = t, y \in \mathcal {S}^{m}_{+} \} = t^{\top } u\), which implies \(t \in W \bullet \mathcal {S}^{m}_{+}\) and thus A1. □
The lemma above shows that \(\sup \{t^{\top } u | q - W^{\top } u \in \mathcal {S}^{m}_{+} \}\) is attained for any \(t \in \mathbb {R}^{s}\) whenever A1 and A2 hold true.
Lemma 2
Assume A1 and A2, then φ is finite, convex and Lipschitz continuous on \(\mathbb {R}^{s}\) .
Proof
Due to A1 and A2, strong duality holds true for the SDP defining φ. We thus have
As MD is nonempty and compact by Lemma 1, φ is finite on \(\mathbb {R}^{s}\).
Furthermore, for arbitrary λ ∈ [0,1] and \(t_{1}, t_{2} \in \mathbb {R}^{s}\), strong duality implies
which proves the asserted convexity of φ.
To establish Lipschitz continuity, let \(t_{1}, t_{2} \in \mathbb {R}^{s}\) be arbitrary and fixed. Then by strong duality and the compactness of MD, there exists u1,u2 ∈ MD such that \(\varphi (t_{1}) = t_{1}^{\top } u_{1}\) and \(\varphi (t_{2}) = t_{2}^{\top } u_{2}\). By \(t_{1}^{\top } u_{1} \geq t_{1}^{\top } u_{2}\) and \(t_{2}^{\top } u_{2} \geq t_{2}^{\top } u_{1}\) we have
and thus \(|\varphi (t_{1})-\varphi (t_{2})| \leq \max \nolimits \{\|u_{1}\|,\|u_{2}\| \} \|t_{1}-t_{2}\|\). Set \(L_{\varphi } := \max \nolimits \nolimits _{u\in M_{D}}\|u\| < \infty \), then
holds for all \(t_{1}, t_{2} \in \mathbb {R}^{s}\), which completes the proof. □
Remark 1
Under assumptions A1 and A2, φ is finite and convex, which implies directional differentiability by [18, Theorem 25.4]. Furthermore, the subdifferential of φ is convex, compact and admits the representation
By [18, Theorem 25.1], φ is differentiable at t if and only if ∂φ(t) is a singleton. In that case, we have ∂φ(t) = {∇φ(t)}.
Remark 2
In two-stage stochastic linear programming, the counterpart of φ is the optimal value function of a linear program:
with \(q_{l} \in \mathbb {R}^{m}\) and \(W_{l} \in \mathbb {R}^{s \times m}\). By linear programming theory, φl is finite on \(\mathbb {R}^{s}\) iff \(W_{l} (\mathbb {R}^{m}_{+}) =\mathbb {R}^{s}\) and \(M_{D_{l}} = \{u \in \mathbb {R}^{s} | W_{l}^{\top } u \leq q\} \neq \emptyset \). In this situation, φl admits the representation
where \(d_{1},\dots ,d_{N}\) denote the vertices of the polytope \(M_{D_{l}}\). In particular, φl is piecewise linear, convex and Lipschitz continuous.
The following example shows that the assumptions A1 and MD≠∅ are not sufficient to ensure that the optimal value in the problem defining φ(t) is attained for all \(t \in \mathbb {R}^{s}\).
Example 1
For \(t \in \mathbb {R}\), consider the SDP
For any \(t \in \mathbb {R}\) we have
Consequently, A1 is fulfilled. Moreover, we have
As (4) is strictly feasible for any right-hand side \(t \in \mathbb {R}^{s}\), strong duality holds and (5) implies that the infimum of (4) is zero. Furthermore, for any \(t \in \mathbb {R} \setminus \{ 0 \}\) we have
which yields the lower bound y11 ≥ t2/(4y22) > 0 for any y that is feasible for (4). Consequently, the optimal value in (4) is not attained if t≠ 0.
3 Structure of Risk-Averse Stochastic SDPs
Let us now return to problem (3) and consider various choices of \(\mathcal {R}\). To ensure finiteness, we shall work with moment conditions on the Borel probability measure \(\mathbb {P} \circ Z^{-1}\) induced by the underlying random vector Z(⋅). Let \(\mathcal {P}(\mathbb {R}^{s})\) denote the space of all Borel probability measures on \(\mathbb {R}^{s}\) and
be the subspace of measures having finite moments of order p ≥ 1.
Lemma 3
Assume A1, A2 and\(\mathbb {P} \circ Z^{-1} \in {}^{p}_{s}\).Then\(f(x,Z(\cdot )) \in L^{p}({\Omega }, \mathcal {F}, \mathbb {P})\)forall\(x \in \mathcal {S}^{n}_{+}\)andthe mapping\(F: \mathcal {S}^{n}_{+} \to L^{p}({\Omega }, \mathcal {F}, \mathbb {P})\),F(x) := f(x,Z(⋅)) isconvex and Lipschitz continuous with constant∥c∥ + Lφ ⋅∥T∥ w.r.t.theLp-norm.
Proof
For any \(x \in \mathcal {S}^{n}_{+}\) we have
by Lemma 2.
For any \(x_{1}, x_{2} \in \mathcal {S}^{n}_{+}\), λ ∈ [0,1] and \(z \in \mathbb {R}^{s}\), the convexity of φ yields
and thus in particular F(λx1 + (1 − λ)x2) ≤ λF(x1) + (1 − λ)F(x2) with respect to the \(\mathbb {P}\)-almost sure partial order, proving the asserted convexity of F.
Finally,
holds for all \(x_{1}, x_{2} \in \mathcal {S}^{n}_{+}\). □
Remark 3
Lemma 3 shows that under a suitable moment condition on Z the classical Lp-spaces are a natural choice for the domain \(\mathcal {X}\) of \(\mathcal {R}\) (\(1 \leq p < \infty \)). Furthermore, Lemma 2 shows that if Z has bounded support one may choose \(\mathcal {X} = L^{\infty }({\Omega }, \mathcal {F}, \mathbb {P})\).
Definition 1
A mapping \(\mathcal {R}: \mathcal {X} \to \mathbb {R}\) defined on some linear subspace \(\mathcal {X}\) of \(L^{0}({\Omega }, \mathcal {F}, \mathbb {P})\) containing the constants is called a convex risk measure if the following conditions are fulfilled:
- 1.
(Convexity) For any \(Z_{1}, Z_{2} \in \mathcal {X}\) and λ ∈ [0,1] we have
$$ \mathcal{R}[\lambda Z_{1} + (1-\lambda)Z_{2}] \leq \lambda \mathcal{R}[Z_{1}] + (1 - \lambda) \mathcal{R}[Z_{2}]. $$ - 2.
(Monotonicity) \(\mathcal {R}[Z_{1}] \leq \mathcal {R}[Z_{2}]\) for all \(Z_{1}, Z_{2} \in \mathcal {X}\) satisfying Z1 ≤ Z2 with respect to the \(\mathbb {P}\)-almost sure partial order.
- 3.
(Translation equivariance) \(\mathcal {R}[Z_{1} + z_{2}] = \mathcal {R}[Z_{1}] + z_{2}\) for all \(Z_{1} \in \mathcal {X}\) and \(z_{2} \in \mathbb {R}\).
A convex risk measure \(\mathcal {R}\) is coherent if the following holds true:
- 4.
(Positive homogeneity) \(\mathcal {R}[z_{2} Z_{1}] = z_{2} \cdot \mathcal {R}[Z_{1}]\) for all \(Z_{1} \in \mathcal {X}\) and \(z_{2} \in [0,\infty )\).
Definition 2
A mapping \(\mathcal {R}: \mathcal {X} \to \mathbb {R}\) is called law-invariant if for all \(Z_{1}, Z_{2} \in \mathcal {X}\) with \(\mathbb {P} \circ Z_{1}^{-1} = \mathbb {P} \circ Z_{2}^{-1}\) we have \(\mathcal {R}[Z_{1}] = \mathcal {R}[Z_{2}]\).
We shall give some examples of risk-measures frequently used in stochastic programming as listed in [17, pp. 447–448], and [21]. Later we will give extensive formulations of discrete mean-risk SDPs based on these risk-measures:
- (i)
The expectation \(\mathbb {E}: L^{1}({\Omega }, \mathcal {F}, \mathbb {P}) \to \mathbb {R}\) is a law-invariant coherent risk-measure.
- (ii)
The expected excess over threshold \(\eta \in \mathbb {R}\) (as used in [19]) is the mapping \(\mathbb {EE}_{\eta } : L^{1}({\Omega }, \mathcal {F}, \mathbb {P}) \rightarrow \mathbb {R}\) defined by
$$ \mathbb{EE}_{\eta}[Y] = {\int}_{\Omega} \max\{Y(\omega) - \eta, 0\} \mathbb{P}(\text{d}\omega). $$This is a non-decreasing, convex and law-invariant risk measure, but in general not translation-equivariant.
- (iii)
The conditional value-at-risk at level α ∈ (0,1)
$$ \mathbb{CV}\text{@R}_{\alpha}: L^{1}({\Omega}, \mathcal{F}, \mathbb{P}) \to \mathbb{R}, \quad \mathbb{CV}\text{@R}_{\alpha}[Y] = \min_{\eta \in \mathbb{R}} \left\{ \eta + \frac{1}{1 - \alpha} \mathbb{EE}_{\eta}(Y) \right\} $$(6)is law-invariant coherent (cf. [16]).
- (iv)
The value-at-risk at level α ∈ (0,1)
$$ \mathbb{V}\text{@R}_{\alpha}: L^{0}({\Omega}, \mathcal{F}, \mathbb{P}) \to \mathbb{R}, \quad \mathbb{V}\text{@R}_{\alpha}[Y] = \inf\{t ~|~ \mathbb{P}(Z(\omega) \leq t) \geq \alpha\} $$is nondecreasing, law-invariant, translation-equivariant and positively homogenous, but in general non-convex.
- (v)
The upper semi-deviation of order p is the mapping \(\mathbb {M}\text {ad}^{+}_{p} : L^{p}({\Omega }, \mathcal {F}, \mathbb {P}) \rightarrow \mathbb {R}\) defined by
$$ \mathbb{M}\text{ad}^{+}_{p}[Y] = \left( \int \max\{ 0 , Y(\omega) - \mathbb{E}_{\mathbb{P}}[Z] \}^{p} \mathbb{P}(\text{d}\omega) \right)^{\frac{1}{p}}. $$For ρ ∈ [0,1] this gives rise to the law-invariant and coherent risk measure \(\mathbb {E} + \rho \mathbb {M}\text {ad}_{p}\) (cf. [21, p. 276]).
Proposition 1
Assume A1 and A2, let \(\mathcal {X}\) be a convex subset of \(L^{0}({\Omega }, \mathcal {F}, \mathbb {P})\) that contains \(F(\mathcal {S}^{n}_{+})\) and fix a convex and nondecreasing mapping \(\mathcal {R}: \mathcal {X} \to \mathbb {R}\). Then \(Q_{\mathcal {R}}\) is finite and convex on \(\mathcal {S}^{n}_{+}\). In particular, problem (3) is convex.
Proof
Finiteness of \(Q_{\mathcal {R}}\) follows directly from the finiteness of \(\mathcal {R}\). Furthermore, for any \(x_{1}, x_{2} \in \mathcal {S}^{n}_{+}\) and λ ∈ [0,1] we have
The first inequality above holds due to the monotonicity of \(\mathcal {R}\) and the convexity of F (by Lemma 3), while the second one is justified by the convexity of \(\mathcal {R}\). □
Similar to the previous proposition, continuity properties of \(\mathcal {R}\) carry over to Lipschitzian properties of \(Q_{\mathcal {R}}\):
Proposition 2
Assume A1, A2 and \(\mathbb {P} \circ Z^{-1} \in {}^{p}_{s}\). If \(\mathcal {R} : \mathcal {X} \rightarrow \mathbb {R}\) is convex and continuous, then \(Q_{\mathcal {R}}\) is locally Lipschitz continuous on \(\mathcal {S}^{n}_{+}\).
Proof
It is well-known (cf. [7]) that a real-valued continuous and convex mapping on a normed space is locally Lipschitz continuous. Hence the result directly follows from Lemma 3. □
Proposition 3
Assume A1, A2 and \(\mathbb {P} \circ Z^{-1} \in {}^{p}_{s}\). If \(\mathcal {R} : \mathcal {X} \rightarrow \mathbb {R}\) is Lipschitz, then \(Q_{\mathcal {R}}\) is Lipschitz on \(\mathcal {S}^{n}_{+}\).
Proof
Straightforward. □
Remark 4
The required continuity in Proposition 2 is always satisfied if \(\mathcal {R}\) is convex and nondecreasing (w.r.t. to the \(\mathbb {P}\)-a.s. partial order). Lipschitz continuity holds for all coherent risk measures (cf. [12]). Concrete Lipschitz constants for coherent law-invariant risk measures may be obtained from representation results, e.g. from [3].
Proposition 4
Assume A1, A2 and that the support of \(\mathbb {P} \circ Z^{-1}\) is bounded. Furthermore, let \(\mathcal {R}: L^{\infty }({\Omega }, \mathcal {F}, \mathbb {P}) \to \mathbb {R}\) be a coherent risk measure. Then \(Q_{\mathcal {R}}\) is Lipschitz continuous with constant ∥c∥ + Lφ ⋅∥T∥ on \(\mathcal {S}^{n}_{+}\).
Proof
\(\mathcal {R}\) is Lipschitz continuous with constant 1 with respect to the \(L^{\infty }\)-norm on by \(L^{\infty }({\Omega }, \mathcal {F}, \mathbb {P})\) by [8, Lemma 4.3] and F takes essentially bounded values by Remark 3. Thus, for any \(x_{1}, x_{2} \in \mathcal {S}^{n}_{+}\) we have
by Lemma 3. □
We shall now turn our attention to questions of differentiability, but confine the analysis to the risk neutral model.
Lemma 4
Assume A1, A2 and\(\mathbb {P} \circ Z^{-1} \in {}^{1}_{s}\), then thefunctional\(Q_{\mathbb {E}}: \mathcal {S}^{n}_{+} \to \mathbb {R}\),\(Q_{\mathbb {E}}(x) := \mathbb {E}[F(x)]\)isdirectionally differentiableand
holds for all \(x, v \in \mathcal {S}^{n}_{+}\).
Proof
\(Q_{\mathbb {E}}\) is finite valued by Lemma 3, convex by Proposition 1 and thus directionally differentiable (cf. [18, Theorem 25.4]). Furthermore, \(\varphi ^{\prime }(\cdot -Tx;v)\) is a pointwise limit of measurable functions and thus measurable for any \(x, v \in \mathcal {S}^{n}_{+}\). The asserted representation of the directional derivative is justified by Lemma 2 and [4, Proposition 2.1]. □
Sufficient conditions for differentiability \(Q_{\mathbb {E}}\) can be obtained using the same arguments as for linear recourse (cf. [21]).
Lemma 5
Assume A1, A2 and\(\mathbb {P} \circ Z^{-1} \in {}^{1}_{s}\)andlet\(x_{0} \in \mathcal {S}^{n}_{+}\)be suchthat
is a singleton for \((\mathbb {P} \circ Z^{-1})\)-almost all \(z \in \mathbb {R}^{s}\). Then \(Q_{\mathbb {E}}\) is differentiable at x0.
Proof
For \((\mathbb {P} \circ Z^{-1})\)-almost all \(z \in \mathbb {R}^{s}\), \(h_{z}: \mathcal {S}^{n}_{+} \to \mathbb {R}\), hz(x) = c ∙ x + φ(z − T ∙ x) is differentiable with measurable derivative
Consider the functions \(g_{z}: \mathcal {S}^{n}_{+} \to \mathbb {R}\) defined by
then \(\lim _{x \to x_{0}} g_{z}(x) = 0\) holds for \((\mathbb {P} \circ Z^{-1})\)-almost all \(z \in \mathbb {R}^{s}\). Furthermore, Lemma 2 implies ∥gz(x)∥≤ 2(Lφ∥T∥ + ∥c∥) for all \(x \in \mathcal {S}^{n}_{+}\) and \(z \in \mathbb {R}^{s}\). Hence, by Lebesgue’s dominated convergence theorem, we have
Consequently, \(Q_{\mathbb {E}}\) is differentiable at x0 and \(Q_{\mathbb {E}}^{\prime }(x_{0}) = {\int \nolimits }_{\mathbb {R}^{s}} h_{z}^{\prime }(x_{0})~(\mathbb {P} \circ Z^{-1})(dz)\). □
Corollary 1
Assume A1, A2 and that \(\mathbb {P} \circ Z^{-1} \in {}^{1}_{s}\) is absolutely continuous with respect to the Lebesgue measure. Then \(Q_{\mathbb {E}}\) is continuously differentiable on \(\mathcal {S}^{n}_{+}\) .
Proof
Let \(N_{\varphi } \subset \mathbb {R}^{s}\) denote the set of points of nondifferentiability of φ. By [18, Theorem 25.5],
is a null set with respect to the Lebesgue measure for any \(x \in \mathcal {S}^{n}_{+}\), which implies \((\mathbb {P} \circ Z^{-1})[N_{x}] = 0\). Consequently, \(Q_{\mathbb {E}}\) is differentiable on \(\mathcal {S}^{n}_{+}\). Continuity of the derivative follows from [18, Theorem 25.5] and the convexity of \(Q_{\mathbb {E}}\). □
Remark 5
Assuming A1, A2 and \(\mathbb {P} \circ Z^{-1} \in {}^{1}_{s}\), the subdifferential of \(Q_{\mathbb {E}}\) admits the representation
Further details are given in [4].
Corollary 2
Assume A2 and that the underlying random variable Z follows a finite discrete distribution withrealizations\(z_{1}, \ldots , z_{S} \in \mathbb {R}^{s}\)and respectiveprobabilitiesπ1,…,πS > 0. Furthermore,assume that\(\{y \in \mathcal {S}^{m}_{+} | W \bullet y = z_{i} - T \bullet x \}\)is nonempty for anyi ∈{1,…,S} and\(x \in \mathcal {S}^{n}_{+}\).Then
holds for any\(x \in \mathcal {S}^{n}_{+}\).
Proof
The result follows directly from [18, Theorem 23.8]. □
4 Stability
We shall now study the dependence of \(Q_{\mathcal {R}}\) on the underlying probability measure \(\mathbb {P} \circ Z^{1}\). This is motivated by the fact that in applications the true probability distribution of the random parameter may be unknown. In such situations, one may work with an approximation if the optimal value function and the optimal solution set mapping of (3) are at least semicontinuous with respect to changes of the underlying distribution. Throughout this section we will assume
- A1′
(Complete recourse) \(W \bullet \mathcal {S}^{m}_{+} = \mathbb {R}^{s}\).
instead of condition A1.
Let \(({\Omega }_{0}, \mathcal {F}_{0}, \mathbb {P}_{0})\) be an atomless probability space, i.e., assume that for any \(A \in \mathcal {F}_{0}\) with \(\mathbb {P}_{0}(A) > 0\) there exists some \(B \subsetneq A\) with \(B \in \mathcal {F}_{0}\) and \(\mathbb {P}_{0}(B) > 0\), and fix any p ≥ 1. Then for any \(\nu \in {}^{1}_{p}\) there exists some \(Z_{\nu } \in L^{p}({\Omega }_{0}, \mathcal {F}_{0}, \mathbb {P}_{0})\) such that \(\mathbb {P}_{0} \circ Z_{\nu }^{-1}\). Thus, given any law-invariant mapping \(\mathcal {R}_{0}: L^{p}({\Omega }_{0}, \mathcal {F}_{0}, \mathbb {P}_{0}) \to \mathbb {R}\), the function
is well-defined. Furthermore, we can construct a mapping \(\mathcal {R}_{\mathcal {R}_{0}}: L^{p}({\Omega }, \mathcal {F}, \mathbb {P}) \to \mathbb {R}\) by setting \(\mathcal {R}_{\mathcal {R}_{0}}[Z_{1}] := {\Theta }_{\mathcal {R}_{0}}[\mathbb {P} \circ Z_{1}^{-1}]\). To ease the notation, we shall assume that \(({\Omega }, \mathcal {F}, \mathbb {P})\) itself is atomless. Given any law-invariant mapping \(\mathcal {R}: L^{p}({\Omega }, \mathcal {F}, \mathbb {P}) \to \mathbb {R}\), we shall consider the function
For the following analysis, we equip the space \(\mathcal {P}(\mathbb {R}^{s})\) with the topology of weak convergence, where a sequence \(\{ \mu _{k} \}_{k \in \mathbb {N}} \subseteq \mathcal {P}(\mathbb {R}^{s})\) converges to some \(\mu \in \mathcal {P}(\mathbb {R}^{s})\), written \(\mu _{k} \stackrel {w}{\rightarrow } \mu \) if and only if
holds for any bounded and continuous function \(h: \mathbb {R}^{s} \to \mathbb {R}\). It is well known that even for linear recourse one cannot expect weak continuity of \(\mathcal {Q}_{\mathcal {R}}\) on the entire space \(\mathcal {S}^{n}_{+} \times {}^{p}_{s}\). Along the lines of [6], we shall thus restrict the analysis to appropriate subspaces.
Definition 3
A set \({} \subseteq {}^{p}_{s}\) is called locally uniformly ∥⋅∥p-integrating if for any \(\mu \in {}\) and any 𝜖 > 0 there exists some open neighborhood \(\mathcal {N}\) of μ with respect to the topology of weak convergence such that
Example 2
-
(a)
For any K,𝜖 > 0 and p ≥ 1, the set
$$ U(\epsilon, K) := \left\{ \nu \in \mathcal{M}^{p}_{s} : {\int}_{\mathbb{R}^{s}} \|t\|^{1+\epsilon}~\nu(dt) \leq K \right\} $$of measures having uniformly bounded moments of order 1 + 𝜖 is locally uniformly ∥⋅∥p-integrating (cf. [5, Lemma 2.69]).
-
(b)
For any p ≥ 1 and compact set \({\Xi } \subset \mathbb {R}^{s}\), the set
$$ \left\{ \nu \in \mathcal{M}^{p}_{s} : {\int}_{\Xi} 1~\nu(dt) = 1 \right\} $$of measures with support in Ξ is locally uniformly ∥⋅∥p-integrating by [14, Lemma 5.1].
-
(c)
Any singleton \(\{\mu \} \subseteq {}^{p}_{s}\) is locally uniformly ∥⋅∥p-integrating for any p ≥ 1 by [14, Lemma 5.2].
Theorem 1
Let\(\mathcal {R}: L^{p}({\Omega }, \mathcal {F}, \mathbb {P}) \to \mathbb {R}\)withp ≥ 1 belaw-invariant, convex and nondecreasing. AssumeA1\(^{\prime }\)and A2 andlet\({} \subseteq {}^{p}_{s}\)be locally uniformly∥⋅∥p-integrating.Then the following statements hold true:
- 1.
The restriction of\(\mathcal {Q}_{\mathcal {R}}\)tothe set\(\mathcal {S}^{n}_{+} \times {}\)iscontinuous with respect to the product topology of the the standard topologyon\(\mathcal {S}^{n}_{+}\)andthe relative topology of weak convergenceon\({}\).
- 2.
The optimal valuefunction
$$ \phi: \mathcal{M} \to \mathbb{R}, \quad \phi(\mu) := \min_{x} \{\mathcal{Q}_{\mathcal{R}}(x,\mu) | x \in X \} $$is weakly upper semicontinuous.
Additionally assume that X is compact. Then
- 3.
ϕ is weakly continuous.
- 4.
The optimal solution set mapping
$$ {\Phi}: \mathcal{M} \rightrightarrows \mathcal{S}^{n}_{+}, \quad {\Phi}(\mu) := \text{Argmin}_{x} \lbrace \mathcal{Q}_{\mathcal{R}}(x,\mu) | x \in X \rbrace $$is weakly upper semicontinuous in the sense of Berge, i.e., for any \(\mu _{0} \in {}\) and any open set \(\mathcal {O} \subseteq \mathcal {S}^{n}_{+}\) with \({\Phi }(\mu _{0}) \subseteq \mathcal {O}\) there exists a weakly open neighborhood \(\mathcal {N}\) of μ0 such that \({\Phi }(\mu ) \subseteq \mathcal {O}\) for all \(\mu \in \mathcal {N} \cap {}\). Furthermore, Φ(μ) is nonempty and compact for any \(\mu \in {}\).
Proof
Invoking Lemma 2, the result follows from [6, Corollary 2]. □
5 Extensive Formulations for Finite Discrete Distributions
Throughout this section, we shall assume A1, A2 and that the underlying random variable Z follows a finite discrete distribution with realizations \(z_{1}, \ldots , z_{S} \in \mathbb {R}^{s}\) and respective probabilities π1,…,πS > 0. Furthermore, we denote the index set {1,…,S} by \(\mathcal {I}_{S}\).
It is well known that in the risk neutral setting, the stochastic SDP admits a reformulation as a block-structured SDP (cf. [1, 15]):
Proposition 5
The risk neutral stochastic SDP
is equivalent to the SDP
in the sense that the infimal values of the problems coincide. Furthermore, x is an optimal solution for (7) if and only if there exist y1,…,yS such that (x,y1,…,yS) is an optimal solution for (8).
Proof
By definition of φ,
holds for any x ∈ X, \(y_{1}, \ldots , y_{S} \in \mathcal {S}^{m}_{+}\) satisfying T ∙ x + W ∙ yi = zi for all \(i \in \mathcal {I}_{S}\). Thus, the infimal value of (7) is less or equal to the infimal value of (8). Furthermore, (9) is satisfied as equality if and only if
holds for all \(i \in \mathcal {I}_{S}\). The optimal solution set above is nonempty by strong duality, which holds due to A1 and A2. □
We continue with extensive formulations of the SDP (3) for mean-risk models based on the risk measures immediately following Definition 2. In this context, ρ shall always be a nonnegative, predefined parameter indicating risk-aversion in the optimization.
Proposition 6
with \(\eta \in \mathbb {R}\) as a given parameter, can be equivalently restated as
Proof
As the objective function of (11) is increasing with respect to v, any optimal solution (x,v1,…,vS,y1,…,yS) satisfies \(v_{i} = {\max \nolimits } \{c \bullet x + q \bullet y_{i} - \eta , 0 \}\) for all \(i \in \mathcal {I}_{S}\). The asserted equivalence of (10) and (11) then follows as in the proof of Proposition 5. □
Proposition 7
can be equivalently restated as
Proof
This follows directly from the variational representation of \(\mathbb {CV}@R\) in (6). The expected-excess can be pushed into the restrictions by the same trick as in Proposition 6. □
As in in the risk-neutral case, problems (11) and (12) exhibit a block structure, i.e., there is no coupling constraint involving variables associated with different scenarios. This allows for a direct adaptation of the decomposition algorithms established for the expectation based model.
Proposition 8
can be equivalently restated as
Proof
Analogous to Proposition 6. □
Unlike the previous models, the equivalent SDP in Proposition 8 contains an individual coupling constraint for each scenario. While Lagrangian relaxation still is possible, it remains to be examined whether this approach is sensible from a computational point of view.
Proposition 9
Consider the problem
with compact set X. This problem can be equivalently restated as the following SDP with binary variables
if \(M \in \mathbb {R}\) is chosen sufficiently big.
Proof
As in the preceding propositions introduce a dummy variable η to push \(\mathbb {V}@R[\varphi (z - T \bullet x)]\) into the restrictions as \(\eta \geq \mathbb {V}@R[\varphi (z - T \bullet x)]\) and minimize over η. Note that \(\eta \geq \mathbb {V}@R[ \varphi (z - T \bullet x) ]\) is equivalent to
As for given x ∈ X feasible points to the second stage problem corresponding to realization zi are denoted as yi, (14) can be rewritten as
This conditional summation can in turn be cast into inequalities with binary variables δi, \(i \in \mathcal {I}_{S}\),
if M is chosen such that η − q ∙ yi < M for all feasible yi and all η close to \(\mathbb {V}@R[ \varphi (z_{i} - T \bullet x)]\). Since − q ∙ yi ≤−φ(zi − T ∙ x) the existence of M follows from compactness of X, as \(\max \nolimits _{x \in X} \varphi (z_{i} - T \bullet x) < \infty \) for all \(i \in \mathcal {I}_{S}\). □
Equation (13) does not decompose scenariowise due to the coupling constraint \({\sum }_{i = 1}^{S} \delta _{i} \pi _{i} \geq \alpha \), which involves variables from all scenarios. Furthermore, it has an additional binary variable for each scenario. Problems of a similar structure have been considered in the context of minimizing a weighted sum of the expectation and the probability of exceeding a fixed threshold in [20], where Lagrangian relaxation of the coupling constraint enables an approach based on Bender’s decomposition. This direction seems also very promising for the algorithmic treatment of (13).
6 Future Research
Some interesting directions to be considered in a future research project are models involving stochastic costs and stochastic dominance SDP-models. In the presence of stochastic costs the recourse function φ is piecewise quadratic and nonconvex which makes the analysis in Lemma 3 more delicate. For first- and second-order dominance constraints, some of the structural results in [10] and [11] immediately carry over to case of stochastic SDPs. Their numerical treatment shall also be considered in future work.
References
Ariyawansa, K.A., Zhu, Y.: Stochastic semidefinite programming: a new paradigm for stochastic optimization. 4OR 4, 239–253 (2006)
Ariyawansa, K.A., Zhu, Y.: A class of polynomial volumetric barrier decomposition algorithms for stochastic semidefinite programming. Math. Comput. 80, 1639–1661 (2011)
Belomestny, D., Krätschmer, V.: Central limit theorems for law-invariant coherent risk measures. J. Appl. Probab. 49, 1–21 (2012)
Bertsekas, D.P.: Stochastic optimization problems with nondifferentiable cost functionals. J. Optim. Theory Appl. 12, 218–231 (1973)
Claus, M.: Advancing stability analysis of mean-risk stochastic programs: Bilevel and two-stage models. PhD thesis, University of Duisburg-Essen (2016)
Claus, M., Krätschmer, V., Schultz, R.: Weak continuity of risk functionals with applications to stochastic programming. SIAM J. Optim. 27, 91–109 (2017)
Ekeland, I., Temam, R.: Analyse Convexe et Problèmes Variationnels. Dunod, Paris (1974)
Föllmer, H., Schied, A.: Stochastic Finance: An Introduction in Discrete Time, 2nd edn. de Gruyter Studies in Mathematics, vol. 27. de Gruyter, Berlin (2004)
Gaujal, B., Mertikopoulos, P.: A stochastic approximation algorithm for stochastic semidefinite programming. Probab. Eng. Inf. Sci. 30, 431–454 (2016)
Gollmer, R., Neise, F., Schultz, R.: Stochastic programs with first-order dominance constraints induced by mixed-integer linear recourse. SIAM J. Optim. 19, 552–571 (2008)
Gollmer, R., Gotzes, U., Schultz, R.: A note on second-order stochastic dominance constraints induced by mixed-integer linear recourse. Math. Program. Ser. B 126, 179–190 (2011)
Inoue, A.: On the worst case conditional expectation. J. Math. Anal. Appl. 286, 237–247 (2003)
Jin, S., Ariyawansa, K. A., Zhu, Y.: Homogeneous self-dual algorithms for stochastic semidefinite programming. J. Optim. Theory Appl. 155, 1073–1083 (2012)
Krätschmer, V., Schied, A., Zähle, H.: Domains of weak continuity of statistical functionals with a view on robust statistics. J. Multivar. Anal. 158, 1–19 (2017)
Mehrotra, S., Özevin, M. G.: Decomposition-based interior point methods for two-stage stochastic semidefinite programming. SIAM J. Optim. 18, 206–222 (2007)
Pflug, G. Ch.: Some remarks on the value-at-risk and the conditional value-at-risk. In: Uryasev, S. P. (ed.) Probabilistic Constrained Optimization: Methodology and Applications. Nonconvex Optimization and its Applications, vol. 49, pp 272–281. Kluwer Academic Publishers, Dordrecht (2000)
Pflug, G. Ch.: Stochastic optimization and statistical inference. In: Ruszczyǹski, A., Shapiro, A (eds.) Stochastic Programming. Handbooks in Operations Research and Management Science, vol. 10, pp 427–482. Elsevier Science, Amsterdam (2003)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Schultz, R., Tiedemann, S.: Conditional value-at-risk in stochastic programs with mixed-integer recourse. Math. Program. Ser. B 105, 365–386 (2006)
Schultz, R., Wollenberg, T.: Unit commitment under uncertainty in AC transmission systems via risk averse semidefinite stochastic programs. RAIRO Oper. Res. 51, 391–416 (2017)
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on Stochastic Programming. MPS-SIAM, Philadelphia (2009)
van Slyke, R.M., Wets, R.J.-B.: L-shaped linear programs with applications to optimal control and stochastic programming. SIAM J. Appl. Math. 17, 638–663 (1969)
Sun, P., Freund, R.M.: Computation of minimum-cost covering ellipsoids. Oper. Res. 52, 690–706 (2004)
Vaidya, P.M.: A new algorithm for minimizing convex functions over a convex set. Math. Program. Ser. A 73, 291–341 (1996)
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996)
Wollenberg, T.: Two-stage stochastic semidefinite programming: Theory, algorithms, and application to AC power flow under uncertainty. PhD Thesis, University of Duisburg-Essen (2016)
Zhao, G.: A log-barrier method with Benders decomposition for solving two-stage stochastic linear programs. Math. Program. Ser. A 90, 507–536 (2001)
Zhu, Y.: Semidefinite programming under uncertainy. PhD Thesis, Washington State University (2006)
Zhu, Y., Ariyawansa, K.A.: A preliminary set of applications leading to stochastic semidefinite programs and chance-constrained semidefinite programs. Appl. Math. Model. 35, 2425–2442 (2011)
Acknowledgements
The authors gratefully acknowledge the support of the German Research Foundation (DFG) within the collaborative research center TRR 154 “Mathematical Modeling, Simulation and Optimization Using the Example of Gas Networks”.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Claus, M., Schultz, R., Spürkel, K. et al. On Risk-Averse Stochastic Semidefinite Programs with Continuous Recourse. Vietnam J. Math. 47, 865–879 (2019). https://doi.org/10.1007/s10013-019-00368-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-019-00368-0