Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming

Jin, S.; Ariyawansa, K. A.; Zhu, Y.

doi:10.1007/s10957-012-0110-x

Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming

Published: 29 June 2012

Volume 155, pages 1073–1083, (2012)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming

Download PDF

S. Jin¹,
K. A. Ariyawansa² &
Y. Zhu³

361 Accesses
6 Citations
Explore all metrics

Abstract

Ariyawansa and Zhu have proposed a new class of optimization problems termed stochastic semidefinite programs to handle data uncertainty in applications leading to (deterministic) semidefinite programs. For stochastic semidefinite programs with finite event space, they have also derived a class of volumetric barrier decomposition algorithms, and proved polynomial complexity of certain members of the class. In this paper, we consider homogeneous self-dual algorithms for stochastic semidefinite programs with finite event space. We show how the structure in such problems may be exploited so that the algorithms developed in this paper have complexity similar to those of the decomposition algorithms mentioned above.

Semidefinite programming relaxation methods for global optimization problems with sparse polynomials and unbounded semialgebraic feasible sets

Article 28 August 2015

Decomposition Methods for Large-Scale Semidefinite Programs with Chordal Aggregate Sparsity and Partial Orthogonality

Approximate Optimality and Approximate Duality for Quasi Approximate Solutions in Robust Convex Semidefinite Programs

Article 27 November 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

While linear programs have enjoyed wide applicability, they are based on deterministic data. Stochastic Linear Programs [1] (SLPs) were defined to handle uncertainty in data defining linear programs leading to dramatic enhancement of areas of applicability. Deterministic semidefinite programs (DSDPs) have been the focus of intense research during the past 20 years, especially in the context of interior point methods for optimization [2]. DSDPs generalize linear programs and have a wide variety of applications, especially beyond those covered by linear programs. It then becomes natural to seek a generalization of DSDPs to handle uncertainty in data similar to the way SLPs generalize linear programs. In [3], Ariyawansa and Zhu have defined such a class of problems termed stochastic semidefinite programs (SSDPs).

Ariyawansa and Zhu [4] have presented a class of volumetric barrier decomposition algorithms for SSDPs and proved polynomial complexity of certain members of the class. Their derivation is for the case where the event space of the random variables in the SSDP is finite with K realizations. Another important feature of the algorithms in [4] is that the most expensive part of the algorithm naturally separates into K subproblems, which allows efficient parallel implementations.

Decomposition can be viewed as a way of exploiting the special structure in SSDPs. In this paper, we propose homogeneous self-dual algorithms for SSDPs (with finite event spaces of their random variables) as an alternative to decomposition for handling the special structure. We show that the complexity of these algorithms is similar to those in [4], and that the most expensive part of the algorithms separates into K independent subproblems.

2 Notation and Terminology

The notation and terminology we use in the rest of this paper follow that in [2]. We let ℝ^m×n and ℝ^n∨n denote the vector spaces of real m×n matrices and real symmetric n×n matrices, respectively. We write X⪰0 (X≻0) to mean that X is positive semidefinite (positive definite) and we use U⪰V or V⪯U to mean that U−V⪰0. For U,V∈ℝ^m×n we write U•V:=trace (U ^T V) to denote the Frobenius inner product of U and V.

Given A _i∈ℝ^n∨n for i=1,2,…,m, we define the operator $\mathcal{A}: \mathbb{R}^{n \vee n}\to\mathbb{R}^{m}$ by

$$ \mathcal{A}X := [ A_1\bullet X, A_2\bullet X, \ldots, A_m\bullet X ]^{\mathsf{T}}, $$

(1)

for X∈ℝ^n∨n. The adjoint operator of $\mathcal{A}$ with respect to the standard inner products in ℝ^n∨n and ℝ^m is the operator $\mathcal {A}^{\ast} :\mathbb{R}^{m} \to\mathbb{R}^{n \vee n}$ defined by $\mathcal{A}^{\ast}y:=\sum_{i=1}^{m}y_{i} A_{i}$, for y∈ℝ^m.

The Kronecker product P⊛Q, where P,Q∈ℝ^n∨n is the operator from ℝ^n∨n to itself defined by $(P\circledast Q)U:=\frac{1}{2}(PUQ^{\mathsf{T}}+QUP^{\mathsf{T}})$, for U∈ℝ^n∨n.

Finally, we recall that the operator $\operatorname{\mathit{svec}}:\mathbb{R}^{n \vee n}\to\mathbb{R}^{\bar{n}}$ (see [5]) where $\bar{n} := (1/2)n(n+1)$ is defined by $\operatorname{\mathit{svec}}(U):=[ U_{11},\sqrt{2}U_{21}, \ldots,\sqrt{2}U_{n1}, U_{22},\sqrt{2}U_{32}, \ldots,\allowbreak \sqrt{2}U_{n2}, \ldots, U_{nn} ]^{\mathsf{T}}$ for U∈ℝ^n∨n.

3 Homogeneous Self-dual Methods for DSDPs

Given data A _i∈ℝ^n∨n for i=1,2,…,m, b∈ℝ^m and C∈ℝ^n∨n, a DSDP in primal standard form is defined as

$$ \begin{array}{@{}lrrr} \mbox{minimize} &C \bullet X & & \\ \mbox{subject to} & \mathcal{A} X & = & b \\ & X & \succeq & 0, \end{array} $$

(2)

where X∈ℝ^n∨n is the variable. The dual of (2) is

$$ \begin{array} {lrcrcl} \mbox{maximize} &b^{\mathsf{T}} y & &\\ \mbox{subject to} &\quad \mathcal{A}^{\ast}y & + & S & = & C\\ & & & S & \succeq & 0, \end{array} $$

(3)

where y∈ℝ^m and S∈ℝ^n∨n are the variables.

We briefly review the homogeneous interior point algorithm for DSDPs as in [6]. The homogeneous model for (2)–(3) is as follows:

$$ \everymath{\displaystyle} \begin{array}{@{}l} \begin{array} {rrrrrcc} \mathcal{A}X & & & -b\tau & & = &0 \\[1.5mm] &-\mathcal{A}^{*}y & -S & +\tau C & & = & 0 \\[1.5mm] -C \bullet X & +b^{\mathsf{T}} y & & & -\kappa & = & 0 \end{array} \\ \noalign{\vspace{1mm}} X\succeq0, \quad S\succeq0, \quad \tau\geq0,\quad \kappa\geq0. \end{array} $$

(4)

It is easy to show from (4) that X•S+τκ=0. The main step at each iteration of the homogeneous interior point algorithm (shown below in Algorithm 1) is the computation of the search direction (ΔX,Δy,ΔS) from the symmetrized Newton equations with respect to an invertible matrix P (which is chosen as a function of (X,y,S)) defined by:

$$ \everymath{\displaystyle} \begin{array} {rrrrrcl} \mathcal{A}\Delta X & & & -b\Delta \tau & & = & \eta r_p \\[1.5mm] & -\mathcal{A}^{*}\Delta y & -\Delta S & +\Delta \tau C & & = & \eta R_d \\[1.5mm] -C \bullet \Delta X & +b^{\mathsf{T}} \Delta y & & & -\Delta \kappa & = & \eta r_g \\[1.5mm] & & & \kappa \Delta \tau & +\tau \Delta \kappa & = & \gamma \mu-\tau\kappa \\[1.5mm] \mathcal{E}\Delta X & & +\mathcal{F}\Delta S & & & = & \gamma\mu I-H_P(XS), \end{array} $$

(5)

where $r_{p} := b\tau- \mathcal{A}X$, $R_{d} := \mathcal{A}^{*}y + S -\tau C$, r _g:=C•X−b ^T y+κ, μ:=[1/(n+1)](X•S+τκ), η and γ are two parameters, H _P:ℝ^n∨n⟶ℝ^n∨n is the symmetrization operator defined by $H_{P}(U) := \frac{1}{2}(PUP^{-1} + (P^{-1})^{\mathsf{T}}U^{\mathsf {T}}P^{\mathsf{T}})$, and $\mathcal{E}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}$ and $\mathcal{F}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}$ are the linear operators defined by $\mathcal{E} := P\circledast(P^{-1})^{\mathsf{T}}S$ and $\mathcal{F} := PX\circledast(P^{-1})^{\mathsf{T}}$, respectively.

Currently, the most common choices of symmetrization for search directions in practice are as follows [7]:

Helmberg–Rendel–Vanderbei–Wolkowicz/Kojima–Shindoh–Hara/Monteiro (HKM) direction, corresponding to P:=S ^1/2. In this case, we have that $\mathcal{E} = \mathcal{I}$ and $\mathcal{F} = X \circledast S^{-1}$.
Kojima, Shindoh, and Hara (KSH) direction (rediscovered by Monteiro), corresponding to P:=X ^−1/2. In this case, we have that $\mathcal{E} = S\circledast X^{-1}$ and $\mathcal{F} = \mathcal{I}$.
Nesterov–Todd (NT) direction, corresponding to P:=H ^−1/2, here H is the unique symmetric positive definite matrix satisfying HXH=X, which can be calculated by H=X ^1/2(X ^1/2 SX ^1/2)^−1/2 X ^1/2. In this case, we have $\mathcal{E} = \mathcal{I}$ and $\mathcal{F} = H \circledast H$.

Lemma 3.1

Suppose that X≻0, S≻0, and A ₁,A ₂,…,A _m are linearly independent. Then for each of the above three choices of P, $\mathcal{E}^{-1}$ and $\mathcal{F}^{-1}$ exist, and $\mathcal{E}^{-1}\mathcal{F}$ and $\mathcal{F}^{-1}\mathcal{E}$ are positive definite and self-adjoint.

Proof

See [2]. □

We state the generic homogeneous algorithm as in [6].

4 Homogeneous Self-dual Algorithms for SSDPs

4.1 SSDPs with Finite Event Space

The general definition of a SSDP in primal standard form is stated in [3]. We consider the special case in which the event space is finite with K realizations. Let (deterministic) data $W_{0}^{(i)} \in{\mathbb{R}}^{n_{0} \vee n_{0}}$ for i=1,2,…,m ₀, $h_{0} \in\mathbb{R}^{m_{0}}$ and $C_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}$; and realizations of random data $B^{(i)}_{k}\in{\mathbb{R}}^{n_{0} \vee n_{0}}$, $W^{(i)}_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}$ for i=1,2,…,m ₁, $h_{k} \in\mathbb{R}^{m_{1}}$ and $C_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}$ for k=1,2,…,K be given. Then a SSDP with finite event space in primal standard form is the problem

$$ \everymath{\displaystyle} \begin{array}{lrcrcccrcc} {\mbox{minimize}}& C_0 \bullet X_0 & + & C_1 \bullet X_1 & + & \cdots& + & C_K \bullet X_K \\[2pt] {\mbox{subject to}} &\quad \mathcal{W}_0 X_0 & & & & & & & = & h_0 \\[2pt] & \mathcal{B}_1 X_0 & + & \mathcal{W}_1 X_1 & & & & & = & h_1 \\[2pt] & \vdots& & & & \ddots& & & & \vdots \\[2pt] & \mathcal{B}_K X_0 & & & & & + & \mathcal{W}_K X_K & = & h_K \\[2pt] & X_0, & & X_1, & & \cdots& & X_K & \succeq& 0, \end{array} $$

(6)

where $X_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}$ and $X_{k} \in{\mathbb{R}}^{n_{1} \vee n_{1}} $ for k=1,2,…,K are the first-stage and second-stage variables, respectively.

Problem (6) is a DSDP in primal standard form with block diagonal structure. Algorithms that exploit this special structure are especially important when K is large as is the case in typical applications.

The dual of (6) is

$$ \everymath{\displaystyle} \begin{array} {lrcrcccrcrcc} {\mbox{maximize}}& h_0^{\mathsf{T}}y_0 & + & h_1^{\mathsf{T}}y_1 & + & \cdots& + & h_K^{\mathsf{T}}y_K \\[2pt] {\mbox{subject to}} &\quad \mathcal{W}^{*}_0 y_0 & + & \mathcal{B}^{*}_1 y_1 & + & \cdots& + & \mathcal{B}^{*}_K y_K & + & S_0 & = & C_0 \\[2pt] & & & \mathcal{W}^*_1 y_1 & & & & & + & S_1 & = & C_1 \\[2pt] & & & & & \ddots&&& & & & \vdots \\[2pt] & & & & & & & \mathcal{W}^*_K y_K & + & S_K & = & C_K \\[2pt] & S_0, & &S_1, & & \cdots& & S_K & & & \succeq& 0, \end{array} $$

(7)

where $y \in\mathbb{R}^{(m_{0}+ K m_{1})}$ and $S_{k} \in\mathbb{R}^{(n_{0} + K n_{1}) \vee(n_{0} + K n_{1})}$ for k=1,2,…,K are the variables.

Now Algorithm 1 can be applied to problem (6)–(7). In practice, K is very large. Then problem (6)–(7) is large, and in particular, the computation of the search direction in Algorithm 1 (i.e., the solution of the system (5)) is very expensive. As we shall see in the next section, this computational work can be reduced significantly by exploiting the special structure of problem (6, 7). In addition, the method we describe in next section for the computation of the search direction decomposes into K smaller computations that can be performed in parallel.

4.2 An Efficient Method for Computing Search Directions

We now describe a method for computing the search direction in Algorithm 1 that exploits the special structure in (6), (7). The homogeneous model (4) for problem (6)–(7) is

$$ \everymath{\displaystyle} \begin{array} {@{}l} \mathcal{W}_0 X_0 - h_0 \tau = 0 \\[2mm] \mathcal{B}_k X_0 +\mathcal{W}_k X_k - h_k \tau = 0 \\[1mm] -\mathcal{W}_0^{*} y_0 - \sum_{k=1}^{K} \mathcal{B}_k^{*} y_k + \tau C_0 - S_0 = 0 \\[5mm] -\mathcal{W}_k^{*} y_k + \tau C_k - S_k = 0 \\[1mm] \sum_{k=0}^{K} h_k^{\mathsf{T}}y_k - \sum_{k=0}^{K} C_k \bullet X_k - \kappa = 0 \\[5mm] X_k\succeq0,\quad S_k\succeq0,\quad k = 0, 1, 2, \ldots, K,\quad \tau\geq0, \quad \kappa\geq0 \end{array} $$

(8)

where $\mathcal{W}_{k}^{*}$ and $\mathcal{B}_{k}^{*}$ are adjoint operators with appropriate dimensions for k=1,2,…,K.

The search direction system corresponding to (8) can be derived via (5) as

$$ \everymath{\displaystyle} \begin{array} {@{}l} \mathcal{W}_0 \Delta X_0 - h_0 \Delta\tau = \eta r_{p_0} \\[2mm] \mathcal{B}_k \Delta X_0 + \mathcal{W}_k \Delta X_k - h_k \Delta\tau = \eta r_{p_k} \\[1mm] -\mathcal{W}_0^{*} \Delta y_0 - \sum_{k=1}^{K}\mathcal{B}_k^{*} \Delta y_k + \Delta\tau C_0 - \Delta S_0 = \eta R_{d_0} \\[5mm] -\mathcal{W}_k^{*} \Delta y_k + \Delta\tau C_k - \Delta S_k = \eta R_{d_k} \\[2mm] \mathcal{E}_0 \Delta X_0 + \mathcal{F}_0 \Delta S_0 = \gamma\mu I_0 - H_{P_0} (X_0S_0) \\[2mm] \mathcal{E}_k \Delta X_k + \mathcal{F}_k \Delta S_k = \gamma\mu I_k - H_{P_k}(X_kS_k) \\[2mm] \kappa\Delta\tau+ \tau\Delta\kappa = \gamma\mu- \tau\kappa \\[1mm] \sum_{k=0}^{K}h_k^{\mathsf{T}}\Delta y_k - \sum_{k=0}^{K} C_k \bullet\Delta X_k - \Delta\kappa = \eta r_g, \end{array} $$

(9)

where $r_{p_{0}} := h_{0}\tau- \mathcal{W}_{0} X_{0}$; $r_{p_{k}} := h_{k}\tau- \mathcal{B}_{k} X_{0} - \mathcal{W}_{k} X_{k}$; $R_{d_{0}} : = \mathcal{W}_{0}^{*} y_{0} + \sum_{k=1}^{K} \mathcal {B}_{k}^{*} y_{k} + S_{0} - \tau C_{0}$; $R_{d_{k}} := \mathcal{W}_{k}^{*} y_{k} + S_{k} - \tau C_{k}$; $r_{g} := \kappa- \sum_{k=0}^{K} h_{k}^{\mathsf{T}}y_{k} + \sum_{k=0}^{K}C_{k} \bullet X_{k}$; $\mu:= (\sum_{k = 0}^{K} X_{k} \bullet S_{k} + \tau\kappa)/(n_{0} +Kn_{1} + 1)$; η and γ are two parameters; and $\mathcal{E}_{k}, \mathcal{F}_{k}$, and $H_{P_{k}}$ are linear operators which depend only on X _k and S _k; for 1,2,…,K.

Now we present the crux of our method for finding the search direction as a solution to (9). By the sixth equation of (9) and the fact that $\mathcal{F}_{k}^{-1}$ for k=1,2,…,K exist, we have $\Delta S_{k} = -\mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} + \mathcal {F}_{k}^{-1} (\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))$ for k=1,2,…,K. Substituting this into the fourth equation of (9), we get $-\mathcal{W}_{k}^{*} \Delta y_{k} + \Delta\tau C_{k} + \mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} = \eta R_{d_{k}} + \mathcal{F}_{k}^{-1} (\gamma \mu I_{k} - H_{P_{k}}(X_{k}S_{k}))$ for k=1,2,…,K. By this equation and by the fact that $(\mathcal{F}_{k}^{-1} \mathcal{E}_{k})^{-1} = \mathcal{E}_{k}^{-1} \mathcal{F}_{k}$, because $\mathcal{E}_{k}^{-1}$ exist, for k=1,2,…,K, we can express ΔX _k as

(10)

Substituting (10) into the second equation of (9), we have $\mathcal{B}_{k}\Delta X_{0} + \mathcal{W}_{k} (\mathcal{E}_{k}^{-1} \mathcal{F}_{k} \mathcal{W}_{k}^{*} \Delta y_{k} - \mathcal{E}_{k}^{-1} \mathcal {F}_{k}(\Delta\tau C_{k} + \eta R_{d_{k}}) + \mathcal{E}_{k}^{-1}(\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))) - h_{k} \Delta\tau= \eta r_{p_{k}}$ for k=1,2,…,K. Thus,

$$ \Delta y_k = - M_k^{-1} \mathcal{B}_k \Delta X_0 + q_k \Delta\tau+ \nu_k, $$

(11)

where

$$ \everymath{\displaystyle} \begin{array}{rcl} M_k &:=& \mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}, \qquad q_k := M_k^{-1}\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k C_k + h_k\bigr), \\[2mm] \nu_k &:=& M_k^{-1}\bigl(\eta r_{p_k} - \eta\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k R_{d_k} - \mathcal{W}_k \mathcal{E}_k^{-1}\bigl(\gamma\mu I_k - H_{P_k}(X_kS_k)\bigr)\bigr), \end{array} $$

(12)

for k=1,2,…,K. By the fifth equation of (9) and the fact that $\mathcal{F}_{0}^{-1}$ exists, we get $\Delta S_{0} = -\mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0} + \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0}S_{0}))$. We use this equality and (11) in the third equation of (9) and get $-\mathcal{W}_{0}^{*}\Delta y_{0} - \sum_{k=1}^{K}\mathcal{B}_{k}^{*}(-M_{k}^{-1} \mathcal{B}_{k} \Delta X_{0} + q_{k} \Delta\tau+ \nu_{k}) + \Delta\tau C_{0} + \mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0}- \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0} S_{0})) = \eta R_{d_{0}}$. From this relation, we have

(13)

where

$$ \everymath{\displaystyle} \begin{array}{@{}rcl} \mathcal{M}_0 & := & \mathcal{F}_0^{-1} \mathcal{E}_0 + \sum_{k=1}^{K} \mathcal{B}_k^{*} M_k^{-1} \mathcal{B}_k, \qquad T_0 := \mathcal{M}_0^{-1} \Biggl(C_0 - \sum_{k=1}^{K} \mathcal{B}_k^{*} q_k\Biggr), \\[4mm] U_0 & := & \mathcal{M}_0^{-1}\Biggl(\sum _{k=1}^{K}\mathcal{B}_k^{*} \nu_k + \eta R_{d_0} + \mathcal{F}_0^{-1} \bigl(\gamma\mu I_0 - H_{P_0}(X_0 S_0) \bigr)\Biggr). \end{array} $$

(14)

Now, substituting (13) into the first equation of (9), we have $\mathcal{W}_{0} (\mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*} \Delta y_{0} - T_{0} \Delta\tau+ U_{0}) - h_{0} \Delta\tau= \eta r_{p_{0}}$. Because $\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}$ is nonsingular (a fact to be discussed in Sect. 5), this equation leads us to

(15)

where

$$ \everymath{\displaystyle} \begin{array}{@{}l} \alpha_0 := \bigl(\mathcal{W}_0 \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \bigr)^{-1}(\mathcal{W}_0 T_0 + h_0), \\[2mm] \beta_0 := \bigl(\mathcal{W}_0 \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \bigr)^{-1}(\eta r_{p_0} - \mathcal{W}_0 U_0). \end{array} $$

(16)

Now we substitute backward. First, we substitute (15) in (13) to get

$$ \Delta X_0 = \mathcal{M}_0^{-1} \mathcal{W}_0^{*}(\alpha_0 \Delta\tau+ \beta_0) - T_0 \Delta\tau+ U_0 = \varPsi_0 \Delta\tau+ \varPhi_0, $$

(17)

where

$$ \varPsi_0 := \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \alpha_0 - T_0, \qquad \varPhi_0 := \mathcal{M}_0^{-1} \mathcal{W}_0^{*}\beta_0 + U_0. $$

(18)

Substituting (17) in (11), we obtain

$$ \Delta y_k = -\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}\bigr)^{-1} \mathcal{B}_k(\varPsi_0 \Delta\tau+ \varPhi_0) + q_k \Delta\tau+ \nu_k = \alpha_k \Delta \tau+ \beta_k, $$

(19)

where

$$ \alpha_k := -\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}\bigr) ^{-1} \mathcal{B}_k \varPsi_0 + q_k, \qquad \beta_k:= \alpha_k - q_k + \nu_k $$

(20)

for k=1,2,…,K. Furthermore, we substitute (19) in (10) to get

(21)

where

$$ \everymath{\displaystyle} \begin{array} {@{}rcl} \varPsi_k & := &\mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*} \alpha_k - \mathcal{E}_k^{-1} \mathcal{F}_k C_k \\[2mm] \varPhi_k & := &\mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*} \beta_k - \mathcal{E}_k^{-1} \mathcal{F}_k \eta R_{d_k} + \mathcal {E}_k^{-1} \bigl(\gamma\mu I_k - H_{P_k}(X_k S_k) \bigr), \end{array} $$

(22)

for k=1,2,…,K. Now, we substitute (15), (17), (19), and (21) in the last equation of (9). By the seventh equation of (9), this yields

$$\sum_{k=0}^{K}h_k^{\mathsf{T}}( \alpha_k \Delta\tau+ \beta_k) - \sum _{k=0}^{K} C_k \bullet( \varPsi_k \Delta\tau+ \varPhi_k) - \frac{1}{\tau }(-\kappa \Delta\tau+ \gamma\mu- \tau\kappa) = \eta r_g. $$

Finally, Δτ is given by

$$ \begin{array} {lll} \Delta\tau& = &\displaystyle\frac{\tau\eta r_g + \tau\sum _{k=0}^{K}(C_k \bullet\varPhi_k - h_k^{\mathsf{T}}\beta_k) + (\gamma\mu- \tau \kappa)}{\tau\sum_{k=0}^{K}(h_k^{\mathsf{T}}\alpha_k - C_k \bullet \varPsi_k) + \kappa}. \end{array} $$

(23)

We get other directions by (23), and move to the next iteration in Algorithm 1.

5 Complexity Analysis

In this section, we first show that under reasonable conditions the operations described above are valid. Then we estimate the computational complexity of Algorithm 1 with the method described in Sect. 4.2 applied on problem (6)–(7). Finally, we compare that complexity to the complexity of Algorithm 1 applied on problem (6)–(7) treating it as a generic primal-dual DSDP pair.

5.1 Validation of Computations

We assume that $W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}$, and $W_{k}^{(1)},W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}$ for k=1,2,…,K are linearly independent. Then M _k for k=1,2,…,K in (12) are nonsingular and positive definite by Lemma 3.1.

Now we will show that $\mathcal{M}_{0}$ in (14) is nonsingular and that $\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}$ is also nonsingular.

Lemma 5.1

Suppose that $W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}$ and that $W_{k}^{(1)}, W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}$ are linearly independent for k=1,2,…,K. Then $\mathcal{M}_{0}$ in (14) and $\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}$ are positive definite.

Proof

From (14), we have $\mathcal{M}_{0} = \mathcal{F}_{0}^{-1} \mathcal{E}_{0} + \sum_{k=1}^{K} \mathcal{B}_{k}^{*} M_{k}^{-1} \mathcal{B}_{k}$. We have that $\mathcal{F}_{0}^{-1} \mathcal{E}_{0}$ is positive definite by Lemma 3.1, so it suffices to show that $\mathcal{B}_{k}^{*}M_{k}^{-1} \mathcal{B}_{k}$ is positive semidefinite for k=1,2,…,K. Letting $\mathcal{B}_{k} U = [B_{k}^{(1)} \bullet U, B_{k}^{(2)} \bullet U, \ldots, B_{k}^{(m_{1})} \bullet U ]^{\mathsf{T}}$, for each $U \in\mathbb{R}^{n_{0} \vee n_{0}}$ and $M_{k}^{-1} = [\phi _{ij}^{(k)} ]_{m_{1} \times m_{1}}$, we have

for k=1,2,…,K. So

$$\mathcal{B}_k^{*} M_k^{-1} \mathcal{B}_k U \bullet U = \sum_{i} \sum_{j}\bigl(\phi_{ij}^{(k)} B_k^{(j)} \bullet U\bigr)B_k^{(i)} \bullet U = (\mathcal{B}_k U)^{\mathsf{T}}M_k^{-1}( \mathcal{B}_k U) \geq0. $$

The last inequality is due to the fact that $M_{k}^{-1}$ is positive definite. As in [2], we can then show that $\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}$ is positive definite. □

5.2 Complexity Estimates

Theorem 5.1

Suppose that m ₁=m ₀, n ₁=n ₀ and that $W_{k}^{(1)}, W_{k}^{(2)}, \ldots, W_{k}^{(m_{0})}$ are linearly independent for k=0,1,…,K. By utilizing method described in Sect. 4.2 for computing the search directions in Algorithm 1, we have that the number of arithmetic operations in each iteration of Algorithm 1 is $O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})$.

Proof

The dominant computations of the method in Sect. 4.2 occur at (12), (14), (16), (18), (20), and (22). The corresponding numbers of arithmetic operations for these computations are listed in Table 1. In particular, the computation in (14) will be analyzed in detail. The total number of arithmetic operations is dominated by $O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})$ for choices of P stated in Sect. 3.

Table 1 Complexity estimates for dominant steps in method of Sect. 4.2

Full size table

We analyze the work of computation in (14). Now $\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}$ is a mapping from ${\mathbb{R}}^{n_{0} \vee n_{0}}$ to itself. We use the operator $\operatorname{\mathit{svec}}: {\mathbb{R}}^{n_{0} \vee n_{0}} \to \mathbb{R}^{\bar{n}_{0}}$ where $\bar{n}_{0} := \frac{1}{2}n_{0}(n_{0}+1)$, to get the matrix of $\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}$. For any $U\in{\mathbb{R}}^{n_{0} \vee n_{0}}$,

The third equality above is from the relationship $U\bullet V= \operatorname{\mathit{svec}}(U)^{\mathsf{T}}\operatorname{\mathit{svec}}(V)$ for $U,V\in{\mathbb{R}}^{n_{0} \vee n_{0}}$ (see [5]). So the matrix of $\mathcal{M}_{0} = \mathcal{F}_{0}^{-1}\mathcal{E}_{0} + \sum_{k=1}^{K}\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}$ in $\mathbb {R}^{n_{0} \vee n_{0}}$ is given by

$$ H_0 + \sum_{k=1}^{K} \sum_{i=1}^{m_0}\sum _{j=1}^{m_0}\phi_{ij}^{(k)}\bigl( \operatorname{\mathit{svec}}\bigl(B_k^{(i)}\bigr)\operatorname{\mathit{svec}} \bigl(B_k^{(j)}\bigr)^{\mathsf{T}}\bigr), $$

(24)

where H ₀ is the matrix of $\mathcal{F}_{0}^{-1}\mathcal{E}_{0}$, which depends on different choices of symmetrization for search directions.

The number of arithmetic operations needed to compute $\operatorname{\mathit{svec}}(B_{k}^{(i)})\operatorname{\mathit{svec}}(B_{k}^{(j)})^{\mathsf{T}}$ is $O(\bar{n}_{0}^{2})=O(n_{0}^{4})$, so the number of arithmetic operations needed for the second term in (24) is $O(Km_{0}^{2}n_{0}^{4})$. The inverse of (24) needs $O(\bar{n}_{0}^{3})=O(n_{0}^{6})$ arithmetic operations. Finally, the arithmetic operations needed for $\mathcal{M}_{0}^{-1}$ is $O(Km_{0}^{2}n_{0}^{4}+n_{0}^{6})$. Thus, the arithmetic operations for (14) is dominated by $O(K(m_{0}^{3}+m_{0}^{2}n_{0}^{4}) + n_{0}^{6})$. □

If we use a generic homogeneous algorithm such as the one in [5], then the number of arithmetic operations required to compute the search directions for (6), (7) is O(mn ³+m ² n ²+m ³), where n:=(n ₀+Kn ₁) and m:=(m ₀+Km ₁). Setting m ₁=m ₀ and n ₁=n ₀ and substituting n=(1+K)n ₀ and m=(1+K)m ₀ in O(mn ³+m ² n ²+m ³), we have that the complexity of such a generic method of computing the search directions is $O(K^{4}(m_{0} n_{0}^{3} + m_{0}^{2} n_{0}^{2}) + K^{3} m_{0}^{3})$. This is much larger than the complexity $O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})$ obtained for the method in Sect. 4.2 when K≫m ₀ and K≫n ₀.

If problem (6)–(7) has a solution, then the KSH method is globally convergent [6]. The algorithm finds an optimal solution or determines that the primal-dual pair has no solution of norm less than a given number in at most O(n ^1/2 L) iterations. Here, n is the size of the problem and L is the logarithm of the ratio of the initial error and the tolerance. So, by using the method in Sect. 4.2 for computing the search direction with KSH symmetrization, the complexity of Algorithm 1 in terms of the total number of arithmetic operations is $O(K^{1.5}(m_{0}^{3} n_{0}^{0.5} + m_{0}^{2} n_{0}^{4.5}) + K^{0.5} n_{0}^{6.5})$ to find a solution or ascertain that the (8) has no solution. In comparison, the short- and long-step decomposition algorithms of Ariyawansa and Zhu [4] have complexities of O(K ^1.5) and O(K ²), respectively, in terms of the total number of arithmetic operations.

We note that the efficient computation of the Schur computation matrix M _k in (12) is crucial as this is the most expensive step in each iteration where usually 80 % of the total CPU time is spent if the algorithm in [7] is used. However, in each iteration, each M _k can be computed independently, and so distributed processing may be used to achieve substantial reductions in computation time.

6 Concluding Remarks

In this paper, we have proposed homogeneous self-dual algorithms for Stochastic Semidefinite Programming whose complexity is comparable to those of algorithms in [4]. The efficient method for calculating the search directions we developed can take advantage of parallel and distributed processing.

References

Kall, P., Wallace, S.: Stochastic Programming. Wiley, New York (1994)
MATH Google Scholar
Todd, M.J.: Semidefinite optimization. Acta Numer. 10, 515–560 (2001)
Article MathSciNet MATH Google Scholar
Ariyawansa, K.A., Zhu, Y.: Stochastic semidefinite programming: a new paradigm for stochastic optimization. 4OR 4(3), 239–253 (2006). An earlier version of this paper appeared as Technical Report 2004-10 of the Department of Mathematics, Washington State University, Pullman, WA 99164-3113, in October 2004
Article MathSciNet MATH Google Scholar
Ariyawansa, K.A., Zhu, Y.: A class of polynomial volumetric barrier decomposition algorithms for stochastic semidefinite programming. Math. Comput. 80, 1639–1661 (2011)
Article MathSciNet MATH Google Scholar
Todd, M.J., Toh, K.C., Tütüncü, R.H.: On the Nesterov-Todd direction in semidefinite programming. SIAM J. Optim. 8, 769–796 (1998)
Article MathSciNet MATH Google Scholar
Potra, F., Sheng, R.: On homogeneous interior-point algorithms for semidefinite programming. Optim. Methods Softw. 9, 161–184 (1998)
Article MathSciNet MATH Google Scholar
Toh, K.C., Todd, M.J., Tütüncü, R.H.: SDPT3—a MATLAB software package for semidefinite programming. Optim. Methods Softw. 11/12, 545–581 (1999)
Article Google Scholar

Download references

Acknowledgements

The work of S.J. was performed while he was visiting Washington State University. Research supported in part by the Chinese National Foundation under Grants No. 51139005 and 51179147. K.A.A. research supported in part by the US Army Research Office under Grant DAAD 19-00-1-0465 and by Award W11NF-08-1-0530. Y.Z. research supported in part by ASU West MGIA Grant 2007.

Author information

Authors and Affiliations

Department of Statistics, Wuhan University of Technology, Wuhan, 430063, China
S. Jin
Department of Mathematics, Washington State University, Pullman, WA, 99164-3113, USA
K. A. Ariyawansa
Division of Mathematical and Natural Sciences, Arizona State University, Phoenix, AZ, 85069-7100, USA
Y. Zhu

Authors

S. Jin
View author publications
You can also search for this author in PubMed Google Scholar
K. A. Ariyawansa
View author publications
You can also search for this author in PubMed Google Scholar
Y. Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. A. Ariyawansa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, S., Ariyawansa, K.A. & Zhu, Y. Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming. J Optim Theory Appl 155, 1073–1083 (2012). https://doi.org/10.1007/s10957-012-0110-x

Download citation

Received: 19 July 2011
Accepted: 15 June 2012
Published: 29 June 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10957-012-0110-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming

Abstract

Similar content being viewed by others

Semidefinite programming relaxation methods for global optimization problems with sparse polynomials and unbounded semialgebraic feasible sets

Decomposition Methods for Large-Scale Semidefinite Programs with Chordal Aggregate Sparsity and Partial Orthogonality

Approximate Optimality and Approximate Duality for Quasi Approximate Solutions in Robust Convex Semidefinite Programs

1 Introduction

2 Notation and Terminology