1 Introduction

While linear programs have enjoyed wide applicability, they are based on deterministic data. Stochastic Linear Programs [1] (SLPs) were defined to handle uncertainty in data defining linear programs leading to dramatic enhancement of areas of applicability. Deterministic semidefinite programs (DSDPs) have been the focus of intense research during the past 20 years, especially in the context of interior point methods for optimization [2]. DSDPs generalize linear programs and have a wide variety of applications, especially beyond those covered by linear programs. It then becomes natural to seek a generalization of DSDPs to handle uncertainty in data similar to the way SLPs generalize linear programs. In [3], Ariyawansa and Zhu have defined such a class of problems termed stochastic semidefinite programs (SSDPs).

Ariyawansa and Zhu [4] have presented a class of volumetric barrier decomposition algorithms for SSDPs and proved polynomial complexity of certain members of the class. Their derivation is for the case where the event space of the random variables in the SSDP is finite with K realizations. Another important feature of the algorithms in [4] is that the most expensive part of the algorithm naturally separates into K subproblems, which allows efficient parallel implementations.

Decomposition can be viewed as a way of exploiting the special structure in SSDPs. In this paper, we propose homogeneous self-dual algorithms for SSDPs (with finite event spaces of their random variables) as an alternative to decomposition for handling the special structure. We show that the complexity of these algorithms is similar to those in [4], and that the most expensive part of the algorithms separates into K independent subproblems.

2 Notation and Terminology

The notation and terminology we use in the rest of this paper follow that in [2]. We let ℝm×n and ℝnn denote the vector spaces of real m×n matrices and real symmetric n×n matrices, respectively. We write X⪰0 (X≻0) to mean that X is positive semidefinite (positive definite) and we use UV or VU to mean that UV⪰0. For U,V∈ℝm×n we write UV:=trace (U T V) to denote the Frobenius inner product of U and V.

Given A i ∈ℝnn for i=1,2,…,m, we define the operator \(\mathcal{A}: \mathbb{R}^{n \vee n}\to\mathbb{R}^{m}\) by

$$ \mathcal{A}X := [ A_1\bullet X, A_2\bullet X, \ldots, A_m\bullet X ]^{\mathsf{T}}, $$
(1)

for X∈ℝnn. The adjoint operator of \(\mathcal{A}\) with respect to the standard inner products in ℝnn and ℝm is the operator \(\mathcal {A}^{\ast} :\mathbb{R}^{m} \to\mathbb{R}^{n \vee n}\) defined by \(\mathcal{A}^{\ast}y:=\sum_{i=1}^{m}y_{i} A_{i}\), for y∈ℝm.

The Kronecker product PQ, where P,Q∈ℝnn is the operator from ℝnn to itself defined by \((P\circledast Q)U:=\frac{1}{2}(PUQ^{\mathsf{T}}+QUP^{\mathsf{T}})\), for U∈ℝnn.

Finally, we recall that the operator \(\operatorname{\mathit{svec}}:\mathbb{R}^{n \vee n}\to\mathbb{R}^{\bar{n}}\) (see [5]) where \(\bar{n} := (1/2)n(n+1)\) is defined by \(\operatorname{\mathit{svec}}(U):=[ U_{11},\sqrt{2}U_{21}, \ldots,\sqrt{2}U_{n1}, U_{22},\sqrt{2}U_{32}, \ldots,\allowbreak \sqrt{2}U_{n2}, \ldots, U_{nn} ]^{\mathsf{T}}\) for U∈ℝnn.

3 Homogeneous Self-dual Methods for DSDPs

Given data A i ∈ℝnn for i=1,2,…,m, b∈ℝm and C∈ℝnn, a DSDP in primal standard form is defined as

$$ \begin{array}{@{}lrrr} \mbox{minimize} &C \bullet X & & \\ \mbox{subject to} & \mathcal{A} X & = & b \\ & X & \succeq & 0, \end{array} $$
(2)

where X∈ℝnn is the variable. The dual of (2) is

$$ \begin{array} {lrcrcl} \mbox{maximize} &b^{\mathsf{T}} y & &\\ \mbox{subject to} &\quad \mathcal{A}^{\ast}y & + & S & = & C\\ & & & S & \succeq & 0, \end{array} $$
(3)

where y∈ℝm and S∈ℝnn are the variables.

We briefly review the homogeneous interior point algorithm for DSDPs as in [6]. The homogeneous model for (2)–(3) is as follows:

$$ \everymath{\displaystyle} \begin{array}{@{}l} \begin{array} {rrrrrcc} \mathcal{A}X & & & -b\tau & & = &0 \\[1.5mm] &-\mathcal{A}^{*}y & -S & +\tau C & & = & 0 \\[1.5mm] -C \bullet X & +b^{\mathsf{T}} y & & & -\kappa & = & 0 \end{array} \\ \noalign{\vspace{1mm}} X\succeq0, \quad S\succeq0, \quad \tau\geq0,\quad \kappa\geq0. \end{array} $$
(4)

It is easy to show from (4) that XS+τκ=0. The main step at each iteration of the homogeneous interior point algorithm (shown below in Algorithm 1) is the computation of the search direction (ΔXyS) from the symmetrized Newton equations with respect to an invertible matrix P (which is chosen as a function of (X,y,S)) defined by:

$$ \everymath{\displaystyle} \begin{array} {rrrrrcl} \mathcal{A}\Delta X & & & -b\Delta \tau & & = & \eta r_p \\[1.5mm] & -\mathcal{A}^{*}\Delta y & -\Delta S & +\Delta \tau C & & = & \eta R_d \\[1.5mm] -C \bullet \Delta X & +b^{\mathsf{T}} \Delta y & & & -\Delta \kappa & = & \eta r_g \\[1.5mm] & & & \kappa \Delta \tau & +\tau \Delta \kappa & = & \gamma \mu-\tau\kappa \\[1.5mm] \mathcal{E}\Delta X & & +\mathcal{F}\Delta S & & & = & \gamma\mu I-H_P(XS), \end{array} $$
(5)

where \(r_{p} := b\tau- \mathcal{A}X\), \(R_{d} := \mathcal{A}^{*}y + S -\tau C\), r g :=CXb T y+κ, μ:=[1/(n+1)](XS+τκ), η and γ are two parameters, H P :ℝnn⟶ℝnn is the symmetrization operator defined by \(H_{P}(U) := \frac{1}{2}(PUP^{-1} + (P^{-1})^{\mathsf{T}}U^{\mathsf {T}}P^{\mathsf{T}})\), and \(\mathcal{E}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}\) and \(\mathcal{F}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}\) are the linear operators defined by \(\mathcal{E} := P\circledast(P^{-1})^{\mathsf{T}}S\) and \(\mathcal{F} := PX\circledast(P^{-1})^{\mathsf{T}}\), respectively.

Algorithm 1
figure 1

Generic homogeneous self-dual algorithm for solving (2)–(3)

Currently, the most common choices of symmetrization for search directions in practice are as follows [7]:

  • Helmberg–Rendel–Vanderbei–Wolkowicz/Kojima–Shindoh–Hara/Monteiro (HKM) direction, corresponding to P:=S 1/2. In this case, we have that \(\mathcal{E} = \mathcal{I}\) and \(\mathcal{F} = X \circledast S^{-1}\).

  • Kojima, Shindoh, and Hara (KSH) direction (rediscovered by Monteiro), corresponding to P:=X −1/2. In this case, we have that \(\mathcal{E} = S\circledast X^{-1}\) and \(\mathcal{F} = \mathcal{I}\).

  • Nesterov–Todd (NT) direction, corresponding to P:=H −1/2, here H is the unique symmetric positive definite matrix satisfying HXH=X, which can be calculated by H=X 1/2(X 1/2 SX 1/2)−1/2 X 1/2. In this case, we have \(\mathcal{E} = \mathcal{I}\) and \(\mathcal{F} = H \circledast H\).

Lemma 3.1

Suppose that X≻0, S≻0, and A 1,A 2,…,A m are linearly independent. Then for each of the above three choices of P, \(\mathcal{E}^{-1}\) and \(\mathcal{F}^{-1}\) exist, and \(\mathcal{E}^{-1}\mathcal{F}\) and \(\mathcal{F}^{-1}\mathcal{E}\) are positive definite and self-adjoint.

Proof

See [2]. □

We state the generic homogeneous algorithm as in [6].

4 Homogeneous Self-dual Algorithms for SSDPs

4.1 SSDPs with Finite Event Space

The general definition of a SSDP in primal standard form is stated in [3]. We consider the special case in which the event space is finite with K realizations. Let (deterministic) data \(W_{0}^{(i)} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\) for i=1,2,…,m 0, \(h_{0} \in\mathbb{R}^{m_{0}}\) and \(C_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\); and realizations of random data \(B^{(i)}_{k}\in{\mathbb{R}}^{n_{0} \vee n_{0}}\), \(W^{(i)}_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}\) for i=1,2,…,m 1, \(h_{k} \in\mathbb{R}^{m_{1}}\) and \(C_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}\) for k=1,2,…,K be given. Then a SSDP with finite event space in primal standard form is the problem

$$ \everymath{\displaystyle} \begin{array}{lrcrcccrcc} {\mbox{minimize}}& C_0 \bullet X_0 & + & C_1 \bullet X_1 & + & \cdots& + & C_K \bullet X_K \\[2pt] {\mbox{subject to}} &\quad \mathcal{W}_0 X_0 & & & & & & & = & h_0 \\[2pt] & \mathcal{B}_1 X_0 & + & \mathcal{W}_1 X_1 & & & & & = & h_1 \\[2pt] & \vdots& & & & \ddots& & & & \vdots \\[2pt] & \mathcal{B}_K X_0 & & & & & + & \mathcal{W}_K X_K & = & h_K \\[2pt] & X_0, & & X_1, & & \cdots& & X_K & \succeq& 0, \end{array} $$
(6)

where \(X_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\) and \(X_{k} \in{\mathbb{R}}^{n_{1} \vee n_{1}} \) for k=1,2,…,K are the first-stage and second-stage variables, respectively.

Problem (6) is a DSDP in primal standard form with block diagonal structure. Algorithms that exploit this special structure are especially important when K is large as is the case in typical applications.

The dual of (6) is

$$ \everymath{\displaystyle} \begin{array} {lrcrcccrcrcc} {\mbox{maximize}}& h_0^{\mathsf{T}}y_0 & + & h_1^{\mathsf{T}}y_1 & + & \cdots& + & h_K^{\mathsf{T}}y_K \\[2pt] {\mbox{subject to}} &\quad \mathcal{W}^{*}_0 y_0 & + & \mathcal{B}^{*}_1 y_1 & + & \cdots& + & \mathcal{B}^{*}_K y_K & + & S_0 & = & C_0 \\[2pt] & & & \mathcal{W}^*_1 y_1 & & & & & + & S_1 & = & C_1 \\[2pt] & & & & & \ddots&&& & & & \vdots \\[2pt] & & & & & & & \mathcal{W}^*_K y_K & + & S_K & = & C_K \\[2pt] & S_0, & &S_1, & & \cdots& & S_K & & & \succeq& 0, \end{array} $$
(7)

where \(y \in\mathbb{R}^{(m_{0}+ K m_{1})}\) and \(S_{k} \in\mathbb{R}^{(n_{0} + K n_{1}) \vee(n_{0} + K n_{1})}\) for k=1,2,…,K are the variables.

Now Algorithm 1 can be applied to problem (6)–(7). In practice, K is very large. Then problem (6)–(7) is large, and in particular, the computation of the search direction in Algorithm 1 (i.e., the solution of the system (5)) is very expensive. As we shall see in the next section, this computational work can be reduced significantly by exploiting the special structure of problem (67). In addition, the method we describe in next section for the computation of the search direction decomposes into K smaller computations that can be performed in parallel.

4.2 An Efficient Method for Computing Search Directions

We now describe a method for computing the search direction in Algorithm 1 that exploits the special structure in (6), (7). The homogeneous model (4) for problem (6)–(7) is

$$ \everymath{\displaystyle} \begin{array} {@{}l} \mathcal{W}_0 X_0 - h_0 \tau = 0 \\[2mm] \mathcal{B}_k X_0 +\mathcal{W}_k X_k - h_k \tau = 0 \\[1mm] -\mathcal{W}_0^{*} y_0 - \sum_{k=1}^{K} \mathcal{B}_k^{*} y_k + \tau C_0 - S_0 = 0 \\[5mm] -\mathcal{W}_k^{*} y_k + \tau C_k - S_k = 0 \\[1mm] \sum_{k=0}^{K} h_k^{\mathsf{T}}y_k - \sum_{k=0}^{K} C_k \bullet X_k - \kappa = 0 \\[5mm] X_k\succeq0,\quad S_k\succeq0,\quad k = 0, 1, 2, \ldots, K,\quad \tau\geq0, \quad \kappa\geq0 \end{array} $$
(8)

where \(\mathcal{W}_{k}^{*}\) and \(\mathcal{B}_{k}^{*}\) are adjoint operators with appropriate dimensions for k=1,2,…,K.

The search direction system corresponding to (8) can be derived via (5) as

$$ \everymath{\displaystyle} \begin{array} {@{}l} \mathcal{W}_0 \Delta X_0 - h_0 \Delta\tau = \eta r_{p_0} \\[2mm] \mathcal{B}_k \Delta X_0 + \mathcal{W}_k \Delta X_k - h_k \Delta\tau = \eta r_{p_k} \\[1mm] -\mathcal{W}_0^{*} \Delta y_0 - \sum_{k=1}^{K}\mathcal{B}_k^{*} \Delta y_k + \Delta\tau C_0 - \Delta S_0 = \eta R_{d_0} \\[5mm] -\mathcal{W}_k^{*} \Delta y_k + \Delta\tau C_k - \Delta S_k = \eta R_{d_k} \\[2mm] \mathcal{E}_0 \Delta X_0 + \mathcal{F}_0 \Delta S_0 = \gamma\mu I_0 - H_{P_0} (X_0S_0) \\[2mm] \mathcal{E}_k \Delta X_k + \mathcal{F}_k \Delta S_k = \gamma\mu I_k - H_{P_k}(X_kS_k) \\[2mm] \kappa\Delta\tau+ \tau\Delta\kappa = \gamma\mu- \tau\kappa \\[1mm] \sum_{k=0}^{K}h_k^{\mathsf{T}}\Delta y_k - \sum_{k=0}^{K} C_k \bullet\Delta X_k - \Delta\kappa = \eta r_g, \end{array} $$
(9)

where \(r_{p_{0}} := h_{0}\tau- \mathcal{W}_{0} X_{0}\); \(r_{p_{k}} := h_{k}\tau- \mathcal{B}_{k} X_{0} - \mathcal{W}_{k} X_{k}\); \(R_{d_{0}} : = \mathcal{W}_{0}^{*} y_{0} + \sum_{k=1}^{K} \mathcal {B}_{k}^{*} y_{k} + S_{0} - \tau C_{0}\); \(R_{d_{k}} := \mathcal{W}_{k}^{*} y_{k} + S_{k} - \tau C_{k}\); \(r_{g} := \kappa- \sum_{k=0}^{K} h_{k}^{\mathsf{T}}y_{k} + \sum_{k=0}^{K}C_{k} \bullet X_{k}\); \(\mu:= (\sum_{k = 0}^{K} X_{k} \bullet S_{k} + \tau\kappa)/(n_{0} +Kn_{1} + 1)\); η and γ are two parameters; and \(\mathcal{E}_{k}, \mathcal{F}_{k}\), and \(H_{P_{k}}\) are linear operators which depend only on X k and S k ; for 1,2,…,K.

Now we present the crux of our method for finding the search direction as a solution to (9). By the sixth equation of (9) and the fact that \(\mathcal{F}_{k}^{-1}\) for k=1,2,…,K exist, we have \(\Delta S_{k} = -\mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} + \mathcal {F}_{k}^{-1} (\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))\) for k=1,2,…,K. Substituting this into the fourth equation of (9), we get \(-\mathcal{W}_{k}^{*} \Delta y_{k} + \Delta\tau C_{k} + \mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} = \eta R_{d_{k}} + \mathcal{F}_{k}^{-1} (\gamma \mu I_{k} - H_{P_{k}}(X_{k}S_{k}))\) for k=1,2,…,K. By this equation and by the fact that \((\mathcal{F}_{k}^{-1} \mathcal{E}_{k})^{-1} = \mathcal{E}_{k}^{-1} \mathcal{F}_{k}\), because \(\mathcal{E}_{k}^{-1}\) exist, for k=1,2,…,K, we can express ΔX k as

(10)

Substituting (10) into the second equation of (9), we have \(\mathcal{B}_{k}\Delta X_{0} + \mathcal{W}_{k} (\mathcal{E}_{k}^{-1} \mathcal{F}_{k} \mathcal{W}_{k}^{*} \Delta y_{k} - \mathcal{E}_{k}^{-1} \mathcal {F}_{k}(\Delta\tau C_{k} + \eta R_{d_{k}}) + \mathcal{E}_{k}^{-1}(\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))) - h_{k} \Delta\tau= \eta r_{p_{k}}\) for k=1,2,…,K. Thus,

$$ \Delta y_k = - M_k^{-1} \mathcal{B}_k \Delta X_0 + q_k \Delta\tau+ \nu_k, $$
(11)

where

$$ \everymath{\displaystyle} \begin{array}{rcl} M_k &:=& \mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}, \qquad q_k := M_k^{-1}\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k C_k + h_k\bigr), \\[2mm] \nu_k &:=& M_k^{-1}\bigl(\eta r_{p_k} - \eta\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k R_{d_k} - \mathcal{W}_k \mathcal{E}_k^{-1}\bigl(\gamma\mu I_k - H_{P_k}(X_kS_k)\bigr)\bigr), \end{array} $$
(12)

for k=1,2,…,K. By the fifth equation of (9) and the fact that \(\mathcal{F}_{0}^{-1}\) exists, we get \(\Delta S_{0} = -\mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0} + \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0}S_{0}))\). We use this equality and (11) in the third equation of (9) and get \(-\mathcal{W}_{0}^{*}\Delta y_{0} - \sum_{k=1}^{K}\mathcal{B}_{k}^{*}(-M_{k}^{-1} \mathcal{B}_{k} \Delta X_{0} + q_{k} \Delta\tau+ \nu_{k}) + \Delta\tau C_{0} + \mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0}- \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0} S_{0})) = \eta R_{d_{0}}\). From this relation, we have

(13)

where

$$ \everymath{\displaystyle} \begin{array}{@{}rcl} \mathcal{M}_0 & := & \mathcal{F}_0^{-1} \mathcal{E}_0 + \sum_{k=1}^{K} \mathcal{B}_k^{*} M_k^{-1} \mathcal{B}_k, \qquad T_0 := \mathcal{M}_0^{-1} \Biggl(C_0 - \sum_{k=1}^{K} \mathcal{B}_k^{*} q_k\Biggr), \\[4mm] U_0 & := & \mathcal{M}_0^{-1}\Biggl(\sum _{k=1}^{K}\mathcal{B}_k^{*} \nu_k + \eta R_{d_0} + \mathcal{F}_0^{-1} \bigl(\gamma\mu I_0 - H_{P_0}(X_0 S_0) \bigr)\Biggr). \end{array} $$
(14)

Now, substituting (13) into the first equation of (9), we have \(\mathcal{W}_{0} (\mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*} \Delta y_{0} - T_{0} \Delta\tau+ U_{0}) - h_{0} \Delta\tau= \eta r_{p_{0}}\). Because \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is nonsingular (a fact to be discussed in Sect. 5), this equation leads us to

(15)

where

$$ \everymath{\displaystyle} \begin{array}{@{}l} \alpha_0 := \bigl(\mathcal{W}_0 \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \bigr)^{-1}(\mathcal{W}_0 T_0 + h_0), \\[2mm] \beta_0 := \bigl(\mathcal{W}_0 \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \bigr)^{-1}(\eta r_{p_0} - \mathcal{W}_0 U_0). \end{array} $$
(16)

Now we substitute backward. First, we substitute (15) in (13) to get

$$ \Delta X_0 = \mathcal{M}_0^{-1} \mathcal{W}_0^{*}(\alpha_0 \Delta\tau+ \beta_0) - T_0 \Delta\tau+ U_0 = \varPsi_0 \Delta\tau+ \varPhi_0, $$
(17)

where

$$ \varPsi_0 := \mathcal{M}_0^{-1} \mathcal{W}_0^{*} \alpha_0 - T_0, \qquad \varPhi_0 := \mathcal{M}_0^{-1} \mathcal{W}_0^{*}\beta_0 + U_0. $$
(18)

Substituting (17) in (11), we obtain

$$ \Delta y_k = -\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}\bigr)^{-1} \mathcal{B}_k(\varPsi_0 \Delta\tau+ \varPhi_0) + q_k \Delta\tau+ \nu_k = \alpha_k \Delta \tau+ \beta_k, $$
(19)

where

$$ \alpha_k := -\bigl(\mathcal{W}_k \mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*}\bigr) ^{-1} \mathcal{B}_k \varPsi_0 + q_k, \qquad \beta_k:= \alpha_k - q_k + \nu_k $$
(20)

for k=1,2,…,K. Furthermore, we substitute (19) in (10) to get

(21)

where

$$ \everymath{\displaystyle} \begin{array} {@{}rcl} \varPsi_k & := &\mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*} \alpha_k - \mathcal{E}_k^{-1} \mathcal{F}_k C_k \\[2mm] \varPhi_k & := &\mathcal{E}_k^{-1} \mathcal{F}_k \mathcal{W}_k^{*} \beta_k - \mathcal{E}_k^{-1} \mathcal{F}_k \eta R_{d_k} + \mathcal {E}_k^{-1} \bigl(\gamma\mu I_k - H_{P_k}(X_k S_k) \bigr), \end{array} $$
(22)

for k=1,2,…,K. Now, we substitute (15), (17), (19), and (21) in the last equation of (9). By the seventh equation of (9), this yields

$$\sum_{k=0}^{K}h_k^{\mathsf{T}}( \alpha_k \Delta\tau+ \beta_k) - \sum _{k=0}^{K} C_k \bullet( \varPsi_k \Delta\tau+ \varPhi_k) - \frac{1}{\tau }(-\kappa \Delta\tau+ \gamma\mu- \tau\kappa) = \eta r_g. $$

Finally, Δτ is given by

$$ \begin{array} {lll} \Delta\tau& = &\displaystyle\frac{\tau\eta r_g + \tau\sum _{k=0}^{K}(C_k \bullet\varPhi_k - h_k^{\mathsf{T}}\beta_k) + (\gamma\mu- \tau \kappa)}{\tau\sum_{k=0}^{K}(h_k^{\mathsf{T}}\alpha_k - C_k \bullet \varPsi_k) + \kappa}. \end{array} $$
(23)

We get other directions by (23), and move to the next iteration in Algorithm 1.

5 Complexity Analysis

In this section, we first show that under reasonable conditions the operations described above are valid. Then we estimate the computational complexity of Algorithm 1 with the method described in Sect. 4.2 applied on problem (6)–(7). Finally, we compare that complexity to the complexity of Algorithm 1 applied on problem (6)–(7) treating it as a generic primal-dual DSDP pair.

5.1 Validation of Computations

We assume that \(W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}\), and \(W_{k}^{(1)},W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}\) for k=1,2,…,K are linearly independent. Then M k for k=1,2,…,K in (12) are nonsingular and positive definite by Lemma 3.1.

Now we will show that \(\mathcal{M}_{0}\) in (14) is nonsingular and that \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is also nonsingular.

Lemma 5.1

Suppose that \(W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}\) and that \(W_{k}^{(1)}, W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}\) are linearly independent for k=1,2,…,K. Then \(\mathcal{M}_{0}\) in (14) and \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) are positive definite.

Proof

From (14), we have \(\mathcal{M}_{0} = \mathcal{F}_{0}^{-1} \mathcal{E}_{0} + \sum_{k=1}^{K} \mathcal{B}_{k}^{*} M_{k}^{-1} \mathcal{B}_{k}\). We have that \(\mathcal{F}_{0}^{-1} \mathcal{E}_{0}\) is positive definite by Lemma 3.1, so it suffices to show that \(\mathcal{B}_{k}^{*}M_{k}^{-1} \mathcal{B}_{k}\) is positive semidefinite for k=1,2,…,K. Letting \(\mathcal{B}_{k} U = [B_{k}^{(1)} \bullet U, B_{k}^{(2)} \bullet U, \ldots, B_{k}^{(m_{1})} \bullet U ]^{\mathsf{T}}\), for each \(U \in\mathbb{R}^{n_{0} \vee n_{0}}\) and \(M_{k}^{-1} = [\phi _{ij}^{(k)} ]_{m_{1} \times m_{1}}\), we have

for k=1,2,…,K. So

$$\mathcal{B}_k^{*} M_k^{-1} \mathcal{B}_k U \bullet U = \sum_{i} \sum_{j}\bigl(\phi_{ij}^{(k)} B_k^{(j)} \bullet U\bigr)B_k^{(i)} \bullet U = (\mathcal{B}_k U)^{\mathsf{T}}M_k^{-1}( \mathcal{B}_k U) \geq0. $$

The last inequality is due to the fact that \(M_{k}^{-1}\) is positive definite. As in [2], we can then show that \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is positive definite. □

5.2 Complexity Estimates

Theorem 5.1

Suppose that m 1=m 0, n 1=n 0 and that \(W_{k}^{(1)}, W_{k}^{(2)}, \ldots, W_{k}^{(m_{0})}\) are linearly independent for k=0,1,…,K. By utilizing method described in Sect4.2 for computing the search directions in Algorithm  1, we have that the number of arithmetic operations in each iteration of Algorithm  1 is \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\).

Proof

The dominant computations of the method in Sect. 4.2 occur at (12), (14), (16), (18), (20), and (22). The corresponding numbers of arithmetic operations for these computations are listed in Table 1. In particular, the computation in (14) will be analyzed in detail. The total number of arithmetic operations is dominated by \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\) for choices of P stated in Sect. 3.

Table 1 Complexity estimates for dominant steps in method of Sect. 4.2

We analyze the work of computation in (14). Now \(\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\) is a mapping from \({\mathbb{R}}^{n_{0} \vee n_{0}}\) to itself. We use the operator \(\operatorname{\mathit{svec}}: {\mathbb{R}}^{n_{0} \vee n_{0}} \to \mathbb{R}^{\bar{n}_{0}}\) where \(\bar{n}_{0} := \frac{1}{2}n_{0}(n_{0}+1)\), to get the matrix of \(\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\). For any \(U\in{\mathbb{R}}^{n_{0} \vee n_{0}}\),

The third equality above is from the relationship \(U\bullet V= \operatorname{\mathit{svec}}(U)^{\mathsf{T}}\operatorname{\mathit{svec}}(V)\) for \(U,V\in{\mathbb{R}}^{n_{0} \vee n_{0}}\) (see [5]). So the matrix of \(\mathcal{M}_{0} = \mathcal{F}_{0}^{-1}\mathcal{E}_{0} + \sum_{k=1}^{K}\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\) in \(\mathbb {R}^{n_{0} \vee n_{0}}\) is given by

$$ H_0 + \sum_{k=1}^{K} \sum_{i=1}^{m_0}\sum _{j=1}^{m_0}\phi_{ij}^{(k)}\bigl( \operatorname{\mathit{svec}}\bigl(B_k^{(i)}\bigr)\operatorname{\mathit{svec}} \bigl(B_k^{(j)}\bigr)^{\mathsf{T}}\bigr), $$
(24)

where H 0 is the matrix of \(\mathcal{F}_{0}^{-1}\mathcal{E}_{0}\), which depends on different choices of symmetrization for search directions.

The number of arithmetic operations needed to compute \(\operatorname{\mathit{svec}}(B_{k}^{(i)})\operatorname{\mathit{svec}}(B_{k}^{(j)})^{\mathsf{T}}\) is \(O(\bar{n}_{0}^{2})=O(n_{0}^{4})\), so the number of arithmetic operations needed for the second term in (24) is \(O(Km_{0}^{2}n_{0}^{4})\). The inverse of (24) needs \(O(\bar{n}_{0}^{3})=O(n_{0}^{6})\) arithmetic operations. Finally, the arithmetic operations needed for \(\mathcal{M}_{0}^{-1}\) is \(O(Km_{0}^{2}n_{0}^{4}+n_{0}^{6})\). Thus, the arithmetic operations for (14) is dominated by \(O(K(m_{0}^{3}+m_{0}^{2}n_{0}^{4}) + n_{0}^{6})\). □

If we use a generic homogeneous algorithm such as the one in [5], then the number of arithmetic operations required to compute the search directions for (6), (7) is O(mn 3+m 2 n 2+m 3), where n:=(n 0+Kn 1) and m:=(m 0+Km 1). Setting m 1=m 0 and n 1=n 0 and substituting n=(1+K)n 0 and m=(1+K)m 0 in O(mn 3+m 2 n 2+m 3), we have that the complexity of such a generic method of computing the search directions is \(O(K^{4}(m_{0} n_{0}^{3} + m_{0}^{2} n_{0}^{2}) + K^{3} m_{0}^{3})\). This is much larger than the complexity \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\) obtained for the method in Sect. 4.2 when Km 0 and Kn 0.

If problem (6)–(7) has a solution, then the KSH method is globally convergent [6]. The algorithm finds an optimal solution or determines that the primal-dual pair has no solution of norm less than a given number in at most O(n 1/2 L) iterations. Here, n is the size of the problem and L is the logarithm of the ratio of the initial error and the tolerance. So, by using the method in Sect. 4.2 for computing the search direction with KSH symmetrization, the complexity of Algorithm 1 in terms of the total number of arithmetic operations is \(O(K^{1.5}(m_{0}^{3} n_{0}^{0.5} + m_{0}^{2} n_{0}^{4.5}) + K^{0.5} n_{0}^{6.5})\) to find a solution or ascertain that the (8) has no solution. In comparison, the short- and long-step decomposition algorithms of Ariyawansa and Zhu [4] have complexities of O(K 1.5) and O(K 2), respectively, in terms of the total number of arithmetic operations.

We note that the efficient computation of the Schur computation matrix M k in (12) is crucial as this is the most expensive step in each iteration where usually 80 % of the total CPU time is spent if the algorithm in [7] is used. However, in each iteration, each M k can be computed independently, and so distributed processing may be used to achieve substantial reductions in computation time.

6 Concluding Remarks

In this paper, we have proposed homogeneous self-dual algorithms for Stochastic Semidefinite Programming whose complexity is comparable to those of algorithms in [4]. The efficient method for calculating the search directions we developed can take advantage of parallel and distributed processing.