Abstract
Ariyawansa and Zhu have proposed a new class of optimization problems termed stochastic semidefinite programs to handle data uncertainty in applications leading to (deterministic) semidefinite programs. For stochastic semidefinite programs with finite event space, they have also derived a class of volumetric barrier decomposition algorithms, and proved polynomial complexity of certain members of the class. In this paper, we consider homogeneous self-dual algorithms for stochastic semidefinite programs with finite event space. We show how the structure in such problems may be exploited so that the algorithms developed in this paper have complexity similar to those of the decomposition algorithms mentioned above.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
While linear programs have enjoyed wide applicability, they are based on deterministic data. Stochastic Linear Programs [1] (SLPs) were defined to handle uncertainty in data defining linear programs leading to dramatic enhancement of areas of applicability. Deterministic semidefinite programs (DSDPs) have been the focus of intense research during the past 20 years, especially in the context of interior point methods for optimization [2]. DSDPs generalize linear programs and have a wide variety of applications, especially beyond those covered by linear programs. It then becomes natural to seek a generalization of DSDPs to handle uncertainty in data similar to the way SLPs generalize linear programs. In [3], Ariyawansa and Zhu have defined such a class of problems termed stochastic semidefinite programs (SSDPs).
Ariyawansa and Zhu [4] have presented a class of volumetric barrier decomposition algorithms for SSDPs and proved polynomial complexity of certain members of the class. Their derivation is for the case where the event space of the random variables in the SSDP is finite with K realizations. Another important feature of the algorithms in [4] is that the most expensive part of the algorithm naturally separates into K subproblems, which allows efficient parallel implementations.
Decomposition can be viewed as a way of exploiting the special structure in SSDPs. In this paper, we propose homogeneous self-dual algorithms for SSDPs (with finite event spaces of their random variables) as an alternative to decomposition for handling the special structure. We show that the complexity of these algorithms is similar to those in [4], and that the most expensive part of the algorithms separates into K independent subproblems.
2 Notation and Terminology
The notation and terminology we use in the rest of this paper follow that in [2]. We let ℝm×n and ℝn∨n denote the vector spaces of real m×n matrices and real symmetric n×n matrices, respectively. We write X⪰0 (X≻0) to mean that X is positive semidefinite (positive definite) and we use U⪰V or V⪯U to mean that U−V⪰0. For U,V∈ℝm×n we write U•V:=trace (U T V) to denote the Frobenius inner product of U and V.
Given A i ∈ℝn∨n for i=1,2,…,m, we define the operator \(\mathcal{A}: \mathbb{R}^{n \vee n}\to\mathbb{R}^{m}\) by
for X∈ℝn∨n. The adjoint operator of \(\mathcal{A}\) with respect to the standard inner products in ℝn∨n and ℝm is the operator \(\mathcal {A}^{\ast} :\mathbb{R}^{m} \to\mathbb{R}^{n \vee n}\) defined by \(\mathcal{A}^{\ast}y:=\sum_{i=1}^{m}y_{i} A_{i}\), for y∈ℝm.
The Kronecker product P⊛Q, where P,Q∈ℝn∨n is the operator from ℝn∨n to itself defined by \((P\circledast Q)U:=\frac{1}{2}(PUQ^{\mathsf{T}}+QUP^{\mathsf{T}})\), for U∈ℝn∨n.
Finally, we recall that the operator \(\operatorname{\mathit{svec}}:\mathbb{R}^{n \vee n}\to\mathbb{R}^{\bar{n}}\) (see [5]) where \(\bar{n} := (1/2)n(n+1)\) is defined by \(\operatorname{\mathit{svec}}(U):=[ U_{11},\sqrt{2}U_{21}, \ldots,\sqrt{2}U_{n1}, U_{22},\sqrt{2}U_{32}, \ldots,\allowbreak \sqrt{2}U_{n2}, \ldots, U_{nn} ]^{\mathsf{T}}\) for U∈ℝn∨n.
3 Homogeneous Self-dual Methods for DSDPs
Given data A i ∈ℝn∨n for i=1,2,…,m, b∈ℝm and C∈ℝn∨n, a DSDP in primal standard form is defined as
where X∈ℝn∨n is the variable. The dual of (2) is
where y∈ℝm and S∈ℝn∨n are the variables.
We briefly review the homogeneous interior point algorithm for DSDPs as in [6]. The homogeneous model for (2)–(3) is as follows:
It is easy to show from (4) that X•S+τκ=0. The main step at each iteration of the homogeneous interior point algorithm (shown below in Algorithm 1) is the computation of the search direction (ΔX,Δy,ΔS) from the symmetrized Newton equations with respect to an invertible matrix P (which is chosen as a function of (X,y,S)) defined by:
where \(r_{p} := b\tau- \mathcal{A}X\), \(R_{d} := \mathcal{A}^{*}y + S -\tau C\), r g :=C•X−b T y+κ, μ:=[1/(n+1)](X•S+τκ), η and γ are two parameters, H P :ℝn∨n⟶ℝn∨n is the symmetrization operator defined by \(H_{P}(U) := \frac{1}{2}(PUP^{-1} + (P^{-1})^{\mathsf{T}}U^{\mathsf {T}}P^{\mathsf{T}})\), and \(\mathcal{E}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}\) and \(\mathcal{F}: \mathbb{R}^{n \vee n}\longrightarrow\mathbb{R}^{n \vee n}\) are the linear operators defined by \(\mathcal{E} := P\circledast(P^{-1})^{\mathsf{T}}S\) and \(\mathcal{F} := PX\circledast(P^{-1})^{\mathsf{T}}\), respectively.
Currently, the most common choices of symmetrization for search directions in practice are as follows [7]:
-
Helmberg–Rendel–Vanderbei–Wolkowicz/Kojima–Shindoh–Hara/Monteiro (HKM) direction, corresponding to P:=S 1/2. In this case, we have that \(\mathcal{E} = \mathcal{I}\) and \(\mathcal{F} = X \circledast S^{-1}\).
-
Kojima, Shindoh, and Hara (KSH) direction (rediscovered by Monteiro), corresponding to P:=X −1/2. In this case, we have that \(\mathcal{E} = S\circledast X^{-1}\) and \(\mathcal{F} = \mathcal{I}\).
-
Nesterov–Todd (NT) direction, corresponding to P:=H −1/2, here H is the unique symmetric positive definite matrix satisfying HXH=X, which can be calculated by H=X 1/2(X 1/2 SX 1/2)−1/2 X 1/2. In this case, we have \(\mathcal{E} = \mathcal{I}\) and \(\mathcal{F} = H \circledast H\).
Lemma 3.1
Suppose that X≻0, S≻0, and A 1,A 2,…,A m are linearly independent. Then for each of the above three choices of P, \(\mathcal{E}^{-1}\) and \(\mathcal{F}^{-1}\) exist, and \(\mathcal{E}^{-1}\mathcal{F}\) and \(\mathcal{F}^{-1}\mathcal{E}\) are positive definite and self-adjoint.
Proof
See [2]. □
We state the generic homogeneous algorithm as in [6].
4 Homogeneous Self-dual Algorithms for SSDPs
4.1 SSDPs with Finite Event Space
The general definition of a SSDP in primal standard form is stated in [3]. We consider the special case in which the event space is finite with K realizations. Let (deterministic) data \(W_{0}^{(i)} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\) for i=1,2,…,m 0, \(h_{0} \in\mathbb{R}^{m_{0}}\) and \(C_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\); and realizations of random data \(B^{(i)}_{k}\in{\mathbb{R}}^{n_{0} \vee n_{0}}\), \(W^{(i)}_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}\) for i=1,2,…,m 1, \(h_{k} \in\mathbb{R}^{m_{1}}\) and \(C_{k} \in\mathbb{R}^{n_{1} \vee n_{1}}\) for k=1,2,…,K be given. Then a SSDP with finite event space in primal standard form is the problem
where \(X_{0} \in{\mathbb{R}}^{n_{0} \vee n_{0}}\) and \(X_{k} \in{\mathbb{R}}^{n_{1} \vee n_{1}} \) for k=1,2,…,K are the first-stage and second-stage variables, respectively.
Problem (6) is a DSDP in primal standard form with block diagonal structure. Algorithms that exploit this special structure are especially important when K is large as is the case in typical applications.
The dual of (6) is
where \(y \in\mathbb{R}^{(m_{0}+ K m_{1})}\) and \(S_{k} \in\mathbb{R}^{(n_{0} + K n_{1}) \vee(n_{0} + K n_{1})}\) for k=1,2,…,K are the variables.
Now Algorithm 1 can be applied to problem (6)–(7). In practice, K is very large. Then problem (6)–(7) is large, and in particular, the computation of the search direction in Algorithm 1 (i.e., the solution of the system (5)) is very expensive. As we shall see in the next section, this computational work can be reduced significantly by exploiting the special structure of problem (6, 7). In addition, the method we describe in next section for the computation of the search direction decomposes into K smaller computations that can be performed in parallel.
4.2 An Efficient Method for Computing Search Directions
We now describe a method for computing the search direction in Algorithm 1 that exploits the special structure in (6), (7). The homogeneous model (4) for problem (6)–(7) is
where \(\mathcal{W}_{k}^{*}\) and \(\mathcal{B}_{k}^{*}\) are adjoint operators with appropriate dimensions for k=1,2,…,K.
The search direction system corresponding to (8) can be derived via (5) as
where \(r_{p_{0}} := h_{0}\tau- \mathcal{W}_{0} X_{0}\); \(r_{p_{k}} := h_{k}\tau- \mathcal{B}_{k} X_{0} - \mathcal{W}_{k} X_{k}\); \(R_{d_{0}} : = \mathcal{W}_{0}^{*} y_{0} + \sum_{k=1}^{K} \mathcal {B}_{k}^{*} y_{k} + S_{0} - \tau C_{0}\); \(R_{d_{k}} := \mathcal{W}_{k}^{*} y_{k} + S_{k} - \tau C_{k}\); \(r_{g} := \kappa- \sum_{k=0}^{K} h_{k}^{\mathsf{T}}y_{k} + \sum_{k=0}^{K}C_{k} \bullet X_{k}\); \(\mu:= (\sum_{k = 0}^{K} X_{k} \bullet S_{k} + \tau\kappa)/(n_{0} +Kn_{1} + 1)\); η and γ are two parameters; and \(\mathcal{E}_{k}, \mathcal{F}_{k}\), and \(H_{P_{k}}\) are linear operators which depend only on X k and S k ; for 1,2,…,K.
Now we present the crux of our method for finding the search direction as a solution to (9). By the sixth equation of (9) and the fact that \(\mathcal{F}_{k}^{-1}\) for k=1,2,…,K exist, we have \(\Delta S_{k} = -\mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} + \mathcal {F}_{k}^{-1} (\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))\) for k=1,2,…,K. Substituting this into the fourth equation of (9), we get \(-\mathcal{W}_{k}^{*} \Delta y_{k} + \Delta\tau C_{k} + \mathcal{F}_{k}^{-1} \mathcal{E}_{k} \Delta X_{k} = \eta R_{d_{k}} + \mathcal{F}_{k}^{-1} (\gamma \mu I_{k} - H_{P_{k}}(X_{k}S_{k}))\) for k=1,2,…,K. By this equation and by the fact that \((\mathcal{F}_{k}^{-1} \mathcal{E}_{k})^{-1} = \mathcal{E}_{k}^{-1} \mathcal{F}_{k}\), because \(\mathcal{E}_{k}^{-1}\) exist, for k=1,2,…,K, we can express ΔX k as
Substituting (10) into the second equation of (9), we have \(\mathcal{B}_{k}\Delta X_{0} + \mathcal{W}_{k} (\mathcal{E}_{k}^{-1} \mathcal{F}_{k} \mathcal{W}_{k}^{*} \Delta y_{k} - \mathcal{E}_{k}^{-1} \mathcal {F}_{k}(\Delta\tau C_{k} + \eta R_{d_{k}}) + \mathcal{E}_{k}^{-1}(\gamma\mu I_{k} - H_{P_{k}}(X_{k}S_{k}))) - h_{k} \Delta\tau= \eta r_{p_{k}}\) for k=1,2,…,K. Thus,
where
for k=1,2,…,K. By the fifth equation of (9) and the fact that \(\mathcal{F}_{0}^{-1}\) exists, we get \(\Delta S_{0} = -\mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0} + \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0}S_{0}))\). We use this equality and (11) in the third equation of (9) and get \(-\mathcal{W}_{0}^{*}\Delta y_{0} - \sum_{k=1}^{K}\mathcal{B}_{k}^{*}(-M_{k}^{-1} \mathcal{B}_{k} \Delta X_{0} + q_{k} \Delta\tau+ \nu_{k}) + \Delta\tau C_{0} + \mathcal{F}_{0}^{-1} \mathcal{E}_{0} \Delta X_{0}- \mathcal{F}_{0}^{-1}(\gamma\mu I_{0} - H_{P_{0}}(X_{0} S_{0})) = \eta R_{d_{0}}\). From this relation, we have
where
Now, substituting (13) into the first equation of (9), we have \(\mathcal{W}_{0} (\mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*} \Delta y_{0} - T_{0} \Delta\tau+ U_{0}) - h_{0} \Delta\tau= \eta r_{p_{0}}\). Because \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is nonsingular (a fact to be discussed in Sect. 5), this equation leads us to
where
Now we substitute backward. First, we substitute (15) in (13) to get
where
Substituting (17) in (11), we obtain
where
for k=1,2,…,K. Furthermore, we substitute (19) in (10) to get
where
for k=1,2,…,K. Now, we substitute (15), (17), (19), and (21) in the last equation of (9). By the seventh equation of (9), this yields
Finally, Δτ is given by
We get other directions by (23), and move to the next iteration in Algorithm 1.
5 Complexity Analysis
In this section, we first show that under reasonable conditions the operations described above are valid. Then we estimate the computational complexity of Algorithm 1 with the method described in Sect. 4.2 applied on problem (6)–(7). Finally, we compare that complexity to the complexity of Algorithm 1 applied on problem (6)–(7) treating it as a generic primal-dual DSDP pair.
5.1 Validation of Computations
We assume that \(W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}\), and \(W_{k}^{(1)},W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}\) for k=1,2,…,K are linearly independent. Then M k for k=1,2,…,K in (12) are nonsingular and positive definite by Lemma 3.1.
Now we will show that \(\mathcal{M}_{0}\) in (14) is nonsingular and that \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is also nonsingular.
Lemma 5.1
Suppose that \(W_{0}^{(1)}, W_{0}^{(2)},\ldots, W_{0}^{(m_{0})}\) and that \(W_{k}^{(1)}, W_{k}^{(2)} \ldots, W_{k}^{(m_{1})}\) are linearly independent for k=1,2,…,K. Then \(\mathcal{M}_{0}\) in (14) and \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) are positive definite.
Proof
From (14), we have \(\mathcal{M}_{0} = \mathcal{F}_{0}^{-1} \mathcal{E}_{0} + \sum_{k=1}^{K} \mathcal{B}_{k}^{*} M_{k}^{-1} \mathcal{B}_{k}\). We have that \(\mathcal{F}_{0}^{-1} \mathcal{E}_{0}\) is positive definite by Lemma 3.1, so it suffices to show that \(\mathcal{B}_{k}^{*}M_{k}^{-1} \mathcal{B}_{k}\) is positive semidefinite for k=1,2,…,K. Letting \(\mathcal{B}_{k} U = [B_{k}^{(1)} \bullet U, B_{k}^{(2)} \bullet U, \ldots, B_{k}^{(m_{1})} \bullet U ]^{\mathsf{T}}\), for each \(U \in\mathbb{R}^{n_{0} \vee n_{0}}\) and \(M_{k}^{-1} = [\phi _{ij}^{(k)} ]_{m_{1} \times m_{1}}\), we have
for k=1,2,…,K. So
The last inequality is due to the fact that \(M_{k}^{-1}\) is positive definite. As in [2], we can then show that \(\mathcal{W}_{0} \mathcal{M}_{0}^{-1} \mathcal{W}_{0}^{*}\) is positive definite. □
5.2 Complexity Estimates
Theorem 5.1
Suppose that m 1=m 0, n 1=n 0 and that \(W_{k}^{(1)}, W_{k}^{(2)}, \ldots, W_{k}^{(m_{0})}\) are linearly independent for k=0,1,…,K. By utilizing method described in Sect. 4.2 for computing the search directions in Algorithm 1, we have that the number of arithmetic operations in each iteration of Algorithm 1 is \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\).
Proof
The dominant computations of the method in Sect. 4.2 occur at (12), (14), (16), (18), (20), and (22). The corresponding numbers of arithmetic operations for these computations are listed in Table 1. In particular, the computation in (14) will be analyzed in detail. The total number of arithmetic operations is dominated by \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\) for choices of P stated in Sect. 3.
We analyze the work of computation in (14). Now \(\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\) is a mapping from \({\mathbb{R}}^{n_{0} \vee n_{0}}\) to itself. We use the operator \(\operatorname{\mathit{svec}}: {\mathbb{R}}^{n_{0} \vee n_{0}} \to \mathbb{R}^{\bar{n}_{0}}\) where \(\bar{n}_{0} := \frac{1}{2}n_{0}(n_{0}+1)\), to get the matrix of \(\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\). For any \(U\in{\mathbb{R}}^{n_{0} \vee n_{0}}\),
The third equality above is from the relationship \(U\bullet V= \operatorname{\mathit{svec}}(U)^{\mathsf{T}}\operatorname{\mathit{svec}}(V)\) for \(U,V\in{\mathbb{R}}^{n_{0} \vee n_{0}}\) (see [5]). So the matrix of \(\mathcal{M}_{0} = \mathcal{F}_{0}^{-1}\mathcal{E}_{0} + \sum_{k=1}^{K}\mathcal{B}_{k}^{*}M_{k}^{-1}\mathcal{B}_{k}\) in \(\mathbb {R}^{n_{0} \vee n_{0}}\) is given by
where H 0 is the matrix of \(\mathcal{F}_{0}^{-1}\mathcal{E}_{0}\), which depends on different choices of symmetrization for search directions.
The number of arithmetic operations needed to compute \(\operatorname{\mathit{svec}}(B_{k}^{(i)})\operatorname{\mathit{svec}}(B_{k}^{(j)})^{\mathsf{T}}\) is \(O(\bar{n}_{0}^{2})=O(n_{0}^{4})\), so the number of arithmetic operations needed for the second term in (24) is \(O(Km_{0}^{2}n_{0}^{4})\). The inverse of (24) needs \(O(\bar{n}_{0}^{3})=O(n_{0}^{6})\) arithmetic operations. Finally, the arithmetic operations needed for \(\mathcal{M}_{0}^{-1}\) is \(O(Km_{0}^{2}n_{0}^{4}+n_{0}^{6})\). Thus, the arithmetic operations for (14) is dominated by \(O(K(m_{0}^{3}+m_{0}^{2}n_{0}^{4}) + n_{0}^{6})\). □
If we use a generic homogeneous algorithm such as the one in [5], then the number of arithmetic operations required to compute the search directions for (6), (7) is O(mn 3+m 2 n 2+m 3), where n:=(n 0+Kn 1) and m:=(m 0+Km 1). Setting m 1=m 0 and n 1=n 0 and substituting n=(1+K)n 0 and m=(1+K)m 0 in O(mn 3+m 2 n 2+m 3), we have that the complexity of such a generic method of computing the search directions is \(O(K^{4}(m_{0} n_{0}^{3} + m_{0}^{2} n_{0}^{2}) + K^{3} m_{0}^{3})\). This is much larger than the complexity \(O(K(m_{0}^{3} + m_{0}^{2} n_{0}^{4}) + n_{0}^{6})\) obtained for the method in Sect. 4.2 when K≫m 0 and K≫n 0.
If problem (6)–(7) has a solution, then the KSH method is globally convergent [6]. The algorithm finds an optimal solution or determines that the primal-dual pair has no solution of norm less than a given number in at most O(n 1/2 L) iterations. Here, n is the size of the problem and L is the logarithm of the ratio of the initial error and the tolerance. So, by using the method in Sect. 4.2 for computing the search direction with KSH symmetrization, the complexity of Algorithm 1 in terms of the total number of arithmetic operations is \(O(K^{1.5}(m_{0}^{3} n_{0}^{0.5} + m_{0}^{2} n_{0}^{4.5}) + K^{0.5} n_{0}^{6.5})\) to find a solution or ascertain that the (8) has no solution. In comparison, the short- and long-step decomposition algorithms of Ariyawansa and Zhu [4] have complexities of O(K 1.5) and O(K 2), respectively, in terms of the total number of arithmetic operations.
We note that the efficient computation of the Schur computation matrix M k in (12) is crucial as this is the most expensive step in each iteration where usually 80 % of the total CPU time is spent if the algorithm in [7] is used. However, in each iteration, each M k can be computed independently, and so distributed processing may be used to achieve substantial reductions in computation time.
6 Concluding Remarks
In this paper, we have proposed homogeneous self-dual algorithms for Stochastic Semidefinite Programming whose complexity is comparable to those of algorithms in [4]. The efficient method for calculating the search directions we developed can take advantage of parallel and distributed processing.
References
Kall, P., Wallace, S.: Stochastic Programming. Wiley, New York (1994)
Todd, M.J.: Semidefinite optimization. Acta Numer. 10, 515–560 (2001)
Ariyawansa, K.A., Zhu, Y.: Stochastic semidefinite programming: a new paradigm for stochastic optimization. 4OR 4(3), 239–253 (2006). An earlier version of this paper appeared as Technical Report 2004-10 of the Department of Mathematics, Washington State University, Pullman, WA 99164-3113, in October 2004
Ariyawansa, K.A., Zhu, Y.: A class of polynomial volumetric barrier decomposition algorithms for stochastic semidefinite programming. Math. Comput. 80, 1639–1661 (2011)
Todd, M.J., Toh, K.C., Tütüncü, R.H.: On the Nesterov-Todd direction in semidefinite programming. SIAM J. Optim. 8, 769–796 (1998)
Potra, F., Sheng, R.: On homogeneous interior-point algorithms for semidefinite programming. Optim. Methods Softw. 9, 161–184 (1998)
Toh, K.C., Todd, M.J., Tütüncü, R.H.: SDPT3—a MATLAB software package for semidefinite programming. Optim. Methods Softw. 11/12, 545–581 (1999)
Acknowledgements
The work of S.J. was performed while he was visiting Washington State University. Research supported in part by the Chinese National Foundation under Grants No. 51139005 and 51179147. K.A.A. research supported in part by the US Army Research Office under Grant DAAD 19-00-1-0465 and by Award W11NF-08-1-0530. Y.Z. research supported in part by ASU West MGIA Grant 2007.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jin, S., Ariyawansa, K.A. & Zhu, Y. Homogeneous Self-dual Algorithms for Stochastic Semidefinite Programming. J Optim Theory Appl 155, 1073–1083 (2012). https://doi.org/10.1007/s10957-012-0110-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-012-0110-x