Two-phase simplex method for linear semidefinite optimization

Zhadan, Vitaly

doi:10.1007/s11590-018-1333-z

Two-phase simplex method for linear semidefinite optimization

Original Paper
Published: 04 October 2018

Volume 13, pages 1969–1984, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Optimization Letters Aims and scope Submit manuscript

Two-phase simplex method for linear semidefinite optimization

Download PDF

Vitaly Zhadan ORCID: orcid.org/0000-0002-2198-9079¹

367 Accesses
3 Citations
Explore all metrics

Abstract

The linear semidefinite programming problem is considered. To solve it, the variant of the primal simplex method, that generalizes the corresponding method for linear programming problems, is proposed. The passage from one extreme point of the feasible set to another one is described. The main attention is given to pivoting in the case, when the extreme point is irregular, i.e. the “triangular” number of rank of the matrix in the basic point is less than number of equality type constraints in the problem. The approach for finding a starting extreme point is proposed too. The local convergence of the method is proven.

A Variant of the Dual Simplex Method for a Linear Semidefinite Programming Problem

Article 01 December 2017

The Dual Simplex-Type Method for Linear Second-Order Cone Programming Problem

A Variant of the Simplex Method for Second-Order Cone Programming

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Linear semidefinite programming problems are very important generalizations of linear programming problems [1,2,3]. The semidefinite optimization problem, represented in standard form, can be expressed as a problem of minimizing a linear objective function on the cone of positively semidefinite symmetric matrices subject to linear equality constraints [4, 5]. Many convex nonlinear problems of mathematical programming, as well as problems of discrete and combinatorial optimization are reduced to such type statements [4, 6]. Moreover, semidefinite programming includes second order cone programming as a special case [7].

To date, effective primal-dual methods proposed in [4, 5] for solving linear semidefinite programming problems mainly generalize interior point techniques earlier developed for linear programming problems. The simplex-type methods had been studied lesser, and there are a number of reasons for this. Among them, one of the main ones is the lack of polyhedral property for the cone of positively semidefinite matrices. As a consequence, the feasible region has infinitely many extreme points. Nevertheless, there are some methods which are extensions of primal simplex method for linear programming.

In [8] a fairly universal generalization of the simplex method for problems with constraints defined in the form of linear matrix inequalities is proposed. Another version of the simplex method for semi-definite programming, using finite approximations of the cone of positively semidefinite matrices, is given in [9]. For problems of conic programming with arbitrary closed convex cones, the generalization of the simplex method was considered in [10]. Pivoting procedure for a class of second-order cone programming problems having one second-order cone, but possibly additional non-negative variables, have been proposed in [11]. Simplex-type algorithms for second-order cone programming via semi-infinite programming reformulation were considered also in [12].

The purpose of this paper is to develop a standard simplex pivoting procedure similar to one used in linear programming. The main attention is given to pivoting in the case, when the number of equality type constraints does not coincide with the number of variables in the problem, equal to the so-called “triangular” number (the number of elements of symmetric matrices on and below the main diagonal). The preliminary variant of the method was proposed earlier in [13]. Here we present also the special procedure for finding an extreme point of the feasible set in the case when only an arbitrary point of this set is known. The approach used at this first stage of the method is borrowed from the affine-scaling method proposed in [14]. In this method it is allowed to move along the faces of the feasible set and in addition to jump from one face to another.

The work consists of five sections. In Sect. 1, we give the formulation of the problem and present the optimality conditions. In Sect. 2, main assertions concerning the characterization of extreme points and their nondegeneracy are formulated. In Sects. 3 and 4, the passage from one extreme point of the feasible set to another one is described. In the end of the Sect. 4, we prove also the local convergence of the method. Finally, in Sect. 5 we consider the approach for finding an extreme point of the feasible set when only an arbitrary admissible point of this set is known.

Here is the notation used throughout this paper. We denote by D(a) the diagonal matrix with a vector a on its diagonal. We use also “, ” for adjoining vectors or components of a vector in a row and “; ” for adjoining them in a column. Thus for instance $a= [a^1; \ldots ; a^n]$. The same rule is applied for matrices. The identity matrix of order n is denoted by $I_n$. The zero n-dimensional vector and the zero $(m \times n)$ matrix are denoted by $0_n$ and $0_{mn}$, respectively.

2 The semidefinite optimization problem and optimality conditions

Let ${\mathbb S}^n$ denote the space of real symmetric matrices of order n, and let ${\mathbb S}^n_+$ denote the cone of positive semidefinite matrices from ${\mathbb S}^n$. To specify that a matrix $M\in {\mathbb S}^n$ is positively semidefinite, we will also use the inequality $M \succeq 0$. The dimension of ${\mathbb S}^n$ is equal to so-called nth “triangular” number $n_{\triangle } = n(n+1)/2$.

The inner product of two matrices $M_1$ and $M_2$ from ${\mathbb S}^n$ is defined as $M_1 \bullet M_2 = {\mathrm{tr\,}}(M_1 M_2)$. In the case where $M_1$ and $M_2$ are positively semidefinite matrices from ${\mathbb S}^n$, the inequality $M_1 \bullet M_2 \ge 0$ takes place. Moreover, $M_1 \bullet M_2 = 0$ if and only if $M_1M_2 = M_2 M_1 = 0_{nn}$. The cone ${\mathbb S}^n_+$ is self-dual.

Consider the linear semidefinite programming problem in standard form

$$\begin{aligned} \begin{array}{c} \min \ C\bullet X, \\ A_i \bullet X = b^i, \quad i=1,\ldots , m, \quad X \succeq 0. \end{array} \end{aligned}$$

(1)

Here the matrices C, X, and $A_i$ belong to the space ${\mathbb S}^n$. We assume that the matrices $A_i$, $1\le i\le m$, are linearly independent. The dual problem to (1) is

$$\begin{aligned} \begin{array}{c} \max \langle b, u \rangle , \\ \sum \limits _{i=1}^{m}u^i A_i + V = C, \quad V \succeq 0, \\ \end{array} \end{aligned}$$

(2)

where $b= [b^1; \ldots ; b^m]$, $V\in {\mathbb S}^n$, and angular brackets stand for usual Euclidean scalar product in ${\mathbb R}^m$. Below the matrix $V = C - \sum _{i=1}^m u^i A_i$, depending from $u\in {\mathbb R}^m$, is denoted by V(u). We suppose that both problems (1) and (2) have solutions.

Let $X_*$ be an optimal solution of problem (1), and let $[u_*,V_*]$ be an optimal solution of problem (2). There is the relation $V_* = V(u_*)$ between $u_* \in {\mathbb R}^m $ and $V_*$, and both symmetric positive semidefinite matrices $X_*$$V_*$ commute among themselves. Therefore, there exists the orthogonal matrix Q such that $X_* = QD(\eta _*)Q^T$$V_* = Q D(\theta _*) Q^T$, where $D(\eta _*)$ and $ D(\theta _*)$ are diagonal matrices with vectors $\eta _*$ and $\theta _*$ at diagonals. These vectors are composed from eigenvalues of matrices $X_*$ and $V_*$ respectively, i.e.

$$\begin{aligned} \eta _* = \left[ \eta _*^1; \ldots ; \eta _*^n \right] , \quad \theta _* = \left[ \theta _*^1; \ldots ; \theta _*^n \right] . \end{aligned}$$

For any pair of eigenvalues $\eta _*^i$ and $\theta _*^i$, $1\le i\le m$, the following complementary condition holds, which means that $\eta _*^i \ge 0$, $\theta _*^i\ge 0$ and $\eta _*^i\theta _*^i = 0$. The strict complementary condition means in addition that $\eta _*^i + \theta _*^i > 0$ for all $1\le i\le m$.

Consider the optimality conditions for pair of problems (1) and (2). Since we suppose that solutions of these problems exist, the following system of equalities and inequalities

$$\begin{aligned} \begin{array}{rcl} X\bullet V &{} = &{} 0, \\ A_i \bullet X &{} = &{} b^i, \quad 1\le i \le m, \\ V &{} = &{} C - \sum \nolimits _{i=1}^{m} u^i A_i, \\ X \succeq 0, &{} &{} V\succeq 0 \end{array} \end{aligned}$$

(3)

has the solution.

Let us write the equalities from (3) in different form using the operation of vectorization of matrices. For a square matrix M of order n, denote by ${\mathrm{vec\,}}M$ the column vector of length $n^2$, in which columns of M are stacked one after another. If the matrix M is symmetric, then, instead of ${\mathrm{vec\,}}M$, it is reasonable to use the column vector $\text{ svec }M$ of smaller dimension $n_\triangle $. This vector contains the lower parts of columns of M starting from the diagonal element, and each off-diagonal element is multiplied by $\sqrt{2}$. Then the scalar product $M_1 \bullet M_2$ of two matrices $M_1 \in {\mathbb S}^n$ and $M_2 \in {\mathbb S}^n$ can be written as the usual scalar product in the space ${\mathbb R}^{n_\triangle }$, i.e. $M_1 \bullet M_2 = \langle \text{ svec }M_1, \text{ svec }M_2 \rangle $. Using the vector representation of matrices, we can rewrite the equalities from (3) in the following form

$$\begin{aligned} \langle \text{ svec }X, \text{ svec }V \rangle = 0, \quad \mathcal{A}_\mathrm{svec} \text{ svec }X = b, \qquad \text{ svec }V = \text{ svec }C - \mathcal{A}_\mathrm{svec}^T u. \end{aligned}$$

(4)

Here $\mathcal{A}_\mathrm{svec}$ is the matrix of size $m\times n_\triangle $ with the rows $\text{ svec }A_i$, $1 \le i \le m$.

For passing from ${\mathrm{vec\,}}M$ to $\text{ svec }M$ and vice versa, for passing from $\text{ svec }M$ to ${\mathrm{vec\,}}M$, we will need special elimination and duplication matrices (see [15, 16]), denoted by $\tilde{\mathcal{L}}_n$ and $\tilde{\mathcal{D}}_n$. They are full rank matrices of size $n_\triangle \times n^2$ and $n^2 \times n_\triangle $, respectively. If M is a symmetric matrix of order n, then $\text{ svec }M = \tilde{\mathcal{L}}_n {\mathrm{vec\,}}M$ and ${\mathrm{vec\,}}M = \tilde{\mathcal{D}}_n \text{ svec }M$. Observe that $\tilde{\mathcal{L}}_n$ and $\tilde{\mathcal{D}}_n$ are somewhat different from the elimination and duplication matrices $\mathcal{L}_n$ and $\mathcal{D}_n$ from [15]; specifically, $\tilde{\mathcal{L}}_n = D (\text{ svec }E_n)\mathcal{L}_n$ and $\tilde{\mathcal{D}}_n = \mathcal{D}_n D^{-1} (\text{ svec }E_n)$. Here $E_n$ is the square matrix of order n with all elements equal to one.

Various ways of solving the system (4), supplemented by requirements that matrices X and V must be positive semidefinite, lead to various numerical methods for solving the problems (1) and (2). Below we will consider one of these ways which can be treated as the generalization of the simplex method for linear programming problems.

3 Extreme points of the feasible set

Let

$$\begin{aligned} \mathcal{F}_{P} = \left\{ X\in {\mathbb S}^n_+: \ A_i \bullet X = b^i, \ 1 \le i \le m \right\} , \end{aligned}$$

be the feasible set of problem (1). In what follows, we will concern with faces and extreme points of $\mathcal{F}_P$. The set $\mathcal{F}_P$ is an intersection of the cone ${\mathbb S}^n_+$ with the affine set $ \mathcal{F}_A = \left\{ X \in {\mathbb S}^n: \ A_i \bullet X = b^i, \ 1\le i\le m \right\} . $ Since both these sets are convex, faces of the set $\mathcal{F}_P$ are intersections of faces of ${\mathbb S}^n_+$ and $\mathcal{F}_A$. Faces of the cone ${\mathbb S}^n_+$ are defined by subspaces $\mathcal{L}$ of the space ${\mathbb R}^n$, namely $\mathcal{G}$ is the face of ${\mathbb S}^n_+$ if and only if $ \mathcal{G}= \mathcal{G}(\mathcal{L}) = \left\{ M\in {\mathbb S}^n_+: \ \mathbf{\mathcal{R}} (M) \subseteq \mathcal{L}\right\} . $ Here $\mathbf{\mathcal{R}}(M)$ denote the space of columns of the matrix M. If the dimension of $\mathcal{L}$ is equal to r, then $\text{ rank } \ M \le r $ for all matrices M from $\mathcal{G}$. Furthermore, the dimension of $\mathcal{G}(\mathcal{L})$ is equal to $r_\triangle $. The matrix $M\in \mathcal{G}(\mathcal{L})$ can be represented as $ M = Q\varLambda Q^T, $ where Q is a full rank matrix of the size $n\times r$, and $\varLambda \in {\mathbb S}^r_+$. For all matrices $M\in \mathcal{G}(\mathcal{L})$ the matrix Q is the same. The dual face $\mathcal{G}^\star (\mathcal{L})$ to the $\mathcal{G}(\mathcal{L})$ is the face which is defined by the orthogonal subspace $\mathcal{L}^\perp $, i.e. $\mathcal{G}^\star (\mathcal{L}) = \mathcal{G}(\mathcal{L}^\perp )$.

The minimal face of the cone ${\mathbb S}^n_+$ containing the point X has the following form

$$\begin{aligned} \mathcal{G}_\mathrm{min}(X; {\mathbb S}^n_+) = \{Y\in {\mathbb S}^n_+: \ \mathcal{R}(Y) \subseteq \mathcal{R}(X) \}. \end{aligned}$$

Thus, if the matrix $X\in {\mathbb S}^n_+$ has rank r, then the face $\mathcal{G}_\mathrm{min}(X; {\mathbb S}^n_+)$ is isomorphic to the cone ${\mathbb S}^r_+$, and, therefore, has the dimension $r_\triangle $. The dual face $\mathcal{G}_\mathrm{min}^*(X; {\mathbb S}^n_+)$ is isomorphic to the cone ${\mathbb S}^{n-r}_+$ and has the dimension $(n-r)_\triangle $.

Getting over from the cone ${\mathbb S}^n_+$ to the feasible set $\mathcal{F}_P$, we obtain that the minimal face for the point $X\in \mathcal{F}_P$ with respect to the set $\mathcal{F}_P$ is defined now as

$$\begin{aligned} \mathcal{G}_\mathrm{min} (X; \mathcal{F}_P) = \mathcal{G}_\mathrm{min}(X; {\mathbb S}^n_+)\bigcap \mathcal{F}_A = \left\{ Y\in \mathcal{F}_P: \ \mathcal{R}(Y) \subseteq \mathcal{R}(X)\right\} . \end{aligned}$$

Let r be rank of the matrix $X\in \mathcal{F}_P$, and let

$$\begin{aligned} X= Q D(\eta ) Q^T, \qquad \eta = \left[ \eta _B; \ \eta _N\right] , \qquad \eta _B > 0_r, \qquad \eta _N = 0_{n-r}, \end{aligned}$$

(5)

be the Cholesky factorization of X. Here Q is an orthogonal matrix of order n. We denote by $Q_B$ the $n\times r$ submatrix of Q formed from the first r columns of Q. We denote also by $A_i^{Q_B} = Q_B^T A_iQ_B$, $1\le i\le m$. The dimension of the face $\mathcal{G}_\mathrm{min}(X; \mathcal{F}_P)$ is equal to value

$$\begin{aligned} \text{ dim } \, \mathcal{G}_\mathrm{min}(X; \mathcal{F}_P) = r_\triangle - \text{ rank } \left[ A_1^{Q_B},\ldots , A_m^{Q_B}\right] . \end{aligned}$$

(6)

The point X is an extreme point of the feasible set $\mathcal{F}_P$ if and only if the dimension of the face $\mathcal{G}_\mathrm{min}(X; \mathcal{F}_P)$ is equal to zero. According to (6) the matrix $X\in \mathcal{F}_P$ of the rank r is an extreme point of the set $\mathcal{F}_P$ if and only if

$$\begin{aligned} \text{ rank }\left[ A_1^{Q_B},\ldots , A_m^{Q_B}\right] = r_\triangle . \end{aligned}$$

(7)

For linearly independent matrices $A_1, A_2, \ldots , A_m$ this equality can take place only in the case, where $r_\triangle \le m$ [10].

Therefore, the point $X\in \mathcal{F}_P$ may be extreme only when the rank r of X is such that $r_\triangle \le m$. We say that the extreme point $X\in \mathcal{F}_P$ is regular, if $r_\triangle = m$. Otherwise, in the case, where $r_\triangle < m$, we call the extreme point Xirregular. Note that all extreme points of $\mathcal{F}_P$ are irregular, if m is not a “triangular” number.

The tangent space to the cone ${\mathbb S}^n_+$ at the point $X\in \mathcal{F}_P$, for which the factorization (5) holds, has the following form [17]:

$$\begin{aligned} \mathcal{T}_X = \left\{ Q \left[ \begin{array}{cc} G &{}\quad F \\ F^T &{}\quad 0 \end{array} \right] Q^T: \ G \in {\mathbb S}^r, \ F\in {\mathbb R}^{r\times (n-r)} \right\} . \end{aligned}$$

The dimension of $\mathcal{T}_X$ is defined by the rank of the matrix X and is equal to $ n_\triangle - (n-r)_\triangle . $ Let us give the definition of a nondegenerate point [18].

Definition 1

The point $X\in \mathcal{F}_P$ is called nondegenerate, if $\mathcal{T}_X + \mathcal{N}_A = {\mathbb S}^n$, where $\mathcal{N}_A$ is the subspace in ${\mathbb S}^n$, which is parallel to the affine set $\mathcal{F}_A$.

The dimension of the subspace $\mathcal{N}_A$ is equal to $n_\triangle -m$. Since $\text{ dim } \ {\mathbb S}^n= n_\triangle $, the equality $\mathcal{T}_X + \mathcal{N}_A = {\mathbb S}^n$ takes place if and only if $ \text{ dim } \ \mathcal{T}_X + \text{ dim } \ \mathcal{N}_A \ge n_\triangle . $ Thus, in order the extreme point $X\in \mathcal{F}_P $ be nondegenerate, the following inequalities between the dimension of the space ${\mathbb S}^n$, the rank of the matrix X and the number of equalities m must be fulfilled: $ r_\triangle \le m \le n_\triangle - (n-r)_\triangle . $

Let for a point $X\in \mathcal{F}_P$ the factorization (5) be valid, and let $Q_B$ and $Q_N$ be the sub-matrices of the orthogonal matrix Q, consisting from the first r and subsequent $n-r$ columns of Q, respectively. As it is shown in [18], in order that the point X be nondegenerate, it is necessary and sufficient that the matrices

$$\begin{aligned} \left[ \begin{array}{ll} Q_B^T A_i Q_B &{}\quad Q_B^T A_i Q_N \\ Q_N^T A_i Q_B &{}\quad 0_{(n-r)(n-r)} \end{array} \right] , \quad 1\le i \le m, \end{aligned}$$

be linearly independent.

In what follows, we assume that all extreme points of the set $\mathcal{F}_P$ are nondegenerate. Moreover, if the point X is irregular, then the rank r of X satisfies to the condition $r_\triangle > m -r$. In this case we say, that the problem (1) is quasi-regular.

4 Pivoting at regular point

Let the starting extreme point $X_0\in \mathcal{F}_P$ be given, and let the sequence of extreme points $\{X_k\}$ be generated in such a way, that the values of the objective function in problem (1) monotonically decrease from iteration to iteration.

Assume that $X\in \mathcal{F}_P$ is a current extreme point. Assume also that the rank of X is equal to $r< n$, and the Cholesky factorization (5) is valid for X. It is possible to rewrite (5) in the following form

$$\begin{aligned} X = Q_B D(\eta _B) Q_B^T. \end{aligned}$$

(8)

Here $\eta _B = [\eta ^1; \ldots ; \eta ^r] > 0_r$, and, as before, $Q_B$ is the submatrix of the orthogonal matrix Q, consisting from the first r columns of Q. For simplicity, we assume in this section that X is a regular extreme point.

If X is not an optimal solution, it is desirable to pass to the new extreme point with the lesser value of the objective function. Let us describe this passage, using the optimality conditions (3). The idea of updating the point X is similar to one using in the simplex method for linear programming. First of all, we define from (3) the dual variable $u\in {\mathbb R}^m$ and calculate the weak dual variable $V= V(u)$.

It follows from properties of the matrix trace that

$$\begin{aligned} X \bullet V = {\mathrm{tr\,}}\left( Q_BD(\eta _B) Q_B^T V \right) = {\mathrm{tr\,}}\left( D(\eta _B) V^{Q_B}\right) = D(\eta _B) \bullet V^{Q_B}, \end{aligned}$$

where $V^{Q_B} = Q_B^T V Q_B$. Therefore, the equality $X \bullet V = 0$ is certainly satisfiable for the matrix V such that $V^{Q_B} = 0_{rr}$.

Denote by $\mathcal{A}^{Q_B}_\mathrm{svec}$ the $ m\times r_\triangle $ matrix, whose rows are the vectors $\text{ svec }(A_i^{Q_B})$, $1\le i\le m$. Then the equality $V^{Q_B} = 0_{rr}$ is reduced to the following system of linear equations with respect to the vector u:

$$\begin{aligned} \text{ svec }V^{Q_B} = \text{ svec }C^{Q_B} - (\mathcal{A}^{Q_B}_\mathrm{svec})^T u = 0_{r_\triangle }, \end{aligned}$$

(9)

where $C^{Q_B} = Q_B^T C Q_B$.

Since by assumption the extreme point X is regular, (9) is a system of m equations with respect to m unknowns. Moreover, according to (7) the square matrix $\mathcal{A}^{Q_B}_\mathrm{svec}$ is nonsingular. Solving the system (9), we obtain

$$\begin{aligned} u = \left( \mathcal{A}^{Q_B}_\mathrm{svec}\right) ^{-T}\text{ svec }C^{Q_B}, \end{aligned}$$

(10)

where notation $M^{-T}$ is used instead of $(M^T)^{-1}$. In the case, when the matrix $V = V(u)$ is positive semidefinite, X with [u, V] are the solutions of problems (1) and (2) respectively.

Assume further that V is not a positive semidefinite matrix. Observe also that the matrix V is similar to the matrix $V^Q = Q^T V Q$, and thus, its eigenvalues coincide with eigenvalues of $V^Q$. But $V^Q$ is a bordering matrix, since the upper left block of $V^Q$ is zero. Therefore, if off-diagonal blocks of $V^Q$ are non-zero, then among all eigenvalues of $V^Q$ there exist at least one negative [16].

Consider the Cholesky factorization $ V = HD(\theta )H^T$ of the matrix V, where H is an orthogonal matrix, and $\theta $ is the vector of eigenvalues of V. Denote by $h_j$ the jth column of the matrix H (the eigenvector of V). Then V can be written also in the form

$$\begin{aligned} V = \sum _{j=1}^n \theta ^j h_jh_j^T. \end{aligned}$$

Let $\theta ^{k}$ be a negative eigenvalue, and let $h_{k}$ be the corresponding eigenvector. Then $V^{h_k} = h_{k}^T V h_{k} = \theta ^{k} < 0$ or, in somewhat other representation,

$$\begin{aligned} V^{h_{k}} = C^{h_k} - \langle u, \mathcal{A}^{h_k} \rangle = \theta ^{k} < 0. \end{aligned}$$

(11)

Here we introduce the value $C^{h_k} = h_k^T C h_k$ and the m-dimensional vector $\mathcal{A}^{h_k}$ with components $h_k^T A_i h_k$, $1\le i\le m$.

Proposition 1

The vector $h_k$ does not belong to the subspace $\mathcal{R}(Q_B)$ generated by the columns of the matrix $Q_B$.

Proof

Indeed, if we assume that $h_k = Q_B z$ for some nonzero vector $z\in {\mathbb R}^r$, then we would have $Vh_k = VQ_B z = \theta ^k Q_B z$. Multiplying this equality on the left by the matrix $Q_B^T$, we get $V^{Q_B} z = \theta ^k z$, which is impossible, because $V^{Q_B}$ is the zero matrix. The assertion is proven. $\square $

Using the rank one matrix $h_kh_k^T$, we pass to the new point $\bar{X}$ by setting

$$\begin{aligned} \bar{X} = X + \alpha \varDelta X, \quad \varDelta X = Q_B \varDelta Z Q_B^T+ h_{k}h_{k}^T, \end{aligned}$$

(12)

where $\varDelta Z \in {\mathbb S}^r$, and $\alpha > 0$.

The matrix $\varDelta Z$ we select in such a way that

$$\begin{aligned} A_i \bullet Q_B \varDelta Z Q_B^T + A_i \bullet \ h_kh_k^T = 0, \quad 1\le i\le m. \end{aligned}$$

(13)

Then the updated point $\bar{X}$ satisfies all equality constraints in problem (1).

Since matrices $M_1M_2$ and $M_2M_1$ have the same trace, the equalities (13) can be rewritten as

$$\begin{aligned} \left( A_i^{Q_B}\right) \bullet \varDelta Z + h_{k}^T A_i h_{k} = 0, \quad 1\le i\le m. \end{aligned}$$

(14)

If we replace matrices by their vector representations, then this system takes the form

$$\begin{aligned} \mathcal{A}^{Q_B}_\mathrm{svec} \ \text{ svec }\varDelta Z + \mathcal{A}^{h_{k}} = 0_m. \end{aligned}$$

(15)

Solving the system (15), we obtain

$$\begin{aligned} \text{ svec }\varDelta Z = - \left( \mathcal{A}^{Q_{B}}_\mathrm{svec}\right) ^{-1}\mathcal{A}^{h_{k}}. \end{aligned}$$

(16)

Let us compute $C \bullet \varDelta X$. According to (12) the equality

$$\begin{aligned} C \bullet \varDelta X = C^{Q_B} \bullet \varDelta Z + C^{h_{k}} \end{aligned}$$

holds. Then, taking into account (11), we get after substitution of the vector $\text{ svec }\varDelta Z$ from (16)

$$\begin{aligned} C \bullet \varDelta X&= \left\langle \text{ svec }C^{Q_B}, \text{ svec }\varDelta Z \right\rangle + C^{h_{j_k}} = - \left\langle \text{ svec }C^{Q_B}, \left( \mathcal{A}^{Q_B}_\mathrm{svec}\right) ^{-1}\mathcal{A}^{h_{k}} \right\rangle + C^{h_{k}} \nonumber \\&= - \,\left\langle \left( \mathcal{A}^{Q_B}_\mathrm{svec}\right) ^{-T} \text{ svec }C^{Q_B}, \mathcal{A}^{h_{k}} \right\rangle + C^{h_{k}} = C^{h_{k}} -\langle u, \mathcal{A}^{h_{k}}\rangle = \theta ^{k} < 0. \end{aligned}$$

(17)

Hence, the objective function decreases along the direction $\varDelta X$. Therefore, in this case the matrix $\mathcal{A}^{Q_B}_\mathrm{svec}$ plays the role of basic matrix. Moreover, the matrix $Q_B$ and the vector $\eta _B$ can be treated as the basic pair of variables, i.e. the basic set consisting from the eigenvectors and eigenvalues.

For passage to the new extreme point (the updated basic pair) we must determine the step-size $\alpha > 0$. Denote $\bar{X}(\alpha ) = X + \alpha \varDelta X$. Then the following formula

$$\begin{aligned} \bar{X}(\alpha ) = X + \alpha \varDelta X = Q_B\left[ D(\eta _B) + \alpha \varDelta Z \right] Q_B^T + \alpha h_kh_k^T \end{aligned}$$

is valid. Since the rank one matrix $h_kh_k^T$ is positive semidefinite, the matrix $\bar{X}(\alpha )$ remains positive semidefinite for $\alpha $ sufficiently small. Therefore, the maximal possible step-size $\bar{\alpha }$ can be find from the condition that the negative eigenvalue appears at first time among all eigenvalues of $M(\alpha ) = D(\eta _B)+ \alpha \varDelta Z$.

Let P be a nonsingular matrix of order r with the help of which both symmetric matrices $D(\eta _B)$ and $\varDelta Z$ are reduced to the diagonal form, that is $D(\eta _B) = PP^T$, $\varDelta Z = P D(\lambda ) P^T$. Therefore, $M(\alpha ) = P[D(\bar{e}) + \alpha D(\lambda ) ] P^T$, where $\bar{e}$ is the r-dimensional vector with all components equal to one. It can be seen from here that the choice of $\bar{\alpha }$ depends on the negative component of the vector $\lambda $ with maximal absolute value. Presuppose that it is the component $\lambda _*$. Then $\bar{\alpha }= - \lambda _*^{-1}$. If $\lambda \ge 0_r$, the problem (1) does not have solution, since $\bar{X}(\alpha ) \in \mathcal{F}_P$ for any $\alpha > 0$. Furthermore, due to (17) $ C \bullet \bar{X}(\alpha ) \rightarrow - \infty $, when $\alpha \rightarrow + \infty $.

Let the face $\varGamma ^*_\mathrm{min}(X_k; {\mathbb S}^n_+)$ be dual to the minimal face $\varGamma _\mathrm{min}(X_k; {\mathbb S}^n_+)$. If the eigenvector $h_k$ of the matrix V(u) belongs to $\varGamma ^*_\mathrm{min}(X_k; {\mathbb S}^n_+)$, then the equality $Q_B^T h_k = 0_r$ holds. The vector $h_k$ is in fact a column of the matrix $Q_N$. Hence, in this case we exclude some vector from the basic set of eigenvectors (the columns of the matrix $Q_B$) and introduce instead of it the new eigenvector of X, which is a column of $Q_N$.

5 Pivoting at irregular point

Assume now that the extreme point $X\in \mathcal{F}_P$ is irregular, that is $r_\triangle < m$, and assume also for definiteness $m = r_\triangle + p$, where $p< r$. In this case the system of equations (9) is undetermined and, thus, can have the whole set of solutions.

Let us take the normal solution with minimal norm among all possible solutions

$$\begin{aligned} u = \left( \mathcal{A}^{Q_B}_\mathrm{svec}\right) \left[ \left( \mathcal{A}^{Q_B}_\mathrm{svec} \right) ^T\left( \mathcal{A}^{Q_B}_\mathrm{svec}\right) \right] ^{-1}\text{ svec }C^{Q_B}. \end{aligned}$$

(18)

Here by (7) the matrix $\mathcal{A}^{Q_B}_\mathrm{svec}$ has full rank equal to $r_\triangle $.

Again, we compute the weak dual variable $V = V(u)$. Let $\theta ^k$ be the negative eigenvalue of the matrix V, and let $h_k$ be the corresponding eigenvector. It is a column of the orthogonal matrix H. As before, the vector $h_k$ does not belong to the subspace $\mathcal{R}(Q_B)$.

Now formula (12) for search direction $\varDelta X$ is non-applicable, since system (15) for finding the vector $\text{ svec }Z$ is overdetermined. This system has a solution only if the vector $\mathcal{A}^{h_k}$ belongs to the null-space of the matrix $(\mathcal{A}^{Q_B}_\mathrm{svec})^T$.

In order to eliminate this drawback, we alter the approach for choosing the matrix $\varDelta X$. Namely, we define now $\varDelta X$ in the form

$$\begin{aligned} \varDelta X = \left[ Q_B \, h_{k} \right] \left[ \begin{array}{cc} \varDelta Z &{}\quad w \\ w^T &{}\quad 1 \end{array}\right] \left[ Q_B \, h_{k} \right] ^T, \end{aligned}$$

(19)

where $\varDelta Z\in {\mathbb S}^r$ and $w\in {\mathbb R}^r$. Observe that this direction $\varDelta X$ coincides with the direction (12), when $w = 0_r$.

Let us take

$$\begin{aligned} w = \frac{1}{2}\tilde{W} y, \quad \tilde{W} = \left[ \tilde{w}_1, \ldots , \ \tilde{w}_p \right] , \end{aligned}$$

(20)

where all columns $\tilde{w}_j\in {\mathbb R}^r$, $1\le j\le p$, of the matrix $\tilde{W}$ are linearly independent, and $y\in {\mathbb R}^p$. Furthermore, we require that the vectors $q_{w_j} = Q_B \tilde{w}_j$, $1\le j\le p$, be orthogonal to the vector $h_k$, i.e.

$$\begin{aligned} \langle h_k, q_{w_j} \rangle = \left\langle Q_B^T h_k, \tilde{w}_j \right\rangle = 0, \quad 1\le j\le p. \end{aligned}$$

(21)

All vectors $q_{w_j}$, $1\le j\le p$, belong to the subspace $\mathcal{R}(Q_B)$. The vector $q_y = Q_B\tilde{W}y$ also belongs to this subspace, and according to (21) $h_k^T q_y = 0$.

Now we have instead of (14)

$$\begin{aligned} A_i^{Q_B} \bullet \varDelta Z + 2 \left\langle Q_B^T A_i h_k, w \right\rangle + h_k^T A_i h_k = 0, \quad 1\le i\le m, \end{aligned}$$

or after substitution of the vector w from (20)

$$\begin{aligned} A_i^{Q_B} \bullet \varDelta Z + \left\langle Q_B^T A_i h_k, \tilde{W} y \right\rangle + h_k^T A_i h_k = 0, \quad 1\le i\le m. \end{aligned}$$

(22)

The second term in (22) can be rewritten also as $ \left\langle Q_B^T A_i h_k, \tilde{W} y \right\rangle = \langle h_k, A_i Q_B\tilde{W} y \rangle . $

Let $\mathcal{B}$ be a matrix of the size $m\times p$ with the (i, j) element equal to $h_k^TA_i q_{w_j}$, $1\le i\le m$, $1\le j\le p$, that is

$$\begin{aligned} \mathcal{B}= \left[ \begin{array}{ccc} h_k^T A_1Q_B\tilde{w}_1 &{}\quad \ldots &{}\quad h_k^T A_1Q_B\tilde{w}_p \\ &{}\quad \vdots &{} \quad \\ h_k^T A_m Q_B\tilde{w}_1 &{} \quad \ldots &{} \quad h_k^T A_m Q_B\tilde{w}_p \end{array} \right] . \end{aligned}$$

Note that $\mathcal{B}y = \left[ A_1^{h_k q_y}; \ldots ; A_m^{h_k q_y} \right] $, where $A_i^{h_k q_y} = h_k^T A_i q_y,$$1\le i\le m$.

Putting together all equations (22), we obtain

$$\begin{aligned} \mathcal{A}^{Q_B}_\mathrm{svec} \text{ svec }\varDelta Z + \mathcal{B}y + \mathcal{A}^{h_k} = 0_m. \end{aligned}$$

(23)

The system (23) is a system of m equations with respect to m unknowns, namely, components of the vectors $\text{ svec }\varDelta Z \in {\mathbb R}^{r_\triangle }$ and $y\in {\mathbb R}^p$.

We take the $(p\times m)$ matrix $\tilde{U}$, whose rows are linearly independent vectors $\tilde{u}_1, \ldots , \tilde{u}_p$ from the null-space of the matrix $(\mathcal{A}^{Q_B}_\mathrm{svec})^T$. Further, we compose from $(\mathcal{A}^{Q_B}_\mathrm{svec})^T$ and $\tilde{U}$ the square matrix $\varOmega = \big [ (\mathcal{A}^{Q_B}_\mathrm{svec})^T; \tilde{U}\big ]$ of order m. The matrix $\varOmega $ is non-singular, and their rows generate the space ${\mathbb R}^m$.

Let $\varPhi $ denote the square non-singular matrix $ \varPhi = (\mathcal{A}^{Q_B}_\mathrm{svec})^T \ \mathcal{A}^{Q_B}_\mathrm{svec} $ of order $r_\triangle $. After multiplying the system (23) from the left on the matrix $\varOmega $ we get the equivalent system which is decomposed in two subsystems

$$\begin{aligned} \varPhi \, \text{ svec }\varDelta Z + (\mathcal{A}^{Q_B}_\mathrm{svec})^T \left[ \mathcal{B}y + \mathcal{A}^{h_k} \right] = 0_{r_\triangle }, \qquad \tilde{U} \left[ \mathcal{B}y + \mathcal{A}^{h_k} \right] = 0_p. \end{aligned}$$

(24)

We find from the first subsystem (24)

$$\begin{aligned} \text{ svec }\varDelta Z = - \varPhi ^{-1} (\mathcal{A}^{Q_B}_\mathrm{svec})^T \left[ \mathcal{B}y + \mathcal{A}^{h_k} \right] . \end{aligned}$$

(25)

Proposition 2

Let $\mathcal{R}(\mathcal{B})$ and $\mathcal{R}(\mathcal{A}^{Q_B}_\mathrm{svec})$ be columns spaces of the matrices $\mathcal{B}$ and $\mathcal{A}^{Q_B}_\mathrm{svec}$, respectively. Let also $\mathcal{R}(\mathcal{B}) \cap \mathcal{R}(\mathcal{A}^{Q_B}_\mathrm{svec}) = \emptyset $. Then the matrix $\tilde{U} \mathcal{B}$ is nonsingular.

Proof

We will show that the homogeneous system of equations $\tilde{U} \mathcal{B}y = 0_p$ has only the trivial solution $y = 0_p$. By contradiction, assume that this is not a case. Then there exists the nonzero vector $y\in {\mathbb R}^p$, satisfying this system. Since $\mathcal{B}$ is a full rank matrix, the nonzero vector $z = \mathcal{B}y$ must belong to the null-space of the matrix $\tilde{U}$, which coincides with the column space of the matrix $\mathcal{A}^{Q_B}_\mathrm{svec}$. We come to contradiction. The assertion is proven.

Solving the second subsystem (24) and taking into consideration the assertion 2, we get

$$\begin{aligned} y = - \left( \tilde{U} \mathcal{B}\right) ^{-1} \tilde{U} \mathcal{A}^{h_k}. \end{aligned}$$

(26)

Denote $\tilde{\mathcal{B}}= \mathcal{B}\left( \tilde{U} \mathcal{B}\right) ^{-1} \tilde{U}$. Then after substituting the vector y into expression (25) for $\text{ svec }\varDelta Z$ we get

$$\begin{aligned} \text{ svec }\varDelta Z = \varPhi ^{-1} (\mathcal{A}^{Q_B}_\mathrm{svec})^T \left[ \tilde{\mathcal{B}}- I_m \right] \mathcal{A}^{h_k}. \end{aligned}$$

(27)

Compute now the derivative of the objective function along $\varDelta X$.

Proposition 3

The matrix $\varDelta X$ is the decreasing direction for the objective function in problem (1). Moreover, $C \bullet \varDelta X = \theta ^k$.

Proof

We have

$$\begin{aligned} C \bullet \varDelta X = \left[ \begin{array}{cc} Q_B^T C Q_B &{} Q_B^T C h_k \\ h_k^T C Q_B &{} h_k^T C h_k \end{array} \right] \bullet \left[ \begin{array}{cc} \varDelta Z &{} w \\ w^T &{} 1 \end{array} \right] . \end{aligned}$$

Hence,

$$\begin{aligned} C \bullet \varDelta X = \langle \text{ svec }C^{Q_B}, \text{ svec }\varDelta Z \rangle + 2 \langle C^{Q_B h_k}, w \rangle + C^{h_k}, \end{aligned}$$

(28)

where $C^{Q_B h_k} = Q_B^T Ch_k$.

We will compute separately the first and the second terms in the right hand side of (28). For the first term we obtain

$$\begin{aligned} \langle \text{ svec }C^{Q_B}, \text{ svec }\varDelta Z \rangle= & {} \langle \text{ svec }C^{Q_B}, \varPhi ^{-1} (\mathcal{A}^{Q_B}_\mathrm{svec})^T \left[ \tilde{\mathcal{B}}- I_m\right] \mathcal{A}^{h_k} \rangle \\= & {} \langle \mathcal{A}^{Q_B}_\mathrm{svec} \varPhi ^{-1} \text{ svec }C^{Q_B}, \left[ \tilde{\mathcal{B}}- I_m\right] \mathcal{A}^{h_k} \rangle \\= & {} \langle u, \left[ \tilde{\mathcal{B}}- I_m\right] \mathcal{A}^{h_k} \rangle = \langle u, \tilde{\mathcal{B}}\mathcal{A}^{h_k} \rangle - \langle u, \mathcal{A}^{h_k} \rangle . \end{aligned}$$

Moreover, due to (26) $ \langle u, \tilde{\mathcal{B}}\mathcal{A}^{h_k} \rangle = - \langle u, \mathcal{B}y \rangle = -\sum _{i=1}^m u^i A_i^{h_k q_y}. $

For the second term in (28) the following equality

$$\begin{aligned} 2 \langle C^{Q_B h_k}, w \rangle = \langle C^{Q_B h_k}, \tilde{W} y \rangle = \langle Ch_k, Q^B \tilde{W} y \rangle = \langle h_k, Cq_y \rangle = C^{h_k q_y}, \end{aligned}$$

holds, where $C^{h_k q_y} = h_k^T C q_y$.

According to (11) $ C^{h_k} - \langle u, \mathcal{A}^{h_k} \rangle = h_k^T V h_k = \theta ^k. $ Taking also into account that $Vh_k = \theta ^k h_k$ and that the vector $q_y = Q_B\tilde{W} y$ is orthogonal to the vector $h_k$, we obtain $ C^{h_k q_y} - \sum _{i=1}^m u^i A_i^{h_k q_y} = h_k^T V q_y = \theta ^k h_k^T q_y = 0. $ Therefore,

$$\begin{aligned} C \bullet \varDelta X = \theta ^k < 0. \end{aligned}$$

Thus, the objective function in the problem (1) decreases along the direction $\varDelta X$. The assertion is proven.

The choice of the maximal possible step-size $\alpha $ is carried out similar to the regular case. The matrix X is positive semidefinite. It follows from (19) that the symmetric matrix $\bar{X}(\alpha ) = X + \alpha \varDelta X$ will remain positive semidefinite for $\alpha $ sufficiently small. The maximal possible $\bar{\alpha }$ can be found from the condition that this matrix preserves its positive semi-definiteness. Therefore, $\bar{\alpha }$ must be the minimal positive number such that

$$\begin{aligned} \det \left[ \begin{array}{cc} D(\eta _B) + \alpha \varDelta Z &{}\quad \alpha w \\ \alpha w^T &{} \quad \alpha \end{array} \right] = 0. \end{aligned}$$

Under $\alpha > 0$ we have

$$\begin{aligned} \det \left[ \begin{array}{cc} D(\eta _B) + \alpha \varDelta Z &{}\quad \alpha w \\ \alpha w^T &{}\quad \alpha \end{array} \right] = \alpha \det \left[ D(\eta _B) + \alpha \left( \varDelta Z - ww^T \right) \right] . \end{aligned}$$

Hence, determination of $\bar{\alpha }$ reduces to determination of minimal positive $\alpha $ such that $ \det \left[ D(\eta _B) + \alpha \left( \varDelta Z - ww^T \right) \right] = 0$. Both matrices $D(\eta _B)$ and $G = \varDelta Z - ww^T$ are symmetric, the matrix $D(\eta _B)$ is positive definite. Therefore, these matrices can be reduced by congruent transformation to the diagonal form with the help of a non-singular matrix P, i.e. $ D(\eta _B) = PP^T$, $G = P D(\lambda ) P^T$.

If among all components of the vector $\lambda = [\lambda ^1, \ldots , \lambda ^r]^T$ there exists at least one negative component, then the value $\bar{\alpha }$ is finite and it is equal to $ \bar{\alpha }= -\lambda _*^{-1}$, where $\lambda _*$ is the maximal by absolute value negative component of the vector $\lambda $. Otherwise, the problem (1) has not solution.

Consider the question about convergence of the method. For simplicity, let us assume that at each iteration we take the negative eigenvalue $\theta ^k$ such that its absolute value is maximal among all other negative eigenvalues.

Theorem 1

Let the problem (1) be quasi-regular. Let also the starting extreme point $X_0\in \mathcal{F}_P$ be such, that the set $ \mathcal{F}_P(X_0) = \left\{ X \in \mathcal{F}_P: \ C \bullet X \le C \bullet X_0 \right\} $ is bounded. Then either the sequence of extreme points $\{X_k\}\subset \mathcal{F}_P(X_0)$, generated by simplex-method, is finite and in this case the last extreme point is the solution of problem (1), or the sequence $\{X_k\}$ is infinite and any limit point of $\{X_k\}$ is the solution of (1) too.

Proof

If the sequence $\{X_k\}$ is finite, then method is stopped at some iteration. This event may occur only if all eigenvalues of the matrix $V_k = V(u_k)$ involved in the determination procedure of $u_k$ are non-negative. Hence, the pair $[u_k,V_k]$ is the feasible point in the dual problem (2). But in this situation all optimality conditions (3) are fulfilled, therefore, $X_k$ is the solution of the problem (1).

Let us consider now the case, where the sequence $\{X_k\}$ is infinite. Since this sequence is bounded, there exist the limits points of $\{X_k\}$. Let $X_{k_s} \rightarrow \bar{X}$. Because of the done assumptions and the rule for the step-size $\alpha _k$ choice, regardless the extreme point $X_k$ with rank r is regular or irregular, the next point $X_{k+1}$ will be an extreme point with the same rank r. Therefore, the point $\bar{X}$ is an extreme point too.

All points $X_{k_s}$ from the sequence $\{X_{k_s}\}$ are such that corresponding matrices $Q_B$ from factorization (8) have the same Frobenius norm, to be exact, $\Vert Q_B\Vert _F = \left( \text{ tr } \ Q_B^T Q_B \right) ^{1/2} = \sqrt{r}$. Consequently, these matrices $Q_B$ belong to the compact set, and it is possible to extract from $\{X_{k_s}\}$ the subsequence, for which the corresponding matrices $Q_B$ converge to certain matrix $\bar{Q}_B$ such that $\bar{Q}_B^T \bar{Q}_B = I_r$. Without loss of generality, we regard that the sequence $\{X_{k_s}\}$ possesses this property.

Consider the matrix $( \mathcal{A}^{\bar{Q}_B}_\mathrm{svec} )^T$, entering in the system (9) for finding the vector of dual variables $\bar{u}$ at the point $\bar{X}$. Since $\bar{X}$ is an extreme point, this matrix has full rank coinciding with the row rank. From here, taking into consideration the continuity of vectors $\text{ svec }C^{Q_B}$, we conclude that solutions $u_{k_s}$ defined either by formula (10) or by formula (18) converge to $\bar{u}$.

Determine the matrix $\bar{V}= V(\bar{u})$. This matrix $\bar{V}$ must be positive semi-definite. Indeed otherwise $\bar{V}$ would have a negative eigenvalue. However, eigenvalues of matrices are Lipschitz continuous. Therefore, matrices $V_{k_s}$ sufficiently close to $\bar{V}$ also have negative eigenvalues. Hence, these iterations correspond to the passage from the points $X_{k_s}$ to the next points $X_{k_s+1}$ with the step-size $\alpha _{k_s}> 0$. Moreover, the value of the objective function decreases by $\alpha _{k_s} \bar{\theta }^{k_s}$, where $\bar{\theta }^{k_s}$ is a negative component of $\theta _{k_s}$ with a maximum absolute value. However, it follows from (16) or (27) that matrices $\varDelta X_{k_s}$ are bounded by norm at $\mathcal{F}_P(X_0)$. Therefore, we have $C \bullet X_{k_s+1} < C\bullet \bar{X}$ at certain $k_s$ iteration. In view of monotone decrease of the values of the objective function along the trajectory, this contradicts to convergence of $\{X_{k_s}\}$ to $ \bar{X}$. The theorem is proven.

$\square $

It is possible to show that, if the problems (1) and (2) are nondegenerate and their solutions are strictly complementary, then the bounded set $\mathcal{F}_P(X_0)$ from conditions of the theorem (1) exists.

6 Initial stage of the method

Suppose that we have a point $X \in \mathcal{F}_P$, which is not an extreme point of $\mathcal{F}_P$. Suppose also that rank of X is equal to $r\le n$, and for X the Cholesky factorization (5) holds. As before, we assume, that the positive eigenvalues of X are at the beginning of the vector $\eta = [\eta _B; \eta _N]$, and $Q_B$ is the $n\times r$ left submatrix of an orthogonal matrix Q.

We replace the basic problem (1) by its reduction on the minimal face $\mathcal{G}_\mathrm{min}(X;{\mathbb S}^n_+)$:

$$\begin{aligned} \begin{array}{c} \min \ C\bullet X, \\ A^i \bullet X= b^i, \quad 1\le i \le m, \quad X \in \mathcal{G}_\mathrm{min}(X;{\mathbb S}^n_+). \end{array} \end{aligned}$$

(29)

Taking into account that points from $\mathcal{G}_\mathrm{min}(X;{\mathbb S}^n_+)$ can be represented as $X =Q_B Z Q_B^T$, where $Z\in {\mathbb S}^r_+$, we rewrite the problem (29) in the following form

$$\begin{aligned} \begin{array}{c} \min \ C^{Q_B} \bullet Z, \\ A_i^{Q_B} \bullet Z= b^i, \quad 1 \le i \le m, \qquad Z \succeq 0. \end{array} \end{aligned}$$

(30)

The optimality conditions (3) for semidefinite programming problem (30) and for dual problem to it can be written also as

$$\begin{aligned} \begin{array}{c} Z \circ V^{Q_B} = 0, \quad V^{Q_B} = C^{Q_B} - \sum \nolimits _{i=1}^{m} u^i A_i^{Q_B}, \\ A_i^{Q_B} \bullet Z = b^i, \quad 1 \le i \le m, \qquad Z \succeq 0, \ V^{Q_B}\succeq 0, \end{array} \end{aligned}$$

(31)

where $M_1 \circ M_2 = \left[ M_1M_2 + M_2M_1 \right] /2$ is the symmetrized product of matrices $M_1$ and $M_2$. Using the well-known formula $ {\mathrm{vec\,}}M_1M_2M_3 = \left( M_3^T \otimes M_1\right) {\mathrm{vec\,}}M_2, $ where the symbol $\otimes $ denotes the Kronecker product of matrices, we obtain, that the equalities from (31) can be represented as

$$\begin{aligned} \tilde{Z}^\otimes \ \text{ svec }V^{Q_B} = 0_{r_\triangle }, \quad \mathcal{A}_\mathrm{svec}^{Q_B} \ \text{ svec }Z = b, \quad \text{ svec }\, V^{Q_B} = \text{ svec }\, C^{Q_B} - \left( \mathcal{A}_\mathrm{svec}^{Q_B} \right) ^T u. \end{aligned}$$

(32)

Here and in what follows $\tilde{Z}^\otimes = \tilde{\mathcal{L}}_r Z^\otimes \tilde{\mathcal{D}}_r$, and $Z^\otimes = \frac{1}{2}\left[ Z \otimes I_r + I_r \otimes Z \right] $ is the Kronecker product of Z.

We substitute the vector $\text{ svec }\, V^{Q_B}$ from the third equality (32) to the first one. Then after multiplying both sides of this equality by the matrix $\mathcal{A}_\mathrm{svec}^{Q_B}$ we obtain the equation

$$\begin{aligned} \varGamma _{Q_B}(\tilde{Z}^\otimes ) u = \mathcal{A}_\mathrm{svec}^{Q_B} \tilde{Z}^\otimes \text{ svec }C^{Q_B}, \end{aligned}$$

(33)

in which $ \varGamma _{Q_B}(\tilde{Z}^\otimes ) = \mathcal{A}_\mathrm{svec}^{Q_B} \tilde{Z}^\otimes (\mathcal{A}_\mathrm{svec}^{Q_B} )^T$. The matrix $\varGamma _{Q_B}(\tilde{Z}^\otimes )$ is nonsingular, if matrices $A_i^{Q_B}$, $1\le i\le m$, are linearly independent. In the case, where $r_\triangle \ge m$, necessarily the matrix $\varGamma _{Q_B}(\tilde{Z}^\otimes )$ is nonsingular. It follows from assumption that the point X is nondegenerate.

Solving the system (33), we get $ u(Z) = \varGamma ^{-1}_{Q_B} (\tilde{Z}^\otimes ) \mathcal{A}_\mathrm{svec}^{Q_B} \tilde{Z}^\otimes \text{ svec }C^{Q_B}. $ Thus, $\text{ svec }V^{Q_B} = \mathcal{P}(\tilde{Z}^\otimes ) \text{ svec }C^{Q_B} $, where $ \mathcal{P}(\tilde{Z}^\otimes ) = I_{r_\triangle } - (\mathcal{A}_\mathrm{svec}^{Q_B} )^T \varGamma ^{-1}_{Q_B} (\tilde{Z}^\otimes ) \mathcal{A}_\mathrm{svec}^{Q_B} \tilde{Z}^\otimes . $

After substitution of the vector $\text{ svec }V^{Q_B}$ in the first equality (31) we derive the system of nonlinear equations

$$\begin{aligned} \tilde{Z}^\otimes \, \mathcal{P}(\tilde{Z}^\otimes ) \, \text{ svec }C^{Q_B} = 0_{r_\triangle } \end{aligned}$$

(34)

with respect to the matrix Z.

Using the left hand side of equation (34), it is possible to pass from Z to another point $\bar{Z}$ with decreasing of the value of the objective function in problem (30), namely, $ \bar{Z} = Z - \alpha \varDelta (Z), $ where $\alpha > 0$, and $\varDelta (Z)$ is a symmetric matrix for which $ \text{ svec }\varDelta (Z) = \tilde{Z}^\otimes \, \mathcal{P}(\tilde{Z}^\otimes ) \, \text{ svec }C^{Q_B}$. The direction $\varDelta Z$ is such that equalities $A_i^{Q_B} \bullet \varDelta (Z) = 0$, $1\le i\le m$, hold, and $C^{Q_B} \bullet \varDelta (Z) \ge 0$.

Denote $ \bar{X} (\alpha ) = Q_B \left[ D(\eta _B) - \alpha \varDelta (Z) \right] Q_B^T. $ Since the matrix $D(\eta _B)$ is positive definite, the matrix $D(\eta _B) - \alpha \varDelta (Z)$ for sufficiently small $\alpha $ remains positive definite too. The maximal possible step-size $\alpha _*$ can be found from the condition that some eigenvalue of the matrix $D(\eta _B) - \alpha \varDelta (Z)$ for the first time becomes zero. In this case the feasible point $\bar{X}(\alpha _*)$ is such that the rank of the matrix $\bar{X}(\alpha _*)$ is less than the rank r of the matrix X. If $\bar{X}(\alpha _*)$ is not an extreme point, it is necessary to repeat the procedure.

7 Conclusion

We have proposed the numerical simplex-type method for linear semidefinite programming problems. The method can be regarded as the specific way for solving the system of optimality conditions (3). There is also the dual simplex-type method for solving the linear semidefinite programming problems [19]. In this method another approach is used for solving the system of optimality conditions. The pivoting procedures for both methods are quite analogous to the procedures used in primal and dual simplex methods for linear programming. Simplex-type methods for semidefinite and second order cone programming are particular important in the case when one must solve a sequence of problems whose structures are similar to one another.

References

Vanderbei, R.J.: Linear Programming. Foundations and Extensions. Kluwer Academic Publishers, Boston (1997)
MATH Google Scholar
Eremin, I.I.: Theory of Linear Optimization. Izd. “Yekaterinburg”, Ekaterinburg (1999). (in Russian)
Vasilyev, F.P., Ivanitskiy, A.Y.: In-Depth Analysis of Linear Programming. Springer, Dordrecht (2001)
Book Google Scholar
Wolkowicz, H., Saigal, R., Vandenberghe, L. (eds.): Handbook of Semidefinite Programming. Kluwer Academic Publishers, Dordrecht (2000)
MATH Google Scholar
Anjos, M.F., Lasserre, J.B. (eds.): Handbook on Semidefinite, Cone and Polynomial Optimization: Theory, Algorithms, Software and Applications. Springer, New York (2011)
Google Scholar
Vandenberghe, L., Boyd, S.: Semidefinite programming. SIAM Rev. 38(1), 49–95 (1996)
Article MathSciNet Google Scholar
Alizadeh, F., Goldfarb, D.: Second order cone programming. Math. Program. Ser. B 95, 3–51 (2003)
Article MathSciNet Google Scholar
Lasserre, J.B.: Linear programming with positive semi-definite matreces. MPE 2, 499–522 (1996)
MATH Google Scholar
Kosolap, A.I.: The simplex method for semidefinite programming problems. Vest. Donetsk Nats. Univ. Ser. Estestv. Nauki 2, 365–369 (2009). (in Russian)
Google Scholar
Pataki, G.: Cone-LP’s and semidefinite programs: geometry and simplex-type method. In: Conference on Integer Programming and Combinatorial Optimization (IPCO 5), pp. 1–13 (1996)
Chapter Google Scholar
Muramatsu, M.: A pivoting procedure for a class of second-order cone programming. Optim. Methods Softw. 21(2), 295–314 (2006)
Article MathSciNet Google Scholar
Hayashi, Sh, Okuno, T., Ito, Yo: Simplex-type algorithms for second-order cone programming via semi-infinite programming reformulation. Optim. Methods Softw. 31(6), 1305–1324 (2016)
Article MathSciNet Google Scholar
Zhadan, V.G.: On a variant of the simplex method for linear semidefinite programming problem. Tr. Inst. Mat. Mekh. UrO RAN 21(3), 117–127 (2015). (in Russian)
MathSciNet Google Scholar
Zhadan, V.G.: On a variant of the affine scaling method for linear semidefinite programming. Tr. Inst. Mat. Mekh. UrO RAN 20(3), 145–160 (2014). (in Russian)
Google Scholar
Magnus, J.R., Neudecker, H.: The elimination matrix: some lemmas and applications. SIAM J. Algebr. Discret. Methods 1(4), 422–449 (1980)
Article MathSciNet Google Scholar
Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applicatons in Statistics and Econometrics. Wiley, New York (1999)
MATH Google Scholar
Arnol’d, V.I.: On matrices depending on parameters. Usp. Mat. Nauk 26(2), 101–114 (1971). (in Russian)
MathSciNet Google Scholar
Alizadeh, F., Haeberly, J.-P.F., Overton, M.L.: Complementarity and nondegeneracy in semidefinite programming. Math. Program. Ser. B 7(2), 129–162 (1997)
MathSciNet MATH Google Scholar
Zhadan, V.G.: A variant of dual simplex method for linear semidefinite programming problem. Tr. Inst. Mat. Mekh. UrO RAN 22(3), 90–100 (2016). (in Russian)
Google Scholar

Download references

Acknowledgements

This work was supported by the Russian Foundation for Basic Research (Project No. 17-07-00510).

Author information

Authors and Affiliations

Dorodnicyn Computing Centre, FRC “Computer Science and Control”, Russian Academy of Sciences, Vaviliva Str., 40, Moscow, Russia, 119333
Vitaly Zhadan

Authors

Vitaly Zhadan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vitaly Zhadan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhadan, V. Two-phase simplex method for linear semidefinite optimization. Optim Lett 13, 1969–1984 (2019). https://doi.org/10.1007/s11590-018-1333-z

Download citation

Received: 30 January 2018
Accepted: 23 September 2018
Published: 04 October 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s11590-018-1333-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Two-phase simplex method for linear semidefinite optimization

Abstract

Similar content being viewed by others

A Variant of the Dual Simplex Method for a Linear Semidefinite Programming Problem

The Dual Simplex-Type Method for Linear Second-Order Cone Programming Problem

A Variant of the Simplex Method for Second-Order Cone Programming

1 Introduction

2 The semidefinite optimization problem and optimality conditions

3 Extreme points of the feasible set

Definition 1

4 Pivoting at regular point

Proposition 1

Proof

5 Pivoting at irregular point

Proposition 2

Proof

Proposition 3

Proof

Theorem 1

Proof

6 Initial stage of the method

7 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two-phase simplex method for linear semidefinite optimization

Abstract

Similar content being viewed by others

A Variant of the Dual Simplex Method for a Linear Semidefinite Programming Problem

The Dual Simplex-Type Method for Linear Second-Order Cone Programming Problem

A Variant of the Simplex Method for Second-Order Cone Programming

1 Introduction

2 The semidefinite optimization problem and optimality conditions

3 Extreme points of the feasible set

Definition 1

4 Pivoting at regular point

Proposition 1

Proof

5 Pivoting at irregular point

Proposition 2

Proof

Proposition 3

Proof

Theorem 1

Proof

6 Initial stage of the method

7 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation