Application to Extended Semidefinite Optimization

Jahn, Johannes

doi:10.1007/978-3-030-42760-3_7

Johannes Jahn²

913 Accesses

Abstract

In semidefinite optimization one investigates nonlinear optimization problems in finite dimensions with a constraint requiring that a certain matrix-valued function is negative semidefinite. This type of problems arises in convex optimization, approximation theory, control theory, combinatorial optimization and engineering. In system and control theory so-called linear matrix inequalities (LMI’s) and extensions like bilinear matrix inequalities (BMI’s) fit into this class of constraints. Our investigations include various partial orderings for the description of the matrix constraint and in this way we extend the standard semidefinite case to other types of constraints. We apply the theory on optimality conditions developed in Chap. 5 and the duality theory of Chap. 6 to these extended semidefinite optimization problems.

Access provided by Autonomous University of Puebla. Download chapter PDF

In semidefinite optimization one investigates nonlinear optimization problems in finite dimensions with a constraint requiring that a certain matrix-valued function is negative semidefinite. This type of problems arises in convex optimization, approximation theory, control theory, combinatorial optimization and engineering. In system and control theory so-called linear matrix inequalities (LMI’s) and extensions like bilinear matrix inequalities (BMI’s) fit into this class of constraints. Our investigations include various partial orderings for the description of the matrix constraint and in this way we extend the standard semidefinite case to other types of constraints. We apply the theory on optimality conditions developed in Chap. 5 and the duality theory of Chap. 6 to these extended semidefinite optimization problems.

7.1 Löwner Ordering Cone and Extensions

In the so-called

one investigates finite dimensional optimization problems with an inequality constraint with respect to a special matrix space. To be more specific, let $\mathcal {S}^{n}$ denote the real linear space of symmetric (n, n)-matrices. It is obvious that this space is a finite dimensional Hilbert space with the scalar product 〈⋅, ⋅〉 defined by

$$\displaystyle \begin{aligned} \langle A,B\rangle =\mbox{trace}(A\cdot B)\ \ \mbox{for all}\ \ A,B\in\mathcal{S}^{n}. \end{aligned} $$

(7.1)

Recall that the trace of a matrix is defined as sum of all diagonal elements of the matrix. Let C be a convex cone in $\mathcal {S}^{n}$ inducing a partial ordering $\preccurlyeq $. Then we consider a matrix function $G:\mathbb {R}^{m} \rightarrow \mathcal {S}^{n}$ defining the inequality constraint

$$\displaystyle \begin{aligned} G(x)\preccurlyeq 0_{\mathcal{S}^{n}}. \end{aligned} $$

(7.2)

If $f:\mathbb {R}^{m}\rightarrow \mathbb {R}$ denotes a given objective function, then we obtain the

$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ G(x)\preccurlyeq 0_{\mathcal{S}^{n}}\\ x\in\mathbb{R}^{m}. \end{array} \end{aligned} $$

(7.3)

The name of this problem comes from the fact that the matrix inequality has to be interpreted using the ordering cone C. Obviously, the theory developed in this book is fully applicable to this problem structure.

In the special literature one often investigates problems of the form

$$\displaystyle \begin{aligned} \begin{array}{c} \min \hat{f}(X)\\ \mbox{subject to the constraints}\\ \hat{G}(X)\preccurlyeq 0_{\mathcal{S}^{n}}\\ X\in\mathcal{S}^{p} \end{array} \end{aligned} $$

(7.4)

with given functions $\hat {f}:\mathcal {S}^{p}\rightarrow \mathbb {R}$ and $\hat {G}:\mathcal {S}^{p}\rightarrow \mathcal {S}^{n}$. In this case the matrix $X\in \mathcal {S}^{p}$ can be transformed to a vector $x\in \mathbb {R}^{p\cdot p}$ where x consists of all columns of X by stacking up columns of X from the first to the p-th column. The dimension can be reduced because X is symmetric. Then we obtain $x\in \mathbb {R}^{\frac {p(p+1)}{2}}$. If φ denotes the transformation from the vector x to the matrix X, then the problem (7.4) can be written as

$$\displaystyle \begin{aligned} \begin{array}{c} \min\ (\hat{f}\circ\varphi )(x)\\ \mbox{subject to the constraints}\\ (\hat{G}\circ\varphi )(x)\preccurlyeq 0_{\mathcal{S}^{n}}\\ x\in\mathbb{R}^{\frac{p(p+1)}{2}}. \end{array} \end{aligned}$$

Hence, the optimization problem is of the form of problem (7.3) and it is not necessary to study the nonlinear optimization problem (7.4) separately.

In practice, one works with special ordering cones for the Hilbert space $\mathcal {S}^{n}$. The Löwner^{Footnote 1} ordering cone and further cones are discussed now.

Remark 7.1 (ordering cones in $\mathcal {S}^{n}$)

Let $\mathcal {S}^{n}$ denote the real linear space of symmetric (n, n) matrices.

(a)
The convex cone
$$\displaystyle \begin{aligned} \mathcal{S}^{n}_{+}:=\{ X\in \mathcal{S}^{n}\ |\ X \mbox{ is positive semidefinite}\} \end{aligned}$$

is called the

.

The partial ordering induced by the convex cone $\mathcal {S}^{n}_{+}$ is also called

≼ (notice that we use the special symbol ≼ for this partial ordering). The problem (7.3) equipped with the Löwner partial ordering is then called a

. The name of this problem is caused by the fact that the inequality constraint means that the matrix G(x) has to be negative semidefinite.

Although the semidefinite optimization problem is only a finite dimensional problem, it is not a usual problem in $\mathbb {R}^{m}$ because the Löwner partial ordering makes the inequality constraint complicated. In fact, the inequality (7.2) is equivalent to infinitely many inequalities of the form
$$\displaystyle \begin{aligned} y^{T}G(x)y\leq 0 \ \ \mbox{for all}\ \ y\in\mathbb{R}^{n}. \end{aligned}$$
(b)
The

is defined by
$$\displaystyle \begin{aligned} C^{n}_{K}:=\{ X\in \mathcal{S}^{n}\ |\ y^{T}Xy\geq 0 \ \ \mbox{for all}\ \ y\in K \} \end{aligned}$$

for a given convex cone $K\subset \mathbb {R}^{n}$, i.e., we consider only matrices for which the quadratic form is nonnegative on the convex cone K. If the partial ordering induced by this convex cone is used in problem (7.3), then we speak of a

.

It is evident that $\mathcal {S}^{n}_{+}\subset C^{n}_{K}$ for every convex cone K and $\mathcal {S}^{n}_{+}= C^{n}_{\mathbb {R}^{n}}$. Therefore, we have for the dual cones $(C^{n}_{K})^{*}\subset (\mathcal {S}^{n}_{+})^{*}$.

If K equals the positive orthant $\mathbb {R}^{n}_{+}$, then $C^{n}_{\mathbb {R}^{n}_{+}}$ is simply called

and the problem (7.3) is then called

.
(c)
The

is defined by
$$\displaystyle \begin{aligned} N^{n}:=\{ X\in \mathcal{S}^{n}\ |\ X_{ij}\geq 0 \ \ \mbox{for all}\ \ i,j\in\{ 1,\ldots ,n\}\}. \end{aligned}$$

In this case the optimization problem (7.3) with the partial ordering induced by the convex cone N ⁿ reduces to a standard optimization problem of the form
$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ G_{ij}(x)\leq 0\ \ \mbox{for all}\ \ i,j\in\{ 1,\ldots ,n\}\\ x\in\mathbb{R}^{m}. \end{array}\end{aligned} $$

The number of constraints can actually be reduced to $\frac {n(n+1)}{2}$ because the matrix G(x) is assumed to be symmetric. So, such a problem can be investigated with the standard theory of nonlinear optimization in finite dimensions.
(d)
The

is defined by

If we use the partial ordering induced by this convex cone in the constraint (7.2), then the optimization problem (7.3) can be written as
$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ G(x)\preceq 0_{\mathcal{S}^{n}}\\ G_{ij}(x)\leq 0\ \ \mbox{for all}\ \ i,j\in\{ 1,\ldots ,n\}\\ x\in\mathbb{R}^{m}. \end{array}\end{aligned} $$

So, we have a semidefinite optimization problem with additional finitely many nonlinear constraints. Obviously, for every convex cone K we have $D^{n}\subset C_{K}^{n}$ and $(C_{K}^{n})^{*}\subset (D^{n})^{*}$.

Before discussing some examples we need an important lemma on the

.

Lemma 7.2 (Schur complement)

Let $\displaystyle X=\left ( \begin {array}{cc} A & B^{T}\\ B & C\end {array}\right ) \in \mathcal {S}^{k+l}$with $A\in \mathcal {S}^{k}$, $C\in \mathcal {S}^{l}$and $B\in \mathbb {R}^{(l,k)}$be given, and assume that A is positive definite. Then we have for the Löwner partial ordering ≼

$$\displaystyle \begin{aligned} -X\preceq 0_{\mathcal{S}^{k+l}} \ \ \ \Longleftrightarrow\ \ \ -(C-BA^{-1}B^{T})\preceq 0_{\mathcal{S}^{l}} \end{aligned}$$

(the matrix C − BA ⁻¹B ^Tis called the Schur complement of A in X).

Proof

We have

Since A is positive definite, for an arbitrarily chosen $y\in \mathbb {R}^{l}$ this optimization problem has the minimal solution − A ⁻¹B ^Ty with the minimal value

$$\displaystyle \begin{aligned} -y^{T}BA^{-1}B^{T}y+y^{T}Cy=y^{T}(C-BA^{-1}B^{T})y. \end{aligned}$$

Consequently we get

$$\displaystyle \begin{aligned} \begin{array}{rcl} -X\preceq 0_{\mathcal{S}^{k+l}} &\displaystyle \Longleftrightarrow &\displaystyle y^{T}(C-BA^{-1}B^{T})y\geq 0\ \ \mbox{for all}\ y\in\mathbb{R}^{l}\\ &\displaystyle \Longleftrightarrow &\displaystyle -(C-BA^{-1}B^{T})\preceq 0_{\mathcal{S}^{l}}. \end{array} \end{aligned} $$

□

The following example illustrates the significance of semidefinite optimization.

Example 7.3 (semidefinite optimization)

(a)
The problem of determining the smallest among the largest eigenvalues of a matrix-valued function $A:\mathbb {R}^{m}\rightarrow \mathcal {S}^{n}$ leads to the semidefinite optimization problem
$$\displaystyle \begin{aligned}\begin{array}{c} \min\;\lambda\\ \mbox{subject to the constraints}\\ A(x)-\lambda I \preceq 0_{\mathcal{S}^{n}}\\ x\in\mathbb{R}^{m} \end{array} \end{aligned}$$

(with the identity matrix $I\in \mathcal {S}^{n}$ and the Löwner partial ordering ≼). Indeed, A(x) − λI is negative semidefinite if and only if for all eigenvalues λ ₁, …, λ _n of A(x) the inequality λ _i ≤ λ is satisfied. Hence, with the minimization of λ we determine the smallest among the largest eigenvalues of A(x).
(b)
We consider a nonlinear optimization problem with a quadratic constraint in a finite dimensional setting, i.e. we have
$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ (Ax+b)^{T}(Ax+b)-c^{T}x-\alpha\leq 0\\ x\in\mathbb{R}^{m} \end{array} \end{aligned} $$
(7.5)

with an objective function $f:\mathbb {R}^{m}\rightarrow \mathbb {R}$, a given matrix $A\in \mathbb {R}^{(k,m)}$, given vectors $b\in \mathbb {R}^{k}$ and $c\in \mathbb {R}^{m}$ and a real number α. If ≼ denotes again the Löwner partial ordering, we consider the inequality
$$\displaystyle \begin{aligned} -\left(\begin{array}{cc} I & Ax+b\\ (Ax+b)^{T} & c^{T}x+\alpha \end{array} \right)\preceq 0_{\mathcal{S}^{k+1}} \end{aligned} $$
(7.6)

($I\in \mathcal {S}^{k}$ denotes the identity matrix). By Lemma 7.2 this inequality is equivalent to the quadratic constraint
$$\displaystyle \begin{aligned} (Ax+b)^{T}(Ax+b)-c^{T}x-\alpha\leq 0. \end{aligned}$$

If the i-th column of the matrix A (with i ∈{1, …, k}) is denoted by $a^{(i)}\in \mathbb {R}^{m}$, then we set
$$\displaystyle \begin{aligned} A^{(0)}:=\left(\begin{array}{cc} I & b\\ b^{T} & \alpha \end{array}\right) \end{aligned}$$

and
$$\displaystyle \begin{aligned} A^{(i)}:=\left(\begin{array}{cc} 0_{\mathcal{S}^{k}} & a^{(i)}\\ {a^{(i)}}^{T} & c_{i} \end{array}\right) \ \ \mbox{for all}\ i\in\{ 1,\ldots ,k\} ,\end{aligned}$$

and the inequality (7.6) is equivalent to
$$\displaystyle \begin{aligned} -A^{(0)}-A^{(1)}x_{1}-\dots -A^{(k)}x_{k}\preceq 0_{\mathcal{S}^{k+1}}. \end{aligned}$$

Hence, the original problem (7.5) with a quadratic constraint can be written as a semidefinite optimization problem with a linear constraint
$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ -A^{(0)}-A^{(1)}x_{1}-\dots -A^{(k)}x_{k}\preceq 0_{\mathcal{S}^{k+1}}\\ x\in\mathbb{R}^{m}. \end{array} \end{aligned}$$

Although the partial ordering used in the constraint becomes more complicated by this transformation, the type of the constraint which is now linear and not quadratic, is much simpler to handle. A similar transformation can be carried out in the case that, in addition, the objective function f is also quadratic. Then we minimize an additional variable and use this variable as an upper bound of the objective function.
(c)
We consider a system of autonomous linear differential equations
$$\displaystyle \begin{aligned} \dot{x}(t)=Ax(t)+Bu(t) \ \mbox{almost everywhere on}\ [0,\infty )\end{aligned} $$
(7.7)

with given matrices $A\in \mathbb {R}^{(k,k)}$ and $B\in \mathbb {R}^{(k,l)}$. Using a feedback control
$$\displaystyle \begin{aligned} u(t)=Fx(t)\ \mbox{almost everywhere on}\ [0,\infty )\end{aligned} $$

with an unknown matrix $F\in \mathbb {R}^{(l,k)}$ we try to make the system (7.7) asymptotically stable, i.e. we require for every solution x of (7.7) that
$$\displaystyle \begin{aligned}\lim_{t\to\infty} \| x(t)\| =0\end{aligned} $$

for the Euclidean norm ∥⋅∥ in $\mathbb {R}^{k}$. In control theory the autonomous linear system (7.7) is called

, if there exists a matrix $F\in \mathbb {R}^{(l,k)}$ so that the system (7.7) is asymptotically stable.

For the determination of an appropriate matrix F we investigate the so-called Lyapunov function $v:\mathbb {R}^{k}\rightarrow \mathbb {R}$ with
$$\displaystyle \begin{aligned} v(\tilde{x})=\tilde{x}^{T}P\tilde{x} \ \mbox{for all}\ \tilde{x}\in\mathbb{R}^{k}\end{aligned} $$

($P\in \mathcal {S}^{k}$ is arbitrarily chosen and should be positive definite). Since P is positive definite we have
$$\displaystyle \begin{aligned} v(\tilde{x})>0\ \mbox{for all}\ \tilde{x}\in\mathbb{R}^{k}\backslash\{0_{\mathbb{R}^{k}}\} .\end{aligned} $$
(7.8)

For a solution x of the system (7.7) we obtain
$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle {\dot{v}(x(t))}\\ &\displaystyle &\displaystyle \ = \frac{d}{dt}x(t)^{T}Px(t)\\ &\displaystyle &\displaystyle \ = \dot{x}(t)^{T}Px(t)+x(t)^{T}P\dot{x}(t)\\ &\displaystyle &\displaystyle \ = \big(Ax(t)+BFx(t)\big)^{T}Px(t)+x(t)^{T}P\big(Ax(t)+BFx(t)\big)\\ &\displaystyle &\displaystyle \ = x(t)^{T}\big( (A+BF)^{T}P+P(A+BF)\big) x(t). \end{array} \end{aligned} $$

If the matrices P and F are chosen in such a way that (A + BF)^TP + P(A + BF) is negative definite, then there is a positive number α with
$$\displaystyle \begin{aligned} \dot{v}(x(t))\leq -\alpha\| x(t)\|{}^{2} \ \ \mbox{for all}\ t\in [0,\infty ). \end{aligned} $$
(7.9)

The inequalities (7.8) and (7.9) imply
$$\displaystyle \begin{aligned} \lim_{t\to\infty} v(x(t))=0. \end{aligned} $$
(7.10)

Since P is assumed to be positive definite, there is a positive number β > 0 with
$$\displaystyle \begin{aligned} v(\tilde{x})\geq\beta\| \tilde{x}\|{}^{2} \ \ \mbox{for all}\ \tilde{x}\in\mathbb{R}^{k}. \end{aligned}$$

Then we conclude with (7.10)
$$\displaystyle \begin{aligned} \lim_{t\to\infty} \| x(t)\| =0, \end{aligned}$$

i.e. the autonomous linear system (7.7) is stabilizable. Hence, we obtain the stabilization of (7.7) by a feedback control, if we choose a positive definite matrix $P\in \mathcal {S}^{k}$ and a matrix $F\in \mathbb {R}^{(l,k)}$ so that (A + BF)^TP + P(A + BF) is negative definite.

In order to fulfill this requirement we consider the semidefinite optimization problem
$$\displaystyle \begin{aligned} \begin{array}{c} \min\lambda\\ \mbox{subject to the constraints}\\ -\lambda I+(A+BF)^{T}P+P(A+BF)\preceq 0_{\mathcal{S}^{k}}\\ -\lambda I-P\preceq 0_{\mathcal{S}^{k}}\\ \lambda\in\mathbb{R},\ \ P\in\mathcal{S}^{k},\ \ F\in\mathbb{R}^{(l,k)} \end{array} \end{aligned} $$
(7.11)

($I\in \mathcal {S}^{k}$ denotes the identity matrix and recall that ≼ denotes the Löwner partial ordering). By a suitable transformation this problem formally fits into the class (7.3) of semidefinite problems. Here G has to be defined in an appropriate way. It is important to note that it is not necessary to solve the problem (7.11). Only a feasible solution with λ < 0 is requested. Then the matrices P and F fulfill the requirements for the stabilization of the autonomous linear system (7.7).
(d)
Finally we discuss an applied problem from structural optimization and consider a structure of k elastic bars connecting a set of p nodes (see Fig. 7.1). The design variables x _i (i = 1, …, k) are the cross-sectional areas of the bars. We assume that nodal load forces f ₁, …, f _p are given. l ₁, …, l _k denote the length of the bars, v is the maximal volume, and and $\bar {x}_{i}$ are the lower and upper bounds of the cross-sectional areas. The so-called stiffness matrix $A(x)\in \mathcal {S}^{p}$ is positive definite for all x ₁, …, x _k > 0. We want to find a feasible structure with minimal elastic stored energy. Then we obtain the optimization problem
$$\displaystyle \begin{aligned}\begin{array}{c} \min f^{T}A(x)^{-1}f\\ \mbox{subject to the constraints} \end{array}\end{aligned}$$

Fig. 7.1
Cantilever with seven nodes and the load force f ₇
Full size image

or

By Lemma 7.2 the inequality constraint
$$\displaystyle \begin{aligned} f^{T}A(x)^{-1}f-\lambda\leq 0 \end{aligned}$$

is equivalent to
$$\displaystyle \begin{aligned} -\left(\begin{array}{cc} A(x) & f\\ f^{T} & \lambda\end{array}\right) \preceq 0_{\mathcal{S}^{p+1}} \end{aligned}$$

(recall that ≼ denotes the Löwner partial ordering). Hence, we get a standard semidefinite optimization problem with an additional linear inequality constraint and upper and lower bounds:

Although the Löwner partial ordering is mostly used for describing the inequality constraint (7.2), we mainly investigate the more general conic optimization problem (7.3) covering the standard semidefinite problem. For the application of the general theory of this book we now investigate properties of the presented ordering cones in more detail.

Lemma 7.4 (properties of the Löwner ordering cone)

For the Löwner ordering cone $\mathcal {S}^{n}_{+}$ we have:

(a)
$\mathit{\mbox{int}}(\mathcal {S}^{n}_{+})= \{ X\in \mathcal {S}^{n}\ |\ X\ \mathit{\mbox{is positive definite}}\}$
(b)
$(\mathcal {S}^{n}_{+})^{*}=\mathcal {S}^{n}_{+}$ , i.e. $\mathcal {S}^{n}_{+}$ is self-dual.

Proof

(a)
First, we show the inclusion $\mbox{int}(\mathcal {S}^{n}_{+}){\,\subset \,} \{ X{\,\in \,}\mathcal {S}^{n} {\,|\,} X\ \mbox{is positive definite}\}$. Let $X\in \mbox{int}(\mathcal {S}^{n}_{+})$ be arbitrarily chosen. Then we get for a sufficiently small λ > 0 $X-\lambda I\in \mathcal {S}^{n}_{+}$ ($I\in \mathcal {S}^{n}$ denotes the identity matrix), i.e.
$$\displaystyle \begin{aligned} 0\leq x^{T}(X-\lambda I)x=x^{T}Xx-\lambda x^{T}x \ \ \mbox{for all}\ x\in\mathbb{R}^{m} \end{aligned}$$

implying
$$\displaystyle \begin{aligned} x^{T}Xx\geq\lambda x^{T}x>0\ \ \mbox{for all}\ x\in\mathbb{R}^{m}\backslash\{ 0_{\mathbb{R}^{m}}\} . \end{aligned}$$

Consequently, the matrix X is positive definite.

Next we prove the converse inclusion. Let a positive definite matrix $X\in \mathcal {S}^{n}$ be arbitrarily given. Then all eigenvalues of X are positive. Since the minimal eigenvalue continuously depends on the elements of the matrix, it follows immediately that X belongs to the interior of $\mathcal {S}^{n}_{+}$.
(b)
First, we show the inclusion $(\mathcal {S}^{n}_{+})^{*}\subset \mathcal {S}^{n}_{+}$. Let an arbitrary matrix $X \in (\mathcal {S}^{n}_{+})^{*}$ be chosen and assume that $X\notin \mathcal {S}^{n}_{+}$. Then there exists some $y\in \mathbb {R}^{m}$ so that y ^TXy < 0. If we set Y := yy ^T, we have $Y\in \mathcal {S}^{n}_{+}$ and we obtain
$$\displaystyle \begin{aligned} \langle X,Y\rangle =\mbox{trace}(Xyy^{T})=y^{T}Xy<0, \end{aligned}$$

a contradiction to $X\in (\mathcal {S}^{n}_{+})^{*}$.

Now, we prove the converse inclusion. Let $X\in \mathcal {S}^{n}_{+}$ be arbitrarily given. Choose any $Y\in \mathcal {S}^{n}_{+}$. Since X and Y are symmetric and positive semidefinite it is known that there are matrices $\sqrt {X},\sqrt {Y}\in \mathcal {S}^{n}_{+}$ with $(\sqrt {X})^{2}=X$ and $(\sqrt {Y})^{2}=Y$ and we obtain
$$\displaystyle \begin{aligned} \begin{array}{rcl} \langle X,Y\rangle &\displaystyle = &\displaystyle \mbox{trace}(\sqrt{X}\sqrt{X}\sqrt{Y}\sqrt{Y})\\ &\displaystyle = &\displaystyle \mbox{trace}(\sqrt{X}\sqrt{Y}\sqrt{Y}\sqrt{X})\\ &\displaystyle = &\displaystyle \langle\sqrt{X}\sqrt{Y},\sqrt{X}\sqrt{Y}\rangle\\ &\displaystyle \geq &\displaystyle 0. \end{array} \end{aligned} $$

Hence, we conclude $X\in (\mathcal {S}^{n}_{+})^{*}$.

□

The result of Lemma 7.4,(b) is also called

in the special literature.

For the K-copositive ordering cone we obtain similar results.

Lemma 7.5 (properties of the K-copositive ordering cone)

Let $K\subset \mathbb {R}^{n}$ be a convex cone. For the K-copositive ordering cone $C^{n}_{K}$ we have:

(a)
$\{ X\in \mathcal {S}^{n}\ |\ X\ \mathit{\mbox{is positive definite}}\}\subset \mathit{\mbox{int}}(C^{n}_{K})$.
(b)
In addition, if K is closed, then for $H_{K}:=\mathit{\mbox{convex hull }}\{ xx^{T} |\, x\hspace{-0.1667em}\in K\}$
1. (i)
  H _K is closed
2. (ii)
  $(C^{n}_{K})^{*}=H_{K}$.

Proof

(a)
By definition we have $S^{n}_{+}\subset C^{n}_{K}$. Consequently, the assertion follows from Lemma 7.4,(a).
(b)
1. (i)
  Let an arbitrary sequence X _k ∈ H _K be chosen with the limit $X\in \mathcal {S}^{n}$ (with respect to the spectral norm). Since K is a cone, for every $k\in \mathbb {N}$ there are vectors $x^{(1_{k})},\ldots , x^{(p_{k})}\in K$ with the property
  $$\displaystyle \begin{aligned} X_{k}=\sum_{i=1}^{p}x^{(i_{k})}{x^{(i_{k})}}^{T} \end{aligned}$$
  
  (notice that by the Carathéodory theorem the number p of vectors is bounded by n + 1). Every $x^{(i_{k})}\in K$ ($i\in \{ 1,\ldots ,p\} ,\ \; k\in \mathbb {N}$) can be written as
  $$\displaystyle \begin{aligned} x^{(i_{k})}=\mu_{i_{k}} s^{(i_{k})} \end{aligned}$$
  
  with $\mu _{i_{k}}\geq 0$ and
  $$\displaystyle \begin{aligned} s^{(i_{k})}\in K\cap\{ x\in\mathbb{R}^{n}\ |\ \| x\| =1\} \end{aligned}$$
  
  (∥⋅∥ denotes the Euclidean norm in $\mathbb {R}^{n}$). This set is compact because K is assumed to be closed. Consequently, we obtain for every $k\in \mathbb {N}$
  $$\displaystyle \begin{aligned} X_{k}=\sum_{i=1}^{p}\mu_{i_{k}}^{2}s^{(i_{k})}{s^{(i_{k})}}^{T}. \end{aligned}$$
  
  Since $s^{(1_{k})},\ldots ,s^{(p_{k})}$ belong to a compact set and $(X_{k})_{k\in \mathbb {N}}$ converges to X, the numbers $\mu _{1_{k}},\ldots ,\mu _{p_{k}}$ are bounded and there are subsequences $(s^{(i_{k_{l}})})_{l\in \mathbb {N}}$ and $(\mu _{i_{k_{l}}})_{l\in \mathbb {N}}$ (with i ∈{1, …p}) converging to s ⁽ⁱ⁾ ∈ K and $\mu _{i}\in \mathbb {R}$, respectively, with the property
  $$\displaystyle \begin{aligned} X=\sum_{i=1}^{p}\mu_{i}^{2}s^{(i)}{s^{(i)}}^{T}. \end{aligned}$$
  
  This implies X ∈ H _K. Hence, H _K is a closed set.
2. (ii)
  First we show the inclusion $H_{K}\subset (C^{n}_{K})^{*}$. For an arbitrary X ∈ H _K we have the representation
  $$\displaystyle \begin{aligned} X=\sum^{p}_{i=1} x^{(i)}{x^{(i)}}^{T} \ \ \mbox{for some}\ x^{(1)},\ldots ,x^{(p)}\in K \end{aligned}$$
  
  (notice here that K is a cone). Then we obtain for every $Y\in C^{n}_{K}$
  $$\displaystyle \begin{aligned} \begin{array}{rcl} \langle Y,X\rangle &\displaystyle = &\displaystyle \mbox{trace} (Y\cdot X)\\ &\displaystyle = &\displaystyle \mbox{trace}\left( Y\sum^{p}_{i=1}x^{(i)}{x^{(i)}}^{T}\right)\\ &\displaystyle = &\displaystyle \sum^{p}_{i=1}\mbox{trace}(Yx^{(i)}{x^{(i)}}^{T})\\ &\displaystyle = &\displaystyle \sum^{p}_{i=1}{x^{(i)}}^{T}Yx^{(i)}\\ &\displaystyle \geq &\displaystyle 0, \end{array} \end{aligned} $$
i.e. $X\in (C^{n}_{K})^{*}$.

For the proof of the converse inclusion we first show $H_{K}^{*}\subset C^{n}_{K}$. Let an arbitrary $X\notin C^{n}_{K}$ be given. Then there is some y ∈ K with y ^TXy < 0. If we set Y := yy ^T, then we have Y ∈ H _K and
$$\displaystyle \begin{aligned} \langle Y,X\rangle = \mbox{trace} (Y\cdot X) =\mbox{trace}(Xyy^{T})=y^{T}Xy<0, \end{aligned}$$

i.e. $X\notin H_{K}^{*}$. Consequently $H_{K}^{*}\subset C^{n}_{K}$ and for the dual cones we get
$$\displaystyle \begin{aligned} (C^{n}_{K})^{*}\subset(H_{K}^{*})^{*}. \end{aligned} $$
(7.12)

Next, we show that $(H_{K}^{*})^{*}\subset H_{K}$. For this proof let $Z\in (H_{K}^{*})^{*}$ be arbitrarily given and assume that Z∉H _K. Since H _K is closed by part (i) and convex, by Theorem C.3 there exists some $V\in \mathcal {S}^{n}\backslash \{ 0_{\mathcal {S}^{n}}\}$ with
$$\displaystyle \begin{aligned} \langle V,Z\rangle < \inf_{U\in H_{K}} \langle V,U\rangle . \end{aligned} $$
(7.13)

This inequality implies
$$\displaystyle \begin{aligned} \langle V,Z \rangle < 0, \end{aligned} $$
(7.14)

if we set $U=0_{\mathcal {S}^{n}}$. Now assume that $V\notin H_{K}^{*}$. Then there is some $\tilde {U}\in H_{K}$ with $\langle V,\tilde {U}\rangle <0$. Since H _K is a cone, we have $\lambda \tilde {U}\in H_{K}$ for all λ > 0 and
$$\displaystyle \begin{aligned} 0>\lambda\langle V,\tilde{U}\rangle =\langle V,\lambda\tilde{U}\rangle\ \ \mbox{for all}\ \lambda >0. \end{aligned}$$

Consequently, $\langle V,\lambda \tilde {U}\rangle $ can be made arbitrarily small contradicting to the inequality (7.13). So $V\in H_{K}^{*}$ and because of $Z\in (H_{K}^{*})^{*}$ we obtain 〈V, Z〉≥ 0 contradicting (7.14). Hence we get Z ∈ H _K. With the inclusions $(H_{K}^{*})^{*}\subset H_{K}$ and (7.12) we then conclude $(C^{n}_{K})^{*}\subset H_{K}$ which has to be shown.

□

In the special literature elements in the dual cone $(C^{n}_{\mathbb {R}^{n}_{+}})^{*}=H_{\mathbb {R}^{n}_{+}}$ (i.e. we set $K=\mathbb {R}^{n}_{+}$) are called

.

Finally we present similar results for the nonnegative ordering cone and the doubly nonnegative ordering cone.

Lemma 7.6 (properties of the nonnegative and doubly nonnegative ordering cone)

For the nonnegative ordering cone N ⁿ and the doubly nonnegative ordering cone D ⁿ we have:

(a)
$\mathit{\mbox{int}}(N^{n})=\{ X\in \mathcal {S}^{n}\ |\ X_{ij}>0 \ \mathit{\mbox{for all}}\ i,j\in \{ 1,\ldots ,n\}\}$
(b)
(N ⁿ)^∗ = N ⁿ, i.e. N ⁿis self-dual
(c)
$\mathit{\mbox{int}}(D^{n})=\{ X\in \mathcal {S}^{n}\ |\ X\ \mathit{\mbox{is positive definite and elementwise}}$
(d)
(D ⁿ)^∗ = D ⁿ, i.e. D ⁿis self-dual.

Proof

(a)
This part is obvious.
(b)
1. (i)
  Let X ∈ N ⁿ be arbitrarily chosen. Then we get for all M ∈ N ⁿ
  $$\displaystyle \begin{aligned} \begin{array}{rcl} \langle X,M\rangle = \mbox{trace}(X\cdot M) = \sum_{i=1}^{n}\sum_{j=1}^{n} \underbrace{X_{ij}}_{\geq 0}\cdot\underbrace{M_{ji}}_{\geq 0} \geq 0. \end{array} \end{aligned} $$
  
  Consequently, we have X ∈ (N ⁿ)^∗.
2. (ii)
  Now let X ∈ (N ⁿ)^∗ be arbitrarily chosen. If we consider M ∈ N ⁿ with
  $$\displaystyle \begin{aligned} M_{ij}=\left\{\begin{array}{cl} 1 & \mbox{for}\ i=k \ \mbox{and}\ j=l\\ 0 & \mbox{otherwise} \end{array}\right.\end{aligned}$$
  
  for arbitrary k, l ∈{1, …, n}, then we conclude
  $$\displaystyle \begin{aligned}0\leq \langle X,M\rangle = X_{kl}. \end{aligned}$$
  
  So, we obtain X ∈ N ⁿ.
(c)
With Lemma 7.4,(a) and part (a) of this lemma we get
$$\displaystyle \begin{aligned} \begin{array}{rcl} \mbox{int}(D^{n}) &\displaystyle = &\displaystyle \mbox{int}(\mathcal{S}^{n}_{+}\cap N^{n})\\ &\displaystyle = &\displaystyle \mbox{int}(\mathcal{S}^{n}_{+})\cap\mbox{int}(N^{n})\\ &\displaystyle = &\displaystyle \{ X\in \mathcal{S}^{n}_{+}\ |\ X \ \mbox{positive definite and elementwise } \mbox{positive}\} . \end{array} \end{aligned} $$
(d)
With Lemma 7.4,(b) and part (b) of this lemma we obtain
$$\displaystyle \begin{aligned} (D^{n})^{*} = (\mathcal{S}^{n}_{+})^{*}\cap (N^{n})^{*} = \mathcal{S}^{n}_{+}\cap N^{n} = D^{n}. \end{aligned}$$

□

7.2 Optimality Conditions

The necessary optimality conditions presented in Sect. 5.2 are now applied to the conic optimization problem (7.3) with the partial ordering $\preccurlyeq $ inducing the ordering cone C. To be more specific, let $f:\mathbb {R}^{m}\rightarrow \mathbb {R}$ and $G:\mathbb {R}^{m} \rightarrow \mathcal {S}^{n}$ be given functions and consider the conic optimization problem

$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ G(x)\preccurlyeq 0_{\mathcal{S}^{n}}\\ x\in\mathbb{R}^{m}. \end{array} \end{aligned}$$

First, we answer the question under which assumptions the matrix function G is Fréchet differentiable.

Lemma 7.7 (Fréchet derivative of G)

Let the matrix function $G:\mathbb {R}^{m} \rightarrow \mathcal {S}^{n}$ be elementwise differentiable at some $\bar {x}\in \mathbb {R}^{m}$ . Then the Fréchet derivative of G at $\bar {x}$ is given by

$$\displaystyle \begin{aligned} G^{\prime}(\bar{x})(h)=\sum_{i=1}^{m} G_{x_{i}}(\bar{x})\, h_{i} \ \ \mathit{\mbox{for all}}\ h\in\mathbb{R}^{m} \end{aligned}$$

with

$$\displaystyle \begin{aligned} G_{x_{i}} := \left(\begin{array}{ccc} \frac{\partial}{\partial x_{i}}G_{11} & \cdots & \frac{\partial}{\partial x_{i}}G_{1n}\\ \vdots & & \vdots\\ \frac{\partial}{\partial x_{i}}G_{n1} & \cdots & \frac{\partial}{\partial x_{i}}G_{nn} \end{array}\right) \ \ \mathit{\mbox{for all}}\ \ i\in\{1,\ldots ,m\} . \end{aligned}$$

Proof

Let $h\in \mathbb {R}^{m}$ be arbitrarily chosen. Since G is elementwise differentiable at $\bar {x}\in \mathbb {R}^{m}$, we obtain for the Fréchet derivative of G

$$\displaystyle \begin{aligned} \begin{array}{rcl} G^{\prime}(\bar{x})(h) &\displaystyle = &\displaystyle \left(\begin{array}{ccc} \nabla G_{11}(\bar{x})^{T}h &\displaystyle \cdots &\displaystyle \nabla G_{1n}(\bar{x})^{T}h\\ \vdots &\displaystyle &\displaystyle \vdots\\ \nabla G_{n1}(\bar{x})^{T}h &\displaystyle \cdots &\displaystyle \nabla G_{nn}(\bar{x})^{T}h \end{array}\right)\\ {} &\displaystyle = &\displaystyle \left(\begin{array}{ccc}\displaystyle \sum_{i=1}^{m}{G_{11}}_{x_{i}}(\bar{x})\, h_{i} &\displaystyle \cdots &\displaystyle \displaystyle \sum_{i=1}^{m}{G_{1n}}_{x_{i}}(\bar{x})\, h_{i}\\ \vdots &\displaystyle &\displaystyle \vdots\\ \displaystyle \sum_{i=1}^{m}{G_{n1}}_{x_{i}}(\bar{x})\, h_{i} &\displaystyle \cdots &\displaystyle \displaystyle \sum_{i=1}^{m}{G_{nn}}_{x_{i}}(\bar{x})\, h_{i} \end{array}\right)\\ {} &\displaystyle = &\displaystyle \sum_{i=1}^{m} G_{x_{i}}(\bar{x})\, h_{i} . \end{array} \end{aligned} $$

□

Now we present the Lagrange multiplier rule for the conic optimization problem (7.3).

Theorem 7.8 (Lagrange multiplier rule)

Let $f:\mathbb {R}^{m}\rightarrow \mathbb {R}$and $G:\mathbb {R}^{m} \rightarrow \mathcal {S}^{n}$be given functions, and let $\bar {x}\in \mathbb {R}^{m}$be a minimal solution of the conic optimization problem (7.3). Let f be differentiable at $\bar {x}$and let G be elementwise differentiable at $\bar {x}$. Then there are a real number μ ≥ 0 and a matrix L ∈ C ^∗with $(\mu ,L)\neq (0,0_{\mathcal {S}^{n}})$,

$$\displaystyle \begin{aligned} \mu\nabla f(\bar{x})+\left(\begin{array}{c} \langle L,G_{x_{1}}(\bar{x})\rangle\\ \vdots\\ \langle L,G_{x_{m}}(\bar{x})\rangle \end{array}\right) = 0_{\mathbb{R}^{m}} \end{aligned} $$

(7.15)

and

$$\displaystyle \begin{aligned} \langle L,G(\bar{x})\rangle = 0. \end{aligned} $$

(7.16)

If, in addition to the above assumptions the equality

$$\displaystyle \begin{aligned} G^{\prime}(\bar{x})(\mathbb{R}^{m})+\mathit{\mbox{cone}}\, (C+\{ G(\bar{x})\} ) = \mathcal{S}^{n} \end{aligned} $$

(7.17)

is satisfied, then it follows μ > 0.

Proof

Because of the differentiability assumptions we have that f and G are Fréchet differentiable at $\bar {x}$. Then we apply Corollary 5.4 and obtain the existence of a real number μ ≥ 0 and a matrix L ∈ C ^∗ with $(\mu ,L)\neq (0,0_{\mathcal {S}^{n}})$,

$$\displaystyle \begin{aligned} \mu\nabla f(\bar{x})+L\circ G^{\prime}(\bar{x})= 0_{\mathbb{R}^{m}} \end{aligned} $$

(7.18)

and

$$\displaystyle \begin{aligned} \langle L,G(\bar{x})\rangle = 0. \end{aligned}$$

For every $h\in \mathbb {R}^{m}$ we obtain with Lemma 7.7

$$\displaystyle \begin{aligned} \begin{array}{rcl} (L\circ G^{\prime}(\bar{x}))(h) &\displaystyle = &\displaystyle \langle L,G^{\prime}(\bar{x})(h)\rangle\\ &\displaystyle = &\displaystyle \langle L,\sum_{i=1}^{m} G_{x_{i}}(\bar{x})h_{i}\rangle\\ &\displaystyle = &\displaystyle \sum_{i=1}^{m}\langle L,G_{x_{i}}(\bar{x})\rangle h_{i}\\ &\displaystyle = &\displaystyle \left(\begin{array}{c} \langle L,G_{x_{1}}(\bar{x})\rangle\\ \vdots\\ \langle L,G_{x_{m}}(\bar{x})\rangle \end{array}\right)^{T}h. \end{array} \end{aligned} $$

Then the equality (7.18) implies

$$\displaystyle \begin{aligned} \mu\nabla f(\bar{x})+\left(\begin{array}{c} \langle L,G_{x_{1}}(\bar{x})\rangle\\ \vdots\\ \langle L,G_{x_{m}}(\bar{x})\rangle \end{array}\right) = 0_{\mathbb{R}^{m}}. \end{aligned}$$

Hence, one part of the assertion is shown. If we consider the Kurcyusz-Robinson-Zowe regularity assumption (5.9) for the special problem (7.3), we have $\hat {S}=\mathbb {R}^{m}$ and $\mbox{cone}(\hat {S}-\{\bar {x}\})=\mathbb {R}^{m}$. So, the equality (7.17) is equivalent to the regularity assumption (5.9). This completes the proof. □

In the case of μ > 0 we can set $U:=\frac {1}{\mu }L\in C^{*}$ and the equalities (7.15) and (7.16) can be written as

$$\displaystyle \begin{aligned} f_{x_{i}}(\bar{x}) +\langle U,G_{x_{i}}(\bar{x})\rangle = 0 \ \ \mbox{for all}\ i\in\{ 1,\ldots ,m\} \end{aligned}$$

and

$$\displaystyle \begin{aligned} \langle U,G(\bar{x})\rangle = 0. \end{aligned}$$

This gives the extended Karush-Kuhn-Tucker conditions to matrix space problems.

In Theorem 7.8 the Lagrange multiplier L is a matrix in the dual cone C ^∗. According to the specific choice of the ordering cone C discussed in Lemmas 7.4, 7.5 and 7.6 we take the dual cones given in Lemmas 7.4,(b), 7.5,(b),(ii) and 7.6,(b),(d). For instance, if C denotes the Löwner ordering cone, then the multiplier L is positive semidefinite.

Instead of the regularity assumption (7.17) used in Theorem 7.8 we can also consider a simpler condition.

Lemma 7.9 (regularity condition)

Let the assumption of Theorem 7.8 be satisfied and let C denote the K-copositive ordering cone $C^{n}_{K}$ for an arbitrary convex cone K. If there exists a vector $\hat {x}\in \mathbb {R}^{m}$ so that $\displaystyle G(\bar {x})+\sum ^{m}_{i=1}G_{x_{i}}(\bar {x})(\hat {x}_{i}-\bar {x}_{i})$ is negative definite, then the regularity assumption in Theorem 7.8 is fulfilled.

Proof

By Lemma 7.5,(a) we have

$$\displaystyle \begin{aligned} G(\bar{x})+G^{\prime}(\bar{x})(\hat{x}-\bar{x}) = G(\bar{x})+\sum^{m}_{i=1}G_{x_{i}}(\bar{x})(\hat{x}_{i}-\bar{x}_{i}) \in -\mbox{int}(C^{n}_{K}) \end{aligned} $$

and with Theorem 5.6 the Kurcyusz-Robinson-Zowe regularity assumption is satisfied, i.e. the equality (7.17) is fulfilled. □

It is obvious that in the case of the Löwner partial ordering $S^{n}_{+}=C^{n}_{\mathbb {R}^{n}}$ Lemma 7.9 is also applicable. Notice that a similar result can be shown for the ordering cones discussed in Lemma 7.6. For the interior of these cones we can then use the results in Lemma 7.6,(a) and (c).

Next, we answer the question under which assumptions the Lagrange multiplier rule given in Theorem 7.8 as a necessary optimality condition is a sufficient optimality condition for the conic optimization problem (7.3).

Theorem 7.10 (sufficient optimality condition)

Let $f:\mathbb {R}^{m}\rightarrow \mathbb {R}$and $G:\mathbb {R}^{m} \rightarrow \mathcal {S}^{n}$be given functions. Let for some $\bar {x}\in \mathbb {R}^{m}$f be differentiable and pseudoconvex at $\bar {x}$and let G be elementwise differentiable and $(-C+\mathit{\mbox{cone}}(\{ G(\bar {x})\} )-\mathit{\mbox{cone}}(\{ G(\bar {x})\} ))$-quasiconvex at $\bar {x}$. If there is a matrix L ∈ C ^∗with

$$\displaystyle \begin{aligned} \nabla f(\bar{x})+\left(\begin{array}{c} \langle L,G_{x_{1}}(\bar{x})\rangle\\ \vdots\\ \langle L,G_{x_{m}}(\bar{x})\rangle \end{array}\right) = 0_{\mathbb{R}^{m}} \end{aligned} $$

(7.19)

and

$$\displaystyle \begin{aligned} \langle L,G(\bar{x})\rangle = 0, \end{aligned}$$

then $\bar {x}$is a minimal solution of the conic optimization problem (7.3).

Proof

With Lemma 7.7 the equality (7.19) implies

$$\displaystyle \begin{aligned} \nabla f(\bar{x})+L \circ G^{\prime}(\bar{x}) = 0_{\mathbb{R}^{m}}. \end{aligned}$$

By Lemma 5.16 and Corollary 5.15 the assertion follows immediately. □

The quasiconvexity assumption in Theorem 7.10 (compare Definition 5.12) means that for all feasible $x\in \mathbb {R}^{m}$

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle G(x)-G(\bar{x})\in -C+\mbox{cone}(\{ G(\bar{x})\}) -\mbox{cone}(\{ G(\bar{x})\})\\ \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle \Longrightarrow\ \ \ \sum^{m}_{i=1} G_{x_{i}}(\bar{x}) (x_{i}-\bar{x}_{i}) \in -C+\mbox{cone}(\{ G(\bar{x})\})-\mbox{cone}(\{ G(\bar{x})\}). \end{array} \end{aligned} $$

For all feasible $x\in \mathbb {R}^{m}$ this implication can be rewritten as

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle G(x)+(\alpha -1-\beta )G(\bar{x})\in -C \ \ \mbox{for some}\ \alpha ,\beta\geq 0\\ \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle \Longrightarrow\ \ \ \sum^{m}_{i=1} G_{x_{i}}(\bar{x}) (x_{i}-\bar{x}_{i})+(\gamma -\delta )G(\bar{x}) \in -C \ \ \mbox{for some}\ \gamma ,\delta\geq 0 \end{array} \end{aligned} $$

or

$$\displaystyle \begin{aligned} \begin{array}{rcl} \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle G(x)+\bar{\alpha}G(\bar{x})\in -C \ \ \mbox{for some}\ \bar{\alpha}\in\mathbb{R}\\ \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}&\displaystyle &\displaystyle \Longrightarrow\ \ \ \sum^{m}_{i=1} G_{x_{i}}(\bar{x}) (x_{i}-\bar{x}_{i})+\bar{\gamma}G(\bar{x}) \in -C \ \ \mbox{for some}\ \bar{\gamma}\in\mathbb{R} . \end{array} \end{aligned} $$

7.3 Duality

The duality theory developed in Chap. 6 is now applied to the conic optimization problem (7.3) with given functions and $G:\mathbb {R}^{m}\rightarrow \mathcal {S}^{n}$ and the partial ordering $\preccurlyeq $ inducing the ordering cone C.

For convenience we recall the primal optimization problem

$$\displaystyle \begin{aligned} \begin{array}{c} \min f(x)\\ \mbox{subject to the constraints}\\ G(x)\preccurlyeq 0_{\mathcal{S}^{n}}\\ x\in\mathbb{R}^{m}. \end{array} \end{aligned}$$

According to Sect. 6.1 the dual problem can be written as

$$\displaystyle \begin{aligned} \max_{U\in C^{*}}\inf_{x\in\mathbb{R}^{m}} f(x)+\langle U,G(x)\rangle \end{aligned} $$

(7.20)

or equivalently

$$\displaystyle \begin{aligned}\begin{array}{c} \max \lambda\\ \mbox{subject to the constraints}\\ f(x)+\langle U,G(x)\rangle\geq\lambda \ \ \mbox{for all}\ x\in\mathbb{R}^{m}\\ \lambda\in\mathbb{R},\ U\in C^{*}. \end{array}\end{aligned}$$

We are now able to formulate a

for the conic optimization problem (7.3).

Theorem 7.11 (weak duality theorem)

For every feasible $\hat {x}$of the primal problem (7.3) and for every feasible $\hat {U}$of the dual problem (7.20) the following inequality is satisfied

$$\displaystyle \begin{aligned} \inf_{x\in\mathbb{R}^{m}} f(x)+\langle \hat{U},G(x)\rangle \leq f(\hat{x}). \end{aligned}$$

Proof

This result follows immediately from Theorem 6.7. □

The following

is a direct consequence of Theorem 6.8.

Theorem 7.12 (strong duality theorem)

Let the composite mapping $(f,G):\mathbb {R}^{m}\rightarrow \mathbb {R}\times \mathcal {S}^{n}$be convex-like and let the ordering cone have a nonempty interior int(C). If the primal problem (7.3) is solvable and the generalized Slater condition is satisfied, i.e., there is a vector $\hat {x}\in \mathbb {R}^{m}$with $G(\hat {x})\in -\mathit{\mbox{int}}(C)$, then the dual problem (7.20) is also solvable and the extremal values of the two problems are equal.

For instance, if the ordering cone C is the K-copositive ordering cone $C^{n}_{K}$ for some convex cone $K\subset \mathbb {R}^{n}$, then by Lemma 7.5,(a) the generalized Slater condition in Theorem 7.12 is satisfied whenever $G(\hat {x})$ is negative definite for some $\hat {x}\in \mathbb {R}^{m}$. In this case a duality gap cannot appear.

With the investigations in Sect. 6.4 it is simple to state the dual problem of a linear semidefinite optimization problem. If we specialize the problem (7.3) to the linear problem

$$\displaystyle \begin{aligned} \begin{array}{c} \min \; c^{T}x\\ \mbox{subject to the constraints}\\ B\preccurlyeq A(x)\\ x_{1},\ldots ,x_{m}\geq 0 \end{array} \end{aligned} $$

(7.21)

with $c\in \mathbb {R}^{m}$, a linear mapping $A:\mathbb {R}^{m}\rightarrow \mathcal {S}^{n}$ and a matrix $B\in \mathcal {S}^{n}$. Since A is linear, there are matrices $A^{(1)},\ldots ,A^{(m)}\in \mathcal {S}^{n}$ so that

$$\displaystyle \begin{aligned} A(x)=A^{(1)}x_{1}+\ldots +A^{(m)}x_{m} \ \ \mbox{for all}\ x\in\mathbb{R}^{m}. \end{aligned}$$

Then the primal linear problem (7.21) can also be written as

$$\displaystyle \begin{aligned} \begin{array}{c} \min \; c^{T}x\\ \mbox{subject to the constraints}\\ B\preccurlyeq A^{(1)}x_{1}+\ldots +A^{(m)}x_{m}\\ x_{1},\ldots ,x_{m}\geq 0. \end{array} \end{aligned} $$

(7.22)

For the formulation of the dual problem of (7.22) we need the adjoint mapping $A^{*}:\mathcal {S}^{n}\rightarrow \mathbb {R}^{m}$ defined by

$$\displaystyle \begin{aligned} \begin{array}{rcl} A^{*}(U)(x) &\displaystyle = &\displaystyle \langle U,A(x)\rangle\\ &\displaystyle = &\displaystyle \langle U,A^{(1)}x_{1}+\ldots +A^{(m)}x_{m}\rangle\\ &\displaystyle = &\displaystyle \langle U,A^{(1)}\rangle x_{1}+\ldots +\langle U,A^{(m)}\rangle x_{m}\\ &\displaystyle = &\displaystyle \left(\langle U,A^{(1)}\rangle ,\ldots , \langle U,A^{(m)}\rangle\right)\cdot x\\ &\displaystyle &\displaystyle \mbox{for all}\ x\in\mathbb{R}^{m}\ \mbox{and all}\ U\in \mathcal{S}^{n}. \end{array} \end{aligned} $$

Using the general formulation (6.19) we then obtain the dual problem

$$\displaystyle \begin{aligned} \begin{array}{c} \max \; \langle B,U\rangle\\ \mbox{subject to the constraints}\\ \langle A^{(1)},U\rangle\leq c_{1}\\ \vdots\\ \langle A^{(m)},U\rangle\leq c_{m}\\ U\in C^{*}. \end{array} \end{aligned} $$

(7.23)

In the special literature on semidefinite optimization the dual problem (7.23) is very often the primal problem with $C^{*}=\mathcal {S}^{n}_{+}$. In this case our primal problem is then the dual problem in the literature.

Exercises

(7.1)
Show that the Löwner ordering cone $\mathcal {S}^{n}_{+}$ is closed and pointed.
(7.2)
Show for the Löwner ordering cone
$$\displaystyle \begin{aligned} \mathcal{S}^{n}_{+} = \mbox{convex hull}\;\{ xx^{T}\,|\ x\in\mathbb{R}^{n}\} . \end{aligned}$$
(7.3)
As an extension of Lemma 7.2 prove the following result: Let $\displaystyle X=\left ( \begin {array}{cc} A & B^{T}\\ B & C\end {array}\right ) \in \mathcal {S}^{k+l}$ with $A\in \mathcal {S}^{k}$, $C\in \mathcal {S}^{l}$ and $B\in \mathbb {R}^{(l,k)}$ be given, and assume that A is positive definite. Then we have for an arbitrary convex cone $K\subset \mathbb {R}^{l}$:
$$\displaystyle \begin{aligned} X\in C^{k+l}_{\mathbb{R}^{k}\times K} \ \ \ \Longleftrightarrow\ \ \ C-BA^{-1}B^{T}\in C^{l}_{K}. \end{aligned}$$
(7.4)
Show for arbitrary $A,B\in \mathcal {S}^{n}_{+}$
$$\displaystyle \begin{aligned} \langle A,B\rangle =0 \ \ \ \Longleftrightarrow\ \ \ AB=0_{\mathcal{S}^{n}}. \end{aligned}$$
(7.5)
Let A be a given symmetric (n, n) matrix. Show for an arbitrary (j − i + 1, j − i + 1) block matrix A ^ij (1 ≤ i ≤ j ≤ n) with
$$\displaystyle \begin{aligned} A^{ij}_{kl} = A_{i+k-1,\, i+l-1} \ \mbox{for all}\ k,l\in\{ 1,\ldots ,j-i+1\}: \end{aligned}$$

$$\displaystyle \begin{aligned} A \ \mbox{positive semidefinite} \ \ \ \Longrightarrow\ \ \ A^{ij} \ \mbox{positive semidefinite}. \end{aligned}$$
(7.6)
Show that the linear semidefinite optimization problem
$$\displaystyle \begin{aligned}\begin{array}{c} \min\;x_{2}\\ \mbox{subject to the constraints}\\[.2ex] -\left(\begin{array}{cc} x_{1} & 1\\ 1 & x_{2} \end{array}\right) \preceq 0_{\mathcal{S}^{2}}\\[.2ex] x_{1},x_{2}\in\mathbb{R} \end{array} \end{aligned}$$

(where ≼ denotes the Löwner partial ordering) is not solvable.
(7.7)
Let the linear mapping $G:\mathbb {R}^{2}\rightarrow \mathcal {S}^{2}$ with
$$\displaystyle \begin{aligned} G(x_{1},x_{2})=\left(\begin{array}{cc} x_{1} & x_{2} \\ x_{2} & 0 \end{array}\right) \ \mbox{for all}\ (x_{1},x_{2})\in\mathbb{R}^{2} \end{aligned}$$

be given. Show that G does not fulfill the generalized Slater condition given in Theorem 7.12 for $C=\mathcal {S}^{2}_{+}$.
(7.8)
Let $c\in \mathbb {R}^{m}$, $B\in \mathcal {S}^{n}$ and a linear mapping $A:\mathbb {R}^{m}\rightarrow \mathcal {S}^{n}$ with
$$\displaystyle \begin{aligned}A(x)=A^{(1)}x_{1}+\ldots +A^{(m)}x_{m} \ \mbox{for all}\ x\in\mathbb{R}^{m} \end{aligned}$$

for $A^{(1)},\ldots ,A^{(m)}\in \mathcal {S}^{n}$ be given. Show that for the linear problem
$$\displaystyle \begin{aligned} \begin{array}{c} \min \; c^{T}x\\ \mbox{subject to the constraints}\\ B\preccurlyeq A(x)\\ x\in\mathbb{R}^{m} \end{array} \end{aligned}$$

the dual problem is given by
$$\displaystyle \begin{aligned} \begin{array}{c} \max \; \langle B,U\rangle\\ \mbox{subject to the constraints}\\ \langle A^{(1)},U\rangle = c_{1}\\ \vdots\\ \langle A^{(m)},U\rangle = c_{m}\\ U\in C^{*}. \end{array} \end{aligned}$$
(7.9)
Consider the linear semidefinite optimization problem
$$\displaystyle \begin{aligned}\begin{array}{c} \min\;x_{1}\\ \mbox{subject to the constraints}\\[.2ex] \left(\begin{array}{ccc} 0 & -x_{1} & 0\\ -x_{1} & -x_{2} & 0\\ 0 & 0 & -x_{1}-1 \end{array}\right) \preceq 0_{\mathcal{S}^{3}}\\[.2ex] x_{1},x_{2}\in\mathbb{R} \end{array} \end{aligned}$$

(where ≼ denotes the Löwner partial ordering). Give the corresponding dual problem and show that the extremal values of the primal and dual problem are not equal. Why is Theorem 7.12 not applicable?

Notes

1.
K. Löwner, “Über monotone Matrixfunktionen”, Mathematische Zeitschrift 38 (1934) 177–216.

Author information

Authors and Affiliations

Department of Mathematics, University of Erlangen-Nürnberg, Erlangen, Germany
Johannes Jahn

Authors

Johannes Jahn
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jahn, J. (2020). Application to Extended Semidefinite Optimization. In: Introduction to the Theory of Nonlinear Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-42760-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-42760-3_7
Published: 03 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-42759-7
Online ISBN: 978-3-030-42760-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

Application to Extended Semidefinite Optimization

Abstract

7.1 Löwner Ordering Cone and Extensions