1 Introduction

Semidefinite optimization (SDO) problems are convex optimization problems over the intersection of an affine set and the cone of positive semidefinite matrices. They have received considerable attention and become one of the most active research areas in mathematical programming. Due to the success of interior-point methods (IPMs) in solving linear optimization (LO), primal-dual IPMs were extended to SDO, which was an important contribution made by Nesterov and Todd [16, 17]. One may distinguish different IPMs by whether they are feasible or infeasible. Feasible IPMs start with a strictly feasible point and maintain feasibility during the solution process. Different from feasible IPMs, interior-point algorithms that accept infeasible initial points are often referred to IIPMs, see more in [9, 12, 22].

Earlier, all primal-dual IPMs used the Newton direction as the search direction. Peng et al. [19] presented a new class of feasible IPMs. They used a direction determined by a so-called self-regular barrier function. Bai et al. [4] introduced a new class of kernel function. Based on the properties of these kernel functions, they derived many new and tight estimates that greatly simplified the analysis of feasible IPMs. By using an algebraic equivalent transformation of the central path, Darvay [5] explored a new technique in finding a class of search directions. Pan et al. [20] found the connection between Darvay’s approach and the method based on kernel functions.

Roos [21] designed a new IIPM for LO based on using the perturbed problems. Since the algorithm uses only full-Newton steps, it has the advantage that no line searches are needed. Later on, Mansouri and Roos [14] extended the proposed algorithm [21] to SDO. Using a new proximity measure to overcome the difficulty of obtaining an upper bound of updated proximity after one full-Newton step, Zhang et al. [30] presented a simplified analysis of the full-Newton step IIPM for SDO. By another definition for feasibility step, Kheirfam [8] generalized the simplified full-Newton step IIPM for LO [13] to SDO.

The aim of this paper is to investigate the kernel function, which has a finite value at the boundary of feasible region:

$$\begin{aligned} \psi (t)=(t-1)^2. \end{aligned}$$
(1)

The search direction determined by the kernel function in (1) is the same as the one introduced by Darvay [5]. Liu and Sun [11] proposed a full Nesterov–Todd(NT) step IIPM for SDO, where the feasibility step was induced by a specific kernel function. The kernel function in (1) is only up to a constant comparing with the one in [11]. It is also discussed by Wang and Bai [26] for feasible IPM, and used by Zhang and Xu [28] for LO to determine the search directions and measure the proximity of iterates to center path. However, the properties of the kernel function, such as exponential convexity, are not discussed in these papers.

Recently, Wang et al. [27] used the property of exponential convexity of the kernel function to simplify the analysis of full-Newton step feasible IPMs for LO problems. In this paper, based on using Nesterov–Todd search direction, the definition of feasibility step by Khierfam [8], and motivated by Wang et al. [27], we not only use the kernel function to determine the centering step, but also apply the property of exponential convexity of the kernel function to derive new estimates, which simplify the analysis of a full NT-step IIPM for SDO.

The article is organized as follows. In Sect. 2, we introduce the concept of some special matrix functions and basic results of feasible IPM for SDO. In Sect. 3, we present the search directions and describe our algorithm in detail. In Sect. 4, we introduce the kernel function and study the properties of barrier function. In Sect. 5, we analyze the algorithm and derive the complexity bound. In Sect. 6, we give a simple numerical example for the algorithm. Finally, some concluding remarks can be found in Sect. 7.

The following notations are used in the paper. \(\mathbb {R}^n\) denotes the n-dimensional Euclidean space. The set of all \(m\times n\) matrices with real entries is denoted by \(\mathbb {R}^{m\times n}\). \(A^T\) denotes the transpose of \(A\in \mathbb {R}^{m\times n}\). The set of all symmetric \(n\times n\) real matrices is denoted by \(\mathcal {S}^n\). \(\mathcal {S}^n_{++}~(\mathcal {S}^n_{+})\) denotes the set of all matrices in \(\mathcal {S}^n\) which are positive definite(positive semidefinite). I denotes \(n\times n\) identity matrix. \(A\succ 0~(A\succeq 0)\) means that A is positive definite (positive semidefinite). For any \(V\in \mathcal {S}^n\), we denote by \(\lambda (V)\) the vector of eigenvalues of V arranged in non-increasing order, that is, \(\lambda _{\max }(V)=\lambda _1(V)\ge \lambda _2(V)\ge \cdots \ge \lambda _n(V)=\lambda _{\min }(V).\) For any matrix M, we denote by \(\sigma _1(M)\ge \sigma _2(M)\ge \cdots \ge \sigma _n(M)\) the singular values of M. For \(Q\in \mathcal {S}^n_{++}\), we use \(Q^{\frac{1}{2}}\) to denote the symmetric square root of Q. The sign \(\sim \) denotes similarity of two matrices. Given \(G,H\in \mathbb {R}^{m\times n}\), the inner product between them is defined as \(\langle G,H\rangle :={\mathrm{Tr}}(G^TH)\), the trace of the matrix \(G^TH\). The Frobenius norm of \(M\in \mathbb {R}^{n\times n}\) is \(\Vert M\Vert :=\langle M,M\rangle ^{1/2}\).

2 Preliminaries

2.1 Special matrix functions

Since we need to use some special matrix functions that are useful in the analysis of the algorithm, we review some facts.

Definition 2.1

[25, Definition 2.2] Let \(V\in S^n_{++}\) and

$$\begin{aligned} V=Q^T(\lambda (V))Q, \end{aligned}$$

where Q is an arbitrary orthogonal matrix that diagonalizes V, and let \(\psi (t)\) be the function in (1). The matrix function \(\psi (V): S^n_{++}\rightarrow S^n\) is defined by

$$\begin{aligned} \psi (V)=Q^Tdiag(\psi (\lambda _1(V)),\psi (\lambda _2(V)), \ldots ,\psi (\lambda _n(V)))Q. \end{aligned}$$
(2)

Definition 2.2

[25, Definition 2.3] The real value matrix function \(\varPsi (V): S^n_{++}\rightarrow R_{+}\) is defined by

$$\begin{aligned} \varPsi (V)=Tr(\psi (V))=\sum _{i=1}^n\psi (\lambda _i(V)). \end{aligned}$$
(3)

Replacing \(\psi (\lambda _i(V))\) in (2) by \(\psi '(\lambda _i(V))\), we obtain the matrix function \(\psi '(V)\). We call \(\varPsi (V)\) matrix barrier function and \(\psi (t)\) the kernel function for it.

Lemma 2.1

\(\varPsi (V)\) is strictly convex with respect to \(V\succ 0\) and vanishes at its global minimal point \(V=I\), i.e., \(\psi (I)=\psi '(I)=0_{n\times n}\). Moreover, \(\varPsi (I)=0\).

Proof

Since \(\psi (t)=(t-1)^2\), we have \(\psi '(t)=2(t-1),\psi ''(t)=2>0\). We can see that \(\psi (t)\) is strictly convex with respect to \(t > 0\) and vanishes at its global minimal point \(t=1\), i.e., \(\psi (1)=\psi '(1)=0\). By using the strictly convex of \(\psi (t)\), definitions 2.1 and 2.2, the rest proof of the lemma is similar to the proof of Proposition 5.2.6 (i) in [19]. \(\square \)

2.2 The central path for SDO

We consider the standard form of the SDO problem:

$$\begin{aligned} (P)\quad \min \langle C,X\rangle ,\quad s.t.~\langle A_i,X\rangle =b_i,~i=1,2,\ldots ,m,~X\succeq 0, \end{aligned}$$
(4)

where \(C, X\in \mathcal {S}^n, b\in \mathbb {R}^m\), and \(A_i\in \mathcal {S}^n,~i=1,2,\ldots ,m\). We call problem (P) the primal form of SDO problem, and X is the primal variable.

The dual problem of (P) is given by

$$\begin{aligned} (D)\quad \max b^Ty,\quad s.t.~\sum _{i=1}^my_iA_i+S=C,~S\succeq 0, \end{aligned}$$
(5)

where \(y\in \mathbb {R}^m\) and \(S\in \mathcal {S}^n\) are the dual variables.

The set of primal-dual feasible solutions is denoted by

$$\begin{aligned}\begin{aligned} \mathcal {F}:=\{(X,y,S)\in {\mathcal {S}_+^n\times \mathbb {R}^m\times \mathcal {S}_+^n}: \langle A_i,X\rangle&=b_i,~i=1,2,\ldots ,m,\\ \sum _{i=1}^m y_iA_i+S&=C\},\end{aligned} \end{aligned}$$

and the relative interior of the primal-dual feasible set is

$$\begin{aligned} \begin{aligned} \mathcal {F}^0:=\{(X,y,S)\in {\mathcal {S}_{++}^n\times \mathbb {R}^m\times \mathcal {S}_{++}^n}:\langle A_i,X\rangle&=b_i,~i=1,2,\ldots ,m,\\ \sum _{i=1}^my_iA_i+S&=C\}. \end{aligned} \end{aligned}$$

We assume that both (P) and (D) satisfy the interior point condition (IPC), i.e., \(\mathcal {F}^0\) is nonempty and the matrices \(A_i,~i=1,2,\ldots ,m\), are linearly independent. It is well known [6] under the IPC the optimality conditions for (P) and (D) can be written as follows:

$$\begin{aligned} \begin{array}{rl}\langle A_i,X\rangle &{}=b_i, i=1,2,\ldots ,m, X\succeq 0,\\ {\displaystyle \sum _{i=1}^m}y_iA_i+S&{}=C, S\succeq 0,\\ XS&{}=0, \end{array} \end{aligned}$$
(6)

where the last equality is called the complementarity equation. The basic idea of primal-dual IPMs is to replace the complementarity condition \(XS=0\) by the parameterized equation \(XS=\mu I\) with \(\mu >0\). Then we get the perturbed system

$$\begin{aligned} \begin{array}{rl}\langle A_i,X\rangle &{}=b_i, i=1,2,\ldots ,m, X\succeq 0,\\ {\displaystyle \sum _{i=1}^m}y_iA_i+S&{}=C, S\succeq 0,\\ XS&{}=\mu {I}. \end{array} \end{aligned}$$
(7)

It is proved in [10, 15] that there is a unique solution \((X(\mu ), y(\mu ), S(\mu ))\) to the central path equations (7) for any barrier parameter \(\mu >0\). Moreover, the limit point \((X^{*}, y^{*}, S^{*})\), as \(\mu \) goes to 0, is a primal-dual optimal solution of the corresponding SDO problem.

2.3 The classic NT search direction

We consider the symmerization scheme yielding the Nesterov–Todd (NT) direction [17], or simply the Newton-direction [23]. The direction is determined by the following system:

$$\begin{aligned} \begin{array}{rl}\langle A_i,\triangle X\rangle &{}=0, i=1,2,\ldots ,m,\\ {\displaystyle \sum _{i=1}^m}\triangle y_i A_i+\triangle S&{}=0,\\ \triangle X + P\triangle S P^T &{}=\mu S^{-1}-X, \end{array} \end{aligned}$$
(8)

where \((\triangle X,\triangle y,\triangle S)\in \mathcal {S}^n\times \mathbb {R}^m\times \mathcal {S}^n\) is the search direction, and the choice of P is

$$\begin{aligned} P=X^{1/2}(X^{1/2}SX^{1/2})^{-1/2}X^{1/2}=S^{-1/2}(S^{1/2}XS^{1/2})^{1/2}S^{-1/2}. \end{aligned}$$
(9)

Let \(D=P^{1/2}\). Then D can be used to scale X and S to the same matrix V defined by

$$\begin{aligned} V:=\frac{1}{\sqrt{\mu }}D^{-1}XD^{-1}=\frac{1}{\sqrt{\mu }}DSD. \end{aligned}$$
(10)

Obviously

$$\begin{aligned} V^2=\frac{1}{\mu }D^{-1}XSD. \end{aligned}$$
(11)

After we define

$$\begin{aligned} \begin{array}{ll}\bar{A_i}:&{}=DA_{i}D,i=1,2,\ldots ,m,\\ D_X:&{}=\frac{1}{\sqrt{\mu }}D^{-1}\triangle X D^{-1},\\ D_S:&{}=\frac{1}{\sqrt{\mu }}D\triangle S D, \end{array} \end{aligned}$$
(12)

the system (8) is reduced to

$$\begin{aligned} \begin{array}{rl}\langle \bar{A_i},D_X\rangle &{}=0, i=1,2,\ldots ,m,\\ {\displaystyle \sum _{i=1}^m}\triangle y_i \bar{A_i}+D_S&{}=0,\\ D_X +D_S&{}=V^{-1}-V, \end{array} \end{aligned}$$
(13)

which gives the usual scaled NT search direction.

3 Infeasible full NT step IPM

We call the triple (XyS) an \(\varepsilon \)-optimal solution of (P) and (D) if the norms of the residuals

$$\begin{aligned} b_i-Tr(A_iX), i=1, \ldots , m \end{aligned}$$

and

$$\begin{aligned} C-\sum _{i=1}^m y_i A_i-S \end{aligned}$$

do not exceed \(\varepsilon \) and the duality satisfies \(Tr(XS)\le \varepsilon \).

As usual, we assume (P) and (D) have an optimal solution \((X^{*},y^{*},S^{*})\) with \(Tr(X^{*}S^{*})=0\), and we choose the initial iterate \((X^0,y^0,S^0)\) with

$$\begin{aligned} X^0=S^0=\zeta I, y^0=0, \mu ^0=\zeta ^2, \end{aligned}$$
(14)

where \(\zeta \) is a positive number such that

$$\begin{aligned} X^{*}+S^{*}\preceq \zeta I. \end{aligned}$$
(15)

3.1 The perturbed problems

We denote the initial values of the primal and dual residuals by \(r^0_b\) and \(R^0_C\), respectively:

$$\begin{aligned} \begin{array}{rl} (r^0_b)_i &{}=b_i-Tr(A_iX^0), i=1, \ldots , m,\\ R^0_C &{}=C-{\displaystyle \sum _{i=1}^m} y^0_i A_i-S^0. \end{array} \end{aligned}$$
(16)

For any \(\nu \) with \(0<\nu \le 1\), we consider the perturbed problem \((P_\nu )\)

$$\begin{aligned} min\{\langle C-\nu R^0_C, X \rangle : \langle A_i,X \rangle =b_i-\nu (r^0_b)_i, X\succeq 0\}, \end{aligned}$$

and its dual problem \((D_\nu )\)

$$\begin{aligned} max\left\{ {\displaystyle \sum _{i=1}^m}(b_i-\nu (r^0_b)_i)y_i: \sum _{i=1}^m y_i A_i+S=C-\nu R^0_C, S\succeq 0\right\} . \end{aligned}$$

Note that if \(\nu =1\), then \(X=X^0\) yields a strictly feasible solution of \((P_\nu )\), and \((y,S)=(y^0,S^0)\) yields strictly feasible solution of \((D_\nu )\). We conclude that if \(\nu =1\), then \((P_\nu )\) and \((D_\nu )\) are strictly feasible. Generally, we have the following lemma.

Lemma 3.1

[14, Lemma 4.1] Let the original problems (P) and (D) be feasible. Then for each \(\nu \) satisfying \(0<\nu \le 1\), the perturbed problems \((P_\nu )\) and \((D_\nu )\) are strictly feasible.

Assuming that (P) and (D) are both feasible, for each \(\nu \) such that \(0<\nu \le 1\), Lemma 3.1 implies that the problems \((P_\nu )\) and \((D_\nu )\) satisfy IPC, then their central paths exist. This means that the system

$$\begin{aligned} \begin{array}{rl}\langle A_i,X\rangle &{}=b_i-\nu (r^0_b)_i, i=1,2,\ldots ,m, X\succeq 0,\\ {\displaystyle \sum _{i=1}^m} y_iA_i+S &{}=C-\nu R^0_C, S\succeq 0,\\ XS&{}=\mu {I} \end{array} \end{aligned}$$
(17)

has a unique solution for every \(\mu >0\). We denote it by \((X(\mu ,\nu ),y(\mu ,\nu ),S(\mu ,\nu ))\), which is the \(\mu \)-center of \((P_\nu )\) and \((D_\nu )\). Next, we will always have \(\mu =\nu \zeta ^2\).

3.2 Search directions

3.2.1 Search directions for centering steps

We define

$$\begin{aligned} D^c_X:=\frac{1}{\sqrt{\mu }}D^{-1}\triangle ^c X D^{-1}, D^c_S:=\frac{1}{\sqrt{\mu }}D\triangle ^c S D. \end{aligned}$$
(18)

For centering steps, by the Definitions 2.1 and 2.2, following [19] we replace the right-hand side \(V^{-1}-V\) in the third equation in system (13) by \(-\psi '(V)\), then the scaled search direction \((D^c_X,\triangle ^c y,D^c_S)\) is given by the system

$$\begin{aligned} \begin{array}{rl}\langle \bar{A_i},D^c_X\rangle &{}=0, i=1,2,\ldots ,m,\\ {\displaystyle \sum _{i=1}^m}\triangle ^c y_i \bar{A_i}+D^c_S&{}=0,\\ D^c_X +D^c_S&{}=D_V, \end{array} \end{aligned}$$
(19)

with \(D_V=-\psi '(V)\). The search directions \(\varDelta ^c X\) and \(\varDelta ^c S\) are computed by (18).

3.2.2 Search directions for feasibility step

Following [8] the search direction \((\varDelta ^f X, \varDelta ^f y, \varDelta ^f S)\) is given by the following system:

$$\begin{aligned} \begin{array}{rl}\langle A_i,\triangle ^f X\rangle &{}=\theta \nu (r^0_b)_i, i=1,2,\ldots ,m,\\ {\displaystyle \sum _{i=1}^m} \triangle ^f y_iA_i+\triangle ^f S &{}=\theta \nu R^0_C,\\ \triangle ^f X + P\triangle ^f S P^T &{}= 0, \end{array}\end{aligned}$$
(20)

where \(\theta \) is a fixed barrier update parameter.

We define

$$\begin{aligned} D^f_X:=\frac{1}{\sqrt{\mu }}D^{-1}\triangle ^f X D^{-1}, D^f_S:=\frac{1}{\sqrt{\mu }}D\triangle ^f S D. \end{aligned}$$
(21)

Then the system (20) can be written as

$$\begin{aligned} \begin{array}{rl}\langle \bar{A_i},D^f_X\rangle &{}=\frac{1}{\sqrt{\mu }}\theta \nu (r^0_b)_i, i=1,2,\ldots ,m,\\ {\displaystyle \sum _{i=1}^m} \frac{1}{\sqrt{\mu }}\varDelta ^f y_i\bar{A_i}+D^f_S &{}=\frac{1}{\sqrt{\mu }}\theta \nu DR^0_CD,\\ D^f_X+D^f_S&{}=0. \end{array} \end{aligned}$$
(22)

3.3 An iteration of the algorithm

Initially \(X^0=S^0=\zeta I\) and \(\mu ^0=\zeta ^2\), whence \(V^0=I\) and \(\varPsi (V^0)=0\). We assume that at the start of each iteration, just before a \(\mu \)-update, \(\varPsi (V)\) is smaller than or equal to a (small) threshold value \(\tau >0\). This obviously holds at the start of the first iteration.

Now we describe a main iteration of our algorithm. Suppose that for some \(\mu \in (0,1]\) we have (XyS) is strictly feasible for the perturbed problems \((P_\nu )\) and \((D_\nu )\), where \(\mu =\nu \zeta ^2\), and such that \(\varPsi (V)\le \tau \). We first perform a feasibility step to generate iterates \((X^f,y^f,S^f)\) which is strictly feasible for the perturbed problems \((P_{\nu ^+})\) and \((D_{\nu ^+})\), where \(\nu ^+=(1-\theta )\nu \) with \(\theta \in (0,1)\), and close to the \(\mu ^+\)-center of \((P_{\nu ^+})\) and \((D_{\nu ^+})\) with \(\mu ^+=\nu ^+\zeta ^2\), i.e., \(\varPsi (V^f)=\varPsi (X^f,S^f,\mu ^+)\le \tau ^f\) for some suitable value of \(\tau ^f\). After the feasibility step, we perform some centering steps to get a strictly feasible triple \((X^+,y^+,S^+)\) of \((P_{\nu ^+})\) and \((D_{\nu ^+})\) such that \(\varPsi (X^+,S^+,\mu ^+)\le \tau \). The algorithm stops if the norms of the residuals and the duality gap are less than the accuracy parameter \(\varepsilon \).

We now summarize the generic full NT-step IIPM for SDO in Fig. 1.

Fig. 1
figure 1

The algorithm

4 The properties of the barrier function

By \(\psi '(t)=2(t-1)\) we get \(D_V=-\psi '(V)=2(I-V)\). We also use the norm-based proximity measure \(\delta (V)\) defined by

$$\begin{aligned} \delta (V):=\frac{1}{2}\Vert \psi '(V)\Vert =\frac{1}{2}\sqrt{\sum \nolimits _{i=1}^{n}\left( \psi '(\lambda _i(V))\right) ^2}. \end{aligned}$$
(23)

By Lemma 2.1, we have

$$\begin{aligned} \varPsi (V)=0\Leftrightarrow \delta (V)=0\Leftrightarrow V=I. \end{aligned}$$
(24)

Lemma 4.1

\(\psi (t)=\psi '(t)^2/4\), and \(\varPsi (V)=\delta (V)^2\) with \(\delta (V)=\Vert D_V\Vert /2\).

Proof

Since \(\psi (t)=(t-1)^2\), it is straightforward that \(\psi (t)=\psi '(t)^2/4\). By \(\psi (t)=\psi '(t)^2/4\), the definition 2.2 and (23), we obtain

$$\begin{aligned} \varPsi (V)=Tr(\psi (V))=\sum _{i=1}^n\psi (\lambda _i(V))=\frac{1}{4}\sum _{i=1}^{n}\left( \psi '(\lambda _i(V))\right) ^2=\delta (V)^2. \end{aligned}$$

\(\square \)

In order to discuss the property of exponential convexity, we need the following lemma. The lemma is a direct consequence of conclusion (d) of Theorem 3.3.14 in [7, pp.176-177]. We cite it here without proof, see [7] for more details.

Lemma 4.2

Let \(M,N\in S^n\) be two nonsingular matrices and f(t) a real-valued function such that \(f(e^t)\) is a convex function. Then

$$\begin{aligned} \sum _{i=1}^nf(\sigma _i(MN))\le \sum _{i=1}^nf(\sigma _i(M)\sigma _i(N)), \end{aligned}$$

where \(\sigma _i(M),\sigma _i(N)\) and \(\sigma _i(MN),i=1,2,\ldots ,n\), denote the singular values of MN and MN arranged in non-increasing order, respectively.

Lemma 4.3

Let \(t_{1}\ge 1/2\) and \(t_{2}\ge 1/2\). Then

$$\begin{aligned} \psi (\sqrt{t_{1}t_{2}})\le (\psi (t_{1})+\psi (t_{2}))/2. \end{aligned}$$

Proof

By some simple calculation, the above inequality is equivalent to \(\sqrt{t_{1}}+\sqrt{t_{2}}\ge \sqrt{2}\). The expression holds if \(t_1,t_2\ge 1/2\). \(\square \)

Similarly as in [3, 4], we say that \(\psi (t)\) is e-convex whenever \(t \ge 1/2\). It is easily verified that \(\psi (e^t)\) is a convex function, so we can apply the Lemma 4.2 to \(\psi (t)\).

Lemma 4.4

Suppose that matrices \(V_1,V_2\) are symmetric positive definite. Let \(\lambda _{min}(V_1)\ge 1/2\) and \(\lambda _{min}(V_2)\ge 1/2\), then

$$\begin{aligned} \varPsi \left( [V_1^{\frac{1}{2}}V_2V_1^{\frac{1}{2}}]^\frac{1}{2}\right) \le \frac{1}{2} \left( \varPsi (V_{1})+\varPsi (V_{2})\right) . \end{aligned}$$

Proof

By the definition of the singular values of a matrix, for any nonsingular matrix \(V\in S^n\),

$$\begin{aligned} \sigma _i(V)=\left( \lambda _i(V^{T}V)\right) ^{\frac{1}{2}}=\left( \lambda _i(VV^{T})\right) ^{\frac{1}{2}}, i=1,2,\ldots ,n. \end{aligned}$$

Hence,

$$\begin{aligned} \sigma _i\left( V_1^{\frac{1}{2}}V_2^{\frac{1}{2}}\right) =\left( \lambda _i(V_1^{\frac{1}{2}}V_2V_1^{\frac{1}{2}})\right) ^{\frac{1}{2}} =\lambda _i\left( [V_1^{\frac{1}{2}}V_2V_1^{\frac{1}{2}}]^\frac{1}{2}\right) ,i=1,2,\ldots ,n. \end{aligned}$$

Since \(V_1,V_2\) are symmetric positive definite, we have

$$\begin{aligned} \sigma _i(V_1)=\lambda _i(V_1), \sigma _i(V_2)=\lambda _i(V_2),i=1,2,\ldots ,n. \end{aligned}$$

By the Definition 2.2, Lemmas 4.2 and 4.3, we obtain

$$\begin{aligned} \begin{aligned} \varPsi \left( [V_1^{\frac{1}{2}}V_2V_1^{\frac{1}{2}}]^\frac{1}{2}\right)&=\sum _{i=1}^n\psi \left( \sigma _i(V_1^{\frac{1}{2}}V_2^{\frac{1}{2}})\right) \\ {}&\le \sum _{i=1}^n\psi \left( \sigma _i(V_1^{\frac{1}{2}})\sigma _i (V_2^{\frac{1}{2}})\right) \\ {}&\le \frac{1}{2}\sum _{i=1}^n\left( \psi \left( \sigma _i^2(V_1^{\frac{1}{2}})\right) +\psi \left( \sigma _i^2(V_2^{\frac{1}{2}})\right) \right) \\ {}&=\frac{1}{2}\sum _{i=1}^n\left( \psi (\sigma _i(V_1))+\psi (\sigma _i(V_2))\right) \\ {}&=\frac{1}{2}\left( \varPsi (V_1)+\varPsi (V_2)\right) . \end{aligned} \end{aligned}$$

\(\square \)

Lemma 4.5

\(\Vert V\Vert \le \sqrt{n}+\delta (V).\)

Proof

It follows from \(\Vert V\Vert =\Vert I-D_V/2\Vert \le \Vert I\Vert +\Vert D_V\Vert /2\). \(\square \)

Lemma 4.6

\(1-\delta (V)\le \lambda _i(V) \le 1+\delta (V), 1\le i \le n.\)

Proof

By Lemma 4.1 we have \(\delta (V)=\sqrt{\varPsi (V)}=\sqrt{\sum _{i=1}^{n}(1-\lambda _i(V))^2}\), then \(|1-\lambda _i(V)|\le \delta (V)\), so the lemma follows. \(\square \)

The following lemma shows the effect of a \(\mu \)-update on the value of \(\varPsi (V)\).

Lemma 4.7

Let \(0<\theta <1\), for any positive definite matrix V:

$$\begin{aligned} \varPsi \left( \frac{V}{\sqrt{1-\theta }}\right) \le \varPsi (V)+\frac{\theta }{1-\theta }\Vert V\Vert ^2. \end{aligned}$$
(25)

Proof

Let \(\beta \ge 1\), define \(v_i:=\lambda _i(V), 1\le i\le n\), then \(v>0\). From the definition of \(\varPsi (V)\), we have

$$\begin{aligned} \varPsi (\beta V)=\sum _{i=1}^n\psi (\beta \lambda _i(V))=\sum _{i=1}^n\psi (\beta v_i)=\varPsi (\beta v). \end{aligned}$$

By the definition of \(\psi (t)\) in (1),

$$\begin{aligned} \psi (\beta t)=(t-1)^2+(\beta ^2-1)t^2+2(1-\beta )t\le \psi (t)+(\beta ^2-1)t^2. \end{aligned}$$
(26)

Let \(\beta =1/\sqrt{1-\theta }\), then

$$\begin{aligned} \begin{aligned} \varPsi \left( \frac{V}{\sqrt{1-\theta }}\right) =\varPsi \left( \frac{v}{\sqrt{1-\theta }}\right)&\le \varPsi (v)+\frac{\theta }{1-\theta }\Vert v\Vert ^2 \\ {}&=\sum _{i=1}^n\psi (v_i)+\frac{\theta }{1-\theta }\sum _{i=1}^n v_i^2 \\ {}&=\sum _{i=1}^n\psi (\lambda _i(V))+\frac{\theta }{1-\theta }\sum _{i=1}^n (\lambda _i(V))^2 \\ {}&=\varPsi (V)+\frac{\theta }{1-\theta }\Vert V\Vert ^2. \end{aligned} \end{aligned}$$

\(\square \)

5 Analysis of the algorithm

5.1 Analysis of the centering step

After a centering step, the new iterates are given by

$$\begin{aligned} X^+:=X+\varDelta ^c X, y^+:=y+\varDelta ^c y, S^+:=S+\varDelta ^c S. \end{aligned}$$

The new V-matrix is

$$\begin{aligned} (V^+)^2=\frac{1}{\mu }D^{-1}X^+S^+D. \end{aligned}$$
(27)

The following results are crucial in the analysis of the algorithm. We list them without proof.

Lemma 5.1

[29, Lemma 4.2] The new iterates \((X^+,y^+,S^+)\) are strictly feasible if \(\varPsi (V)<1\).

Theorem 5.1

[29, Theorem 4.5] If \(\varPsi (V)<1\), then \(\varPsi (V^+)<\varPsi (V)^2\).

Theorem 5.1 implies that the centering step is quadratically convergent.

5.2 Analysis of the feasibility step

After a feasibility step, the new iterates are given by

$$\begin{aligned} X^f:=X+\triangle ^f X, y^f:=y+\triangle ^f y, S^f:=S+\triangle ^f S. \end{aligned}$$
(28)

By (20) and (28), it is easy to show that the iterates are feasible for the new perturbed problems. A key point in the analysis is to show that \(X^f\) and \(S^f\) are positive definite and satisfy \(\varPsi (V^f)\le \tau ^f(\tau ^f<1)\).

By using (21), we obtain

$$\begin{aligned} \triangle ^f X=\sqrt{\mu }DD^f_XD,\triangle ^f S=\sqrt{\mu }D^{-1}D^f_SD^{-1}. \end{aligned}$$

It follows from (28) that

$$\begin{aligned} \begin{array}{rl} X^f&{}=X+\triangle ^f X=\sqrt{\mu }D\left( V+D^f_X\right) D,\\ S^f&{}=S+\triangle ^f S=\sqrt{\mu }D^{-1}\left( V+D^f_S\right) D^{-1}. \end{array} \end{aligned}$$
(29)

Therefore

$$\begin{aligned} X^fS^f=\mu D\left( V+D^f_X\right) \left( V+D^f_S\right) D^{-1}, \end{aligned}$$
(30)

which implies that

$$\begin{aligned} X^fS^f\sim \mu \left( V+D^f_X\right) \left( V+D^f_S\right) . \end{aligned}$$
(31)

Let

$$\begin{aligned} \begin{array}{rl} D^f_{XS}:&{}=\frac{1}{2}\left( D^f_XD^f_S+D^f_SD^f_X\right) ,\\ M:&{}=\left( D^f_XV-VD^f_X\right) +\frac{1}{2}\left( D^f_XD^f_S-D^f_SD^f_X\right) . \end{array} \end{aligned}$$
(32)

Note that \(D^f_{XS}\) is a symmetric matrix and M a skew-symmetric matrix. From the third equation in system (22), by multiplying both sides from the left with V, we get

$$\begin{aligned} VD^f_X+VD^f_S=0. \end{aligned}$$
(33)

Therefore

$$\begin{aligned} \begin{aligned} \frac{X^fS^f}{\mu }\sim&\left( V+D^f_X\right) \left( V+D^f_S\right) \\=&\, V^2+VD^f_S+D^f_XV+D^f_XD^f_S \\=&\,V^2+\frac{1}{2}\left( D^f_XD^f_S+D^f_SD^f_X\right) +\left( D^f_XV-VD^f_X\right) \\ {}&+\frac{1}{2}\left( D^f_XD^f_S-D^f_SD^f_X\right) \\=&\, V^2+D^f_{XS}+M. \end{aligned} \end{aligned}$$
(34)

The following lemma shows the sufficient condition of the strict feasibility of the new iterates \((X^f,y^f,S^f)\).

Lemma 5.2

Let \(X\succ 0\) and \(S\succ 0\). Then the iterates \((X^f,y^f,S^f)\) are strictly feasible if

$$\begin{aligned} \lambda _{min}(V)>\left| \lambda _i\left( D^f_X\right) \right| , 1 \le i\le n. \end{aligned}$$

Proof

By Lemma 10 in [8], \((X^f,y^f,S^f)\) are strictly feasible if \(V^2+D^f_{XS}\succ 0\). Using \(D^f_S=-D^f_X\) we have \(\lambda _{i}\left( D^f_{XS}\right) =-\lambda ^2_{i}\left( D^f_X\right) \), therefore \(V^2+D^f_{XS}\succ 0\) if \(\lambda _{min}(V)>\left| \lambda _i\left( D^f_X\right) \right| , 1 \le i\le n.\) This completes the proof. \(\square \)

Before \(\mu \)-update, by (10), the new V-matrix is given by

$$\begin{aligned} \hat{V}=\frac{1}{\sqrt{\mu }}\left( D^{-1}X^fS^fD\right) ^{\frac{1}{2}}. \end{aligned}$$
(35)

Now we want to find an upper bound for \(\varPsi (\hat{V})\) by applying exponential convexity, so we assume the eigenvalues of the matrices \(V+D^f_X\) and \(V+D^f_S\) are at least 1 / 2, i.e.

$$\begin{aligned} \lambda _i\left( V+D^f_X\right) \ge \frac{1}{2} \ {and} \ \lambda _i\left( V+D^f_S\right) \ge \frac{1}{2},1\le i\le n. \end{aligned}$$
(36)

It is obvious that (36) holds if

$$\begin{aligned} \lambda _{min}(V)-\left| \lambda _i\left( D^f_X\right) \right| \ge \frac{1}{2}, 1\le i \le n. \end{aligned}$$
(37)

By (35) we have \(\hat{V}^2\sim \frac{1}{\mu }X^fS^f\sim \frac{1}{\mu }\left( X^f\right) ^{\frac{1}{2}}S^f\left( X^f\right) ^{\frac{1}{2}}\), from (31),

$$\begin{aligned} \hat{V}^2\sim \bar{V}^2:=\left( V+D^f_X\right) ^{\frac{1}{2}}\left( V+D^f_S\right) \left( V+D^f_X\right) ^{\frac{1}{2}}. \end{aligned}$$
(38)

Lemma 5.3

\(\varPsi (\hat{V})\le \varPsi (V)+\left\| D^f_X\right\| ^2.\)

Proof

Using (38), the definition of \(\psi (V)\) and \(\varPsi (V)\), Lemma 4.4 and \(D^f_S=-D^f_X\), we have

$$\begin{aligned} \begin{aligned} \varPsi (\hat{V})=\varPsi (\bar{V})&=\varPsi \left( \left( (V+D^f_X)^{\frac{1}{2}}(V+D^f_S)(V+D^f_X)^{\frac{1}{2}}\right) ^{\frac{1}{2}}\right) \\ {}&\le \frac{1}{2}\left( \varPsi (V+D^f_X)+\varPsi (V+D^f_S)\right) \\ {}&=\frac{1}{2}\left( Tr\left( \psi (V+D^f_X)\right) +Tr\left( \psi (V+D^f_S)\right) \right) \\ {}&=\frac{1}{2}Tr\left( \psi (V+D^f_X)+\psi (V+D^f_S)\right) \\ {}&=\frac{1}{2}Tr\Big (2(V-I)^2+(D^f_X+D^f_S)(V-I) \\ {}&+(V-I)(D^f_X+D^f_S)+(D^f_X)^2+(D^f_S)^2\Big ) \\ {}&=\frac{1}{2}Tr\left( 2(V-I)^2+2(D^f_X)^2\right) \\ {}&=Tr\left( (V-I)^2\right) +Tr\left( (D^f_X)^2\right) \\ {}&=Tr(\psi (V))+Tr\left( (D^f_X)^2\right) \\ {}&=\varPsi (V)+\left\| D^f_X\right\| ^2. \end{aligned} \end{aligned}$$

\(\square \)

5.3 Upper bound for \(\left\| D^f_X\right\| \)

By a similar argument given in [8, Section 5], we have

$$\begin{aligned} \left\| D^f_X\right\| \le \frac{\theta }{\zeta \lambda _{min}(V)}Tr(X+S). \end{aligned}$$
(39)

Next, we denote \(\delta (V)\) simply as \(\delta \).

Lemma 5.4

$$\begin{aligned} \left\| D^f_X\right\| \le \frac{(\sqrt{n}+\delta )^2+n}{1-\delta }\theta . \end{aligned}$$
(40)

Proof

In [14, Lemma 5.15], it is shown that

$$\begin{aligned} \nu \zeta Tr(X+S)\le \langle X,S \rangle +\nu n\zeta ^2. \end{aligned}$$
(41)

By \(XS\sim \mu V^2\) and Lemma 4.5 we have

$$\begin{aligned} Tr(XS)=\mu \Vert V\Vert ^2\le \nu \zeta ^2(\sqrt{n}+\delta )^2. \end{aligned}$$
(42)

The lemma follows from (39) and Lemma 4.6. \(\square \)

By (37) and Lemma 4.6 we obtain that the e-convex holds if \(\left| \lambda _i(D^f_X)\right| \le \left\| D^f_X\right\| \le 1/2-\delta , 1\le i \le n \), then using Lemma 5.4 we have

$$\begin{aligned} \theta \le \frac{(1/2-\delta )(1-\delta )}{(\sqrt{n}+\delta )^2+n}. \end{aligned}$$
(43)

Note that the right-hand side of the above inequality is monotonically decreasing with respect to \(\delta \).

5.4 The choice of \(\tau \), \(\theta \) and \(\tau ^f\)

Theorem 5.2

Let \(\mu ^+=(1-\theta )\mu \), where \(0<\theta < 1\) and \(V^f=\frac{\hat{V}}{\sqrt{1-\theta }}\), then

$$\begin{aligned} \varPsi (V^f)\le \frac{1}{1-\theta }\left( \varPsi (V)+2\theta \sqrt{n\varPsi (V)}+\theta n+(1-2\theta )/4\right) . \end{aligned}$$
(44)

Particularly, if \(\varPsi (V)\le \tau =\frac{1}{16}\) and \(\theta =\frac{1}{13n}\), then

$$\begin{aligned} \varPsi (V^f)\le \tau ^f=\frac{1}{2}. \end{aligned}$$

Proof

Before \(\mu \)-update, by Lemma 5.3 we have \(\varPsi (\hat{V})\le \varPsi (V)+\left\| D^f_X\right\| ^2.\) From (34) and (35),

$$\begin{aligned} \hat{V}^2\sim \left( V^2+D^f_{XS}+M\right) . \end{aligned}$$
(45)

Then

$$\begin{aligned} \Vert \hat{V}\Vert ^2=Tr\left( \hat{V}^2\right) =Tr\left( V^2+D^f_{XS}+M\right) . \end{aligned}$$
(46)

Since M is skew-symmetric, and \(D^f_S=-D^f_X\), we obtain

$$\begin{aligned} \Vert \hat{V}\Vert ^2=Tr\left( V^2\right) +Tr\left( D^f_{XS}\right) =\Vert V\Vert ^2-Tr\left( (D^f_X)^2\right) =\Vert V\Vert ^2-\left\| D^f_X\right\| ^2. \end{aligned}$$
(47)

After \(\mu \)-update, by Lemma 4.7 we get

$$\begin{aligned} \begin{aligned} \varPsi (V^f)&\le \varPsi (\hat{V})+\frac{\theta }{1-\theta }\Vert \hat{V}\Vert ^2 \\ {}&\le \varPsi (V)+\left\| D^f_X\right\| ^2+\frac{\theta }{1-\theta }\left( \Vert V\Vert ^2-\left\| D^f_X\right\| ^2\right) . \end{aligned} \end{aligned}$$
(48)

Since \(\left\| D^f_X\right\| ^2\le (1/2-\delta )^2\le 1/4\), by Lemmas 4.1 and 4.5, the inequality in (44) follows. If \(\varPsi (V)\le \tau =1/16\), by Lemma 4.1\(\delta (V)\le 1/4\), hence if we take \(\theta =1/(13n)\), the inequality (43) holds. Then \(\varPsi (V^f)\le \tau ^f=1/2\) follows from (44) and simple calculation. \(\square \)

5.5 Complexity analysis

After the feasibility step, we have derived an upper bound for the \(\varPsi (V^f)\), i.e. \(\varPsi (V^f)\le 1/2\). By the quadratic convergence property of the centering step (Theorem 5.1), at most

$$\begin{aligned} \log _2(\log _2 1/\tau )=\log _2(\log _2 16)=2 \end{aligned}$$

centering steps suffice to get iterates \((X^+,y^+,S^+)\) that satisfy \(\varPsi (X^+,S^+;\mu ^+)\le 1/16\). So each main iteration consists of at most 3 so-called inner iterations, in each of which we need to compute a new search direction.

In each main iteration both the duality gap and the norms of the residuals are reduced by the factor \(1-\theta \). Hence, using \(Tr(X^0S^0)=n\zeta ^2\), the total number of main iterations is bounded above by

$$\begin{aligned} \frac{1}{\theta }\log \frac{max\{n\zeta ^2,\Vert r^0_b\Vert ,\Vert R^0_C\Vert \}}{\varepsilon }. \end{aligned}$$

Since

$$\begin{aligned} \theta =\frac{1}{13n}, \end{aligned}$$

the total number of inner iterations is bounded above by

$$\begin{aligned} 39n\log \frac{max\{n\zeta ^2,\Vert r^0_b\Vert ,\Vert R^0_C\Vert \}}{\varepsilon }. \end{aligned}$$

Now we state our main result without further proof.

Theorem 5.3

If (P) and (D) have an optimal solution \((X^*,y^*,S^*)\) such that \(X^*+S^*\le \zeta I\), then after at most

$$\begin{aligned} 39n\log \frac{max\{n\zeta ^2,\Vert r_b^0\Vert ,\Vert R^0_C\Vert \}}{\varepsilon } \end{aligned}$$

iterations, the algorithm finds an \(\varepsilon \)-solution of (P) and (D).

6 Numerical results

In this section we consider an application example and give some numerical results.

Maximum eigenvalue minimization Suppose the symmetric matrix A(x) depends affinely on \(x\in \mathbb {R}^{k}: A(x)=A_0+x_1A_1+\cdots +x_kA_k\), where \(A_i\in S^{n}.\) The problem of minimizing the maximum eigenvalue of the matrix A(x) can be cast as the semidefinite program

$$\begin{aligned} \quad \min \quad t,\quad s.t.\quad tI-A(x)\ge 0, \end{aligned}$$
(49)

with variables \(x\in \mathbb {R}^{k}\) and \(t\in \mathbb {R}\). Problems of this type arise in control theory, structural optimization, graph theory and combinatorial optimization, and other fields (see [24] for details).

Example 1

[1, Example 14.1] Find scalars \(y_1,y_2,\) and \(y_3\) such that the maximum eigenvalue of \(F=A_0+y_1A_1+y_2A_2+y_3A_3\) with

$$\begin{aligned} A_0= & {} \left[ \begin{matrix} 2&{} -0.5&{} -0.6\\ -0.5&{} 2&{} 0.4\\ -0.6&{} 0.4&{} 3\\ \end{matrix} \right] , A_1=\left[ \begin{matrix} 0&{} 1&{} 0\\ 1&{} 0&{} 0\\ 0&{} 0&{} 0\\ \end{matrix} \right] \\ A_2= & {} \left[ \begin{matrix} 0&{} 0&{} 1\\ 0&{} 0&{} 0\\ 1&{} 0&{} 0\\ \end{matrix} \right] , A_3=\left[ \begin{matrix} 0&{} 0&{} 0\\ 0&{} 0&{} 1\\ 0&{} 1&{} 0\\ \end{matrix} \right] \end{aligned}$$

is minimized.

This problem can be formulated as the standard form of the SDO problem

$$\begin{aligned} \quad \max b^Ty,\quad s.t.~\sum _{i=1}^4y_iA_i+S=C,~S\succeq 0, \end{aligned}$$
(50)

where \(b=[0,0,0,1]^T, y=[y_1,y_2,y_3,y_4]^T, C=-A_0,A_4=I\), by (49) we observe that \(-y_4\) is the maximum eigenvalue of matrix F.

The initial point is \((X^0,y^0,S^0)=(\zeta I, 0,\zeta I)\) with \(\zeta =1\). We set \(\tau =\frac{1}{16},\theta =\frac{1}{13n}\) and \(\varepsilon =10^{-3}\). A primal-dual optimal solution is given by

$$\begin{aligned} X^*= & {} \left[ \begin{matrix} 0.0009&{} -0.0000&{} -0.0000\\ -0.0000&{} 0.0009&{} 0.0000\\ -0.0000&{} 0.0000&{} 1.0000\\ \end{matrix} \right] , y^*=\left[ \begin{matrix} 0.4996\\ 0.5995\\ -0.3996\\ -2.9973\\ \end{matrix} \right] , \\ S^*= & {} \left[ \begin{matrix} 1.0000&{} 0.0000&{} 0.0000\\ 0.0000&{} 1.0000&{} -0.0000\\ 0.0000&{} -0.0000&{} 0.0009\\ \end{matrix} \right] . \end{aligned}$$

The algorithm reaches this solution in 271 iterations, taking 2.0836 seconds. It might be emphasized that the default value \(\theta =\frac{1}{13n}\) is theoretically justified in a worst-case. If a larger \(\theta \) is taken, for example \(\theta =\frac{1}{4n}\), the algorithm need just 80 iterations, taking 0.5817 seconds.

By using the first three components of \(y^*\), that is \(y_1,y_2,\) and \(y_3\), the maximum eigenvalue of \(F=A_0+y_1A_1+y_2A_2+y_3A_3\) is found to be 2.9973.

7 Concluding remarks

Based on the exponential convexity of a simple kernel function, we propose a new complexity analysis of a full NT-step infeasible interior-point algorithm for SDO. The iteration bound of the algorithm coincides with the best known iteration bound for IIPMs. Moreover, the resulting analysis is relatively simple comparing to that in [8, 11]. However, from a practical perspective, the full NT-step IIPM may be not so efficient. Recently, by applying the properties of the kernel function-based barrier function discussed in [18], Asadi and Roos [2] attempted to design a large-update version of Roos’ algorithm [21] for LO problems. Next, we may focus on designing a large-update IIPM involving the kernel function for SDO problems.