1 Introduction

Symmetric optimization (SCO) is to minimize a linear function over the intersection of an affine space and the symmetric cone. Symmetric cones are intimately related to Euclidean Jordan algebras (Faraut and Korányi 1994), and these algebras provide us with a basic toolbox to carry out our analysis. There is an extensive literature on the analysis of interior-point methods (IPMs) for SCO. The basic idea for solving SCO using IPMs is due to Nesterov and Nemirovski (1994). Their method was primarily either primal or dual based. Later on, Nesterov and Todd (1997) proposed interior-point algorithms on a special class of cones called self-scaled cones. The first work connecting Jordan algebras and optimization is due to Güler (1996). He observed that the family of the self-scaled cones is identical to the set of symmetric cones for which there exists a complete classification theory. Faybusovich (1997) first extended primal-dual IPMs for semidefinite optimization (SDO) to SCO by using Euclidean Jordan algebras. Muramatsu (2002) presented a commutative class of search directions for SCO and analyzed the complexities of primal-dual IPMs for SCO. Rangarajan (2006) proved the polynomial-time convergence of infeasible IPMs (IIPMs) for conic programming over symmetric cones using a wide neighborhood of the central path for a commutative family of search directions. Schmieta and Alizadeh (2003) introduced primal-dual IPMs for SCO extensively under the framework of Euclidean Jordan algebra. By constructing strictly feasible iterates for a sequence of the perturbed problems of the given problem and its dual problem, Roos (2006) introduced a full-Newton primal-dual IIPM for linear optimization (LO) based on the classical directions. Kheirfam (2011) extended Roos’ algorithm for LO to SDO based on a special self-regular proximity. Gu et al. (2011) extended this algorithm to SCO by using Euclidean Jordan algebra. Darvay (2003) proposed a full-Newton step primal-dual path-following interior-point algorithm for LO. The search direction of his algorithm is introduced by using an algebraic equivalent transformation of the centering equation which define the central path and then applying Newton’s method for the new system of equations. Recently, Wang and Bai (2012) generalized Darvay’s full-Newton step primal-dual path-following interior-point algorithm for LO to SCO by using Euclidean Jordan algebras.

In this paper, by combining Darvay’s and Roos’ works we propose a full Nesterov and Todd (NT) step primal-dual infeasible interior-point algorithm for SCO by using Euclidean Jordan algebras. At each iteration, we use only full NT-steps which have the advantage that no line searches are needed. By constructing strictly feasible iterates for a sequence of the perturbed problems of the given problem and its dual problem, our algorithm decreases the duality gap and the feasibility residuals at the same rate. Finally, the order of the iteration bound coincides with the currently best known iteration bound for SCO.

The paper is organized as follows: In Sect. 2, we briefly recall the theory of the Euclidean Jordan algebra and their associated symmetric cones. In Sect. 3, after briefly reviewing the concept of the central path for SCO, we recall the Darvay’s technique for SCO, the search directions and the quadratical convergence. The primal-dual infeasible interior-point algorithm for SCO is presented in Sect. 4. In Sect. 5, we analyze the algorithm and derive the currently best known iteration bound for IIPMs. Finally, some conclusions and remarks follow in Sect. 6.

2 Euclidean Jordan algebras and symmetric cones

In this section, we introduce Jordan algebras and symmetric cones as well as some of their basic properties, closely following (Schmieta and Alizadeh 2003). These properties serve as our toolbox for the analysis of IIPMs. For a comprehensive study of Jordan algebras, the reader is referred to Faraut and Korányi (1994).

A Jordan algebra \(\mathcal{J}\) is a finite dimensional vector space endowed with a bilinear map \(\circ: \mathcal{J}\times \mathcal{J}\rightarrow \mathcal{J}\) if for all \(x, y\in \mathcal{J}, x\circ y= y\circ x\), and x∘(x 2y)=x 2∘(xy), where x 2=xx. A Jordan algebra \(( \mathcal{J}, \circ)\) is called Euclidean if there exists a symmetric positive definite quadratic form Q on \(\mathcal{J}\) such that Q(xy,z)=Q(x,yz). A Jordan algebra has an identity element, if there exists a unique element \(e\in \mathcal{J}\) such that xe=ex=x, for all \(x\in \mathcal{J}\). The set \(\mathcal{K}(\mathcal{J}):=\{x^{2}:~x\in \mathcal{J}\}\) is called the cone of squares of the Euclidean Jordan algebra \((\mathcal{J}, \circ, \langle \cdot, \cdot \rangle)\). A cone is symmetric if and only if it is the cone of squares of a Euclidean Jordan algebra (Faraut and Korányi 1994).

An element \(c\in \mathcal{J}\) is idempotent if cc=c. Two elements x and y are orthogonal if xy=0. An idempotent c is primitive if it is nonzero and can not be expressed by sum of two other nonzero idempotents. A set of primitive idempotents {c 1,c 2,…,c k } is called a Jordan frame if c i c j =0, for any ij∈{1,2,…,k} and \(\sum_{i=1}^{k} c_{i}= e\). For any \(x\in \mathcal{J}\), let r be the smallest positive integer such that {e,x,x 2,…,x r} is linearly dependent; r is called the degree of x and is denoted by deg(x). The rank of \(\mathcal{J}\), denoted by \(\operatorname{rank}(\mathcal{J})\), is defined as the maximum of deg(x) over all \(x\in \mathcal{J}\).

Theorem 1

(Theorem III.1.2 in Faraut and Korányi 1994)

Let \((\mathcal{J}, \circ, \langle\cdot,\cdot\rangle)\) be a Euclidean Jordan algebra with \(\operatorname{rank}(\mathcal{J})=r\). Then, for any \(x\in \mathcal{J}\), there exists a Jordan frame {c 1,c 2,…,c r } and real numbers λ 1(x),λ 2(x),…,λ r (x) such that \(x=\sum_{i=1}^{r}\lambda_{i}( x) c_{i}\).

Every λ i (x) is called an eigenvalue of x. We denote λ min(x) (λ max(x)) as the minimal (maximal) eigenvalue of x. We can define the following: the square root, \(x^{\frac{1}{2}}:=\sum_{i=1}^{r}\sqrt{\lambda_{i}( x)} c_{i}\), wherever all λ i ≥0, the inverse, \(x^{-1}:=\sum_{i=1}^{r}\lambda_{i}( x)^{-1} c_{i}\), wherever all λ i ≠0, the square \(x^{2}:=\sum_{i=1}^{r}\lambda_{i}( x)^{2} c_{i}\); the trace \(\mathit{tr}(x):=\sum_{i=1}^{r}\lambda_{i}( x)\). If x −1 is defined, we call x invertible.

Since “∘” is bilinear map, for every \(x\in \mathcal{J}\), a linear operator L(x) can be defined such that L(x)y=xy for all \(y\in\mathcal{J}\). In particular, L(x)e=x and L(x)x=x 2. For each \(x \in \mathcal{J}\), define

$$P( x):=2L( x)^2-L\bigl( x^2\bigr), $$

where, L(x)2=L(x)L(x). The map P(x) is called the quadratic representation of x. For any x, \(y\in \mathcal{J}\), x and y are said to be operator commutable if L(x) and L(y) commute, i.e., L(x)L(y)=L(y)L(x). In other words, x and y are operator commutable if for all \(z\in \mathcal{J}\), x∘(yz)=y∘(xz) (Schmieta and Alizadeh 2003). For any x, \(y\in \mathcal{J}\), the inner product is defined as 〈x,y〉=tr(xy), and the Frobenius norm of x as follows \(\| x\|_{F}=\sqrt{\langle x, x\rangle}=\sqrt{ \mathit{tr}( x^{2})}=\sqrt{\sum_{i=1}^{r}\lambda_{i}^{2}( x)}\). Observe that \(\|e\|_{F}=\sqrt{r}\), since identity element e has eigenvalue 1 with multiplicity r. Furthermore, we have

$$\lambda_{\min}(x)\leq\|x\|_F,\qquad\lambda_{\max}(x)\leq\|x\|_F \quad\mathrm{and}\quad \bigr|\langle x, y\rangle\bigl|\leq\|x\|_F\|y\|_F. $$

Lemma 1

(Lemma 3.2 in Faybusovich 2002)

Let x, \(s\in\mathrm{ int}\,\mathcal{K}\). Then, there exists a unique \(w\in \mathrm{int}\,\mathcal{K}\) such that

$$x=P(w){ s}. $$

Moreover,

$$w=P\bigl( x^{\frac{1}{2}}\bigr) \bigl(P\bigl( x^{\frac{1}{2}}\bigr) s \bigr)^{-\frac{1}{2}} \bigl[=P\bigl( s^{-\frac{1}{2}}\bigr) \bigl(P\bigl( s^{\frac{1}{2}}\bigr) x \bigr)^{\frac{1}{2}} \bigr], $$

where \(\mathrm{int}\,\mathcal{K}\) denotes the interior of the cone \(\mathcal{K}\).

The point w is called the scaling point of x and s. Hence, there exists \(\tilde{v}\in\mathrm{int}\,\mathcal{K}\) such that

$$\tilde{v}=P(w)^{-\frac{1}{2}} x=P(w)^{\frac{1}{2}} s, $$

which is the so-called NT-scaling of \(\mathcal{J}\).

Let x, \(y\in \mathcal{J}\). We say that two elements x and y are similar, as denoted by xy, if and only if x and y share the same set of eigenvalues. We have \(x\in \mathcal{K}\) if and only if λ i ≥0, for all i=1,2,…,r and \(x\in \mathrm{ int}\,\mathcal{K}\) if and only if λ i >0, for all i=1,2,…,r. We also say x is positive semidefinite (positive definite) if \(x\in\mathcal{K}( x\in \mathrm{int}\,\mathcal{K})\).

Theorem 2

(Theorem III.2.1, Proposition III.2.2 in Faraut and Korányi 1994)

Let \(\mathcal{J}\) be a Euclidean Jordan algebra. Then \(\mathcal{K}(\mathcal{J})\) is a symmetric cone, and is the set of elements x in \(\mathcal{J}\) for which L(x) is positive semidefinite. Furthermore, if x is invertible, then \(P(x)\mathrm{int}\,\mathcal{K}(\mathcal{J})=\mathrm{ int}\,\mathcal{K}(\mathcal{J})\).

Lemma 2

(Lemma 30 in Schmieta and Alizadeh 2003)

Let x, \(s\in \mathrm{int}\,\mathcal{K}\). Then

$$\bigl\| P( x)^{\frac{1}{2}} s- e\bigr\| _F\leq\| x\circ s- e\|_F. $$

Lemma 3

(Theorem 4 in Sturm 2000)

Let x, \(s\in \mathrm{int}\,\mathcal{K}\). Then

$$\lambda_{\min}\bigl(P( x)^{\frac{1}{2}} s\bigr)\geq\lambda_{\min}( x\circ s). $$

Lemma 4

(Lemma 14 in Schmieta and Alizadeh 2003)

Let x, \(s\in \mathcal{J}\). Then, the eigenvalues of x+s are bounded as follows:

$$\lambda_{\min}(x+s)\geq\lambda_{\min}(x)+\lambda_{\min}(s)\geq\lambda_{\min}(x)-\|s\|_F. $$

3 Full NT-step based on Darvay’s technique

In this section, we briefly recall the central path for SCO, and Darvay’s technique extended to SCO by Wang and Bai (2012). Consider the following primal and dual problems:

$$ \min\quad \bigl\{ \langle c, x\rangle:\ Ax=b,\ x\in \mathcal{K} \bigr\} , $$
(P)

and

$$ \max\quad \bigl\{ b^Ty:\ A^Ty+s=c,\ s\in \mathcal{K} \bigr\} , $$
(D)

where, c and the rows of A lie in \(\mathcal{J}\), and bR m. Moreover, the rows of A are linearly independent. Throughout the paper, we assume that (P) and (D) satisfy the interior point condition (IPC), i.e., there exist \(x^{0}\in\mathrm{int}\,\mathcal{K}\) and \(s^{0}\in\mathrm{int}\,\mathcal{K}\) such that Ax 0=b and A T y 0+s 0=c. This can be achieved via the so-called homogeneous self-dual embedding (Sturm 2000). Under the IPC, finding an optimal solution of (P) and (D) is equivalent to solving the following system (Schmieta and Alizadeh 2003; Faybusovich 1997):

$$\begin{aligned} & Ax=b,\quad x\in \mathcal{K} \\ & A^Ty+s=c,\quad s\in \mathcal{K}\\ & x\circ s=0. \end{aligned}$$
(1)

The basic idea of primal-dual IPMs is to replace the third equation in (1), the so-called complementary condition for (P) and (D), by the parameterized equation xs=μe, with μ>0. Thus, one may consider

$$\begin{aligned} &Ax=b,\quad x\in \mathcal{K} \\ &A^Ty+s=c, \quad s\in \mathcal{K}\\ &x\circ s=\mu e. \end{aligned}$$
(2)

For each μ>0, the parameterized system (2) has a unique solution (x(μ),y(μ),s(μ)), is called the μ-center of (P) and (D), and these solutions form a curve parameterized by μ. This curve is called the central path and most IPMs approximately follow the central path to reach the optimal set. If μ⟶0, then the limit of the path exists and yields optimal solutions for (P) and (D) (Vieira 2007).

Similarly to the LO case (Darvay 2003), Wang and Bai (2012) replace the standard centering equation xs=μe by \(\varphi(\frac{x\circ s}{\mu})=\varphi(e)\) where φ(.) is the vector-valued function induced by the univariate function φ(t). Thus, the system (2) becomes

$$\begin{aligned} &Ax=b, \quad x\in \mathcal{K} \\ &A^Ty+s=c, \quad s\in \mathcal{K} \\ &\varphi\biggl(\frac{x\circ s}{\mu}\biggr)=\varphi(e). \end{aligned}$$
(3)

Applying Newton’s method to system (3), then using Taylor’s theorem to the third equation, leads to

$$\begin{aligned} &A\varDelta x=0, \\ &A^T\varDelta y+\varDelta s=0, \\ &x\circ \varDelta s+s\circ \varDelta x=\mu \biggl(\varphi^{\prime} \biggl( \frac{x\circ s}{\mu} \biggr) \biggr)^{-1}\circ \biggl(\varphi(e)-\varphi \biggl(\frac{x\circ s}{\mu} \biggr) \biggr). \end{aligned}$$
(4)

Due to the fact that L(x)L(s)≠L(s)L(x), system (4) does not always have a unique solution in \(\mathrm{int}\,\mathcal{K}\). It is well known that this difficulty can be resolved by applying a scaling scheme. This is given in the following lemma.

Lemma 5

(Lemma 28 in Schmieta and Alizadeh 2003)

Let \(u\in\mathrm{int}\,\mathcal{K}\). Then

$$x\circ s=\mu e\quad\Leftrightarrow\quad P(u) x\circ P(u)^{-1} s=\mu e. $$

Replacing the third equation of the system (3) by

$$\varphi \biggl(\frac{P(u)x\circ P(u)^{-1}s}{\mu}\biggr)=\varphi(e), $$

and applying Newton’s method to the resulting system leads us to the following system

$$ \begin{aligned} &A\varDelta x=0, \\ &A^T\varDelta y+\varDelta s=0, \\ &P(u)x\circ P(u)^{-1}\varDelta s+P(u)^{-1}s\circ P(u)\varDelta x \\ &\quad{}=\mu \biggl(\varphi^{\prime} \biggl(\frac{P(u)x\circ P(u)^{-1}s}{\mu}\biggr) \biggr)^{-1}\circ \biggl(\varphi(e)-\varphi \biggl(\frac{P(u)x\circ P(u)^{-1}s}{\mu}\biggr) \biggr). \end{aligned} $$
(5)

Let \(u=w^{-\frac{1}{2}}\), where w is the NT-scaling point of x and s as defined in Lemma 1. We define

$$ v:=\frac{P(w)^{-\frac{1}{2}} x}{\sqrt{\mu}} \biggl[=\frac{P(w)^{\frac{1}{2}} s}{\sqrt{\mu}} \biggr], $$
(6)

and

$$ \overline {A}:=\sqrt{\mu}AP(w)^{\frac{1}{2}},\qquad d_x:= \frac{P(w)^{-\frac{1}{2}}\varDelta x}{\sqrt{\mu}},\qquad d_s:=\frac{P(w)^{\frac{1}{2}}\varDelta { s}}{\sqrt{\mu}}. $$
(7)

This enables us to rewrite the system (5), considering \(\varphi(t)=\sqrt{t}\), as follows:

$$\begin{aligned} &\overline{A}d_x=0, \\ &\overline{A}^T\frac{\varDelta {y}}{\mu}+d_s=0,\\ &d_x+d_s=2(e-v):=p_v. \end{aligned}$$
(8)

The search directions d x and d s are obtained by solving (8) so that Δx and Δs are computed via (7). The new iterate is obtained by taking a full NT-step as follows

$$x^+:=x+\varDelta x, \qquad s^+:=s+\varDelta s. $$

For the analysis of the algorithm, we define a norm-based proximity measure σ(x,s;μ) as follows

$$\begin{aligned} \sigma(v):=\sigma(x, s; \mu):=\frac{\|p_v\|_F}{2}=\|e-v \|_F. \end{aligned}$$
(9)

We can conclude that

$$\begin{aligned} \sigma(v)=0\quad\Leftrightarrow\quad v=e\quad\Leftrightarrow\quad d_x=d_s=0\quad \Leftrightarrow\quad x\circ s=\mu e. \end{aligned}$$
(10)

Hence, the value of σ(v) can be considered as a measure for the distance between the given triple (x,y,s) and the μ-center.

The following lemma is crucial in the analysis of the algorithm, which states that the Newton process is quadratically convergent. We recall it without proof.

Lemma 6

(Lemma 4.4 in Wang and Bai 2012)

Let σ:=σ(x,s;μ)<1. Then

$$\sigma\bigl(x^+, s^+; \mu\bigr)\leq\frac{\sigma^2}{1+\sqrt{1-\sigma^2}}. $$

Thus σ(x +,s +;μ)≤σ 2, which means the quadratical convergence of the algorithm.

The following lemma provides an iteration bound \(O (\sqrt{r} \log\frac{\langle x^{0}, s^{0}\rangle}{\epsilon} )\) for the algorithm.

Lemma 7

(Theorem 4.1 in Wang and Bai 2012)

Let \(\theta=\frac{1}{2\sqrt{r}}\). Then the algorithm requires at most

$$O \biggl(\sqrt{r} \log\frac{\langle x^0, s^0\rangle}{\epsilon} \biggr), $$

iterations. The output is a primal-dual pair (x,s) satisfyingx,s〉≤ϵ.

4 A full NT-step infeasible IPM

In the case of an infeasible method, we call the triple (x,y,s) an ϵ-solution of (P) and (D) if the norms of the residual vectors bAx and cA T ys do not exceed ϵ, and also tr(xs)≤ϵ. In what follows, we present an infeasible-start algorithm that generates an ϵ-solution of (P) and (D), if it exists, or establishes that no such solution exists.

4.1 The perturbed problems

We assume (P) and (D) have an optimal solution (x ,y ,s ) with tr(x s )=0. As usual for IIPMs, we start the algorithm with (x 0,y 0,s 0)=ξ(e,0,e) and μ 0=ξ 2, where ξ is a positive number such that

$$\begin{aligned} x^*+s^*\preceq_\mathcal{K}\xi e, \end{aligned}$$
(11)

where the Löwner partial ordering \(\succeq_{\mathcal{K}}\) of \(\mathcal{J}\) defined by a cone \(\mathcal{K}\) is defined by \(x\succeq_{\mathcal{K}}s\) if \(x-s\in\mathcal{K}\). The initial values of the primal and dual residual vectors are \(r_{p}^{0}=b-Ax^{0}\) and \(r_{d}^{0}=c-A^{T}y^{0}-s^{0}\). In general, \(r_{p}^{0}\neq0\) and \(r_{d}^{0}\neq0\). The iterates generated by the algorithm will be infeasible for (P) and (D), but they will be feasible for perturbed versions of (P) and (D) that we introduce below. For any ν with 0<ν≤1, we consider the perturbed problem

figure a

and its dual problem

figure b

Note that if ν=1, then x=x 0 and (y,s)=(y 0,s 0) yield strictly feasible solutions of (P ν ) and (D ν ), respectively. We conclude that if ν=1, both (P ν ) and (D ν ) are strictly feasible, which means that both perturbed problems (P ν ) and (D ν ) satisfy the IPC. It should be mentioned that the problems (P ν ) and (D ν ) for LO have been studied in Freund (1996). More generally, one has the following lemma (Lemma 4.1 in Gu et al. 2011).

Lemma 8

Let (P) and (D) be feasible and 0<ν≤1. Then, the perturbed problems (P ν ) and (D ν ) satisfy the IPC.

Assuming that (P) and (D) are both feasible, it follows from Lemma 8 that the problems (P ν ) and (D ν ) satisfy the IPC, for each ν∈(0,1]. Then, their central paths exist, meaning that the system

$$\begin{aligned} &b-Ax=\nu r_p^{0},\quad x\in\mathcal{K}, \\ &c-A^Ty-s=\nu r_d^{0},\quad s\in\mathcal{K},\\ &x\circ s=\mu e, \end{aligned}$$
(12)

has a unique solution, for any μ>0. For ν∈(0,1] and μ=νμ 0=νξ 2, we denote this unique solution as (x(μ,ν),y(μ,ν),s(μ,ν)), where x(μ,ν) is the μ-center of (P ν ) and (y(μ,ν),s(μ,ν)) is the μ-center of (D ν ). The parameters μ and ν will always be in a one-to-one correspondence, according to μ=νμ 0=νξ 2. For the sake of simplicity, we denote x(μ)=x(μ,ν), y(μ)=y(μ,ν) and s(μ)=s(μ,ν). By taking ν=1, one has (x(1),y(1),s(1))=(x 0,y 0,s 0)=(ξe,0,ξe) and x 0s 0=μ 0 e. Hence, x 0 is the μ 0-center of the perturbed problem (P 1) and (y 0,s 0) is the μ 0-center of the perturbed problem (D 1).

Similarly to the feasible case, we replace the standard centering equation xs=μe by \(\varphi (\frac{x\circ s}{\mu})=\varphi(e)\). Then, the system of (12) becomes

$$\begin{aligned} &b-Ax=\nu r_p^{0},\quad x\in\mathcal{K}, \end{aligned}$$
(13)
$$\begin{aligned} &c-A^Ty-s=\nu r_d^{0},\quad s\in\mathcal{K}, \end{aligned}$$
(14)
$$\begin{aligned} &\varphi \biggl(\frac{x\circ s}{\mu}\biggr)=\varphi(e). \end{aligned}$$
(15)

4.2 An iteration of our algorithm

We just established that if ν=1 and μ=μ 0, then (x 0,y 0,s 0) is the μ-center of the problems (P ν ) and (D ν ). We measure proximity to the μ-center of the perturbed problems by the quantity σ(x,s;μ) as defined in (9).

Initially, we have σ(x,s;μ)=0. In the sequel, we assume that at the start of each iteration, just before the μ-and ν-update, σ(x,s;μ)≤τ, where τ is a positive threshold value. This certainly holds at the start of the first iteration. Since we then have σ(x,s;μ)=0.

Now, we describe one main iteration of our algorithm. The algorithm begins with an infeasible interior-point (x,y,s) such that (x,y,s) is feasible for the perturbed problems (P ν ) and (D ν ), with μ=νμ 0 and such that tr(xs)≤ and σ(x,s;μ)≤τ. We reduce ν to ν +=(1−θ)ν, with θ∈(0,1), and find new iterate (x +,y +,s +) that is feasible for the perturbed problems (\(P_{\nu^{+}}\)) and (\(D_{\nu^{+}}\)), and such that σ(x +,s +;μ +)≤τ. Every iteration consists of a feasibility step, a μ-update and a few centering steps, respectively. First, we find a new point (x f,y f,s f) which is feasible for the perturbed problems with ν +:=(1−θ)ν. Then, μ is decreased to μ +:=(1−θ)μ. Generally, there is no guarantee that σ(x f,s f;μ +)≤τ. So, a few centering steps is applied to produce a new point (x +,y +,s +) such that σ(x +,s +;μ +)≤τ. This process is repeated until the algorithm terminates. We now summarize the steps of the algorithm as Algorithm 1.

Algorithm 1
figure 1

A full NT-step IIPM based on Darvay’s technique.

5 Analysis of the feasibility step

First, we describe the feasibility step in details. The analysis will follow in the sequel. Suppose that we have strictly feasible iterate (x,y,s) for (P ν ) and (D ν ). This means that (x,y,s) satisfies (13) and (14) with μ=νξ 2. We need displacements Δ f x, Δ f y and Δ f s such that

$$ x^f:=x+\varDelta ^fx,\qquad y^f:=y+ \varDelta ^fy,\qquad s^f:=s+\varDelta ^fs, $$
(16)

are feasible for (\(P_{\nu^{+}}\)) and (\(D_{\nu^{+}}\)). One may easily verify that (x f,y f,s f) satisfies (13) and (14), with ν replaced by ν +, only if the first two equations in the following system are satisfied.

$$ \begin{aligned} &A\varDelta ^fx=\theta\nu r_p^{0}, \\ &A^T\varDelta ^fy+\varDelta ^fs=\theta\nu r_d^{0} \\ &P(w)^{-\frac{1}{2}}x\circ P(w)^{\frac{1}{2}}{\varDelta ^f s}+P(w)^{\frac{1}{2}}s\circ P(w)^{-\frac{1}{2}}{\varDelta ^f x} \\ &\quad{}=\mu \biggl(\varphi^{\prime} \biggl(\frac{P(w)^{-\frac{1}{2}}x\circ P(w)^{\frac{1}{2}}s}{\mu}\biggr) \biggr)^{-1}\circ \biggl(\varphi(e)-\varphi \biggl(\frac{P(w)^{-\frac{1}{2}}x\circ P(w)^{\frac{1}{2}}s}{\mu}\biggr) \biggr). \end{aligned} $$
(17)

The third equation is inspired by the third equation in the system (5) that we used to define search directions for the feasible case.

According to (17), after the feasibility step the iterates satisfy the affine equations in (13) and (14), with ν replaced by ν +. The hard part in the analysis will be to guarantee that x f, s f are positive and to guarantee that the new iterate satisfies σ(x f,s f;μ +)≤h<1.

Let (x,y,s) denote the iterate at the start of an iteration with tr(xs)≤ and σ(x,s;μ)≤τ. At the start of the first iteration this is certainly true, because tr(x 0s 0)= 0 and σ(x 0,s 0;μ 0)=0. Define

$$ d_x^f:=\frac{1}{\sqrt{\mu}}P(w)^{-\frac{1}{2}} \varDelta ^fx,\qquad d_s^f:=\frac{1}{\sqrt{\mu}}P(w)^{\frac{1}{2}} \varDelta ^fs, $$
(18)

where w is the NT-scaling point of x and s. One can easily check that the system (17), which defines the search directions Δ f x, Δ f y and Δ f s, considering \(\varphi(t)=\sqrt{t}\), can be expressed in terms of the scaled search directions \(d_{x}^{f}\) and \(d_{s}^{f}\) as follows

$$\begin{aligned} &\bar{A}d_x^f=\theta \nu r_p^0, \\ &\bar{A}^T\frac{\varDelta ^fy}{\mu}+d_s^f=\frac{1}{\sqrt{\mu}}\theta\nu P(w)^{\frac{1}{2}}r_d^0, \\ &d_x^f+d_s^f=2(e-v)=p_v, \end{aligned}$$
(19)

where \(\bar{A}={\sqrt{\mu}}AP(w)^{\frac{1}{2}}\). To get the search directions Δ f x and Δ f s in the original x and s-space we use (18), which gives

$$\varDelta ^fx=\sqrt{\mu}P(w)^{\frac{1}{2}}d_x^f,\qquad \varDelta ^fs=\sqrt{\mu}P(w)^{-\frac{1}{2}}d_s^f. $$

The new iterates are obtained by taking a full step, as given by (16). Hence, we have

$$\begin{aligned} x^f =&x+\varDelta ^fx=\sqrt{\mu}P(w)^{\frac{1}{2}} \bigl(v+d_x^f\bigr), \\ s^f =&s+\varDelta ^fs=\sqrt{\mu}P(w)^{-\frac{1}{2}} \bigl(v+d_s^f\bigr). \end{aligned}$$

From the third equation in (19) we derive that

$$\begin{aligned} \bigl(v+d_x^f\bigr)\circ \bigl(v+d_s^f\bigr) =&v^2+v\circ \bigl(2(e-v) \bigr)+d_x^f\circ d_s^f \\ =&v^2+v\circ p_v+d_x^f\circ d_s^f. \end{aligned}$$
(20)

Moreover, from p v =2(ev), we have

$$ v+\frac{1}{2}p_v=e\quad\Longrightarrow\quad v^2+v\circ p_v=e-\frac{1}{4}p_v^2. $$
(21)

Let us introduce the notation

$$ \bar{p}_v:=d_x^f-d_s^f. $$
(22)

In that case, we have

$$ d_x^f\circ d_s^f= \frac{p_v^2-\bar{p}_v^2}{4}. $$
(23)

Using (20), (21) and (23), we get

$$ \bigl(v+d_x^f\bigr)\circ \bigl(v+d_s^f\bigr)=e-\frac{1}{4}p_v^2+ \frac{1}{4}\bigl(p_v^2-\bar{p}_v^2 \bigr) =e-\frac{1}{4}\bar{p}_v^2. $$
(24)

Lemma 9

(Lemma 4.1 in Wang and Bai 2012)

Let x(α)=x+αΔx, s(α)=s+αΔs for 0≤α≤1 with x, \(s \in\mathrm{int}\,\mathcal{K}\) and \(x(\alpha)\circ s(\alpha)\in\mathrm{int}\,\mathcal{K}\) for \(\alpha\in[0, \bar{\alpha}]\). Then, \(x(\bar{\alpha})\in\mathrm{int}\,\mathcal{K}\) and \(s(\bar{\alpha})\in\mathrm{int}\,\mathcal{K}\).

Since \(P(w)^{\frac{1}{2}}\) and \(P(w)^{-\frac{1}{2}}\) are automorphisms of \(\mathrm{int}\,\mathcal{K}\), Theorem 2 implies that x f and s f belong to \(\mathrm{int}\,\mathcal{K}\) if and only if \(v+d^{f}_{x}\) and \(v+d_{s}^{f}\) belong to \(\mathrm{int}\,\mathcal{K}\), respectively. The following lemma shows the strict feasibility of the full NT-step.

Lemma 10

Let \(x\in\mathrm{int}\,\mathcal{K}\) and \(s\in\mathrm{int}\,\mathcal{K}\). Then the iterates (x f,y f,s f) are strictly feasible if

$$\biggl|\lambda_i \biggl(\frac{\bar{p}_v}{2} \biggr) \biggr|<1,\quad i=1, 2, \ldots, r. $$

Proof

Introduce a step length α with α∈[0,1] and define

$$v_x(\alpha)=v+\alpha d^f_x,\qquad v_s(\alpha)=v+\alpha d^f_s. $$

We thus have

$$\begin{aligned} v_x(\alpha)\circ v_s(\alpha) =&v^2+\alpha v \circ\bigl(d_x^f+d_s^f\bigr)+ \alpha^2d_x^f\circ d^f_s \\ =&v^2+\alpha v\circ p_v+\alpha^2d_x^f \circ d_s^f \\ =&(1-\alpha)v^2+\alpha\biggl(e-\frac{p_v^2}{4}\biggr)+ \frac{\alpha^2}{4}\bigl(p_v^2-\bar{p}_v^2 \bigr). \end{aligned}$$

If \(|\lambda_{i}(\frac{\bar{p}_{v}}{2})|<1\), then we have

$$0\prec_{\mathcal{K}}e-\frac{1}{4}\bar{p}_v^2=e- \frac{1}{4}p_v^2+\frac{1}{4} \bigl(p_v^2-\bar{p}_v^2\bigr), $$

and this implies that

$$ \frac{1}{4}\bigl(p_v^2-\bar{ p}_v^2\bigr)\succ_{\mathcal{K}}-e+\frac{1}{4}p_v^2. $$
(25)

Therefore, we get for 0≤α≤1

$$\begin{aligned} &(1-\alpha)v^2+\alpha \biggl(e-\frac{p_v^2}{4} \biggr)+ \frac{\alpha^2}{4} \bigl(p_v^2-{\bar{p}}_v^2 \bigr) \\ &\quad{}\succ_{\mathcal{K}}(1-\alpha)v^2+\alpha \biggl(e- \frac{p_v^2}{4} \biggr)+{\alpha^2} \biggl(-e+\frac{1}{4}p_v^2 \biggr) \\ &\quad{}=(1-\alpha)v^2+\alpha(1-\alpha) \biggl(e-\frac{p_v^2}{4} \biggr) \\ &\quad{}=(1-\alpha)^2v^2+2\alpha(1-\alpha) v \succeq_{\mathcal{K}}0. \end{aligned}$$
(26)

So \(v_{x}(\alpha)\circ v_{s}(\alpha)\in\mathrm{int}\,\mathcal{K}\) for α∈[0,1]. Hence, since x, \(s \in\mathrm{int}\,\mathcal{K}\), Lemma 9 implies that \(v_{x}(1)=v+d_{x}^{f}\in\mathrm{ int}\,\mathcal{K}\) and \(v_{s}(1)=v+d_{s}^{f}\in\mathrm{int}\,\mathcal{K}\). This completes the proof. □

In the sequel, we denote

$$\vartheta(v):=\frac{1}{2}\sqrt{\bigl\| d_x^f\bigr\| _F^2+\bigl\| d_s^f\bigr\| _F^2}, $$

which implies \(\|d_{x}^{f}\|_{F}\leq2\vartheta(v)\) and \(\|d_{s}^{f}\|_{F}\leq2\vartheta(v)\). Moreover we have

$$ \bigl\| d_x^f\circ d_s^f \bigr\| _F\leq\frac{1}{2}\bigl(\bigl\| d_x^f \bigr\| _F^2+ \bigl\| d_s^f \bigr\| _F^2\bigr)=2\vartheta(v)^2, $$
(27)
$$ \bigl|\lambda_i\bigl(d_x^f\circ d_s^f\bigr)\bigr|\leq\bigl\| d_x^f\circ d_s^f\bigr\| _F\leq2\vartheta(v)^2,\quad i=1, 2, \ldots, r. $$
(28)

We proceed by deriving an upper bound for σ(x f,s f,μ +). Recall from definition (9) that

$$\begin{aligned} \sigma\bigl(x^f, s^f; \mu^+\bigr):=\sigma \bigl(v^f\bigr)=\bigl\| e-v^f\bigr\| _F, \end{aligned}$$
(29)

where \(v^{f}:=\frac{1}{\sqrt{\mu(1-\theta)}}P(w^{f})^{-\frac{1}{2}}x^{f} [=\frac{1}{\sqrt{\mu(1-\theta)}}P(w^{f})^{\frac{1}{2}}s^{f} ]\) and w f is the scaling point of x f and s f.

Lemma 11

(Lemma 4.3 in Gu et al. 2011)

One has

$$\sqrt{1-\theta} v^f\sim \bigl[P\bigl(v+d^f_x \bigr)^{\frac{1}{2}}\bigl(v+d^f_s\bigr) \bigr]^{\frac{1}{2}}. $$

Lemma 12

If \(|\lambda_{i} (\frac{\bar{p}_{v}}{2} ) |<1\), i=1,2,…,r. Then

$$\sigma\bigl(v^f\bigr)\leq\frac{\sigma(v)^2+2\vartheta(v)^2+\theta \sqrt{r}}{1-\theta+\sqrt{(1-\theta)(1-\sigma(v)^2-2\vartheta(v)^2)}}. $$

Proof

Using Lemma 11, Lemma 2 and (24) we have

$$\begin{aligned} \sigma\bigl(v^f\bigr) =&\bigl\| e-v^f \bigr\| _F\leq\frac{1}{1+\lambda_{\min}(v^f)}\bigl\| e-\bigl(v^f \bigr)^2\bigr\| _F \\ =&\frac{1}{1+\lambda_{\min}(v^f)} \biggl\| e-P \biggl(\frac{v+d^f_x}{\sqrt{1-\theta}} \biggr)^{\frac{1}{2}} \biggl(\frac{v+d^f_s}{\sqrt{1-\theta}} \biggr) \biggr\| _F \\ \leq&\frac{1}{1+\lambda_{\min}(v^f)} \biggr\| e- \biggl(\frac{v+d^f_x}{ \sqrt{1-\theta}} \biggr)\circ \biggl( \frac{v+d^f_s}{\sqrt{1-\theta}} \biggr) \biggl\| _F \\ =&\frac{1}{1+\lambda_{\min}(v^f)} \biggl\| e-\frac{e-\frac{1}{4}\bar{p}_v^2}{1-\theta} \biggr\| _F \\ =&\frac{1}{(1-\theta)(1+\lambda_{\min}(v^f))} \biggr\| -\theta e+\frac{1}{4}\bar{p}_v^2 \biggl\| _F \\ \leq&\frac{1}{(1-\theta)(1+\lambda_{\min}(v^f))} \biggl(\theta\sqrt{r}+ \biggl(\frac{\|\bar{p}_v\|_F}{2} \biggr)^2 \biggr). \end{aligned}$$
(30)

By Lemma 11, Lemma 3, (20), (21), Lemma 4 and (28) we get

$$\begin{aligned} \lambda_{\min}\bigl(\bigl(v^f \bigr)^2\bigr) =&\lambda_{\min} \biggl(P \biggl( \frac{v+d^f_x}{\sqrt{1-\theta}} \biggr)^{\frac{1}{2}} \biggl(\frac{v+d^f_s}{\sqrt{1-\theta}} \biggr) \biggr) \\ \geq&\lambda_{\min} \biggl( \biggl(\frac{v+d^f_x}{\sqrt{1-\theta}} \biggr)\circ \biggl(\frac{v+d^f_s}{\sqrt{1-\theta}} \biggr) \biggr) \\ =&\frac{1}{1-\theta}\lambda_{\min} \biggl(e-\frac{1}{4}{p}^2_v+d^f_x \circ d^f_s \biggr) \\ \geq&\frac{1}{1-\theta} \biggl(1-\frac{1}{4} \|p_v\|^2_F+\lambda_{\min} \bigl(d^f_x\circ d^f_s \bigr) \biggr) \\ \geq&\frac{1}{1-\theta} \bigl(1-\sigma(v)^2-2\vartheta(v)^2 \bigr). \end{aligned}$$
(31)

By substituting (31) into (30) and using (9), the third equation of (19) and (27), we get

$$\begin{aligned} \sigma\bigl(v^f\bigr) \leq&\frac{\sqrt{1-\theta}}{(1-\theta) (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \biggl( \biggl(\frac{\|\bar{p}_v\|_F}{2} \biggr)^2+\theta\sqrt{r} \biggr) \\ =&\frac{\sqrt{1-\theta}}{(1-\theta) (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \biggl(\frac{1}{4}\bigr\| d_x^f-d_s^f \bigl\| _F^2+\theta\sqrt{r} \biggr) \\ =&\frac{\frac{1}{4} (\|d_x^f+d_s^f\|_F^2-4\langle d_x^f, d_s^f\rangle )+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \\ =&\frac{\frac{1}{4} (\|p_v\|_F^2-4\langle d_x^f, d_s^f\rangle )+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \\ =&\frac{\sigma(v)^2-\langle d_x^f, d_s^f\rangle+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \\ \leq&\frac{\sigma(v)^2+\|d_x^f\|_F\|d_s^f\|_F+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \\ \leq&\frac{\sigma(v)^2+\frac{1}{2} (\|d_x^f\|_F^2+\|d_s^f\|_F^2 )+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )} \\ =&\frac{\sigma(v)^2+2\vartheta(v)^2+\theta\sqrt{r}}{\sqrt{1-\theta} (\sqrt{1-\sigma(v)^2-2\vartheta(v)^2}+\sqrt{1-\theta} )}, \end{aligned}$$

this completes the proof. □

Because we need to have σ(v f)≤h<1 (\(\mathrm{default}\ h=\frac{1}{\sqrt{2}}\)), it follows from Lemma 12 that it suffices to have

$$ \frac{\sigma(v)^2+2\vartheta(v)^2+\theta \sqrt{r}}{1-\theta+\sqrt{(1-\theta)(1-\sigma(v)^2-2\vartheta(v)^2)}}\leq\frac{1}{\sqrt{2}}<1. $$
(32)

At this stage, we choose

$$ \tau=\frac{1}{16},\qquad \theta=\frac{\alpha}{2\sqrt{r}},\quad\alpha\leq1. $$
(33)

The left-hand side of (32) is monotonically increasing with respect to ϑ(v)2, then for r≥1 and σ(v)≤τ, one can verify that

$$\begin{aligned} \vartheta(v)\leq\frac{1}{2\sqrt{2}}\quad\Rightarrow \quad\sigma \bigl(v^f\bigr)\leq\frac{1}{\sqrt{2}}<1. \end{aligned}$$
(34)

5.1 Upper bound for ϑ(v)

In this section, we obtain an upper bound for ϑ(v) which will enable us to find a default value for θ. For this purpose, consider the system (19) which defines the search directions Δ f x, Δ f y and Δ f s in terms of the scaled search directions \(d_{x}^{f}\) and \(d_{s}^{f}\). By eliminating \(d_{s}^{f}\) from (19), we get

$$ \begin{aligned} &\bar{A}d_x^f=\theta \nu r_p^0,\\ &{-}{\bar{A}}^T\frac{\varDelta ^fy}{\mu}+d_x^f=p_v-\frac{1}{\sqrt{\mu}}\theta\nu P(w)^{\frac{1}{2}}r_d^0. \end{aligned} $$
(35)

Multiplying the second equation of (35) from left with \(\bar{A}\) and using the first equation of (35) it follows that

$$ -\frac{\varDelta ^fy}{\mu}= \bigl(\bar{A}\bar{A}^T \bigr)^{-1}\bar{A} \biggl(p_v-\frac{1}{\sqrt{\mu}}\theta\nu P(w)^{\frac{1}{2}}r_d^0 \biggr)-\theta\nu \bigl(\bar{A} \bar{A}^T \bigr)^{-1}r_p^0. $$

Substitution into the second equation of (35) gives

$$ d_x^f= \bigl(I-\bar{A}^T \bigl(\bar{A} \bar{A}^T \bigr)^{-1}\bar{A} \bigr) \biggl(p_v- \frac{1}{\sqrt{\mu}}\theta\nu P(w)^{\frac{1}{2}}r_d^0 \biggr)+\theta\nu\bar{A}^T \bigl(\bar{A} \bar{A}^T \bigr)^{-1}r_p^0. $$

Let (x ,y ,s ) be the optimal solution satisfying (11). Then, we have

$$ r_p^0=A\bigl(x^*-x^0\bigr),\qquad r_d^0=A^T\bigl(y^*-y^0\bigr)+ \bigl(s^*-s^0\bigr), $$

and

$$ \bigl(I-\bar{A}^T \bigl(\bar{A}\bar{A}^T \bigr)^{-1}\bar{A} \bigr)\bar{A}^T\bigl(y^*-y^0 \bigr)=0. $$

Hence, it follows that

$$\begin{aligned} d_x^f =& \bigl(I-\bar{A}^T \bigl(\bar{A} \bar{A}^T \bigr)^{-1}\bar{A} \bigr) \biggl(p_v- \frac{\theta\nu}{\sqrt{\mu}}P(w)^{\frac{1}{2}} \bigl(s^*-s^0 \bigr) \biggr) \\ &{}+\frac{\theta\nu}{\sqrt{\mu}}\bar{A}^T \bigl(\bar{A}\bar{A}^T \bigr)^{-1}\bar{A} P(w)^{-\frac{1}{2}} \bigl(x^*-x^0 \bigr). \end{aligned}$$

For \(d_{s}^{f}\), by using \(d_{x}^{f}+d_{s}^{f}=p_{v}\), we obtain

$$\begin{aligned} d_s^f =& \frac{\theta\nu}{\sqrt{\mu}} \bigl(I- \bar{A}^T \bigl(\bar{A}\bar{A}^T \bigr)^{-1} \bar{A} \bigr)P(w)^{\frac{1}{2}} \bigl(s^*-s^0 \bigr) \\ &{}+\bar{A}^T \bigl(\bar{A}\bar{A}^T \bigr)^{-1}\bar{A} \biggl(p_v-\frac{\theta\nu}{\sqrt{\mu}}P(w)^{-\frac{1}{2}} \bigl(x^*-x^0 \bigr) \biggr). \end{aligned}$$

Therefore, using orthogonality and the Cauchy-Schwartz inequality, after some computations, we obtain

$$\begin{aligned} \bigl\| d_x^f\bigr\| ^2_F+ \bigl\| d_s^f\bigr\| _F^2 \leq& 2 \|p_v\|_F^2 \\ &{}+\frac{3\theta^2\nu^2}{\mu} \bigl( \bigl\| P(w)^{-\frac{1}{2}} \bigl(x^*-x^0 \bigr) \bigr\| _F^2+ \bigl\| P(w)^{\frac{1}{2}} \bigl(s^*-s^0 \bigr) \bigr\| _F^2 \bigr) \\ =&8\sigma(v)^2 +\frac{3\theta^2\nu^2}{\mu} \bigl( \bigl\| P(w)^{-\frac{1}{2}} \bigl(x^*-x^0 \bigr) \bigr\| _F^2+ \bigl\| P(w)^{\frac{1}{2}} \bigl(s^*-s^0 \bigr) \bigr\| _F^2 \bigr). \end{aligned}$$
(36)

Since (x ,y ,s ) is a primal-dual feasible solution we have \(x^{*}\succeq_{\mathcal{K}}0\) and \(s^{*}\succeq_{\mathcal{K}}0\). Hence

$$ 0\preceq_{\mathcal{K}} x^0-x^*\preceq_{\mathcal{K}}\xi e, \qquad 0 \preceq_{\mathcal{K}} s^0-s^*\preceq_{\mathcal{K}}\xi e. $$

Using that \(P(w)^{\frac{1}{2}}\) is self-adjoint with respect to the inner product 〈⋅,⋅〉 and P(w)e=w 2, we may write

$$\begin{aligned} \bigl\| P(w)^{\frac{1}{2}}\bigl(s^0-s^*\bigr)\bigr\| _F^2 =& \bigl\langle P(w) \bigl(s^0-s^*\bigr), \bigl(s^0-s^* \bigr) \bigr\rangle \\ =& \bigl\langle P(w) \bigl(s^0-s^*\bigr), \xi e \bigr\rangle - \bigl\langle P(w) \bigl(s^0-s^*\bigr), \xi e-\bigl(s^0-s^* \bigr) \bigr\rangle \\ \leq& \bigl\langle P(w) \bigl(s^0-s^*\bigr), \xi e \bigr\rangle = \bigl\langle s^0-s^*, P(w)\xi e \bigr\rangle =\xi \bigl\langle s^0-s^*, w^2 \bigr\rangle \\ =&\xi \bigl\langle \xi e, w^2 \bigr\rangle -\xi \bigl\langle \xi e- \bigl(s^0-s^*\bigr), w^2 \bigr\rangle \leq\xi \bigl\langle \xi e, w^2 \bigr\rangle =\xi^2\mathit{tr}\bigl(w^2 \bigr). \end{aligned}$$

In the same way it follows that

$$ \bigl\| P(w)^{-\frac{1}{2}}\bigl(x^0-x^*\bigr) \bigr\| _F^2 \leq\xi^2\mathit{tr}\bigl(w^{-2}\bigr). $$

Substitution of the last two inequalities into (36) and use from μ=νξ 2 gives

$$\begin{aligned} \bigl\| d_x^f\bigr\| ^2_F+ \bigl\| d_s^f\bigr\| _F^2 \leq& 8 \sigma(v)^2+{3\theta^2\nu} \mathit{tr} \bigl(w^2+w^{-2} \bigr) \\ \leq& 8\sigma(v)^2+\frac{3\theta^2}{\xi^2\lambda_{\min}(v)^2}\mathit{tr}(x+s)^2 \\ \leq& 8\sigma(v)^2+\frac{3\theta^2}{\xi^2(1-\sigma(v))^2}\mathit{tr}(x+s)^2, \end{aligned}$$
(37)

where the second inequality follows by Lemma 4.5 in Gu et al. (2011) and the third inequality follows by (9), because

$$ \sigma(v)^2=\|e-v\|^2_F=\sum _{i=1}^r\bigl(1-\lambda_i(v) \bigr)^2, $$

implies that

$$ 1-\sigma(v)\leq\lambda_i(v)\leq 1+\sigma(v), \quad i=1, 2, \ldots, r. $$

Lemma 13

(Lemma 4.6 in Gu et al. 2011)

Let (x,y,s) be feasible for the perturbed problems (P ν ) and (D ν ) and let (x 0,y 0,s 0)=(ξe,0,ξe) and (x ,y ,s ) be as defined in (11). Then

$$ \mathit{tr}(x+s)\leq 2\xi r, $$
(38)

where r=tr(e) is the rank of the associated Euclidean Jordan algebra.

Substituting (38) into (36), we obtain

$$ \bigl\| d_x^f\bigr\| _F^2+ \bigl\| d_s^f\bigr\| _F^2\leq8 \sigma(v)^2+\frac{12\theta^2r^2}{(1-\sigma(v))^2}. $$
(39)

5.2 Value for θ

At this stage, we choose \(\tau=\frac{1}{16}\). Since \(\sigma(v)\leq\tau=\frac{1}{16}\) and the right-hand-side of (39) is monotonically increasing in σ(v), we have

$$ \bigl\| d_x^f\bigr\| _F^2+\bigl\| d_s^ f\bigr\| _F^2\leq\frac{1}{32}+\frac{3072\theta^2r^2}{225}. $$

Using \(\theta=\frac{\alpha}{2\sqrt{r}}\), the above relation becomes

$$ \bigl\| d_x^f\bigr\| _F^2+ \bigl\| d_s^f\bigr\| _F^2\leq \frac{1}{32}+\frac{768 r\alpha^2}{225}. $$
(40)

From (34) we know that \(\vartheta(v)\leq\frac{1}{2\sqrt{2}}\) is needed in order to have \(\sigma(v^{f})\leq\frac{1}{\sqrt{2}}\). Due to (40), this will hold if

$$ \frac{1}{32}+\frac{768 r\alpha^2}{225}\leq\frac{1}{2}. $$

If we take

$$ \alpha=\frac{1}{\sqrt{8 r}}, $$
(41)

the above inequality is satisfied.

5.3 Complexity analysis

We have seen that if at the start of an iteration the iterate satisfies σ(x,s;μ)≤τ, with \(\tau=\frac{1}{16}\), then after the feasibility step, with θ as defined in (33) and α as in (41), the iterate is strictly feasible and satisfies \(\sigma(x^{f}, s^{f}; \mu^{+})\leq\frac{1}{\sqrt{2}}\).

After the feasibility step, we perform a few centering steps in order to get the iterate (x +,y +,s +) which satisfies σ(x +,s +;μ +)≤τ. By Lemma 6, after k centering steps we will have the iterate (x +,y +,s +) that is still feasible for (\(P_{\nu^{+}}\)) and (\(D_{\nu^{+}}\)) and satisfies

$$ \sigma\bigl(x^+, s^+; \mu^+\bigr)\leq {\biggl(\frac{1}{\sqrt{2}}\biggr)^2}^k. $$

From this, one easily deduces that σ(x +,s +;μ +)≤τ will hold after at most

$$ 1+\log_2\log_2\frac{1}{\tau}=1+ \log_2\log_216=1+\log_24=3, $$
(42)

centering steps. So, each main iteration consists of one feasibility step and at most 3 centering steps.

In each main iteration both the duality gap and the norms of the residual vectors are reduced by the factor 1−θ. Hence, the total number of main iterations is bounded above by

$$\begin{aligned} \frac{1}{\theta}\log\frac{\max\{r\xi^2, \|r_p^0\|_F, \|r_d^0\|_F\}}{\epsilon}, \end{aligned}$$

the above iteration bound was first obtained by Potra (1996), and it is still the best-known iteration bound for IIPMs. Due to (33), (41) and the fact that we need at most 4 inner iterations per main iteration, we may state the main result of the paper.

Theorem 3

If (P) and (D) are feasible and ξ>0 is such that x +s ξe for some optimal solution x of (P) and (y ,s ) of (D), then after at most

$$ 16\sqrt{2}~r\log\frac{\max\{r\xi^2, \|r_p^0\|_F, \|r_d^0\|_F\}}{\epsilon}, $$

inner iterations, the algorithm finds an ϵ-optimal solution of (P) and (D).

6 Conclusions

In this paper, we presented a primal-dual infeasible interior-point algorithm for SCO by combining Darvay’s technique and Roos’ algorithm. To our knowledge, this is the first time that a primal-dual infeasible interior-point algorithm, with adaptive combinations of Darvay’s technique and Roos’ algorithm, has been analyzed for SCO underlying the Euclidean Jordan algebras. The proposed algorithm is closely related to Roos’ algorithm. The complexity obtained here over symmetric cones, using the NT direction, coincides with the best bound obtained for IIPMs.