1 Introduction

Symmetric optimization (SO) problem is a convex optimization problem that minimizes a linear function over the intersection of an affine subspace and a symmetric cone. This class of optimization problem includes linear optimization (LO), second-order cone optimization (SOCO), and semidefinite optimization (SDO) as special cases. In the past two decades, the SDO problems have been one of the most active research areas in mathematical programming. Nesterov and Todd [1, 2] provided a theoretical foundation for efficient primal–dual interior-point methods (IPMs) for convex programming problems, which includes LO, SOCO, and SDO. Potra and Sheng [3] and Kojima, Shida and Shindoh [4] independently investigated the superlinear convergence of primal–dual infeasible-interior-point algorithms for SDO. Ji, Potra, and Sheng [5] studied the local behavior of the predictor–corrector algorithm considered by Monteiro [6] for SDO using the Monteiro–Zhang family of search directions. Potra and Sheng [7] first analyzed the homogeneous embedding idea for SDO. Luo, Sturm, and Zhang [8] proposed a self-dual embedding model for conically constrained convex optimization, including SDO. For an overview of these related results, we refer to the subject monographs [9, 10] and references within.

In the research field of Lie algebras, it is well-known that a symmetric cone has a deep connection with the Euclidean Jordan algebras. The first work connecting Jordan algebras and optimization is due to Güler [11]. He observed that the family of the self-scaled cones is identical to the set of symmetric cones for which there exists a complete classification theory. Faybusovich [12] first extended primal–dual IPMs for SDO to SO by using Euclidean Jordan algebras. Muramatsu [13] presented a commutative class of search directions for LO over symmetric cones, and analyzed the complexities of primal–dual IPMs for SO. Rangarajan [14] proved the polynomial-time convergence of infeasible IPMs for conic programming over symmetric cones using a wide neighborhood of the central path for a commutative family of search directions. Schmieta and Alizadeh [15] presented a way to transfer the Jordan algebra associated with the second-order cone into the so-called Clifford algebra in the cone of matrices and then carried out a unified method of the analysis for many IPMs in symmetric cones. Vieira [16] proposed primal–dual IPMs for SO based on the so-called eligible kernel functions and obtained the currently best known iteration bounds for the large- and small-update methods.

Recently, Darvay [17] proposed a full-Newton step primal–dual path-following interior-point algorithm for LO. The search direction of his algorithm is introduced by using an algebraic equivalent transformation (form) of the nonlinear equations which define the central path and then applying Newton’s method for the new system of equations. Later on, Achache [18] and Wang et al. [19, 20] respectively extended Darvay’s algorithm for LO to convex quadratic optimization (CQO), SOCO, and SDO. By constructing strictly feasible iterates for a sequence of the perturbed problems of the given problem and its dual problem, Roos [21] presented a full-Newton step primal–dual infeasible interior-point algorithm for LO based on the classical search direction. Using Euclidean Jordan algebras, Gu et al. [22] extended Roos’s algorithm for LO to SO. The order of the iteration bound coincides with the bound derived for LO which is the currently best known iteration bound for SO.

The purpose of the paper is to generalize Darvay’s full-Newton step primal–dual path-following interior-point algorithm for LO to SO by using Euclidean Jordan algebras. We develop some new analytic tools and adopt the basic analysis used in Darvay [17] and Wang et al. [19, 20] to the SO case. At each iteration, we use only full Nesterov–Todd steps which have the advantage that no line searches are needed. The currently best known iteration bound for the small-update method is derived. Moreover, our analysis is relatively simple and straightforward to the LO analogue in Darvay [17].

The paper is organized as follows. In Sect. 2, we provide the theory of the Euclidean Jordan algebras and their associated symmetric cones, and develop some new results that are needed in the analysis of the algorithm. In Sect. 3, after briefly reviewing the concept of the central path for SO, we extend Darvay’s technique for LO to SO and obtain the new search directions for SO. The generic primal–dual interior-point algorithm for SO is also presented. In Sect. 4, we analyze the algorithm and derive the currently best known iteration bound for the small-update method. Finally, some conclusions and remarks follow in Sect. 5.

2 Euclidean Jordan Algebras and Their Associated Symmetric Cones

In this section, we provide the theory of the Euclidean Jordan algebras and their associated symmetric cones. This will serve as the basic analytic tool for the analysis of our algorithm presented in Fig. 1. For a comprehensive treatment of Jordan algebras, the reader is referred to the monograph by Faraut and Koranyi [23].

Fig. 1
figure 1

Algorithm

Definition 2.1

Let V be an n-dimensional vector space over R along with the bilinear map ∘:(x,y)↦xyV. Then (V,∘) is a Jordan algebra iff for all x,yV

  1. (i)

    xy=yx;

  2. (ii)

    x∘(x 2y)=x 2∘(xy), where x 2=xx.

We define x n:=xx n−1,n≥2. Jordan algebras are not necessarily associative, but they are power associative, x mx n:=x m+n, i.e., the algebra generated by a single element xV is associative (see, e.g., [15]).

An element eV is said to be an identity element iff ex=xe=x for all xV. Note that the identity element e is unique, i.e., if e 1 and e 2 are identity elements of V, then e 1=e 1e 2=e 2e 1=e 2.

Definition 2.2

A Jordan algebra (V,∘) over R with an identity element e is said to be a Euclidean Jordan algebra iff there exists a symmetric, positive definite quadratic form Q on V which is also associative, that is,

$$Q(x\circ y, z) = Q( x, y\circ z).$$

Remark 2.1

In the sequel, we always assume that (V,∘) is a Euclidean Jordan algebra, and simply denoted as V.

Since “∘” is bilinear for every xV, there exists a matrix L(x) such that for every y, xy:=L(x)y. In particular, L(x)e=x and L(x)x=x 2. Furthermore, the part (ii) of Definition 2.1 implies that the operators L(x) and L(x 2) commute, i.e., L(x 2)L(x)y=x 2∘(xy)=x∘(x 2y)=L(x)L(x 2)y.

For each xV, we define

$$P(x):=2L(x)^2-L\bigl(x^2\bigr),$$
(1)

where L(x)2:=L(x)L(x). The map P(x) is called the quadratic representation of V, which is an essential concept in the theory of Jordan algebras.

An element xV is called invertible iff there exists a \(y=\sum_{i=0}^{k}\alpha_{i} x^{i}\) for some finite k<∞ and real numbers α i such that xy=yx=e, and denoted as x −1.

It is well-known that the cone K is said to be a symmetric cone iff it is self-dual and homogeneous. A symmetric cone is also precisely the class of self-scaled cone introduced by Nesterov and Todd [1, 2]. For any Euclidean Jordan algebra V, the corresponding cone of squares

$${K}({V}):=\bigl\{x^2: x\in{V}\bigr\}$$
(2)

is indeed a symmetric cone.

The following theorem gives some major properties of the cone of squares in a Euclidean Jordan algebra.

Theorem 2.1

(Theorem III.2.1, Proposition III.2.2 in [23])

Let V be a Euclidean Jordan algebra. Then K(V) is a symmetric cone, and is the set of elements x in V for which L(x) is positive semidefinite. Furthermore, if x is invertible, then

$$P(x)\operatorname{int} {K}=\operatorname{int} {K},$$

where \(\operatorname{int}{K}\) is the interior of K.

A symmetric cone K in a Euclidean space V is said to be irreducible iff there do not exist non-trivial subspaces V 1, V 2 and symmetric cones K 1V 1, K 2V 2 such that V is the direct sum of V 1 and V 2, and K the direct sum of K 1 and K 2.

Lemma 2.1

(Proposition II.4.5 in [23])

Any symmetric cone K is, in a unique way, the direct sum of irreducible symmetric cones.

A Euclidean Jordan algebra is said to be simple iff it cannot be represented as the orthogonal direct sum of two Euclidean Jordan algebras. Simple Jordan algebras have been classified into five kinds of irreducible symmetric cones. The details can be found in [23].

Since a Jordan algebra V is power associative, we can define the concepts of rank, the minimum and the characteristic polynomials, eigenvalues, trace, and determinant for it in the following way (see, e.g., [15]).

For any xV, let r be the smallest integer such that the set {e,x,…,x r} is linearly dependent. Then r is the degree of x which we denote as \(\operatorname{deg}(x)\). The rank of V, \(\operatorname{rank}({V})\), is the largest \(\operatorname{deg}(x)\) of any number xV. An element xV is called regular iff its degree equals the rank of V.

Remark 2.2

In the sequel, unless stated otherwise, we always assume that V is a Euclidean Jordan algebra with \(\operatorname{rank}({V})=r\).

For a regular element xV, since {e,x,…,x r} is linearly dependent, there are real numbers a 1(x),…,a r (x) such that the minimal polynomial of every regular element x is given by

$$f(\lambda;x)=\lambda^r - a_1(x)\lambda^{r-1} +\cdots+(-1)^ra_r (x)$$
(3)

which is the characteristic polynomial of the regular element x. The coefficient a 1(x) is called the trace of x, denoted as \(\operatorname{tr}(x)\). The coefficient a r (x) is called the determinant of x, denoted as \(\operatorname{det}(x)\).

An element cV is said to be idempotent iff c≠0 and c 2=c. Two idempotents c 1 and c 2 are said to be orthogonal if c 1c 2=0. We say that {c 1,…,c r } is a complete system of orthogonal primitive idempotents, or a Jordan frame, iff each c i is a primitive idempotent, c i c j =0,ij, and \(\sum_{i=1}^{n}c_{i} = e\). Note that Jordan frames always contain r primitive idempotents, where r is the rank of V.

Theorem 2.2

(Spectral decomposition, Theorem III.1.2 in [23])

Let xV. Then there exists a Jordan frame {c 1,…,c r } and real numbers λ 1(x),…,λ r (x) such that

$$ x=\sum_{i=1}^{n}\lambda_i(x)c_i.$$
(4)

The numbers λ i (x) (with their multiplicities) are the eigenvalues of x. Furthermore,

$$\operatorname{tr}(x)=\sum_{i=1}^r\lambda_i(x) \quad\mbox{\textit{and}}\quad \operatorname{det}(x)=\prod _{i=1}^r \lambda_i(x).$$
(5)

In fact, the above λ 1(x),…,λ r (x) are exactly the roots of the characteristic polynomial f(λ;x). Note that, since e=c 1+⋯+c r has eigenvalue 1, with multiplicity r, it follows that \(\operatorname{tr}(e) = r\) and \(\operatorname{det}(e) = 1\). For any xV, one can easily verify that

$$ x\in{K} \quad\Longleftrightarrow\quad \lambda_i(x)\geq 0,\quad i=1,\ldots, r,$$
(6)

and

$$ x\in\operatorname{int} {K}\quad\Longleftrightarrow\quad \lambda_i(x)> 0,\quad i=1,\ldots, r.$$
(7)

The importance of the spectral decomposition (4) is that it enables us to extend the definition of any real-valued, continuous univariate function ψ(t) to elements of a Euclidean Jordan algebra using eigenvalues.

Throughout the paper, we assume that ψ(t) is a real valued univariate function on [0,+∞[ and differentiable on ]0,+∞[ such that ψ′(t)>0 for all t>0. Now we are ready to show how a vector-valued function can be obtained from the function ψ(t).

Definition 2.3

Let xV with the spectral decomposition defined by (4). The vector-valued function ψ(x) is defined by

$$ \psi(x):=\psi\bigl(\lambda_1(x)\bigr)c_1+\cdots +\psi\bigl(\lambda_r(x)\bigr) c_r.$$
(8)

Furthermore, replacing ψ(λ i (x)) in (8) by ψ′(λ i (x)) with i=1,…,r, respectively, we can define the vector-valued function as follows

$$ \psi'(x):=\psi'\bigl(\lambda_1(x)\bigr) c_1+ \cdots+\psi'\bigl(\lambda_r(x)\bigr) c_r.$$
(9)

It should be pointed out that xs=sx, but, in general, L(x)L(s)≠L(s)L(x). We say that two elements x and y of V operator commute iff L(x)L(y)=L(y)L(x). In other words, x and y operator commute iff for all zV, x∘(yz)=y∘(xz) (see, e.g., [15]).

Lemma 2.2

(Theorem 27 in [15])

Let x and s be two elements of a Euclidean Jordan algebra V. Then x and s operator commute if and only if there is a Jordan frame {c 1,…,c r } such that

$$x=\lambda_1(x)c_1+\cdots+\lambda_r(x)c_r\quad\mbox{\textit{and}}\quad s=\lambda_1(s)c_1+\cdots+\lambda_r(s)c_r.$$

Corollary 2.1

Let x,sV, and suppose they share a Jordan frame {c 1,…,c r }. Then

$$x \circ s=\lambda_1(x)\lambda_1(s)c_1+\cdots+\lambda_r(x)\lambda_r(s) c_r.$$

Proof

The desired result follows directly from the fact that {c 1,…,c r } is a complete system of orthogonal primitive idempotents. □

Corollary 2.2

Let x,sV and x+s=e. Then x and s operator commute.

Proof

Theorem 2.2 implies that there exists a Jordan frame {c 1,…,c r } such that

$$x=\lambda_1(x)c_1+\cdots+\lambda_r(x)c_r\quad\mbox{and}\quad e= c_1+\cdots+ c_r.$$

Thus, we get

$$s=e-x=\bigl(1-\lambda_1(x)\bigr)c_1+\cdots+\bigl(1-\lambda_r(x)\bigr)c_r.$$

This implies the desired result. □

Definition 2.4

Let x,sV. The elements x and s are approximately equivalent, and briefly denoted as xs, iff x and s operator commute, and the absolute value of the difference of their respective corresponding eigenvalues is small enough.

Theorem 2.3

Let x,sV, and suppose x and s operator commute. Then

$$\lambda_i(x+s)=\lambda_i(x)+\lambda_i(s),\quad i=1,\ldots,r.$$

Furthermore, if each |λ i (s)| with i=1,…,r is small enough, one has

$$\psi(x+s)\approx\psi(x)+\psi'(x)\circ s.$$

Proof

Since x,sV, Theorem 2.2 implies that there exists a Jordan frame {c 1,…,c r } such that

$$x=\lambda_1(x)c_1+\cdots+\lambda_r(x)c_r\quad\mbox{and}\quad s=\lambda_1(s)c_1+\cdots+\lambda_r(s)c_r.$$

Hence, we have

$$x+s=\bigl(\lambda_1(x)+\lambda_1(s)\bigr)c_1+\cdots+\bigl(\lambda_r(x)+\lambda_r(s)\bigr)c_r.$$

This implies that

$$\lambda_i(x+s)=\lambda_i(x)+\lambda_i(s),\quad i=1,\ldots,r.$$

On the other hand, since ψ(t) is a differentiable, we know that, if each |λ i (s)| with i=1,…,r is small enough, then

$$\psi\bigl(\lambda_i(x)+\lambda_i(s)\bigr)\approx\psi \bigl(\lambda_i(x)\bigr)+\psi'\bigl(\lambda_i(x)\bigr)\lambda_i(s),\quad i=1,\ldots,r.$$

Hence, we have

The last equality holds due to Corollary 2.1. This completes the proof. □

Corollary 2.3

Let x,sV and x+s=e. If |λ i (s)| with i=1,…,r is small enough, one has

$$\psi(x+s)\approx\psi(x)+\psi'(x)\circ s.$$

Proof

The corollary follows immediately from Corollary 2.2 and Theorem 2.3. □

For any x,sV, we define

$$ \langle x,s\rangle:=\operatorname{tr}(x\circ s).$$
(10)

and refer to it as the trace inner product. The Frobenius norm induced by this trace inner product, namely ∥⋅∥ F , is defined by

$$ \| x\|_F:=\sqrt{\langle x,x\rangle}.$$
(11)

Then

$$\| x\|_F=\sqrt{\operatorname{tr}\bigl(x^2\bigr)}=\sqrt {\sum_{i=1}^r\lambda_i^2(x)}.$$
(12)

Furthermore, we have

$$ \big|\lambda_\mathrm{max}(x)\big|\leq\| x\|_F\quad \mbox{and}\quad \big|\lambda_\mathrm{min}(x)\big|\leq\| x\|_F,$$
(13)

where λ max(x) and λ min(x) denote the largest and the smallest eigenvalue of x, respectively.

Lemma 2.3

Let xV. Then

$$\lambda_\mathrm{max}(x)e-x\in{K}\quad\mbox{\textit{and}}\quad x-\lambda_\mathrm{min}(x)e\in{K}.$$

Proof

Theorem 2.2 implies that there exists a Jordan frame {c 1,…,c r } such that

$$x=\sum_{i=1}^r\lambda_i(x)c_i\quad\mbox{and}\quad e=\sum_{i=1}^rc_i.$$

Hence, we have

$$\lambda_\mathrm{max}(x)e-x=\lambda_\mathrm{max}(x)\sum _{i=1}^rc_i-\sum _{i=1}^r\lambda_i(x)c_i=\sum _{i=1}^r\bigl(\lambda_\mathrm{max}(x)-\lambda_i(x)\bigr)c_i.$$

It follows from (6) that

$$\lambda_\mathrm{max}(x)e-x\in{K}.$$

The proof of the second part of the lemma follows in a similar way. □

Corollary 2.4

Let xV andx F ≤1. Then

$$e-x\in{K}.$$

Proof

Since λ max(x)≤∥x F , the corollary follows immediately from Lemma 2.3. □

Lemma 2.4

Let xV and exxK. Then

$$0\leq\| e-x\circ x\|_F^2 \leq r.$$

Proof

Theorem 2.1 implies that λ i (xx)≥0 for i=1,…,r. Thus, we have

$$0\leq\| e-x\circ x\|_F^2=\sum _{i=1}^r\lambda_i^2(e-x\circ x)=\sum_{i=1}^r\bigl(1-\lambda_i(x\circ x)\bigr)^2\leq r.$$

The last inequality holds due to the fact that 0≤1−λ i (xx)≤1. This completes the proof. □

The following well-known inequalities are needed in the analysis of the interior-point algorithm presented in Fig. 1.

Lemma 2.5

(Lemma 2.12 in [22])

Let xV. Then

$$\big\| x^2 \big\|_F\leq\|x\|_F^2.$$

Lemma 2.6

(Lemma 14 in [15])

Let x,sV. Then

$$\lambda_\mathrm{min}(x+s)\geq\lambda_\mathrm{min}(x)+\lambda_\mathrm{min}(s)\geq\lambda_\mathrm{min}(x)-\| s\|_F$$

and

$$\lambda_\mathrm{max}(x+s)\leq\lambda_\mathrm{max}(x)+\lambda_\mathrm{max}(s)\leq\lambda_\mathrm{max}(x)+\| s\|_F.$$

Lemma 2.7

(Lemma 30 in [15])

Let x,sK. Then

$$\big\| P(x)^{1/2}s-e \big\|_F \leq\|x\circ s-e \|_F.$$

Lemma 2.8

(Theorem 4 in [24])

Let x,sK. Then

$$\lambda_\mathrm{min}\bigl(P(x)^{1/2}s\bigr)\geq \lambda_\mathrm{min}(x\circ s).$$

Recall that two matrices X and S are said to be similar iff they share the same eigenvalues, including their multiplicities. In this case, we write XS. Analogously, we say that two elements x and s in V are similar, and briefly denoted as xs, iff x and s share the same eigenvalues, including their multiplicities (see, e.g., [16, 22]).

Lemma 2.9

Let x,sV and xs. Then

$$\operatorname{det}(x)=\operatorname{det}(s)\quad\mbox{\textit{and}}\quad \operatorname{tr}(x)=\operatorname{tr}(s).$$

Furthermore, one has

$$x\in{K}\ (\operatorname{int} {K})\quad\Longleftrightarrow\quad s\in{K}\ (\operatorname{int} {K}).$$

Proof

The desired results follow directly from Theorem 2.2. □

The following lemma gives the so-called NT-scaling of V, which was first established by Faybusovich [25] in the framework of Euclidean Jordan algebras.

Lemma 2.10

(NT-scaling, Lemma 3.2 in [25])

Let \(x,s\in\operatorname{int}{K}\). Then there exists a unique \(w\in\operatorname{int}{K}\) such that

$$x=P(w)s.$$

Moreover,

$$w=P(x)^{\frac{1}{2}} \bigl(P\bigl(x^{\frac{1}{2}}\bigr)s \bigr)^{-\frac{1}{2}}\bigl[=P\bigl(s^{-\frac{1}{2}}\bigr) \bigl(P\bigl(s^{\frac{1}{2}}\bigr)x\bigr)^{\frac{1}{2}} \bigr].$$

The point w is called the scaling point of x and s (in this order).

As a consequence, there exists \(v\in\operatorname{int}{K}\) such that

$$v=P(w)^{-\frac{1}{2}}x=P(w)^{\frac{1}{2}}s.$$
(14)

Note that \(P(w)^{\frac{1}{2}}\) and its inverse \(P(w)^{-\frac{1}{2}}\) are automorphisms of \(\operatorname{int}{K}\) (see, e.g., [22]).

Lemma 2.11

Let t>0 and vK. Then

$$\| t e-v\|_F\leq \frac{1}{t+\lambda_\mathrm{min}(v)} \big\|t^2 e-v\circ v\big\|_F.$$

Proof

It follows from Corollary 2.1 and Theorem 2.3 that

This completes the proof. □

3 Primal–Dual Path-Following Interior-Point Algorithm for SO

In this section, after reviewing the concept of the central path for SO, we extend Darvay’s technique for LO to SO and obtain the new search directions for SO. The generic primal–dual path-following interior-point algorithm for SO is also presented.

3.1 The Central Path for SO

We consider the SO problem given in the standard form

$$\mbox{(SOP)}\qquad\min \bigl\{\langle c, x\rangle:Ax=b, x\in {K} \bigr\} ,$$

and its dual problem

$$\mbox{(SOD)}\qquad\max \bigl\{b^T y:A^Ty + s=c, s\in{K} \bigr\} ,$$

where c and the rows of A lie in V, and bR m. We call (SOP) feasible iff there exists xK such that Ax=b, and strictly feasible, if in addition, \(x\in\operatorname{int}{K}\). Similarly, we call (SOD) feasible iff there exists (y,s)∈R m×K such that A T y+s=c, and strictly feasible, if in addition, \(s\in\operatorname{int}{K}\). Throughout the paper, we assume that the matrix A has full rank, i.e., \(\operatorname{rank}(A)=m\).

Throughout the paper, we assume that both (SOP) and (SOD) satisfy the interior-point condition (IPC), i.e., both (SOP) and (SOD) are strictly feasible. This can be achieved via the so-called homogeneous self-dual embedding (see, e.g., [7, 24]). Under the IPC, finding an optimal solution of (SOP) and (SOD) is equivalent to solving the following system

(15)

The basic idea of primal–dual IPMs is to replace the third equation in (15), the so-called complementarity condition for (SOP) and (SOD), by the parameterized equation xs=μe with μ>0. The system (15) can be written as

(16)

Since the IPC holds and A has full rank, the parameterized system (16) has a unique solution (x(μ),y(μ),s(μ)) for each μ>0 [12, 16], and we call x(μ) the μ-center of (SOP) and (y(μ),s(μ)) the μ-center of (SOD). The set of μ-centers gives a homotopy path (with μ running through all the positive real numbers), which is called the central path. If μ→0, then the limit of the central path exists and since the limit points satisfy the complementarity condition xs=0, it naturally yields an optimal solution for (SOP) and (SOD) (see, e.g., [12, 16]).

3.2 The New Search Directions for SO

Similarly to the LO case [17], we replace the standard centering equation xs=μe by \(\psi(\frac{x\circ s}{\mu })=\psi(e)\), where ψ(⋅) is the vector-valued function induced by the univariate function ψ(t), Then, we consider the following system

(17)

A direct application of Newton’s method is to solve the nonlinear system (17) with μ fixed. For any strictly feasible \(x\in\operatorname{int}{K}\) and \(s\in\operatorname{int}{K}\), we want to find displacements (Δxys) such that

(18)

The third equation of the system (18) is equivalent to

$$ \psi \biggl(\frac{x\circ s}{\mu}+\frac{x\circ\Delta s +\Delta x\circ s+\Delta x \circ\Delta s}{\mu} \biggr)=\psi(e).$$
(19)

Note that

$$\frac{x\circ s}{\mu}+\frac{x\circ\Delta s +\Delta x\circ s+\Delta x\circ\Delta s}{\mu}=e.$$
(20)

Neglecting the term Δx∘Δs, from Corollary 2.3, we can replace (19) by

$$ \psi \biggl(\frac{x\circ s}{\mu} \biggr)+\psi' \biggl(\frac{x\circ s}{\mu} \biggr)\circ \biggl(\frac{x\circ\Delta s +\Delta x\circ s}{\mu} \biggr)=\psi(e).$$
(21)

This enables us to rewrite the system (18) as follows

(22)

Due to the fact that L(x)L(s)≠L(s)L(x), in general, this system does not always have a unique solution in \(\operatorname{int}{K}\). It is well known that this difficulty can be solved by applying a scaling scheme (see, e.g., [12, 15]). It goes as follows.

Lemma 3.1

(Lemma 28 in [15])

Let \(u\in\operatorname{int}{K}\). Then

$$x\circ s=\mu e\quad \Longleftrightarrow \quad P(u)x\circ P(u)^{-1}s=\mu e.$$

Now we replace the third equation of the system (18) by

$$\psi \biggl(\frac{P(u)(x+\Delta x)\circ P(u)^{-1}(s+\Delta s)}{\mu} \biggr)=\psi(e).$$
(23)

Applying Newton’s method again and neglecting the term P(uxP(u)−1Δs, from Corollary 2.3, we get

(24)

In this paper, we consider the NT-scaling scheme [1, 2]. Let \(u=w^{-\frac{1}{2}}\), where w is the NT-scaling point of x and s. We define

$$ v:=\frac{P(w)^{-\frac {1}{2}}x}{\sqrt{\mu}} \biggl[=\frac{P(w)^{\frac{1}{2}}s}{\sqrt{\mu}} \biggr]$$
(25)

and

$$ \bar{A}:=\frac{1}{\sqrt{\mu }}AP(w)^{\frac {1}{2}},\quad\quad d_x:=\frac{P(w)^{-\frac{1}{2}}\Delta x}{\sqrt{\mu}},\quad\quad d_s:=\frac{P(w)^{\frac{1}{2}}\Delta s}{\sqrt{\mu}}.$$
(26)

Replacing the third equation of the system (22) by (24), and then using (25) and (26), after some elementary reductions, we obtain

(27)

where

$$p_v=v^{-1}\circ\bigl(\psi'(v\circ v)\bigr)^{-1}\circ\bigl(\psi(e)-\psi(v\circ v)\bigr).$$

The system (27) has a unique solution in \(\operatorname{int}{K}\) (see, e.g., [16]). By choosing the function ψ(t) appropriately, this system can be used to define the new search directions. For example:

  • ψ(t)=t yields p v =v −1v, which gives the classical NT-search direction (see, e.g., [22]);

  • ψ(t)=t 2 yields \(p_{v}=\frac{1}{2}(v^{-3}-v)\), which gives the search direction differs from the defined in [16] only by a constant multiplier;

  • \(\psi(t)=t^{\frac{q+1}{2}},~q\geq0\) yields \(p_{v}=\frac {2}{q+1}(v^{-q}-v)\), which also includes the above two cases.

In order to facilitate the analysis of the interior-point algorithm, we restrict our analysis to the case where \(\psi(t)=\sqrt{t}\), which yields

$$ p_v=2(e-v).$$
(28)

The new search directions d x and d s are obtained by solving (27) with (28) so that Δx and Δs are computed via (26). If (x,y,s)≠(x(μ),y(μ),s(μ)), then (Δxys) is nonzero. The new iterate is obtained by taking a full Nesterov–Todd step as follows

$$ x_+:=x+\Delta x,\quad y_+:=y+\Delta y,\quad s_+:=s+\Delta s.$$
(29)

For the analysis of the algorithm, we define a norm-based proximity measure δ(x,s;μ) as follows

$$\delta(v):=\delta(x,s;\mu):=\frac{\| p_v\|_F}{2}=\| e-v\|_F.$$
(30)

From the first two equations of the system (27), d x and d s are orthogonal, i.e.,

$$ \langle d_x, d_s\rangle=\langle d_s,d_x\rangle=0.$$
(31)

Furthermore, we can conclude that

$$\delta(v)=0 \quad\Longleftrightarrow\quad v=e\quad\Longleftrightarrow\quad d_x=d_s=0\quad\Longleftrightarrow\quad x\circ s=\mu e.$$
(32)

Hence, the value of δ(v) can be considered as a measure for the distance between the given pair (x,y,s) and the μ-center (x(μ),y(μ),s(μ)).

3.3 The Generic Primal–Dual Path-Following Interior-Point Algorithm for SO

The generic primal–dual path-following interior-point algorithm is now presented in Fig. 1.

4 Analysis of the Algorithm

In this section, we will show that our algorithm can solve the SO problems in polynomial time and prove the local quadratic convergence of the algorithm.

Let us further define

$$ q_v=d_x-d_s.$$
(33)

Then, we have

$$ d_x=\frac{p_v+q_v}{2}, \quad\quad d_s=\frac {p_v-q_v}{2},\quad\quad d_x\circ d_s=\frac{p_v\circ p_v-q_v\circ q_v}{4}.$$
(34)

It follows from (31) that ∥p v F =∥q v F =2δ(v).

Let 0≤α≤1, we define

$$ x(\alpha):=x+\alpha\Delta x\quad\mbox{and}\quad s(\alpha):=s+\alpha\Delta s.$$
(35)

The following lemma gives a sufficient condition for a feasible step-length \(\bar{\alpha}> 0\) such that \(x(\bar{\alpha})\in\operatorname{int}{K}\) and \(s(\bar{\alpha })\in\operatorname{int}{K}\).

Lemma 4.1

Let \(x,s\in\operatorname{int}{K}\) and \(x(\alpha)\circ s(\alpha)\in\operatorname{int}{K}\) for \(\alpha\in[0,\bar{\alpha}]\). Then

$$x(\bar{\alpha})\in\operatorname{int} {K}\quad\mbox{\textit{and}}\quad s(\bar{\alpha})\in \operatorname{int} {K}.$$

Proof

Since \(x(\alpha)\circ s(\alpha)\in\operatorname{int}{K}\) for all \(\alpha\in[0,\bar{\alpha}]\), Lemma 2.15 in [22], i.e., if \(x\circ s\in\operatorname{int}{K}\), then \(\operatorname{det}(x)\neq0\), implies that \(\operatorname{det}(x(\alpha))\) and \(\operatorname{det}(s(\alpha))\) do not vanish for all \(\alpha\in[0,\bar{\alpha})\). Since \(\operatorname{det}(x(0))=\operatorname{det}(x)>0\) and \(\operatorname{det}(s(0))=\operatorname{det}(s)>0\), by continuity, \(\operatorname{det}(x(\alpha)]\) and \(\operatorname{det}(s(\alpha))\) stay positive for all \(\alpha\in[0,\bar {\alpha}]\). Moreover, by Theorem 2.2, this implies that all the eigenvalues of x(α) and s(α) stay positive for all \(\alpha\in[0,\bar{\alpha}]\). Hence, we can conclude that all the eigenvalues of \(x(\bar{\alpha})\) and \(s(\bar{\alpha})\) are non-negative. This implies the desired result. □

From (29), (25), and (26), we get

$$x_+=x+\Delta x=\sqrt{\mu}P(w)^{\frac{1}{2}}(v+d_x)\quad\mbox{and}\quad s_+=s+\Delta s=\sqrt{\mu}P(w)^{-\frac{1}{2}}(v+d_s).$$

Since \(P(w)^{\frac{1}{2}}\) and its inverse \(P(w)^{-\frac{1}{2}}\) are automorphisms of \(\operatorname{int}{K}\), Theorem 2.1 implies that x + and s + belong to \(\operatorname{int}{K}\) if and only if v+d x and v+d s belong to \(\operatorname{int}{K}\).

The following lemma shows the strict feasibility of the full Nesterov–Todd step under the condition δ(x,s;μ)<1.

Lemma 4.2

Let δ:=δ(x,s;μ)<1. Then the full Nesterov–Todd step is strictly feasible.

Proof

Let 0≤α≤1, we define

$$v_x(\alpha ):=v+\alpha d_x\quad\mbox{and}\quad v_s(\alpha):=v+\alpha d_s.$$
(36)

Then

(37)

The second to last equality holds due to

$$v\circ v+v\circ p_v=v\circ v+2v\circ(e-v)=e-(e-v)\circ(e-v)=e-\frac {p_v \circ p_v}{4}.$$

Furthermore, since 0≤α≤1, we have

Corollary 2.4 implies that

$$e-(1-\alpha)\frac{p_v \circ p_v}{4}-\alpha\frac{q_v\circ q_v}{4}\in \operatorname{int}{K}.$$

Thus

$$(1-\alpha) (v \circ v)+\alpha \biggl(e-(1-\alpha) \frac{p_v \circ p_v}{4}-\alpha \frac{q_v \circ q_v}{4} \biggr)\in\operatorname{int} {K}.$$

From Lemma 4.1, we have

$$v_x(1)=v+d_x\in\operatorname{int} {K}\quad\mbox{and}\quad v_s(1)=v+d_s\in\operatorname{int} {K}$$

for α=1. Hence, the result of the lemma holds. □

According to (25), the v-vector after the step is given by

$$v_+:=\frac{P(w_+)^{-\frac{1}{2}}x_+}{\sqrt{\mu}} \biggl[=\frac{P(w_+)^{\frac{1}{2}}s_+}{\sqrt{\mu}} \biggr],$$
(38)

where w + is the scaling point of x + and s +.

Lemma 4.3

(Proposition 5.9.3 in [16])

One has

$$v_+\sim \bigl(P(v+d_x)^{\frac{1}{2}}(v+d_s)\bigr))^{\frac{1}{2}}.$$

In the next lemma, we proceed to prove the local quadratic convergence of the full Nesterov–Todd step to the target point (x(μ),y(μ),s(μ)).

Lemma 4.4

Let δ:=δ(x,s;μ)<1. Then

$$\delta(x_+,s_+;\mu)\leq\frac{\delta^2}{1+\sqrt{1-\delta^2}}.$$

Thus δ(x +,s +;μ)≤δ 2, which shows the quadratical convergence of the algorithm.

Proof

From (37), with α=1, we get

$$ (v+d_x) \circ(v+d_s)= e-\frac{q_v \circ q_v}{4}.$$
(39)

By Lemma 2.6, (13), and Lemma 2.5, we obtain

It follows from Lemma 4.3 and Lemma 2.8 that

(40)

Using Lemma 2.11 with t=1, Lemma 2.7, (39), (40), and Lemma 2.5, we have

This completes the proof. □

The following lemma gives an upper bound of the duality gap after a full Nesterov–Todd step.

Lemma 4.5

After a full Nesterov–Todd step. Then

$$\langle x_+, s_+\rangle\leq r\mu.$$

Proof

Since

$$e-\frac{q_v \circ q_v}{4}=(v+d_x) \circ(v+d_s)\in \operatorname{int} {K},$$

and using Lemma 2.4, we obtain

$$\langle x_+, s_+\rangle=\mu \big\|(v+d_x) \circ(v+d_s)\big\|_F^2=\mu\biggl \Vert e-\frac{q_v \circ q_v}{4}\biggr \Vert _F^2 \leq r\mu.$$

This completes the proof. □

In the following lemma, we investigate the effect on the proximity measure of a full Nesterov–Todd step followed by an update of the parameter μ.

Lemma 4.6

Let δ:=δ(x,s;μ)<1 and μ +=(1−θ)μ, where 0<θ<1. Then

$$\delta(x_+,s_+;\mu_+)\leq\frac{\theta\sqrt{r}+\delta^2}{1-\theta +\sqrt{(1-\theta)(1-\delta^2)}}.$$

Proof

After updating μ +=(1−θ)μ, the vector v + is divided by the factor \(\sqrt{1-\theta}\). From Lemma 2.11 with \(t=\sqrt{1-\theta}\), Lemma 2.7, (39), and (40), we have

This completes the proof. □

Corollary 4.1

Let \(\delta:=\delta(x,s;\mu) \leq\frac{1}{2}\) and \(\theta=\frac{1}{2\sqrt{r}}\) with r≥4. Then

$$\delta(x_+,s_+;\mu_+)\leq\frac{1}{2}.$$

Proof

Note that

$$1-\theta=1-\frac{1}{2\sqrt{r}}\geq1-\frac{1}{2\sqrt{4}}=\frac{3}{4}.$$

From Lemma 4.6 and \(\delta\leq\frac{1}{2}\), after some elementary reductions, we get

$$\delta(x_+,s_+;\mu_+)\leq\frac{1}{2}.$$

This completes the proof. □

The following lemma gives an upper bound for the total number of iterations produced by our algorithm.

Lemma 4.7

Suppose that x 0 and s 0 are strictly feasible, \(\mu^{0}\!=\!\frac {\langle x^{0}, s^{0}\rangle}{r}\) and δ(x 0,s 0;μ 0) is less than or equal to \(\frac{1}{2}\). Moreover, let x k and s k be the vectors obtained after k iterations. Then the inequalityx k,s k〉≤ε is satisfied for

$$k\geq\frac{1}{\theta}\log{\frac{\langle x^0, s^0\rangle }{\varepsilon}}.$$

Proof

Lemma 4.5 implies that

$$\bigl\langle x^k, s^k\bigr\rangle\leq r\mu^k=r(1-\theta)^k\mu^0=(1-\theta)^k \bigl\langle x^0, s^0\bigr\rangle.$$

Then the inequality 〈x k,s k〉≤ε holds if

$$(1-\theta)^k \bigl\langle x^0, s^0\bigr \rangle\leq\varepsilon.$$

Taking logarithms, we obtain

$$k\log{(1-\theta)}+\log{\bigl\langle x^0, s^0\bigr \rangle}\leq\log{\varepsilon},$$

and using −log(1−θ)≥θ, we observe that the above inequality holds if

$$k\theta\geq\log{\bigl\langle x^0, s^0\bigr\rangle}-\log{\varepsilon}=\log{\frac{\langle x^0,s^0\rangle}{\varepsilon}}.$$

Hence, the result of this lemma holds. □

Theorem 4.1

Let \(\theta=\frac{1}{2\sqrt{r}}\). Then the algorithm requires at most

$$O \biggl(\sqrt{r}\log{\frac{\langle x^0,s^0\rangle}{\varepsilon}} \biggr)$$

iterations. The output is a primal–dual pair (x,s) satisfyingx,s〉≤ε.

Proof

Let \(\theta=\frac {1}{2\sqrt{r}}\), Theorem 4.1 follows immediately from Lemma 4.7. □

Corollary 4.2

If one takes x 0=s 0=e, the iteration bound becomes

$$O \biggl(\sqrt{r}\log{\frac{r}{\varepsilon}} \biggr)$$

which is the currently best known iteration bound for the small-update method.

5 Conclusions and Remarks

In this paper, we have shown that a full-Newton step primal–dual path-following interior-point algorithm for LO presented in [17] can be extended to the context of SO. Though the proposed algorithm is exactly an extension from LO to SO, it is not straightforward due to the non-polyhedral property of a symmetric cone. By employing Euclidean Jordan algebras, we derived the currently best known iteration bound for the small-update method. This unifies the analysis for LO, SOCO, and SDO.

Some interesting topics for further research remain. The search direction used in this paper is based on the NT-scaling scheme. It may be possible to design similar algorithms using other scaling schemes and to obtain polynomial-time iteration bounds. Another topic for further research may be the development of full Nesterov–Todd step primal–dual infeasible interior-point algorithm for SO. Finally, an interesting topic is the generalization of the analysis of the (infeasible) interior-point algorithms to the case where \(\psi (t)=t^{\frac{q+1}{2}},~q\geq0\).