1 Introduction

Symmetric programming problems [1, 2] are convex optimization problems in which we optimize a linear objective function over the intersection of an affine linear manifold with the Cartesian product of symmetric cones. Linear programming, second-order cone programming [3, 4] and semidefinite programming [5, 6] are well known and important special cases of symmetric programs. Interior-point methods are considered one of the effective methods developed for general symmetric programming problems (see for example [1, 2, 7,8,9]). For certain classes of optimization problems over symmetric cones complete interior point methods were developed. See, for example, [10,11,12,13] for linear programming, [3, 4, 14,15,16,17,18] for second-order cone programming, and [5, 6, 19,20,21,22] for semidefinite programming.

Interior-point methods consist of two elements: namely, calculating a step or search direction that places one in the direction of a central path, and calculating a displacement step-size in order to enforce some sort of feasibility (more specifically, the positive-definiteness) of the subsequent iterate. Throughout this paper, we use the term “displacement step” instead of “displacement step-size” for the sake of simplicity. In fact, the main challenge to be faced in establishing an iteration in an interior-point algorithm is that the determination and computation of the displacement step along the search direction. Calculating the displacement step using classical line search methods is undesirable and even generally impossible [20, 23]. Touil et al. [23] developed an interior-point algorithm for semidefinite programming and presented some strategies for determining the displacement step for the developed algorithm. Extensions of interior-point methods for semidefinite programming to general symmetric cones can be found in the literature (see for example [14,15,16,17,18]), but only in terms of calculating the search direction, not preserving geometric feasibility conditions thereafter. This paper presents some strategies for determining the displacement step for a particular interior-point method on general symmetric cones. So, our focus in this paper is not on the search direction, but on the displacement step. In other words, we are not developing a new step, but a new step size.

We can separate two types of interior-point methods: projective methods [10,11,12, 24, 25] and (feasible or infeasible) central trajectory methods [19, 23, 26, 27]. The interior-point method developed by Touil et al. [23] for semidefinite programming is a feasible central trajectory algorithm. The focus in this paper is on solving symmetric programming by feasible central trajectory methods. More specifically, this work exploits the Jordan algebraic structure of the symmetric cone to derive a feasible primal-dual central trajectory algorithm for symmetric programming and to present some strategies for determining the displacement step for the proposed algorithm by extending the work of Touil et al. [23] for the semidefinite case to the general symmetric case.

In this paper, we associate a perturbed problem to symmetric programming problem and apply Newton’s method to treat the corresponding perturbed equations in order to obtain a descent search direction. The main contribution of this paper is that it uniquely proposes four different selections to calculate the displacement step along the search direction while solving symmetric programming. Our work is not a trivial generalization of that in [23], however, because a deep understanding of the Euclidean Jordan algebras is very important to establish the analysis of this paper. The main difficulties in developing our analysis are caused by establishing and proving the selections of determining the displacement step in the framework of Euclidean Jordan algebras.

There are widely available software packages than can handle symmetric programs in general, and second-order cone programs in particular. Optimization solvers such as MOSEK [28] and SDPT3 [29] are based on the interior-point methods and are well suited for symmetric programming problems. The efficiency of our proposed algorithms is shown by presenting numerical experiments on second-order cone programs comparing with SDPT3 and MOSEK.

This paper is organized as follows. In Sect. 2, we briefly review the basics of the Euclidean Jordan algebra associated with the symmetric cone and provide some preliminary results. In Sect. 3, we introduce the problem formulation and make some assumptions, then we apply Newton’s method to the perturbed problem. In Sect. 4, we propose four different selections of calculating the appropriate displacement step. Section 5 is devoted to present a feasible primal-dual central trajectory algorithm. The complexity analysis of the proposed algorithm is presented in Sect. 6. In Sect. 7, we compare the four selections of computing the displacement step and show the efficiency of the proposed algorithm in some numerical examples for second-order cone programs. Section 8 contains some concluding remarks.

2 Background and preliminary results

In this section, we provide the necessary background for the subsequent sections. First, we review some statistical inequalities needed for deriving some results in the work. Then, we briefly review the definitions of self-duality, homogeneity and symmetric cones. Finally, we review the basics of the theory of Euclidean Jordan algebras.

2.1 Preliminary statistical inequalities

In this part, we review some statistical inequalities needed for the derivation of some of the results in the paper.

Let \(x_1, x_2, \ldots , x_n \in {\mathbb {R}}\) be a sample of size n, then its mean\(\bar{ x}\) and its standard deviation\(\sigma _{\varvec{x}}\) are respectively defined as

$$\begin{aligned} \bar{ x} := \frac{1}{n} \sum _{k=1}^{n} x_k \;\;\;\text {and}\;\;\; \sigma _{\varvec{x}}^2 := \frac{1}{n} \sum _{k=1}^{n} x^2_k - {{\bar{x}}} ^2 = \frac{1}{n} \sum _{k=1}^{n} (x_k - {{\bar{x}}})^2. \end{aligned}$$

The first statement in the following proposition is due to [30] and the second statement is due to [20, Theorem 5].

Proposition 2.1

Assume that \(\varvec{x} \in {\mathbb {R}}^n\), then we have

$$\begin{aligned}&{{\bar{x}}} - \sigma _{\varvec{x}} \sqrt{n-1} \le \displaystyle \min _{1\le k \le n} x_k \le {{\bar{x}}} - \displaystyle \frac{\sigma _{\varvec{x}}}{\sqrt{n-1}}, \;\;\;\;\text {and}\\&\quad {{\bar{x}}} + \displaystyle \frac{\sigma _{\varvec{x}}}{\sqrt{n-1}} \le \displaystyle \max _{1\le k \le n} x_k \le {{\bar{x}}} + \sigma _{\varvec{x}} \sqrt{n-1}. \end{aligned}$$

In particular, if \(x_k>0\) for all \(k=1, 2, \ldots , n\), then we also have

$$\begin{aligned}&n \ln \left( {{\bar{x}}} - \sigma _{\varvec{x}} \sqrt{n-1} \right) \le A \le \sum _{k=1}^{n} \ln (x_k) \le B \le n \ln ({{\bar{x}}}), \;\;\text {where}\\&\quad A = (n-1) \ln \left( {{\bar{x}}} + \displaystyle \frac{\sigma _{\varvec{x}}}{\sqrt{n-1}} \right) + \ln \left( {{\bar{x}}} - \sigma _{\varvec{x}} \sqrt{n-1} \right) ,\;\;\text {and}\\&\quad B = (n-1) \ln \left( {{\bar{x}}} - \displaystyle \frac{\sigma _{\varvec{x}}}{\sqrt{n-1}} \right) + \ln \left( {{\bar{x}}} + \sigma _{\varvec{x}} \sqrt{n-1} \right) . \end{aligned}$$

2.2 Preliminary conical definitions

A cone is said to be closed iff it is closed under the taking of sequential limits, convex iff it is closed under taking convex combinations, pointed iff it does not contain two opposite nonzero vectors (so the origin is an extreme point), solid iff it has a nonempty interior, and regular iff it is a closed, convex, pointed, solid cone.

Definition 2.1

Let \(\mathcal {V}\) be a finite-dimensional Euclidean vector space over \({\mathbb {R}}\) with an inner product “\(\langle \varvec{\cdot }, \varvec{\cdot } \rangle \)”. The dual cone of a regular cone \(\mathcal {K} \subset \mathcal {V}\) is denoted by \(\mathcal {K}^\star \) and is defined as

$$\begin{aligned} \mathcal {K}^\star :=\{\varvec{s} \in \mathcal {V}: \langle \varvec{x}, \varvec{s} \rangle \ge 0,\ \forall \varvec{x} \in \mathcal {K}\}. \end{aligned}$$

A regular cone \(\mathcal {K}\) is said to be self-dual if it coincides with its dual cone, i.e., \(\mathcal {K} = \mathcal {K}^\star \).

The general linear group of degree n over \({\mathbb {R}}\) is denoted by \(GL(n, {\mathbb {R}})\) and is defined to be the set of all nonsingular matrices of order n with entries from \({\mathbb {R}}\), together with the operation of ordinary matrix multiplication. For a regular cone \(\mathcal {K} \subset \mathcal {V}\), we denote by \(\text {Aut}(\mathcal {K})\) the automorphism group of \(\mathcal {K}\), i.e.,

$$\begin{aligned} \text {Aut}(\mathcal {K}):=\{\varphi \in GL(n, {\mathbb {R}}): \varphi (\mathcal {K})=\mathcal {K}\}. \end{aligned}$$

Definition 2.2

Let \(\mathcal {V}\) be a finite-dimensional real Euclidean space. A regular cone \(\mathcal {K} \subset \mathcal {V}\) is said to be homogeneous if for each \(\varvec{u}, \varvec{v} \in \text {int}\,\mathcal {K}\), there exists an nonsingular linear map \(F: \mathcal {V}\longrightarrow \mathcal {V}\) such that

$$\begin{aligned} F(\mathcal {K})=\mathcal {K}\;(\text {i.e.,}\; F\; \text {is an automorphism of}\; \mathcal {K}),\;\text { and}\;\; F(\varvec{u})=\varvec{v}. \end{aligned}$$

In other words, \(\text {Aut}(\mathcal {K})\) acts transitively on the interior of \(\mathcal {K}\).

A regular \(\mathcal {K}\) cone is said to be symmetric if it is both self-dual and homogeneous.

2.3 Preliminaries of Euclidean Jordan algebras

In this subsection, we review some basic definitions and notions from the theory of Jordan algebras. The text of Faraut and Korányi [31] covers the foundations of this theory. Our presentation in this subsection follows that of [1, Section 2]. In order to make our presentation concrete, we give some examples from the algebra of the second-order cone as an interesting paradigm for understanding the theory of Jordan algebras.

We use “,” for adjoining scalars, vectors and matrices in a row, and use “;” for adjoining them in a column. So, for instance, if \(\varvec{u}\) and \(\varvec{v}\) are vectors, we have

$$\begin{aligned} \left[ \begin{array}{c} \varvec{u} \\ \varvec{v} \end{array} \right] = (\varvec{u}^\mathsf{T }, \varvec{v^\mathsf{T }})^\mathsf{T }= (\varvec{u}; \varvec{v}). \end{aligned}$$

Let \(\mathcal {J}\) be a finite-dimensional vector space over \({\mathbb {R}}\). A map \(\circ : \mathcal {J} \times \mathcal {J} \longrightarrow \mathcal {J}\) is called bilinear if \((\alpha \varvec{x} + \beta \varvec{y})\circ \varvec{z}=\alpha (\varvec{x} \circ \varvec{z})+\beta (\varvec{y} \circ \varvec{z})\) and \(\varvec{x} \circ (\alpha \varvec{y} + \beta \varvec{z})=\alpha (\varvec{x} \circ \varvec{y})+\beta (\varvec{x} \circ \varvec{z})\) for all \(\varvec{x}, \varvec{y}, \varvec{z} \in \mathcal {J}\) and \(\alpha , \beta \in {\mathbb {R}}\). The vector space \(\mathcal {J}\) over \({\mathbb {R}}\) is called an algebra over \({\mathbb {R}}\) if a bilinear map \(\circ : \mathcal {J} \times \mathcal {J} \longrightarrow \mathcal {J}\) exists.

Let \(\varvec{x}\) be an element in an algebra \(\mathcal {J}\), then we define \(\varvec{x}^{(1)}:=\varvec{x}\) and define \(\varvec{x}^{(n)}\) recursively by \(\varvec{x}^{(n)} :=\varvec{x} \circ \varvec{x}^{(n-1)}\) for \(n \ge 2\).

Definition 2.3

Let \(\mathcal {J}\) be an algebra over \({\mathbb {R}}\) with a bilinear product \(\circ : \mathcal {J} \times \mathcal {J} \longrightarrow \mathcal {J}\). Then \((\mathcal {J}, \circ )\) is called a Jordan algebra if for all \(\varvec{x}, \varvec{y} \in \mathcal {J}\) we have

$$\begin{aligned}&\varvec{x} \circ \varvec{y}=\varvec{y} \circ \varvec{x}\;\hbox {(commutativity), and}\\&\varvec{x} \circ (\varvec{x}^{(2)} \circ \varvec{y})=\varvec{x}^{(2)} \circ (\varvec{x} \circ \varvec{y})\;\text {(Jordan's axiom).} \end{aligned}$$

The product \(\varvec{x} \circ \varvec{y}\) between two elements \(\varvec{x}\) and \(\varvec{y}\) of a Jordan algebra \((\mathcal {J}, \circ )\) is called the Jordan multiplication between \(\varvec{x}\) and \(\varvec{y}\). A Jordan algebra \((\mathcal {J}, \circ )\) has an identity element if there exists a (necessarily unique) element \(\varvec{e}\in \mathcal {J}\) such that \(\varvec{x} \circ \varvec{e} = \varvec{x}\) for all \(\varvec{x} \in \mathcal {J}\). A Jordan algebra \((\mathcal {J}, \circ )\) is not necessarily associative, that is, \(\varvec{x} \circ (\varvec{y} \circ \varvec{z}) =(\varvec{x} \circ \varvec{y}) \circ \varvec{z}\) may not generally hold. However, it is power associative, i.e., \(\varvec{x}^{(p)} \circ \varvec{x}^{(q)}=\varvec{x}^{(p+q)}\) for all integers \(p, q \ge 1\).

Example 2.1

For each vector \(\varvec{x} \in {\mathbb {R}}^n\) whose first entry is indexed with 0, we write \(\widetilde{\varvec{x}}\) for the subvector consisting of entries 1, through \(n-1\) (therefore \(\varvec{x} = (x_0; \widetilde{\varvec{x}}) \in {\mathbb {R}}\times {\mathbb {R}}^{{n}-1}\)). We let \(\mathcal {E}^{n}\) denote the \(n^\text {th}\) dimensional real vector space \({\mathbb {R}}\times {\mathbb {R}}^{{n}-1}\) whose vectors \(\varvec{x}\) are indexed from 0. Consider the multiplication \(\circ : \mathcal {E}^n \times \mathcal {E}^n \longrightarrow \mathcal {E}^n\) defined as

$$\begin{aligned} \varvec{x} \circ \varvec{y} := \left[ \begin{array}{c} x_0y_0 +\varvec{{\widetilde{x}}}^\mathsf{T }\varvec{{\widetilde{y}}} \\ x_0 \varvec{{\widetilde{y}}} + y_0 \varvec{{\widetilde{x}}} \end{array} \right] , \end{aligned}$$
(1)

for \(\varvec{x}, \varvec{y} \in \mathcal {E}^n\). It is easy to see that the space \(\mathcal {E}^{n}\) with the multiplication “\(\circ \)” forms a Jordan algebra with the identity vector \( \varvec{e} := \left( 1; \varvec{0} \right) .\)

Definition 2.4

A Jordan algebra \(\mathcal {J}\) is called Euclidean if there exists a map \( \bullet :(\mathcal {J}, \circ ) \times (\mathcal {J}, \circ )\longrightarrow {\mathbb {R}}\) such that for all \(\varvec{x}, \varvec{y}, \varvec{z} \in \mathcal {J}\) we have

  1. (1)

    \(\varvec{x} \bullet \varvec{x} > 0\) for all \(\varvec{x} \ne \varvec{0}\) (positive definiteness);

  2. (2)

    \(\varvec{x}\bullet \varvec{y} = \varvec{y}\bullet \varvec{x}\) (symmetry);

  3. (3)

    \(\varvec{x}\bullet (\varvec{y} \circ \varvec{z}) =(\varvec{x} \circ \varvec{y})\bullet \varvec{z}\) (associativity).

That is, \(\mathcal {J}\) admits a positive definite, symmetric, quadratic form which is also associative.

The degree of an element \(\varvec{x} \in \mathcal {J}\) (denoted by \(\text {deg}(\varvec{x})\)) is the smallest integer d such that the set \(\left\{ \varvec{e}, \varvec{x}, \varvec{x}^{(2)}, \ldots , \varvec{x}^{(d)} \right\} \) is linearly independent. The rank of a Jordan algebra \(\mathcal {J}\) (denoted by \(\text {rank}(\mathcal {J})\)) is the largest \(\text {deg}(\varvec{x})\) of any element \(\varvec{x} \in \mathcal {J}\). An element \(\varvec{x} \in \mathcal {J}\) is called regular if \(\deg (\varvec{x})= \text {rank}(\mathcal {J})\). Throughout the rest of this section, we let \(\mathcal {J}\) be a rank-r Euclidean Jordan algebra.

Let \(\varvec{x}\) be an element of degree d in a Euclidean Jordan algebra \(\mathcal {J}\). Since \(\{\varvec{e}, \varvec{x}, \varvec{x}^2, \ldots , \varvec{x}^d\}\) is linearly dependent, there are real numbers \(a_1(\varvec{x}), a_2(\varvec{x}), \ldots , a_d(\varvec{x})\), not all zero, such that

$$\begin{aligned} q(\varvec{x}) :=\varvec{x}^d - a_1(\varvec{x}) \varvec{x}^{d-1} + a_2(\varvec{x}) \varvec{x}^{d-2} + \cdots + (-1)^d a_d(\varvec{x}) \varvec{e} = \varvec{0}. \end{aligned}$$

We call q the minimal polynomial of \(\varvec{x}\). If \(\varvec{x}\) is a regular element of a Euclidean Jordan algebra, then we define its characteristic polynomial to be equal to its minimal polynomial. Since the set of regular elements is dense in \(\mathcal {J}\), we can extend characteristic polynomials to all \(\varvec{x}\) in \(\mathcal {J}\). So, the minimal polynomial coincides with the characteristic polynomial for regular elements and divides the characteristic polynomial of non-regular elements. Let \(\varvec{x}\) be an element in a rank-r algebra \(\mathcal {J}\), then its eigenvalues are the roots \(\lambda _1, \lambda _2, \ldots , \lambda _r\) of its characteristic polynomial \(p(\lambda ) =\lambda ^r - a_1(\varvec{x}) \lambda ^{r-1} + a_2(\varvec{x}) \lambda ^{r-2} + \cdots + (-1)^r a_r(\varvec{x})\).

Two elements \(\varvec{c}_1, \varvec{c}_2 \in \mathcal {J}\) are said to be orthogonal if \(\varvec{c}_1 \circ \varvec{c}_2 = \varvec{0}\). A set of elements of \(\mathcal {J}\) is orthogonal if all its elements are mutually orthogonal to each other. An element \(\varvec{c} \in \mathcal {J}\) is said to be an idempotent if \(\varvec{c}^{(2)}=\varvec{c}\). An idempotent is primitive if it is non-zero and cannot be written as a sum of two (necessarily orthogonal) non-zero idempotents. A subset \(\{ \varvec{c}_1, \varvec{c}_2, \ldots , \varvec{c}_k\}\) of \(\mathcal {J}\) is called a Jordan frame if it is a complete system (i.e., \(\varvec{c}_1 + \varvec{c}_2 + \cdots + \varvec{c}_k = \varvec{e}\)) of orthogonal primitive idempotents.

Note that Jordan frames always have r primitive idempotents in them, so in Jordan frames \(k=r\).

Theorem 2.1

(Spectral decomposition I, [31, Theorem 6]). Let \(\mathcal {J}\) be a Euclidean Jordan algebra. Then for each \(\varvec{x} \in \mathcal {J}\) there exist unique real numbers \(\lambda _1(\varvec{x}), \lambda _2(\varvec{x}), \ldots , \lambda _k(\varvec{x})\), all distinct, and a unique complete system of orthogonal idempotents \(\varvec{c}_1(\varvec{x}), \varvec{c}_2(\varvec{x}), \ldots , \varvec{c}_k(\varvec{x})\) such that

$$\begin{aligned} \varvec{x} = \lambda _1(\varvec{x}) \varvec{c}_1(\varvec{x})+ \cdots +\lambda _k(\varvec{x})\varvec{c}_k(\varvec{x}). \end{aligned}$$

Theorem 2.2

(Spectral decomposition II, [31, Theorem 7]). Let \(\mathcal {J}\) be a rank-r Euclidean Jordan algebra. Then for each \(\varvec{x} \in \mathcal {J}\) there exist real numbers \(\lambda _1(\varvec{x}), \lambda _2(\varvec{x}), \ldots , \lambda _r(\varvec{x})\) and a Jordan frame \(\varvec{c}_1(\varvec{x}), \varvec{c}_2(\varvec{x}), \ldots , \varvec{c}_r(\varvec{x})\) such that

$$\begin{aligned} \varvec{x} = \lambda _1(\varvec{x}) \varvec{c}_1(\varvec{x})+ \cdots +\lambda _r(\varvec{x})\varvec{c}_r(\varvec{x}), \end{aligned}$$
(2)

and \(\lambda _1, \lambda _2, \ldots , \lambda _r\) are the eigenvalues of \(\varvec{x}\).

The elements \(\varvec{x}\) and \(\varvec{y}\) of a Jordan algebra are called simultaneously decomposed if they share a Jordan frame, i.e., \(\varvec{x} = \lambda _1(\varvec{x})\, \varvec{c}_1 + \cdots + \lambda _r(\varvec{x})\, \varvec{c}_r\) and \(\varvec{y} = \lambda _1(\varvec{y})\, \varvec{c}_1 + \cdots + \lambda _r(\varvec{y})\, \varvec{c}_r\) for a Jordan frame \(\{\varvec{c}_1, \ldots ,\varvec{c}_r\}\) (hence, we have \(\varvec{c}_i(\varvec{x})=\varvec{c}_i(\varvec{y})\) for each \(i=1, \ldots , r\)). Two elements \(\varvec{x}\) and \(\varvec{y}\) are said to operator commute if for all \(\varvec{z}\), we have that \(\varvec{x} \circ (\varvec{y}\circ \varvec{z}) = \varvec{y} \circ (\varvec{x} \circ \varvec{z})\). Two elements of a Euclidean Jordan algebra are simultaneously decomposed iff they operator commute [1, Theorem 27].

Let \(\varvec{x}\) be an element in a Jordan algebra \(\mathcal {J}\) with the spectral decomposition given in (2). Then \(\text {trace}(\varvec{x}):=\lambda _1(\varvec{x}) +\cdots + \lambda _r(\varvec{x})\) is called the trace of \(\varvec{x}\) in \(\mathcal {J}\), and \(\det (\varvec{x}):=\lambda _1(\varvec{x}) \cdots \lambda _r(\varvec{x})\) is the determinant of \(\varvec{x}\) in \(\mathcal {J}\).

For a Euclidean Jordan algebra \(\mathcal {J}\), we define the Frobenius inner product between two elements \(\varvec{x}, \varvec{y} \in \mathcal {J}\) as

$$\begin{aligned} \varvec{x} \bullet \varvec{y} := trace (\varvec{x} \circ \varvec{y}). \end{aligned}$$

Due to the bilinearity of “\(\circ \)”, one can show that the map “\(\bullet \)” satisfies the three conditions in Definition 2.4. (See [1, Section 2]).

Example 2.2

One can easily show that the algebra \(\mathcal {E}^{n}\), with the Jordan multiplication “\(\,\circ \)” defined in (1), forms a Euclidean Jordan algebra under the inner product \(\varvec{x} \bullet \varvec{y} =2 \varvec{x}^\mathsf{T }\varvec{y}=2\left( x_0y_0+\varvec{{\widetilde{x}}}^\mathsf{T }\varvec{{\widetilde{y}}}\right) \).

Now, for \(\varvec{x} \in \mathcal {J}\) having the spectral decomposition given in (2), we can define \(\varvec{x}^2\) as

$$\begin{aligned} \varvec{x}^2:=(\lambda _1 (\varvec{x}))^2 \varvec{c}_1 (\varvec{x}) +\cdots + (\lambda _r (\varvec{x}))^2 \varvec{c}_r (\varvec{x}). \end{aligned}$$

One can easily see that \(\varvec{x}^2 = \varvec{x} \circ \varvec{x}= \varvec{x}^{(2)}\).

If \(\det (\varvec{x}) \ne 0\) (i.e., all \(\lambda _i(\varvec{x}) \ne 0\)), then we say that \(\varvec{x}\) is nonsingular. In this case, the inverse of \(\varvec{x}\), which is denoted by \(\varvec{x}^{-1}\), is the unique element that satisfies \(\varvec{x}^{-1}\circ \varvec{x} = \varvec{e}\). Therefore,

$$\begin{aligned} \varvec{x}^{-1}:= \displaystyle \frac{1}{\lambda _1 (\varvec{x})} \varvec{c}_1 (\varvec{x}) +\cdots + \displaystyle \frac{1}{\lambda _r (\varvec{x})} \varvec{c}_r (\varvec{x}). \end{aligned}$$

More generally, if \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\) is continuous, then it is also possible to extend the above definition to define \(f(\varvec{x})\) as

$$\begin{aligned} f(\varvec{x}):=f(\lambda _1 (\varvec{x})) \varvec{c}_1 (\varvec{x}) +\cdots + f(\lambda _r (\varvec{x})) \varvec{c}_r (\varvec{x}). \end{aligned}$$

In particular, the square root of \(\varvec{x}\) is defined as

$$\begin{aligned} \varvec{x}^{1/2}:=\sqrt{\lambda _1 (\varvec{x})} \varvec{c}_1 (\varvec{x}) +\cdots + \sqrt{\lambda _r (\varvec{x})} \varvec{c}_r (\varvec{x}), \end{aligned}$$

provided that \(\lambda _i (\varvec{x}) \ge 0, i=1, 2, \ldots , r\).

The cone of squares of \(\mathcal {J}\) is defined as \(\mathcal {K}_\mathcal {J}:=\left\{ \varvec{x}^2 : \varvec{x} \in \mathcal {J} \right\} \). The logarithmic barrier function, which is defined on the interior of \(\mathcal {K}_\mathcal {J}\) (denoted by \(\text {int}\,\mathcal {K}_\mathcal {J}\)) as \(\ell (\varvec{x}) :=- \ln \det \left( \varvec{x} \right) \), will play an important role for our subsequent development.

In the following theorem, we give the Jordan algebraic characterization of symmetric cones.

Theorem 2.3

[1, Theorem 2] A regular cone is symmetric iff it is the cone of squares of some Euclidean Jordan algebra.

Example 2.3

The cone of squares of the algebra \((\mathcal {E}^n, \circ )\), with “\(\,\circ \)” defined in (1), is the second-order cone, which is defined as

$$\begin{aligned} \mathcal {E}^{n}_+ :=\left\{ \varvec{x} : x_0 \ge \Vert \widetilde{\varvec{x}}\Vert \right\} . \end{aligned}$$

The mappings introduced in following definition play an important role in the development of the interior point methods for conic programming [1].

Definition 2.5

Let \(\varvec{x}\) be an element in a Jordan algebra \(\mathcal {J}\). Then

  1. (a)

    The linear map \(L(\varvec{x}) : \mathcal {J} \longrightarrow \mathcal {J}\) is defined as \(L(\varvec{x}) \varvec{y} :=\varvec{x} \circ \varvec{y}\) for all \(\varvec{y} \in \mathcal {J}\).

  2. (b)

    The quadratic representation of \(\varvec{x}\), \(Q_{(\varvec{x})}: \mathcal {J} \longrightarrow \mathcal {J}\), is defined as \(Q_{(\varvec{x})} := 2 L(\varvec{x})^2 - L(\varvec{x}^2)\).

Note that \(Q_{(\varvec{x})}\) is also a linear operator in \(\mathcal {J}\). Note also that \(L(\varvec{x}) \varvec{e} = \varvec{x}, L(\varvec{x}) \varvec{x} = \varvec{x}^2\), \(L(\varvec{e}) = Q_{(\varvec{e})} =I, \; \text {trace}(\varvec{e}) = 2\) and \(\det (\varvec{e})=1\) (since all the eigenvalues of \(\varvec{e}\) are equal to one).

The Frobenius norm of an element \(\varvec{x} \in \mathcal {J}\) is defined as \(\Vert \varvec{x}\Vert _F := \sqrt{\varvec{x} \bullet \varvec{x}}= \sqrt{\sum _{i=1}^{r} \lambda _i^2(\varvec{x})}\). For any \(\varvec{x}, \varvec{y}\in \mathcal {J}\), we have

$$\begin{aligned} \Vert \varvec{x} \circ \varvec{y}\Vert _F \le \Vert \varvec{x}\Vert ~ \Vert \varvec{y}\Vert _{F} \le \Vert \varvec{x}\Vert _{F}~ \Vert \varvec{y}\Vert _{F}\;\;\text {and}\;\; \Vert \varvec{x} + \varvec{y}\Vert _{F}^2 = \Vert \varvec{x}\Vert _{F}^2 + \Vert \varvec{y}\Vert _{F}^2 + 8 \varvec{x}\bullet \varvec{y}. \end{aligned}$$
(3)

All the above notions are also used in the block sense as follows. Let \((\mathcal {J}_1, \circ _1, \bullet _1), (\mathcal {J}_2, \circ _2, \bullet _2), \ldots \), \((\mathcal {J}_q, \circ _q, \bullet _q)\), be Euclidean Jordan algebras with identities \(\varvec{e}_1, \varvec{e}_2, \ldots , \varvec{e}_q\) and cone of squares \(\mathcal {K}_{\mathcal {J}_1}\), \(\mathcal {K}_{\mathcal {J}_2}\), \(\cdots , \mathcal {K}_{\mathcal {J}_q}\), respectively. Let also \(\varvec{x} := (\varvec{x}_1; \varvec{x}_2; \ldots ; \varvec{x}_q)\) and \(\varvec{y} :=(\varvec{y}_1; \varvec{y}_2; \ldots ; \varvec{y}_q)\) with \(\varvec{x}_i, \varvec{y}_i \in \mathcal {J}_i\) for \(i=1, 2, \ldots , q\). Then

  1. (a)

    \(\mathcal {K}_\mathcal {J} :=\mathcal {K}_{\mathcal {J}_1} \times \mathcal {K}_{\mathcal {J}_2} \times \cdots \times \mathcal {K}_{\mathcal {J}_q}\) is the cone of squares of \(\mathcal {J} := \mathcal {J}_1 \times \mathcal {J}_2 \times \cdots \times \mathcal {J}_q\).

  2. (b)

    \(\varvec{e} := \left( \varvec{e}_1; \varvec{e}_2; \ldots ; \varvec{e}_q\right) \) is the identity vector of \(\mathcal {J}\).

  3. (c)

    \(\varvec{x} \circ \varvec{y} := (\varvec{x}_1 \circ _1 \varvec{y}_1; \varvec{x}_2 \circ _2 \varvec{y}_2; \ldots ; \varvec{x}_q \circ _q \varvec{y}_q)\).

  4. (d)

    \(\varvec{x}\bullet \varvec{y} := {\varvec{x}_1}\bullet _1 \varvec{y}_1 + {\varvec{x}_2}\bullet _2 \varvec{y}_2 + \cdots + {\varvec{x}_q}\bullet _q \varvec{y}_q\).

  5. (e)

    \(\text {L}(\varvec{x}):= \text {L}(\varvec{x}_1) \oplus \text {L}(\varvec{x}_2) \oplus \cdots \oplus \text {L}(\varvec{x}_q)\)Footnote 1.

  6. (f)

    \(Q_{(\varvec{x})}:= Q_{(\varvec{x}_1)} \oplus Q_{(\varvec{x}_2)} \oplus \cdots \oplus Q_{(\varvec{x}_q)}\).

  7. (g)

    \(f\!(\varvec{x})\!\!:=\!(f(\varvec{x}_1); f(\varvec{x}_2); \ldots ; f(\varvec{x}_q))\). In particular, \(\varvec{x}^{-1}\!:=\!\left( {\varvec{x}_1}^{-1}; {\varvec{x}_2}^{-1}; \ldots ; {\varvec{x}_q}^{-1}\right) \).

  8. (h)

    \(\Vert \varvec{x} \Vert _F^2 := \Vert \varvec{x}_1 \Vert _F^2 + \Vert \varvec{x}_2 \Vert _F^2 + \cdots + \Vert \varvec{x}_q \Vert _F^2\).

  9. (i)

    The logarithmic barrier is defined on \(\text {int}\,\mathcal {K}_{\mathcal {J}}\) as \(\ell (\varvec{x}) := - \sum _{i=1}^{q} \ln \det \left( \varvec{x}_i \right) \).

  10. (j)

    \(\varvec{x}\) and \(\varvec{y}\) operator commute iff \(\varvec{x}_i\) and \(\varvec{y}_i\) operator commute for each \(i=1, 2, \ldots , q\).

We write \(\varvec{x} \succeq _{\mathcal {K}_{\mathcal {J}}} \varvec{0}\) (or simply \(\varvec{x} \succeq \varvec{0}\) when \(\mathcal {K}_{\mathcal {J}}\) is known from the context) to mean that \(\varvec{x} \in \mathcal {K}_{\mathcal {J}}\) (i.e., \(\varvec{x}\) is positive semidefinite). We also write \(\varvec{x} \succ \varvec{0}\) to mean that \(\varvec{x} \in \text {int} \mathcal {K}_{\mathcal {J}}\) (i.e., \(\varvec{x}\) is positive definite), and write \(\varvec{x} \succeq \varvec{y}\; (\varvec{x} \succ \varvec{y})\) to mean that \(\varvec{x} - \varvec{y} \succeq \varvec{0}\; (\varvec{x} - \varvec{y} \succ \varvec{0})\). Note that \(\varvec{x} \succeq \varvec{0}\) (\(\varvec{x} \succ \varvec{0}\)) iff \(\lambda _i (\varvec{x}) \ge 0 ~(\lambda _i (\varvec{x}) > 0)\) for \(i=1, 2, \ldots , r\).

3 Problem formulation and Newton’s method

In this section, we introduce the primal and dual forms of symmetric programming problem and the perturbed problem. Then we apply Newton’s method to the perturbed problem.

3.1 Problem formulation and assumptions

Let \(q \ge 1\) be an integer. For each \(i=1, 2, \ldots , q\), let \(n, n_i, r\) and \(r_i\) be positive integers such that \(n:=\sum _{i=1}^{q}n_i\) and \(r:=\sum _{i=1}^{q}r_i\). Let \((\mathcal {J}_1, \circ _1, \bullet _1), (\mathcal {J}_2, \circ _2, \bullet _2), \ldots \), \((\mathcal {J}_q, \circ _q, \bullet _q)\) be Euclidean Jordan algebras with dimensions \(n_1, n_2, \ldots , n_q\), ranks \(r_1, r_2, \ldots , r_q\), identities \(\varvec{e}_1, \varvec{e}_2, \ldots , \varvec{e}_q\) and cone of squares \(\mathcal {K}_{\mathcal {J}_1}, \mathcal {K}_{\mathcal {J}_2}, \ldots \mathcal {K}_{\mathcal {J}_q}\), respectively, and let

$$\begin{aligned} \begin{array}{lll} (\mathcal {J}, \circ , \bullet ) &{}:=&{} (\mathcal {J}_1, \circ _1, \bullet _1) \times (\mathcal {J}_2, \circ _2, \bullet _2) \times \cdots \times (\mathcal {J}_q, \circ _q, \bullet _q),\\ \mathcal {K}_\mathcal {J}&{}:=&{}\mathcal {K}_{\mathcal {J}_1} \times \mathcal {K}_{\mathcal {J}_2} \times \cdots \times \mathcal {K}_{\mathcal {J}_q}. \end{array} \end{aligned}$$

Let also \(\varvec{y}, \varvec{b} \in {\mathbb {R}}^{m}\), and \(\varvec{x}, \varvec{c}, \varvec{s} \in \mathcal {J}\) and \(A \in {\mathbb {R}}^{m \times n}\) be such that they are conformally partitioned as

$$\begin{aligned}&\varvec{x} := (\varvec{x}_1; \varvec{x}_2; \ldots ; \varvec{x}_q),\; \varvec{s} := (\varvec{s}_1; \varvec{s}_2; \ldots ; \varvec{s}_q),\; \varvec{c} := (\varvec{c}_1; \varvec{c}_2; \ldots ; \varvec{c}_q),\;\text {and}\\&\quad A := (A_1, A_2, \ldots , A_q), \end{aligned}$$

where \(\varvec{x}_i, \varvec{s}_i, \varvec{c}_i \in \mathcal {J}_i\) and \(A_i \in {\mathbb {R}}^{m \times n_i}\) for \(i=1, 2, \ldots , q\). More specifically, the submatrix \(A_i\) consists of m rows \(\varvec{a}^{(i)}_1, \varvec{a}^{(i)}_2, \ldots , \varvec{a}^{(i)}_m \in \mathcal {J}_i\) and acts as a linear transformation that maps \(\varvec{x}_i\) to the \(m^\text {th}\)-dimensional vector whose \(j^\text {th}\) component is \(\varvec{a}^{(i)}_j \bullet _i \varvec{x}_i\) for each \(i=1, 2, \ldots , q\).

The symmetric programming problem and its dual in multi-block structures are defined as [1, Section 3]

$$\begin{aligned} \begin{array}{lllllll} &{} &{}\min &{} \varvec{c}_1 \bullet _1 \varvec{x}_1 + \cdots + \varvec{c}_q \bullet _q \varvec{x}_q &{} &{}\max &{}\varvec{b} ^\mathsf{T }\varvec{y} \\ &{}(\hbox {P}_i) ~~~&{}\text{ s.t. } &{} A_1 \varvec{x}_1 \!+\! \cdots + A_q \varvec{x}_q \!= \varvec{b}, ~~~~~~~~ &{}{(\hbox {D}_i)}~~~ &{}\text{ s.t. } &{}{A_i}^\mathsf{T }\varvec{y} \!+\! \varvec{s}_i \!=\! \varvec{c}_i, \; i=1, \ldots , q, \\ &{} &{}&{} \varvec{x}_i \succeq \varvec{0}, \; i\!=\!1, \ldots , q; &{} &{}&{}\varvec{s}_i \succeq \varvec{0},\; \; i\!=\!1, \ldots , q. \end{array} \end{aligned}$$

We can rewrite the pair \((P_i, D_i)\) compactly as

$$\begin{aligned} \begin{array}{lllllll} &{}&{}\min &{} \varvec{c} \bullet \varvec{x} &{}&{}\max &{}\varvec{b}^\mathsf{T }\varvec{y}\\ &{}\text {(P)}~~~~~&{}\text{ s.t. } &{} A \varvec{x} = \varvec{b}, ~~~~~~~~ &{}\text {(D)}~~~~~&{}\text{ s.t. }&{} A^\mathsf{T }\varvec{y} + \varvec{s} = \varvec{c}, \\ &{}&{}&{} \varvec{x} \succeq \varvec{0}; &{}&{}&{}\varvec{s} \succeq \varvec{0}. \end{array} \end{aligned}$$

The general scheme of the central trajectory methods is as follows. We associate the perturbed problems to symmetric programming problem (P) and (D), then we draw a path of the centers defined by the perturbed KKT optimality conditions. After that, Newton’s method is applied to treat the corresponding perturbed equations in order to obtain a descent search direction. As we mentioned earlier, we propose four different selections of calculating the displacement step.

Let \(\mu > 0\) be a barrier parameter and \(\sigma \in (0, 1)\) be the centering parameter. The perturbed primal problem corresponding to the primal problem (P) is

$$\begin{aligned} {(\hbox {P}_\mu )} ~~~~~ \begin{array}{lll} \min &{}&{} f_\mu (\varvec{x}) := \varvec{c}\bullet \varvec{x} + \sigma \mu \, \ell (\varvec{x}) + r \mu \ln \mu \\ \text{ s.t. } &{}&{} A \varvec{x} = \varvec{b},\\ &{}&{} \varvec{x} \succ \varvec{0}, \end{array} \end{aligned}$$

and the perturbed dual problem corresponding to the dual problem (D) is

$$\begin{aligned} ~~~~~{(\hbox {D}_\mu )} ~~~~~ \begin{array}{lll} \max &{}&{} g_\mu (\varvec{y}, \varvec{s}) := \varvec{b}^\mathsf{T }\varvec{y} - \sigma \mu \, \ell (\varvec{s}) - r \mu \ln \mu \\ \text{ s.t. } &{}&{} A^\mathsf{T }\varvec{y} + \varvec{s} = \varvec{c},\\ &{}&{} \varvec{s} \succ \varvec{0}. \end{array} \end{aligned}$$

Now, we define the following feasibility sets:

$$\begin{aligned} \begin{array}{lll} \mathcal {F}_P &{}:=&{} \left\{ \varvec{x} \in \mathcal {J}: A \varvec{x} = \varvec{b},\;\varvec{x} \succeq \varvec{0} \right\} ,\\ \mathcal {F}_D &{}:=&{} \left\{ (\varvec{y}; \varvec{s}) \in {\mathbb {R}}^m \times \mathcal {J}: A^\mathsf{T }\varvec{y} + \varvec{s} = \varvec{c},\;\varvec{s} \succeq \varvec{0} \right\} ,\\ \mathcal {F}_P^\circ &{}:=&{} \left\{ \varvec{x} \in \mathcal {J}: A \varvec{x} = \varvec{b},\;\varvec{x} \succ \varvec{0} \right\} ,\\ \mathcal {F}_D^\circ &{}:=&{} \left\{ (\varvec{y}; \varvec{s}) \in {\mathbb {R}}^m \times \mathcal {J}: A^\mathsf{T }\varvec{y} + \varvec{s} = \varvec{c},\;\varvec{s} \succ \varvec{0} \right\} ,\\ \mathcal {F}^\circ &{}:=&{} \mathcal {F}_P^\circ \times \mathcal {F}_D^\circ . \end{array} \end{aligned}$$

Next, we make two assumptions about the primal-dual pair (PD).

Assumption 3.1

The m rows of the matrix A are linearly independent.

Assumption 3.2

The set \(\mathcal {F}^\circ \) is nonempty.

Assumption 3.1 is for convenience. Assumption 3.2 requires that Problem (P\(_\mu \)) and its dual (\(\hbox {D}_\mu \)) have strictly feasible solutions, which guarantees strong duality for the symmetric programming problem. Note that the feasible region for Problems (P\(_\mu \)) and (\(\hbox {D}_\mu \)) is described implicitly by \(\mathcal {F}_0\). Due to the coercivity of the function \(f_\mu \) on the feasible set of \(P_\mu \), Problem (P\(_\mu \)) has an optimal solution.

The following lemma proves the convergence of the optimal solution of Problem \((P_\mu )\) to the optimal solution of Problem (P) when \(\mu \) approaches zero.

Lemma 3.1

Let \(\bar{\varvec{x}}_\mu \) be an optimal primal solution of Problem \((P_\mu )\), then \(\bar{\varvec{x}} = \lim _{\mu \rightarrow 0} \bar{\varvec{x}}_\mu \) is an optimal solution of Problem (P).

Proof

Let \(f_\mu (\varvec{x}) := f(\varvec{x}, \mu )\) and \(f(\varvec{x}) := f(\varvec{x}, 0)\). Due to the coercivity of the function \(f_\mu \) on the feasible set of \(P_\mu \), Problem (P\(_\mu \)) has an optimal solution, say \(\bar{\varvec{x}}_\mu \), such that

$$\begin{aligned} \nabla _{\varvec{x}}f_\mu (\bar{\varvec{x}}_\mu ) = \nabla _{\varvec{x}}f(\bar{\varvec{x}}_\mu , \mu ) = \varvec{0}. \end{aligned}$$

Then, for all \(\varvec{x} \in \mathcal {F}_P^\circ \), we have that

$$\begin{aligned} \begin{array}{lll} f(\varvec{x}) &{}\ge &{} f(\bar{\varvec{x}}_\mu , \mu ) + (\varvec{x} - \bar{\varvec{x}}_\mu ) \bullet \nabla _{\varvec{x}}f(\bar{\varvec{x}}_\mu , \mu ) + (0-\mu ) \displaystyle \frac{\partial }{\partial \mu } f(\bar{\varvec{x}}_\mu , \mu )\\ &{}\ge &{} f(\bar{\varvec{x}}_\mu , \mu ) + \mu \ln \det \bar{\varvec{x}}_\mu - r \mu \ln \mu - r \mu \\ &{}\ge &{} \varvec{c} \bullet \bar{\varvec{x}}_\mu - \mu \ln \det \bar{\varvec{x}}_\mu + r \mu \ln \mu + \mu \ln \det \bar{\varvec{x}}_\mu - r \mu \ln \mu - r \mu \\ &{}\ge &{} \varvec{c} \bullet \bar{\varvec{x}}_\mu - r \mu . \end{array} \end{aligned}$$

Since \(\varvec{x}\) was arbitrary in \(\mathcal {F}^\circ _P\), this implies that \(\min _{\varvec{x} \in \mathcal {F}^\circ _P} f(\varvec{x}) \ge \varvec{c} \bullet \bar{\varvec{x}}_\mu - r \mu \ge \varvec{c} \bullet \bar{\varvec{x}}_\mu = f(\bar{\varvec{x}}_\mu )\). On the other side, we have \(f(\bar{\varvec{x}}_\mu ) \ge \min _{\varvec{x} \in \mathcal {F}^\circ _P} f(\varvec{x})\). As \(\mu \) goes to 0, it immediately follows that \(f(\bar{\varvec{x}}) = \min _{\varvec{x} \in \mathcal {F}^\circ _P} f(\varvec{x})\). Thus, \(\bar{\varvec{x}}\) is an optimal solution of Problem (P). The proof is complete. \(\square \)

3.2 Newton’s method and commutative directions

As we mentioned, the objective function of Problem \((P_\mu )\) is strictly convex, hence the KKT conditions are necessary and sufficient to characterize an optimal solution of Problem \((P_\mu )\). Consequently, the points \(\bar{\varvec{x}}_\mu \) and \((\bar{\varvec{y}}_\mu , \bar{\varvec{s}}_\mu )\) are optimal solutions of \((P_\mu )\) and \((D_\mu )\) respectively iff they satisfy the perturbed nonlinear system

$$\begin{aligned} \begin{array}{rllllll} A \varvec{x} &{}=&{} \varvec{b},&{}&{} \varvec{x} &{}\succ &{} \varvec{0},\\ A^\mathsf{T }\varvec{y} + \varvec{s} &{}=&{} \varvec{c},&{}&{} \varvec{s} &{}\succ &{} \varvec{0},\\ \varvec{x} \circ \varvec{s} &{}=&{} \sigma \mu \varvec{e}, &{}&{} \mu &{}>&{} 0, \end{array} \end{aligned}$$
(4)

where \(\varvec{e} := \left( \varvec{e}_{n_1}; \varvec{e}_{n_2}; \ldots ; \varvec{e}_{n_q}\right) \) is the identity vector of \(\mathcal {J}\).

We call the set of all solutions of system (4), denoted by \((\varvec{x}_\mu , \varvec{y}_\mu , \varvec{s}_\mu )\) with \(\mu >0\), the central trajectory or central path. We say that a point \((\varvec{x}, \varvec{y}, \varvec{s})\) is near to the central trajectory if it belongs to the set \(\mathcal {N}_\theta (\mu )\), which is defined as

$$\begin{aligned}&\mathcal {N}_\theta (\mu ) := \left\{ (\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D: d_F(\varvec{x}, \varvec{s}) \le \theta \mu \right\} ,\;\;\text {where}\\&\quad d_F(\varvec{x}, \varvec{s}) := \Vert Q_{({\varvec{x}}^{1/2})} \varvec{s} - \mu \varvec{e}\Vert _F \;\;\text {and}\;\; \theta \in (0, 1). \end{aligned}$$

Now, we can apply Newton’s method to system (4) and obtain the following linear system

$$\begin{aligned} \begin{array}{rll} A \Delta \varvec{x} &{}=&{} \varvec{0},\\ A^\mathsf{T }\Delta \varvec{y} + \Delta \varvec{s} &{}=&{} \varvec{0},\\ \varvec{x} \circ \Delta \varvec{s} + \Delta \varvec{x} \circ \varvec{s} &{}=&{} \sigma \mu \varvec{e} - \varvec{x} \circ \varvec{s}. ~~~~~~~~~~~~ ~~~~~~~~~~~~ \end{array} \end{aligned}$$
(5)

where \((\Delta \varvec{x}; \Delta \varvec{y}; \Delta \varvec{s}) \in \mathcal {J} \times {\mathbb {R}}^m \times \mathcal {J}\) is the search direction and \(\mu =\frac{1}{r} \varvec{x} \bullet \varvec{s}\) is the normalized duality gap corresponding to \((\varvec{x}, \varvec{y}, \varvec{s})\).

Note that the strict second-order cone inequalities \(\varvec{x}, \varvec{s} \succ \varvec{0}\) imply that \(d_F(\varvec{x}, \varvec{s}) \le \Vert \varvec{x} \circ \varvec{s} - \mu \varvec{e}\Vert _F\) with equality holds when \(\varvec{x}\) and \(\varvec{s}\) operator commute [1, Lemma 30]. In fact, it is known that many interesting properties become apparent for the analysis of interior-point methods when \(\varvec{x}\) and \(\varvec{s}\) operator commute. The commutative class is denoted by \(\mathcal {C}(\varvec{x}, \varvec{s})\) and is defined as

$$\begin{aligned} \mathcal {C}(\varvec{x}, \varvec{s}) := \left\{ \varvec{p} \in (\mathcal {J}, \circ , \bullet ) : \varvec{p} \;\text {is nonsingular},\; Q_{(\varvec{p})} \varvec{x}\; \text {and}\; Q_{(\varvec{p}^{-1})} \varvec{s} \;\text {operator commute}\right\} \!\!.\nonumber \\ \end{aligned}$$
(6)

From [1, Lemma 28], the equality \(\varvec{x} \circ \varvec{s} = \mu \varvec{e}\) holds iff the equality \((Q_{(\varvec{p})}\varvec{x}) \circ (Q_{(\varvec{p}^{-1})}\varvec{s}) = \mu \varvec{e}\) holds, for any nonsingular vector \(\varvec{p}\) in \(\mathcal {J}\). Therefore, for any given nonsingular vector \(\varvec{p} \in (\mathcal {J}, \circ , \bullet )\), the system (4) is equivalent to the system

$$\begin{aligned} \begin{array}{rllllll} A \varvec{x} &{}=&{} \varvec{b},&{}&{} \varvec{x} &{}\succ &{} \varvec{0},\\ A^\mathsf{T }\varvec{y} + \varvec{s} &{}=&{} \varvec{c},&{}&{} \varvec{s} &{}\succ &{} \varvec{0},\\ (Q_{(\varvec{p})}\varvec{x}) \circ (Q_{(\varvec{p}^{-1})}\varvec{s}) &{}=&{} \sigma \mu \varvec{e}, &{}&{} \mu &{}>&{} 0. \end{array} \end{aligned}$$
(7)

Let \(\varvec{v} \in (\mathcal {J}, \circ , \bullet )\). With respect to a nonsingular vector \(\varvec{p} \in (\mathcal {J}, \circ , \bullet )\), we define the scaling vectors \(\overline{\varvec{v}}\) and \(\underline{\varvec{v}}\) and the scaling matrix \(\underline{A}\) as

$$\begin{aligned} \overline{\varvec{v}} := Q_{(\varvec{p})} \varvec{v},\; \underline{\varvec{v}} := Q_{(\varvec{p}^{-1})} \varvec{v}, \;\;\text {and}\;\; \underline{A} := Q_{(\varvec{p})} A. \end{aligned}$$

Using this change of variables and the fact that \(Q_{(\varvec{p})}\left( \mathcal {J}\right) = \mathcal {J}\), we conclude that the system (5) is equivalent to the following Newton system

$$\begin{aligned} \begin{array}{rll} \underline{A} \overline{\Delta \varvec{x}} &{}=&{} \varvec{b} - \underline{A}\overline{ \varvec{x}},\\ \underline{A}^\mathsf{T }\Delta \varvec{y} + \underline{\Delta \varvec{s}} &{}=&{} \underline{\varvec{c}} - \underline{\varvec{s}} - \underline{A}^\mathsf{T }\varvec{y},\\ \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}} \circ \underline{\varvec{s}} &{}=&{} \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}. \end{array} \end{aligned}$$
(8)

Here, the normalized duality gap is \(\mu = \frac{1}{r} \overline{\varvec{x}} \bullet \underline{\varvec{s}} =\frac{1}{r} \varvec{x} \bullet \varvec{s}\). In fact,

$$\begin{aligned} \overline{\varvec{x}} \bullet \underline{\varvec{s}} = \left( Q_{(\varvec{p})}\varvec{x}\right) \bullet Q_{(\varvec{p}^{-1})}\varvec{s} = \varvec{x} \bullet Q_{(\varvec{p})}Q_{(\varvec{p}^{-1})}\varvec{s} = \varvec{x} \bullet \varvec{s}. \end{aligned}$$
(9)

Solving the scaled Newton system (8) yields the search direction \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}\right) \). Then, we apply the inverse scaling to \(\left( \overline{\Delta \varvec{x}}; \underline{\Delta \varvec{s}}\right) \) to obtain the Newton direction \((\Delta \varvec{x}; \Delta \varvec{s})\). Note that the search direction \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}\right) \) belongs to the so-called the MZ family of directions (due to Monteiro [21] and Zhang [13]). In fact, such a way of scaling originally proposed for semidefinite programming by Monteiro [21] and Zhang [13], and after that it was generalized for general symmetric cone programming by Schmieta and Alizadeh [1].

Clearly, the set \(\mathcal {C}(\varvec{x}, \varvec{s})\) defined in (6) is a subclass of the MZ family of search directions. Our focus in this paper is in vectors \(\varvec{p} \in \mathcal {C}(\varvec{x}, \varvec{s})\). We discuss the following three choices of \(\varvec{p}\) (see [1, Section 3]): We may choose \(\varvec{p} = \varvec{x}^{1/2}\) to obtain \(\overline{\varvec{x}}= \varvec{e}\), and we may choose \(\varvec{p} = \varvec{s}^{-1/2}\) to obtain \(\underline{\varvec{s}} = \varvec{e}\). These two choices of directions are called the HRVW/KSH/M directions (due to Helmberg et al. [22], Monteiro [21] and Kojima et al. [32]). The third choice of \(\varvec{p}\) is given by

$$\begin{aligned} \varvec{p} = \left( Q_{({\varvec{x}}^{1/2})} \left( Q_{({\varvec{x}}^{1/2})} {\varvec{s}}\right) ^{-1/2}\right) ^{-1/2}, \end{aligned}$$

which yields \(Q_{(\varvec{p})}^2 \varvec{x} = \varvec{s}\), and therefore \({\underline{\varvec{s}}} = Q_{(\varvec{p}^{-1})} \varvec{s} = Q_{(\varvec{p})} \varvec{x} = {\overline{\varvec{x}}}\). This choice of directions is called the NT direction (due to Nesterov and Todd [7]).

4 Selections of the displacement step

We start by motivating the need for introducing the so-called displacement step. Note that the positive definiteness of the vectors \(\varvec{x}^+ = \varvec{x} + \Delta \varvec{x}\) and \(\varvec{s}^+ = \varvec{s} + \Delta \varvec{s}\) is not always achieved. In order to circumvent this difficulty, we introduce a parameter \(\alpha >0\), for which we call the displacement step-size or simply the displacement step, then we redefine \(\varvec{x}^+, \varvec{y}^+\) and \(\varvec{s}^+\) as

$$\begin{aligned} \varvec{x}^+ := \varvec{x} + \alpha \Delta \varvec{x},\; \varvec{y}^+ := \varvec{y} + \alpha \Delta \varvec{y},\;\text {and}\; \varvec{s}^+ := \varvec{s} + \alpha \Delta \varvec{s}. \end{aligned}$$

Calculating the displacement step using classical line search methods is undesirable and even generally impossible [20, 23]. Strategies for determining the displacement step and preserving positive definiteness exists in the literature for semidefinite programming case only [23]. To the best of our knowledge, no such strategies available for general symmetric programming. From this stems the importance of the results of this paper. Calculating the search direction for any Newton method for problems over symmetric cones has been well established in the literature, but there are gaps in terms of maintaining feasibility. In order to fill such gaps, this section proposes four different selections of computing the appropriate displacement step for symmetric optimization. Lemmas 4.14.24.3 and 4.4 below generalize Lemmas 3.3, 3.4, 3.5 and 3.6, respectively, in [23, Section 3]. Each one of the following four lemmas gives a selection to calculate the displacement step \(\alpha \).

Lemma 4.1

(The first selection lemma). Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\). If \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\), then \(\varvec{x}^+, \varvec{s}^+ \succ \varvec{0}\), where for \(\varvec{v} \in \{\varvec{x}, \varvec{s} \}\) we have

$$\begin{aligned} {{\alpha _{\varvec{v}} = \left\{ \begin{array}{lll} \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}-\varepsilon , &{}\text {if}&{} \left( \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}> 0\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right)<0\right) ;\\ \varepsilon , &{}\text {if}&{} \left( \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}< 0\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right) <0\right) ;\\ 1, &{}\text {if}&{}\displaystyle \min _{1\le i\le r} \lambda _i\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right) > 0, \end{array} \right. }} \end{aligned}$$
(10)

where

$$\begin{aligned} \bar{\lambda }_{\varvec{v}}= & {} \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right) , \\ \delta _{\varvec{v}}^{2}= & {} \displaystyle \frac{1}{n} \displaystyle \sum _{i=1}^{r} \lambda _i^2\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right) - \bar{\lambda }_{\varvec{v}}^2, \end{aligned}$$

and \(\epsilon \) is a small positive real. Here, \(\lambda _i\left( \varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \right) , i=1, 2, \ldots , r\), are the eigenvalues of the vector \(\varvec{v}^{-1/2} \circ \left( \Delta \varvec{v} \circ \varvec{v}^{-1/2}\right) \).

Proof

Since \(\varvec{x} \succ \varvec{0}\), the vectors \(\varvec{x}^{\pm 1/2}\) are well-defined and we have \(\varvec{x}^{\pm 1/2} \succ \varvec{0}\). Note that

$$\begin{aligned} \varvec{x}^+= & {} \varvec{x} + \alpha \; \Delta \varvec{x} =\varvec{x}^{1/2} \circ \varvec{x}^{1/2} + \alpha \; \Delta \varvec{x}\\= & {} \varvec{x}^{1/2} \circ \left( \left( \varvec{e} + \alpha \; \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) \circ \varvec{x}^{1/2}\right) . \end{aligned}$$

Therefore, as \(\varvec{x}^{-1/2} \succ \varvec{0}\), we have that

$$\begin{aligned} \begin{array}{lll} \varvec{x}^+ \succ \varvec{0} &{}\iff &{} \varvec{x}^{-1/2} \circ \varvec{x}^+ \succ \varvec{0} \\ &{}\iff &{} \left( \varvec{x}^{-1/2} \circ \varvec{x}^+ \right) \circ \varvec{x}^{-1/2} \succ \varvec{0} \\ &{}\iff &{}\varvec{e} + \alpha \; \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \succ \varvec{0} \\ &{} \iff &{} 1+ \alpha \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) > 0,\;\text {for}\; i=1, 2, \ldots , r. \end{array} \end{aligned}$$

Hence,

$$\begin{aligned}&1+ \alpha \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right)> 0 \\&\quad \Longrightarrow \;\; \alpha \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) > -1. \end{aligned}$$

If \(\min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) < 0\), then \(\alpha < -1 / \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \right. \left. \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) \). Using Proposition 2.1, we conclude that

$$\begin{aligned} \alpha< & {} \displaystyle \frac{-1}{\displaystyle \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) } \;\;\;\text {and}\;\;\; \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{x}}- \delta _{\varvec{x}} \sqrt{r-1}} \\\le & {} \displaystyle \frac{-1}{\displaystyle \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1/2} \circ \left( \Delta \varvec{x} \circ \varvec{x}^{-1/2}\right) \right) }. \end{aligned}$$

This gives the expression of \(\alpha _{\varvec{x}}\). Applying the same procedure, we obtain \(\alpha _{\varvec{s}}\). Finally, we choose \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\). The proof is complete. \(\square \)

Lemma 4.2

(The second selection lemma). Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\). If \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\), then \(\varvec{x}^+, \varvec{s}^+ \succ \varvec{0}\), where for \(\varvec{v} \in \{\varvec{x}, \varvec{s} \}\) we have

$$\begin{aligned} \alpha _{\varvec{v}} = \left\{ \begin{array}{lll} \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}-\varepsilon , &{}\text {if}&{} \left( \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}> 0\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i\left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right)<0\right) ;\\ \varepsilon , &{}\text {if}&{} \left( \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}< 0\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i\left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right) <0\right) ;\\ 1, &{}\text {if}&{}\displaystyle \min _{1\le i\le r} \lambda _i \left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right) > 0, \end{array} \right. \end{aligned}$$
(11)

where

$$\begin{aligned} \bar{\lambda }_{\varvec{v}} = \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i\left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right) , \;\; \delta _{\varvec{v}}^{2} = \displaystyle \frac{1}{n} \displaystyle \sum _{i=1}^{r} \lambda _i^2\left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right) - \bar{\lambda }_{\varvec{v}}^2, \end{aligned}$$

and \(\epsilon \) is a small positive real. Here, \(\lambda _i\left( \varvec{v}^{-1} \circ \Delta \varvec{v} \right) , i=1, 2, \ldots , r\), are the eigenvalues of the vector \(\varvec{v}^{-1} \circ \Delta \varvec{v} \).

Proof

Since \(\varvec{x} \succ \varvec{0}\), the vector \(\varvec{x}^{-1}\) is invertible and positive definite (i.e., \(\varvec{x}^{-1} \succ \varvec{0}\)). Note that

$$\begin{aligned} \varvec{x}^+ = \varvec{x} + \alpha \; \Delta \varvec{x} = \varvec{x} \circ \left( \varvec{e} + \alpha \; \varvec{x}^{-1} \circ \Delta \varvec{x} \right) . \end{aligned}$$

Therefore, as \(\varvec{x}^{-1} \succ \varvec{0}\), we have that

$$\begin{aligned} \begin{array}{lll} \varvec{x}^+ \succ \varvec{0} &{}\iff &{} \varvec{x}^{-1} \circ \varvec{x}^+ \succ \varvec{0} \\ &{}\iff &{}\varvec{e} + \alpha \; \varvec{x}^{-1} \circ \Delta \varvec{x} \succ \varvec{0} \\ &{} \iff &{} 1+ \alpha \lambda _i \left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) > 0,\;\text {for}\; i=1, 2, \ldots , r. \end{array} \end{aligned}$$

Hence,

$$\begin{aligned} 1+ \alpha \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right)> 0 \;\; \Longrightarrow \;\; \alpha \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) > -1. \end{aligned}$$

If \(\min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) < 0\), then \(\alpha < -1 / \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) \). Using Proposition 2.1, we conclude that

$$\begin{aligned} \alpha < \displaystyle \frac{-1}{\displaystyle \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) } \;\;\;\text {and}\;\;\; \displaystyle \frac{-1}{\bar{\lambda }_{\varvec{x}}- \delta _{\varvec{x}} \sqrt{r-1}} \le \displaystyle \frac{-1}{\displaystyle \min _{1\le i\le r} \lambda _i\left( \varvec{x}^{-1} \circ \Delta \varvec{x} \right) }. \end{aligned}$$

This gives the expression of \(\alpha _{\varvec{x}}\). Applying the same procedure, we obtain \(\alpha _{\varvec{s}}\). Finally, we choose \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\). The proof is complete. \(\square \)

Lemma 4.3

(The third selection lemma). Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\). If \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\), then \(\varvec{x}^+, \varvec{s}^+ \succ \varvec{0}\), where for \(\varvec{v} \in \{\varvec{x}, \varvec{s} \}\) we have

$$\begin{aligned} \alpha _{\varvec{v}} = \left\{ \begin{array}{lll} -\displaystyle \frac{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}{\bar{\lambda }_{\Delta \varvec{v}}- \delta _{\Delta \varvec{v}} \sqrt{r-1}}-\varepsilon , &{}\text {if}&{} \left( -\displaystyle \frac{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}{\bar{\lambda }_{\Delta \varvec{v}}- \delta _{\Delta \varvec{v}} \sqrt{r-1}}> 0\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i(\Delta \varvec{v})<0\right) ;\\ \varepsilon , &{}\text {if}&{} \left( -\displaystyle \frac{\bar{\lambda }_{\varvec{v}}- \delta _{\varvec{v}} \sqrt{r-1}}{\bar{\lambda }_{\Delta \varvec{v}}- \delta _{\Delta \varvec{v}} \sqrt{r-1}}\;\;\text {and}\;\;\min _{1\le i\le r} \lambda _i(\Delta \varvec{v}) <0\right) ;\\ 1, &{}\text {if}&{}\displaystyle \min _{1\le i\le r} \lambda _i(\Delta \varvec{v})> 0, \end{array} \right. \end{aligned}$$
(12)

where

$$\begin{aligned} \bar{\lambda }_{\varvec{v}}= & {} \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i(\varvec{v}),\;\;\; \bar{\lambda }_{\Delta \varvec{v}} = \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i(\Delta \varvec{v}),\;\;\; \delta _{\varvec{v}}^{2} = \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i^2(\varvec{v}) - \bar{\lambda }_{\varvec{v}}^2,\\ \delta _{\Delta \varvec{v}}^{2}= & {} \displaystyle \frac{1}{r} \displaystyle \sum _{i=1}^{r} \lambda _i^2(\Delta \varvec{v}) - \bar{\lambda }_{\Delta \varvec{v}}^2, \end{aligned}$$

and \(\epsilon \) is a small positive real. Here, \(\lambda _i(\varvec{v}), i=1, 2, \ldots , r\), are the eigenvalues of the vector \(\varvec{v}\), and \(\lambda _i(\Delta \varvec{v}), i=1, 2, \ldots , r\), are the eigenvalues of the vector \(\Delta \varvec{v}\),

Proof

It is known that \(\varvec{x}^+ = \varvec{x} + \alpha \Delta \varvec{x} \succ \varvec{0}\) iff \(\min _{1 \le i \le r} \lambda _i(\varvec{x}^+) >0\). Here, \(\lambda _i(\varvec{x}^+), i=1, 2, \ldots , r\), are the eigenvalues of the vector \(\varvec{x}^+\). It is also known (see for example [33]) that

$$\begin{aligned} \displaystyle \min _{1 \le i \le r} \lambda _i(\varvec{x}^+) \ge \min _{1 \le i \le r} \lambda _i(\varvec{x}) + \min _{1 \le i \le r} \lambda _i(\Delta \varvec{x}). \end{aligned}$$

Then, it is enough to find \(\alpha \) such that

$$\begin{aligned} \displaystyle \min _{1 \le i \le r} \lambda _i(\varvec{x}^+) \ge \min _{1 \le i \le r} \lambda _i(\varvec{x}) +\alpha \; \min _{1 \le i \le r} \lambda _i(\Delta \varvec{x}) >0. \end{aligned}$$

Hence

$$\begin{aligned} \min _{1 \le i \le r} \lambda _i(\Delta \varvec{x})< 0 \;\;\Longrightarrow \;\; \alpha < - \displaystyle \frac{\min _{1 \le i \le r} \lambda _i(\varvec{x})}{\min _{1 \le i \le r} \lambda _i(\Delta \varvec{x})}. \end{aligned}$$

Using Proposition 2.1, we find

$$\begin{aligned} \alpha < - \displaystyle \frac{\min _{1 \le i \le r} \lambda _i(\varvec{x})}{\min _{1 \le i \le r} \lambda _i(\Delta \varvec{x})} \;\;\;\text {and}\;\;\; -\displaystyle \frac{\bar{\lambda }_{\varvec{x}}- \delta _{\varvec{x}} \sqrt{r-1}}{\bar{\lambda }_{\Delta \varvec{x}}- \delta _{\Delta \varvec{x}} \sqrt{r-1}} \le - \displaystyle \frac{\min _{1 \le i \le r} \lambda _i(\varvec{x})}{\min _{1 \le i \le r} \lambda _i(\Delta \varvec{x})}. \end{aligned}$$

This gives the expression of \(\alpha _{\varvec{x}}\). Applying the same procedure, we obtain \(\alpha _{\varvec{s}}\). Finally, we choose \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\). The proof is complete. \(\square \)

Lemma 4.4

(The fourth selection lemma). Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\). If \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\), then \(\varvec{x}^+, \varvec{s}^+ \succ \varvec{0}\), where for \(\varvec{v} \in \{\varvec{x}, \varvec{s} \}\) we have

$$\begin{aligned} \alpha _{\varvec{v}} = \left\{ \begin{array}{lll} \displaystyle \min _{i \in I_{\varvec{v}}}~~ \displaystyle \frac{\left( \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\varvec{v}_i)\}_{jk} \right| \right) - \{\text {L}(\varvec{v}_i)\}_{jj}}{\{\text {L}(\Delta \varvec{v}_i)\}_{jj} - \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\Delta \varvec{v}_i)\}_{jk} \right| }, &{}\text {if}&{} I_{\varvec{v}} \ne \emptyset ;\\ + \infty , &{}\text {if}&{} I_{\varvec{v}} = \emptyset , \end{array} \right. \end{aligned}$$
(13)

where

$$\begin{aligned} I_{\varvec{v}} := \left\{ i \in \{1, 2, \ldots , q \}: \{\text {L}(\Delta \varvec{v}_i)\}_{jj} - \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\Delta \varvec{v}_i)\}_{jk} \right| < 0 \right\} . \end{aligned}$$

Proof

Note that a vector \(\varvec{w} \in \mathcal {J}\) is positive definite (i.e., \(\varvec{w} \succ \varvec{0}\)) iff \(\text {L}(\varvec{w}_i)\) is positive definite (i.e., \(\varvec{w}_i \succ \varvec{0}\)) for \(i=1, 2, \ldots , q\) [1, Lemma 12]. Then, \(\varvec{x}^+= \varvec{x} + \alpha \Delta \varvec{x} \succ \varvec{0}\) if

$$\begin{aligned} \{ \text {L}(\varvec{x}^+_i)\}_{jj} > \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\varvec{x}^+_i) \}_{jk} \right| ,\;\;\forall \;\; i=1, 2, \ldots , q. \end{aligned}$$
(14)

Observe that \(\text {L}(\varvec{x}^+_i) = \text {L}(\varvec{x}_i + \alpha \Delta \varvec{x}_i) = \text {L}(\varvec{x}_i) + \alpha \text {L}(\Delta \varvec{x}_i)\) for \(i=1, 2, \ldots , q\). So, the inequality (14) is equivalent to

$$\begin{aligned} \{\text {L}(\varvec{x}_i)\}_{jj} + \alpha \{\text {L}(\Delta \varvec{x}_i)\}_{jj} > \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\varvec{x}_i)\}_{jk} + \alpha \{\text {L}(\Delta \varvec{x}_i)\}_{jk} \right| ,\;\;\forall \;\; i=1, 2, \ldots , q. \end{aligned}$$

Because, for \(i=1, 2, \ldots , q\), we have

$$\begin{aligned} \displaystyle \sum _{j \ne k=1}^{n_i} \left( \left| \{\text {L}(\varvec{x}_i)\}_{jk} \right| + \alpha \left| \{\text {L}(\Delta \varvec{x}_i)\}_{jk} \right| \right) \ge \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\varvec{x}_i)\}_{jk} + \alpha \{\text {L}(\Delta \varvec{x}_i)\}_{jk} \right| , \end{aligned}$$

it is enough to find \(\alpha \) such that

$$\begin{aligned} \{\text {L}(\varvec{x}_i)\}_{jj} + \alpha \{\text {L}(\Delta \varvec{x}_i)\}_{jj} > \displaystyle \sum _{j \ne k=1}^{n_i} \left( \left| \{\text {L}(\varvec{x}_i)\}_{jk} \right| + \alpha \left| \{\text {L}(\Delta \varvec{x}_i)\}_{jk} \right| \right) ,\;\;\forall \;\; i=1, 2, \ldots , q, \end{aligned}$$

or equivalently

$$\begin{aligned}&\alpha \left( \{\text {L}(\Delta \varvec{x}_i)\}_{jj} - \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\Delta \varvec{x}_i)\}_{jk} \right| \right) > \left( \displaystyle \sum _{j \ne k=1}^{n_i} \left| \{\text {L}(\varvec{x}_i)\}_{jk} \right| \right) \\&\quad -\{\text {L}(\varvec{x}_i)\}_{jj}, \;\;\forall \;\; i=1, 2, \ldots , q. \end{aligned}$$

This gives the expression of \(\alpha _{\varvec{x}}\).

Applying the same procedure for the vector \(\varvec{s}^+ = \varvec{s} + \alpha \Delta \varvec{s}\), we obtain \(\alpha _{\varvec{s}}\). Finally, we choose \(\alpha = \rho \min \{\alpha _{\varvec{x}}, \alpha _{\varvec{s}}\}\) with \(0< \rho < 1\). The proof is complete. \(\square \)

5 The central trajectory algorithm

The central trajectory algorithm for solving symmetric programming problem is formally stated in Algorithm 5.1. Algorithm 5.1 selects a sequence of displacement steps \(\{\alpha ^{(k)} \}\) and centrality parameters \(\{\sigma ^{(k)}\}\) according to the following rule: for all \(k\ge 0\), we take \(\sigma ^{(k)} = 1 - \delta /\sqrt{r}\), where \(\delta \in \left[ 0, \sqrt{r}\right) \) as it will be seen in the next section.

figure a

In the rest of this section, we prove that the complementary gap and the function \(f_\mu \) decrease for a given displacement step. The proof of this result depends essentially on the following lemma.

Lemma 5.1

Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \text {int}\,\mathcal {K}_\mathcal {J} \times {\mathbb {R}}^m \times \text {int}\,\mathcal {K}_\mathcal {J}\), \(\left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) \) be obtained by applying scaling to \(\left( \varvec{x}; \varvec{y}; \varvec{s}\right) \), and \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}\right) \) be a solution of the system (8). Then we have

  1. (a)

    \(\overline{\Delta \varvec{x}}\bullet \underline{\Delta \varvec{s}} = 0\).

  2. (b)

    \(\overline{\varvec{x}}\bullet \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}}\bullet \underline{\varvec{s}} = trace (\varvec{h})\), where \(\varvec{h} = \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\) such that \(\sigma \in (0, 1)\) and \(\mu = \frac{1}{r} \overline{\varvec{x}}\bullet \underline{\varvec{s}}\).

  3. (c)

    \({\overline{\varvec{x}}^+}\bullet \underline{\varvec{s}}^+ =\left( 1-\alpha \left( 1-\frac{\sigma }{2}\right) \right) \; \overline{\varvec{x}}\bullet \underline{\varvec{s}}, \;\forall \alpha \in {\mathbb {R}}\), where \(\overline{\varvec{x}}^+ = \overline{\varvec{x}} + \alpha \overline{\Delta \varvec{x}}\) and \(\underline{\varvec{s}}^+ = \underline{\varvec{s}} + \alpha \underline{\Delta \varvec{s}}\).

  4. (d)

    \({\varvec{x}^+}\bullet \varvec{s}^+ =\left( 1-\alpha \left( 1-\frac{\sigma }{2}\right) \right) \; \varvec{x}\bullet \varvec{s}, \;\forall \alpha \in {\mathbb {R}}\), where \(\varvec{x}^+ = \varvec{x} + \alpha \Delta \varvec{x}\) and \({\varvec{s}}^+ = \varvec{s} + \alpha \Delta \varvec{s}\).

Proof

By the first two equations of the system (8), we get

$$\begin{aligned} \overline{\Delta \varvec{x}}\bullet \underline{\Delta \varvec{s}} =- \overline{\Delta \varvec{x}}\bullet \underline{A}^\mathsf{T }\Delta \varvec{y} =- \left( \underline{A} \overline{\Delta \varvec{x}} \right) ^\mathsf{T }\Delta \varvec{y} = 0. \end{aligned}$$

This proves item (a).

We prove item (b) by noting that

$$\begin{aligned} \begin{array}{lll} trace (\varvec{h}) &{}=&{} trace \left( \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}} \right) \\ &{}=&{} trace \left( \overline{\varvec{x}}\circ \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}}\circ \underline{\varvec{s}} \right) \\ &{}=&{} trace \left( \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} \right) + trace \left( \overline{\Delta \varvec{x}} \circ \underline{\varvec{s}} \right) \\ &{}=&{}\overline{\varvec{x}}\bullet \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}}\bullet \underline{\varvec{s}}, \end{array} \end{aligned}$$

where we used the last equation of the system (8) to obtain the first equality. To prove item (c), note that

$$\begin{aligned} \begin{array}{lll} {\overline{\varvec{x}}^+}\bullet \underline{\varvec{s}}^+ &{}=&{} \left( \overline{\varvec{x}} + \alpha \overline{\Delta \varvec{x}}\right) \bullet \left( \underline{\varvec{s}} + \alpha \underline{\Delta \varvec{s}}\right) \\ &{}=&{} \overline{\varvec{x}}\bullet \underline{\varvec{s}} + \alpha \left( \overline{\Delta \varvec{x}}\bullet \underline{\varvec{s}} + \overline{\varvec{x}}\bullet \underline{\Delta \varvec{s}} \right) + \alpha ^2 \overline{\Delta \varvec{x}}\bullet \underline{\Delta \varvec{s}}\\ &{}=&{} \overline{\varvec{x}}\bullet \underline{\varvec{s}} + \frac{1}{2} \alpha \; trace \left( \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}} \right) \\ &{}=&{} \overline{\varvec{x}}\bullet \underline{\varvec{s}} + \frac{1}{2} \alpha \sigma \mu \; trace \left( \varvec{e}\right) - \frac{1}{2} \alpha \; trace \left( \overline{\varvec{x}} \circ \underline{\varvec{s}} \right) \\ &{}=&{}\overline{\varvec{x}}\bullet \underline{\varvec{s}} + \alpha \sigma \mu r -\alpha \overline{\varvec{x}}\bullet \overline{\varvec{s}}\\ &{}=&{}\overline{\varvec{x}}\bullet \underline{\varvec{s}} +\frac{1}{2} \alpha \sigma \overline{\varvec{x}}\bullet \underline{\varvec{s}} -\alpha \overline{\varvec{x}}\bullet \underline{\varvec{s}}\\ &{}=&{}\left( 1-\alpha \left( 1-\frac{\sigma }{2}\right) \right) \; \overline{\varvec{x}}\bullet \underline{\varvec{s}}, \end{array} \end{aligned}$$

where the third equality follows from items (a) and (b).

Finally, item (d) follows from item (c) and the fact that \(\overline{\varvec{x}}\bullet \underline{\varvec{s}}= \varvec{x}\bullet \varvec{s}\) (see (9)), and similarly that \({\overline{\varvec{x}}^+}\bullet \underline{\varvec{s}}^+= {{\varvec{x}}^+}\bullet {\varvec{s}}^+\). The proof is complete. \(\square \)

The following result generalizes the corresponding one in [23, Lemmas 4.2 and 4.3].

Lemma 5.2

Let \((\varvec{x}; \varvec{y}; \varvec{s})\) and \(\left( \varvec{x}^+; \varvec{y}^+; \varvec{s}^+\right) \) be strictly feasible solutions of the pair of problems \((P_\mu , D_\mu )\) with \(\left( \varvec{x}^+; \varvec{y}^+; \varvec{s}^+\right) = \left( \varvec{x} + \alpha \Delta \varvec{x};\; \varvec{y} + \alpha \Delta \varvec{y};\; \varvec{s} + \alpha \Delta \varvec{s}\right) \), where \(\alpha \) is a displacement step and \(\left( \Delta \varvec{x};\Delta \varvec{y};\Delta \varvec{s}\right) \) is the Newton direction. Then we have

  1. (a)

    \({\varvec{x}^+}\bullet \varvec{s}^+ < \overline{\varvec{x}}\bullet \underline{\varvec{s}}\).

  2. (b)

    \(f_\mu (\varvec{x}^+) < f_\mu (\varvec{x})\).

Proof

Note that

$$\begin{aligned} {\varvec{x}^+}\bullet \varvec{s}^+ =\left( 1-\alpha \left( 1-\frac{\sigma }{2}\right) \right) \; \overline{\varvec{x}}\bullet \underline{\varvec{s}} < \overline{\varvec{x}}\bullet \underline{\varvec{s}}, \end{aligned}$$

where the equality follows from item (d) of Lemma 5.1 and the strict inequality follows from \(\left( 1-\alpha \left( 1-\frac{\sigma }{2}\right) \right) < 1\) (as \(\alpha > 0\) and \(\sigma \in (0, 1)\)). This proves item (a).

To prove item (b), note that

$$\begin{aligned} f_\mu (\varvec{x}^+) \simeq f_\mu (\varvec{x}) + \nabla _{\varvec{x}} f_\mu (\varvec{x}) \bullet (\varvec{x}^+ - \varvec{x}), \end{aligned}$$

and hence

$$\begin{aligned} f_\mu (\varvec{x}^+) - f_\mu (\varvec{x}) \simeq \alpha \nabla _{\varvec{x}} f_\mu (\varvec{x}) \bullet \Delta \varvec{x}. \end{aligned}$$

Since

$$\begin{aligned} \nabla _{\varvec{x}} f_\mu (\varvec{x}) = - \nabla ^2_{\varvec{x} \varvec{x}} f_\mu (\varvec{x}) \Delta \varvec{x}, \end{aligned}$$

we have

$$\begin{aligned} f_\mu (\varvec{x}^+) - f_\mu (\varvec{x}) \simeq -\alpha \; \Delta \varvec{x}\bullet \nabla ^2_{\varvec{x} \varvec{x}} f_\mu (\varvec{x}) \Delta \varvec{x} <0, \end{aligned}$$

where the strict inequality follows from the positive definiteness of the Hessian matrix \(\nabla ^2_{\varvec{x} \varvec{x}} f_\mu (\varvec{x})\) (as \(f_\mu \) is strictly convex). Thus, \(f_\mu (\varvec{x}^+) < f_\mu (\varvec{x})\). The proof is complete. \(\square \)

6 Complexity analysis

In this section, we analyze the complexity of the proposed central trajectory algorithm for symmetric programming. More specifically, we prove that the iteration-complexity of Algorithm 5.1 is bounded by

$$\begin{aligned} \mathcal {O}\left( \sqrt{r} \ln \left[ \epsilon ^{-1}\, \varvec{x}^{(0)} \bullet \varvec{s}^{(0)}\right] \right) . \end{aligned}$$

Our proof depends essentially on the following two lemmas.

Lemma 6.1

Let \((\varvec{x}; \varvec{y}; \varvec{s}) \in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\), \(\left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) \) be obtained by applying scaling to \(\left( \varvec{x}; \varvec{y}; \varvec{s}\right) \) with \(\varvec{h} = \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\), and \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}\right) \) be a solution of the system (8). For any \(\alpha \in {\mathbb {R}}\), we set

$$\begin{aligned} \begin{array}{l} (\varvec{x}(\alpha ); \varvec{y}(\alpha ); \varvec{s}(\alpha )) := (\overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}) + \alpha (\overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}),\\ \mu (\alpha ) := \frac{1}{r} \varvec{x}(\alpha )\bullet \varvec{s}(\alpha ),\\ \varvec{\nu }(\alpha ) := \varvec{x}(\alpha ) \circ \varvec{s}(\alpha ) - \mu (\alpha ) \varvec{e}. \end{array} \end{aligned}$$

Then

$$\begin{aligned} \varvec{\nu }(\alpha ) = (1-\alpha ) (\overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e}) + \alpha ^2 \; \overline{\Delta \varvec{x}} \circ \underline{\Delta \varvec{s}}. ~~~~~~ \end{aligned}$$
(15)

Proof

Given \(\alpha \in {\mathbb {R}}\), using item (c) of Lemma 5.1, we have

$$\begin{aligned} \varvec{x} (\alpha )\bullet \varvec{s}(\alpha ) = (1- \alpha + \alpha \sigma )\, \overline{\varvec{x}}\bullet \underline{\varvec{s}}, \;\; \text {and hence} \;\; \mu (\alpha ) = (1- \alpha + \alpha \sigma ) \mu . \end{aligned}$$

Thus, we get

$$\begin{aligned} \begin{array}{lll} \varvec{\nu }(\alpha ) &{}=&{} \varvec{x}(\alpha ) \circ \varvec{s}(\alpha ) -\mu (\alpha ) \varvec{e}\\ &{}=&{} \left( \overline{\varvec{x}} + \alpha \overline{\Delta \varvec{x}}\right) \circ \left( \underline{\varvec{s}} + \alpha \underline{\Delta \varvec{s}}\right) - (1- \alpha + \alpha \sigma ) \mu \varvec{e}\\ &{}=&{} (1 - \alpha )(\overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e}) + \alpha (\underbrace{\overline{\varvec{x}} \circ \underline{\varvec{s}} - \sigma \mu \varvec{e}}_{- \varvec{h}})\\ &{}&{}+\, \alpha \underbrace{\left( \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}} \circ \underline{\varvec{s}} \right) }_{\varvec{h}} + \alpha ^2 \; \overline{\Delta \varvec{x}} \circ \underline{\Delta \varvec{s}}\\ &{}=&{} (1-\alpha ) (\overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e}) + \alpha ^2 \; \overline{\Delta \varvec{x}} \circ \underline{\Delta \varvec{s}}. \end{array} \end{aligned}$$

This completes the proof. \(\square \)

Lemma 6.2

Let \((\varvec{x}; \varvec{y}; \varvec{s})\in \mathcal {F}^\circ _P \times \mathcal {F}^\circ _D\), \(\left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) \) be obtained by applying scaling to \(\left( \varvec{x}; \varvec{y}; \varvec{s}\right) \) such that \(\left\| \overline{\varvec{x}} \circ \underline{\varvec{s}} -\mu \varvec{e}\right\| \le \theta \mu \), for some \(\theta \in [0,1)\) and \(\mu >0\). Let also \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}}\right) \) be a solution of the system (8), \(\varvec{h} = \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\), \(\delta _{\varvec{x}}:=\mu \left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1}\right\| _{F}, \delta _{\varvec{s}}:=\left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} \right\| _{F}\). Then, we have

$$\begin{aligned} \delta _{\varvec{x}} \delta _{\varvec{s}} \le \frac{1}{2} \left( \delta _{\varvec{x}}^2 + \delta _{\varvec{s}}^2 \right) \le \displaystyle \frac{\Vert \varvec{h} \Vert ^2_F}{2(1-\theta )^2}. \end{aligned}$$
(16)

Proof

By the last equation of system (8) and from the operator commutativity, we have

$$\begin{aligned} \varvec{h}= \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} + \overline{\Delta \varvec{x}} \circ \underline{\varvec{s}} = \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} +\mu \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} +\left( \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1}\right) \circ \left( \overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e} \right) . \end{aligned}$$

It immediately follows that

$$\begin{aligned} \begin{array}{lll} \Vert \varvec{h} \Vert _F &{}\ge &{} \left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} + \mu \, \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F - \left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F \left\| \overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e} \right\| \\ &{}\ge &{} \left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} + \mu \, \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F - \theta \delta _{\varvec{x}}\\ &{}\ge &{} \sqrt{\left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} \right\| _F^2 + \left\| \mu \, \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F^2} - \theta \delta _{\varvec{x}}\\ &{}=&{} \sqrt{\delta _{\varvec{x}}^2 + \delta _{\varvec{s}}^2}- \theta \delta _{\varvec{x}}\\ &{}\ge &{}(1-\theta ) \sqrt{\delta _{\varvec{x}}^2 + \delta _{\varvec{s}}^2}, \end{array} \end{aligned}$$
(17)

where the second inequality follows from the assumption that \(\left\| \overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e} \right\| \le \theta \mu \), and the third inequality follows from (3) the fact that

$$\begin{aligned}&\left( \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}}\right) \bullet \left( \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1}\right) = trace \left( \left( \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}}\right) \circ \left( \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1}\right) \right) \\&\quad = trace \left( \underline{\Delta \varvec{s}}\circ \overline{\Delta \varvec{x}} \right) =\overline{\Delta \varvec{x}}\bullet \underline{\Delta \varvec{s}}, \end{aligned}$$

which is essentially zero due to item (a) of Lemma 5.1.

The right-hand side inequality in (16) follows by noting that \((\delta _{\varvec{x}} - \delta _{\varvec{s}})^2 \ge 0\), and the left-hand side inequality in (16) follows from the last inequality in (17). The proof is complete. \(\square \)

The following theorem analyzes the behavior of one iteration of Algorithm 5.1. This theorem generalizes [23, Theorem 7.1].

Theorem 6.1

Let \(\theta \in (0, 1)\) and \(\delta \in \left[ 0, \sqrt{r}\right) \) be given such that

$$\begin{aligned} \frac{\theta ^2 + \delta ^2}{2(1-\theta )^2 \left( 1 - \frac{\delta }{\sqrt{r}} \right) } \le \theta \le \frac{1}{2}. \end{aligned}$$
(18)

Suppose that \(\left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) \in \mathcal {N}_\theta (\mu )\) and let \(\left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}} \right) \) denote the solution of system (8) with \(\varvec{h} = \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\) and \(\sigma = 1 - \frac{\delta }{\sqrt{r}}\). Then, we have

  1. (a)

    \({\overline{\varvec{x}}^+}\bullet \underline{\varvec{s}}^+ = \left( 1 - \frac{\delta }{\sqrt{r}} \right) \overline{\varvec{x}}\bullet \underline{\varvec{s}}\).

  2. (b)

    \(\left( \overline{\varvec{x}}^+;\varvec{y}^+;\underline{\varvec{s}}^+\right) = \left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) + \left( \overline{\Delta \varvec{x}}; \Delta \varvec{y}; \underline{\Delta \varvec{s}} \right) \in \mathcal {N}_\theta (\mu )\).

  3. (c)

    \(\left( \varvec{x}^+;\varvec{y}^+;\varvec{s}^+\right) = \left( \varvec{x}; \varvec{y}; \varvec{s}\right) + \left( \Delta \varvec{x}; \Delta \varvec{y}; \Delta \varvec{s} \right) \in \mathcal {N}_\theta (\mu )\).

Proof

Item (a) follows directly from item (c) of Lemma 5.1 with \(\alpha = 1\) and \(\sigma = 1 -\frac{\delta }{\sqrt{r}}\). We now prove item (b). Define

$$\begin{aligned} \mu ^+ := \frac{1}{r}\; {\overline{\varvec{x}}^+}\bullet \underline{\varvec{s}}^+ = \left( 1- \frac{\delta }{r}\right) \mu \end{aligned}$$
(19)

and let \(\left( \overline{\varvec{x}}; \varvec{y}; \underline{\varvec{s}}\right) \in \mathcal {N}_\theta (\mu )\), we then have

$$\begin{aligned} \left\| \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}} \right\| _F^2\le & {} \left\| (\sigma -1) \mu \varvec{e} \right\| _F^2 + \left\| \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}} \right\| _F^2\nonumber \\\le & {} \left( (\sigma -1)^2 r + \theta ^2 \right) \mu ^2 = \left( \delta ^2 + \theta ^2 \right) \mu ^2. \end{aligned}$$
(20)

Since \(\Vert \overline{\varvec{x}} \circ \underline{\varvec{s}} - \mu \varvec{e}\Vert \le \theta \mu \) and \(\varvec{h} = \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\), using Lemma 6.2 it follows that

$$\begin{aligned} \left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F \left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} \right\| _F \le \displaystyle \frac{\left\| \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\right\| _F^2}{2(1-\theta )^2 \mu }. \end{aligned}$$
(21)

Defining \(\varvec{\nu }^+ := \varvec{\nu }(1) = \overline{\varvec{x}}^+ \circ \underline{\varvec{s}}^+ - \mu ^+ \varvec{e}\) and using (15) with \(\alpha = 1\), (21), (20), (18) and (19), we get

$$\begin{aligned} \Vert \varvec{\nu }^+\Vert _F= & {} \left\| \overline{\Delta \varvec{x}} \circ \underline{\Delta \varvec{s}} \right\| _F \le \left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F \left\| \overline{\varvec{x}} \circ \underline{\Delta \varvec{s}} \right\| _F \\\le & {} \displaystyle \frac{\left\| \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\right\| _F^2}{2(1-\theta )^2 \mu } \le \frac{(\delta ^2 + \theta ^2) \mu }{2(1-\theta )^2} \le \theta \left( 1 - \frac{\delta }{\sqrt{r}} \right) \mu = \theta \mu ^+. \end{aligned}$$

Consequently,

$$\begin{aligned} \left\| \overline{\varvec{x}}^+ \circ \underline{\varvec{s}}^+ - \mu ^+ \varvec{e} \right\| _F \le \theta \mu ^+. \end{aligned}$$
(22)

By using the right-hand side inequality in (16), and using (20) and (18), we have

$$\begin{aligned} \left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F \le \displaystyle \frac{\left\| \sigma \mu \varvec{e} - \overline{\varvec{x}} \circ \underline{\varvec{s}}\right\| _F}{(1-\theta ) \mu } \le \frac{\sqrt{\delta ^2 + \theta ^2}}{(1-\theta )} \le \sqrt{2\theta \left( 1 - \frac{\delta }{\sqrt{r}} \right) } < 1, \end{aligned}$$

where the strict inequality follows from \(\theta \le \frac{1}{2}\) and \(0< 1 - \frac{\delta }{\sqrt{r}} < 1\).

One can easily see that \(\left\| \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \right\| _F < 1\) implies that \(\varvec{e} + \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1} \succ \varvec{0}\), and therefore

$$\begin{aligned} \overline{\varvec{x}}^+ = \overline{\Delta \varvec{x}}+ \overline{\varvec{x}} =\left( \varvec{e} + \overline{\Delta \varvec{x}} \circ \overline{\varvec{x}}^{-1}\right) \circ \overline{\varvec{x}} \succ \varvec{0}. \end{aligned}$$

Note that, from (22), we have \(\lambda _{\min }\left( \overline{\varvec{x}}^+ \circ \underline{\varvec{s}}^+ \right) \ge (1-\theta ) \mu ^+ >0\), and therefore \(\overline{\varvec{x}}^+ \circ \underline{\varvec{s}}^+ \succ \varvec{0}\). Since \(\overline{\varvec{x}}^+ \succ \varvec{0}\) and \(\overline{\varvec{x}}^+\) and \(\underline{\varvec{s}}^+\) operator commute, we conclude that \(\underline{\varvec{s}}^+ \succ \varvec{0}\). Using the first equation of system (8), we get

$$\begin{aligned} \underline{A} \overline{\varvec{x}}^+ = \underline{A} \left( \overline{\varvec{x}} + \overline{\Delta \varvec{x}} \right) = \underline{A} \overline{\varvec{x}} + \underline{A}\overline{\Delta \varvec{x}}= \varvec{b}, \;\; \text {and hence} \;\; \overline{\varvec{x}}^+ \in \mathcal {F}_P^\circ . \end{aligned}$$

By using the second equation of system (8), we get

$$\begin{aligned} \underline{A}^\mathsf{T }\varvec{y}^+ + \overline{\varvec{s}}^+= & {} \underline{A}^\mathsf{T }(\varvec{y} + \Delta \varvec{y}) + \left( \overline{\varvec{s}} +\overline{\Delta \varvec{s}}\right) = \underline{A}^\mathsf{T }\varvec{y} + \overline{\varvec{s}} \\&+\underline{A}^\mathsf{T }\Delta \varvec{y} +\overline{\Delta \varvec{s}} = \varvec{c}, \;\; \text {and hence} \;\; \left( \varvec{y}^+; \overline{\varvec{s}}^+\right) \in \mathcal {F}^\circ _D. \end{aligned}$$

Thus, in view of (22), we deduce that \(\left( \overline{\varvec{x}}^+; \varvec{y}^+; \overline{\varvec{s}}^+\right) \in \mathcal {N}^\circ _\theta (\mu )\). Item (b) is therefore established. Item (c) follows from item (b) and [1, Proposition 29]. The proof is now complete. \(\square \)

Corollary 6.1

Let \(\theta \) and \(\delta \) as given in Theorem 6.1 and \(\left( \varvec{x}^{0}; \varvec{y}^{0}; \varvec{s}^{(0)}\right) \in \mathcal {N}_\theta (\mu )\). Then Algorithm 5.1 generates a sequence of points \(\left\{ \left( \varvec{x}^{k}; \varvec{y}^{k}; \varvec{s}^{(k)}\right) \right\} \subset \mathcal {N}_\theta (\mu )\) such that

$$\begin{aligned} \varvec{x}^{(k)}\bullet \varvec{s}^{(k)} = \left( 1 - \frac{\delta }{\sqrt{r}} \right) ^k \varvec{x}^{(0)}\bullet \varvec{s}^{(0)},\;\;\; \forall k\ge 0. \end{aligned}$$

Moreover, given a tolerance \(\epsilon >0\), Algorithm 5.1 computes an iterate \(\left\{ \left( \varvec{x}^{k}; \varvec{y}^{k}; \varvec{s}^{(k)}\right) \right\} \) satisfying \(\varvec{x}^{(k)}\bullet \varvec{s}^{(k)} \le \epsilon \) in at most \(K= \mathcal {O}\left( \sqrt{r} \ln \left[ \epsilon ^{-1}\varvec{x}^{(0)}\bullet \varvec{s}^{(0)} \right] \right) \) iterations.

Proof

Looking recursively at item (a) of Theorem 6.1, for each k we have that

$$\begin{aligned} \varvec{x}^{(k)}\bullet \varvec{s}^{(k)} = \left( 1 - \frac{\delta }{\sqrt{r}} \right) ^k \varvec{x}^{(0)}\bullet \varvec{s}^{(0)} \le \epsilon . \end{aligned}$$

By taking natural algorithm of both sides, we get

$$\begin{aligned} k \ln \left( 1 - \frac{\delta }{\sqrt{r}} \right) \le \ln \left( \frac{\epsilon }{\varvec{x}^{(0)}\bullet \varvec{s}^{(0)}} \right) , \end{aligned}$$

which holds only if

$$\begin{aligned} k \left( - \frac{\delta }{\sqrt{r}} \right) \le \ln \left( \frac{\epsilon }{\varvec{x}^{(0)}\bullet \varvec{s}^{(0)}} \right) , \;\; \text {or equivalently}, \;\; k \ge K \ge \left[ \delta ^{-1} \sqrt{r}\; \ln \left( \frac{\varvec{x}^{(0)}\bullet \varvec{s}^{(0)}}{\epsilon } \right) \right] . \end{aligned}$$

The result is established. \(\square \)

7 Numerical experiments

In this section, we present two numerical examples for second-order cone programs to demonstrate the efficiency of our algorithm. Our numerical experiments are carried out on a PC with Intel(R) Dual CPU at 2.20 GHz and 2 GB of physical memory. The PC runs MATLAB Version: 7.4.0.287 (R2007a) on Windows XP Enterprise 32-bit operating system. We denote by

  • \(1\text {st}\) Sel.:   the first selection for calculating the displacement step,

  • \(2\text {nd}\) Sel.:  the second selection for calculating the displacement step,

  • \(3\text {rd}\) Sel.:   the third selection for calculating the displacement step,

  • \(4\text {th}\) Sel.:   the fourth selection for calculating the displacement step,

  • (nm):      the size of problems,

  • Iter.:         the number of iterations taken to obtain the optimal solution,

  • CPU(s):   the time (in seconds) required to obtain the optimal solution.

We point out that the matrices used in our examples have full row rank. In all of our experiments, we used the NT direction for choosing the scaling vector \(\varvec{p}^{(k)} \in \mathcal {C}\left( \varvec{x}^{(k)}, \varvec{s}^{(k)}\right) \) because this direction takes the advantage of being primal-dual symmetric. The following example is taken from the literature, see [15, Section 6].

Example 7.1

Let n and m be positive integers such that \(n=2m\). We consider the second-order cone programming problem and its dual

$$\begin{aligned} \begin{array}{lllllll} &{}&{}\min &{} \varvec{c} ^\mathsf{T }\varvec{x} &{}&{}\max &{}\varvec{b}^\mathsf{T }\varvec{y}\\ &{}(P)~~~~~&{}\text{ s.t. } &{} A \varvec{x} = \varvec{b}, ~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~ &{}(D)~~~~~&{}\text{ s.t. }&{} \varvec{c} - A^\mathsf{T }\varvec{y} \in \mathcal {E}^{n}_+, \\ &{}&{}&{} \varvec{x} \in \mathcal {E}^{n}_+; &{}&{}&{} ~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~ \end{array} \end{aligned}$$

where \(\mathcal {E}^{n}_+\) is the second-order cone defined in Example 2.3, and

$$\begin{aligned}&\begin{array}{lll} \varvec{c} &{}=&{} 10 \varvec{e} -2\; \mathbf ones (n, 1) + 4\; \mathbf rand (n, 1) \in \mathcal {E}^n,\\ \varvec{b} &{}=&{} 10 \varvec{e} - 2\; \mathbf ones (m, 1) + 4\; \mathbf rand (m, 1) \in {\mathbb {R}}^{m},\\ A&{}=&{}\left[ \hat{A}\; \vdots \; \text {Randn}(m, n-m)\right] \in {\mathbb {R}}^{m \times n}, \end{array} \;\text {and} \;\;\hat{a}_{ij} = \left\{ \begin{array}{lll} 2&{}&{}\text {if}\;\; i=j-1,\\ 100&{}&{}\text {if}\;\; i=j,\\ -2&{}&{}\text {if}\;\; i=j+1,\\ 0&{}&{}\text {otherwise}, \end{array}\right. ~~\text {for}\;\;1 \le i, j \le m. \end{aligned}$$

Here, \(\mathbf ones (\ell , 1)\) is a vector of ones of length \(\ell \). We take \(\varvec{x}^{(0)} = \varvec{e} \in \mathcal {E}^n\) and \(\varvec{y}^{(0)} = \varvec{0} \in {\mathbb {R}}^m\) as our initial strictly feasible points. We also take \(\epsilon = 10^{-6}, \sigma = 0.1\) and \(\rho = 0.99\). As a well-known interior-point method for solving second-order cone programming, MOSEK [28] and SDPT3 [29] solvers were used in our experiments for comparison purposes. The numerical results related to this example are displayed in Table 1.

Table 1 The numerical results of Example 7.1

Due to the similarity in results between Examples 7.1 and 7.2, we analyze the numerical results of Table 1 after Example 7.2.

Example 7.2

In this example, the proposed primal-dual central trajectory algorithm was tested on randomly generated second-order cone programming problems. We generate a random matrix \(A\in {\mathbb {R}}^{n \times m}\) with full row rank and random vectors \(\varvec{x}, \varvec{s} \in \mathcal {E}^n_+\) and \(\varvec{y} \in {\mathbb {R}}^m\). We let \(\varvec{b} = A \varvec{x}\) and \(\varvec{c} = A^\mathsf{T }\varvec{y} + \varvec{s}\). Let \(\varvec{x}^{(0)} = \varvec{e}\in \mathcal {E}^{n}, \varvec{y}^{(0)} = \varvec{0} \in {\mathbb {R}}^m\) and \(\varvec{s}^{(0)} = \varvec{e}\in \mathcal {E}^{n}\) be initial points. Since the set of strictly feasible solutions of (P) and (D) are non-empty, the generated test problems (P) and (D) have optimal solutions and their optimal values are equal. The parameters used in Algorithm 5.1 were as follows: \(\epsilon = 1.3500e-06, \sigma = 0.1\) and \(\rho = 0.99\).

In our experiments, we generated test problems with size \(n(=2m)\) from 100 to 1200 and \(r=1\). The random test problems of each n are generated 5 times, and hence we have 60 random problems in total. In this example, we use MOSEK [28] and SDPT3 [29] solvers in our experiments for comparison purposes as another well-known interior-point method for solving second-order cone programming. We present the results of our numerical results in Table 2. Note that the values of “Iter” and “CPU(s)” are the average of 5 runs for each n.

Table 2 The numerical results of Example 7.2

In view of Tables 1 and 2, we can see that Algorithm 5.1 with the first, second or fourth selections are able to give an optimal solution with a favorable running time and requires less number of iterations than the MOSEK and SDPT3 solvers; this is probably due to the fact that these solvers implement infeasible algorithms and use infeasible initial points. We can also see that the fourth selection for calculating the displacement step has remarkable superiority to other three selections in terms of number of iterations and running time. It is interesting to note that our conclusion for second-order cone programming problems exactly matches the same conclusion obtained from the numerical experiments in [23, Section 5] for solving semidefinite programming problems.

8 Conclusions

In this paper, we have presented a primal-dual central trajectory interior-point method for solving symmetric programming problem. We have proven the convergence of the optimal solution of the corresponding perturbed problem to the optimal solution of the original problem when the barrier parameter goes to zero. Then, we have applied Newton’s method to find a new iterative point by computing a good descent direction. The inconvenience lies in the high computational cost motivated us to avoid using several methods, such as the line search methods, to calculate the displacement step. Alternatively, in this paper, we have proposed a new approach based on four new selections to calculate the displacement step.

After stating the central trajectory algorithm, we have analyzed the convergence of the proposed algorithm and have proven that the complexity for short-step is bounded by \(\mathcal {O}(\sqrt{r} \ln [\epsilon ^{-1}\varvec{x}^{(0)}\bullet \varvec{s}^{(0)}])\) iterations, where r is the rank of the Cartesian product of the corresponding Euclidean Jordan algebras. Our numerical results for second-order cone program have demonstrated the efficiency of our approach and have shown the convergence of the four proposed selections to the optimal solution of the problem. By looking at the computed time and number of iterations, the numerical results on second-order cone programs have shown that the fourth selection is the best selection that can be chosen to reach the optimal solution.