Keywords

1 Introduction

The second-order cone programming problem (SOCP) is one of the main programs in cone programming. The linear SOCP is a problem in which the linear objective function is minimized on the intersection of a linear manifold with a second-order cone (the Lorentz cone) (see [1]). Many optimization problems, including combinatorial problems, can be reduced to the SOCP programs [1,2,3].

Today, there are some numerical methods for solving SOCP programs. From these methods, the primal-dual path-following methods are the most known [4, 5]. In [6] the dual barrier-projection methods have been proposed for SOCP programs. These methods are generalizations of the corresponding methods for linear programming [7]. The primal Newton’s method for SOCP have been considered in [8]. Both dual barrier-projection methods and the primal Newton’s method had been worked out with the help of optimality conditions.

In dual methods the dual variables depending on primal variable are defined. As a result, the system of nonlinear equations with respect to dual variables, including a slack dual variable, is derived. In [6] this derived system of nonlinear equations is solved by the fix point method. The proposed in [6] dual methods are of the affine-scaling type. In present paper unlike to [6] the Newton method is used for solving the derived system of nonlinear equations. Under assumption that the solutions of primal and dual problems are strictly complementary dual Newton’s methods converge locally to these solutions with super-linear rate.

The paper is organized as follows. In Sect. 1, we formulate the SOCP program. Section 2 is principal in the paper. In this section the dual iterative methods for SOCP programs, based on the Newton method, are constructed. In Sect. 3 we show that in the case of non-degenerate problem these dual methods are well-posed. Finally, in Sect. 4, the local convergence of the methods is proved.

In what follows, the identity matrix of order s is denoted by \(I_s\). The symbol \(0_s\) is used for denoting the zero s-dimensional vector, and the symbol \(0_{sk}\) is used for denoting \(s\times k\) zero matrix. By is denoted the diagonal matrix with a vector x at its diagonal. Similarly, a block diagonal matrix with diagonal blocks \(M_1, \ldots , M_k\) is denoted by \(\text{ DIAG } \left( M_1, \ldots , M_k \right) \).

2 The Linear Second-Order Cone Programming Problem

Let denote a closed convex pointed cone with the nonempty interior. This cone induces in a partial order, that is: \(x_1 \succeq _K x_2\), if \(x_1 - x_2 \in {K}\). The linear cone programming problem is

$$\begin{aligned} \min \ \langle c, x \rangle , \quad \mathcal{A}x = b, \quad x \in \mathcal{K}, \end{aligned}$$
(1)

where \(\mathcal{A}\) is a \(m\,\times \, n\) matrix, and , . The semicolons between vectors or components of a vector denote that these vectors or components are placed one under another. The angle brackets denotes the usual Euclidean scalar product.

The linear SOCP program is a special case of the problem (1). Let , . Let also matrices \(A_i\) have dimensions \(m \times n_i\), . Consider the problem

(2)

Here \(K^{n_i}_2\) is the second order cone (the Lorentz cone) in , defined as

where \(\Vert \cdot \Vert \) is the Euclidean norm. The cone \(K^{n_i}_2\) is self-dual, that is \(\left( K^{n_i}_2\right) ^* = K^{n_i}_2\).

The following problem is dual to (2)

$$\begin{aligned} \begin{array}{c} \max \ \langle b, u \rangle , \\ A_i^T u + y_i = c_i, \ 1\le i\le r; \quad y_1 \succeq _{K^{n_1}_2} 0_{n_1}, \ \ldots , \ y_r \succeq _{K^{n_r}_2} 0_{n_r}, \end{array} \end{aligned}$$
(3)

in which .

Denote \(n = n_1 + \cdots + n_r\). If set \(c = [c_1; \ldots ; c_r]\), \(x = [x_1; \ldots ; x_r]\), \(y = [y_1; \ldots ; y_r]\) and \(\mathcal{A}= \left[ A_1, \ldots A_r \right] \), \(\mathcal{K}= K_2^{n_1} \times \cdots \times K_2^{n_r}\), then the problem (2) can be written in the form of (1). The cone \(\mathcal{K}\) is self-dual. We assume that both problems (2) and (3) have solutions, and that rows of the matrix \(\mathcal{A}\) are linear independent. We assume also that \(r>1\).

Let \(y(u) = c - \mathcal{A}^Tu\). By

we will denote the feasible sets in problems (2) and (3), respectively. By we will denote the projection of the set \(\mathcal{F}_D\) onto the space , i.e. the set

If x and [uy] are solutions of problems (2) and (3), then they satisfy to the following system of equalities

$$\begin{aligned} \langle x, y \rangle = 0, \qquad \mathcal{A}x = b, \qquad y = c- \mathcal{A}^T u, \end{aligned}$$
(4)

and to inclusions: \(x \in \mathcal{K}\), \(y \in \mathcal{K}\). Taking into account these inclusions, the equality \(\langle x, y \rangle = 0\) from (4) can be replaced by n other equalities

$$\begin{aligned} x_i \circ y_i = 0_{n_i}, \quad 1\le i\le r, \end{aligned}$$
(5)

where the product between vectors \(x_i \in \mathbb R^{n_i}\) and \(y_i \in \mathbb R^{n_i}\) is defined by the following way \(x_i\circ y_i = \left[ x_i^T y_i; \, x_i^0 \bar{y} + y_i^0 \bar{x}_i\right] \). By introducing the matrix

$$ \text{ Arr } \,(x_i) = \left[ \begin{array}{cc} x_i^0 &{} \ \bar{x}_i^T \\ \bar{x}_i \ &{} \ x_i^0 I_{n-1} \end{array} \right] , $$

the product \(x_i\circ y_i\) can be represented as \(x_i\circ y_i = \text{ Arr } \,(x_i) \, y_i = \text{ Arr } \,(y_i) \, x_i\).

Compose the block-diagonal matrix \( {\mathbf {Arr}}(y) = \text{ DIAG } \left[ \text{ Arr } \, (y_1), \ \ldots , \ \text{ Arr }\, (y_r) \right] . \) Then equalities (4) can be rewritten as

$$\begin{aligned} {\mathbf {Arr}}\, (y) \, x = {\mathbf {Arr}}\, (x) \, y = 0_n, \qquad \mathcal{A}x = b, \qquad y = c- \mathcal{A}^T u, \end{aligned}$$
(6)

where, recall, \(x\in \mathcal{K}\), \(y\in \mathcal{K}\).

3 The Dual Newton’s Methods

Consider the iterative dual methods for solving problems (2) and (3). These methods are analogs of the primal method proposed in [8]. In dual methods the dependence x(u) or more general dependence x(uy) are used to derive from (6) the system of nonlinear equations depending on only dual variables.

In order to obtain x(u) we multiply the second equality from (6) by the matrix \(\mathcal{A}^T\) and sum it with the first equality (6). As a result, we get the equation with respect to x:

$$\begin{aligned} {\mathbf \Phi }(y)x = \mathcal{A}^Tb, \end{aligned}$$
(7)

where by \({\mathbf \Phi }(y)\) is denoted the matrix: \({\mathbf \Phi }(y) = \mathcal{A}^T \mathcal{A}+ {\mathbf {Arr}}\, (y)\). The matrix \({\mathbf \Phi }(y)\) is symmetric of order n. If \({\mathbf \Phi }(y)\) is nonsingular, then, solving the Eq. (7), we obtain

$$\begin{aligned} x = x(y) = {\mathbf \Phi }^{-1} (y) \mathcal{A}^Tb. \end{aligned}$$

Taking \(y = y(u)\), we conclude that in fact the matrix \({\mathbf \Phi }(y)\) depends on u.

Substituting the founded \(x(u) = x(y(u))\) into the second equation from (6), we get the system of nonlinear equations with respect to u, namely,

$$\begin{aligned} \left[ I_m - \mathcal{A}{\mathbf \Phi }^{-1}(y(u)) \mathcal{A}^T \right] b = 0_m. \end{aligned}$$
(8)

The system (8) consists of m equations. The number of unknowns is also equal to m.

Applying the Newton method to solve (8), we obtain the iterative process

$$\begin{aligned} u_{k+1} = u_k - {\mathbf{G} \,}^{-1} (u_k) \left( \mathcal{A}x_k - b \right) . \end{aligned}$$
(9)

Here \(x_k = x(u_k)\) and \( {\mathbf{G} \,}(u) = \frac{d }{du} \, \mathcal{A}\, x(u) = \mathcal{A}\, x_u(u). \)

Treating (7) as the identity with respect to u, we obtain after differentiating

$$ {\mathbf {Arr}}_u(y(u)) \, x(u) + {\mathbf \Phi }(y(u)) \, x_u(u) = 0_{nm}. $$

If \({\mathbf \Phi }(y(u))\) is a nonsingular matrix, then

$$\begin{aligned} x_u(u) = - {\mathbf \Phi }^{-1} (y(u)) \, {\mathbf {Arr}}_u (y(u)) \, x(u). \end{aligned}$$
(10)

Since \(y(u) = c - \mathcal{A}^T u\), we get \({\mathbf {Arr}}_u (y(u)) = -{\mathbf {Arr}}_y (y) \mathcal{A}^T\).

Proposition 1

For any \(x\in \mathbb {R}^n\) the equality

$$\begin{aligned} {\mathbf {Arr}}_u (y(u)) \, x = - {\mathbf {Arr}}\, (x) \mathcal{A}^T \end{aligned}$$
(11)

holds.

Proof

Let us take the product \(z_i\) of any matrix \(\text{ Arr } (y_i)\) on the vector \(x_i\) and differentiate each row of \(z_i\) by \(y_i\) separately. First of all, the “null” row is the following: \( z_i^0 = \sum _{j=0}^{n-1} x_i^j y_i^j. \) Therefore,

$$\begin{aligned} \frac{d}{dy} z_i^0 = \left[ x_i^0; \, x_i^1; \, \ldots ; \, x_i^{n-1} \right] . \end{aligned}$$
(12)

Further, for any consequent j-th row: \( z_i^j = y_i^j x_i^0 + y_i^0 x_i^j. \) Hence

$$\begin{aligned} \frac{d}{dy}z_i^j = \left[ x_i^j; \, 0; \ldots ; 0; \, x_i^0; \, 0; \ldots , 0 \right] , \quad 1\le j \le n-1. \end{aligned}$$
(13)

From (12) and (13) we derive that \( {\mathbf {Arr}}_y(y)x = {\mathbf {Arr}}(x)\). Hence, the equality (11) takes place.    \(\square \)

According to Proposition 1 and to (10) \( {\mathbf{G} \,}(u) = \mathcal{A}{\mathbf \Phi }^{-1} (u) {\mathbf {Arr}}(x(u)) \mathcal{A}^T. \) Thus, the iterative method (9) can be written in the following form

$$\begin{aligned} u_{k+1} = u_k - \left[ \mathcal{A}{\mathbf \Phi }^{-1}(y_k) {\mathbf {Arr}}(x_k) \mathcal{A}^T \right] ^{-1} \left( \mathcal{A}x_k - b\right) , \end{aligned}$$
(14)

where \(x_k = x(u_k)\), \(y_k = y(u_k)\).

It is possible to consider the more general with respect to (9) iterative process. In this process both variables u and y are updated at each iteration. In order to construct the method we add to the right side of Eq. (7) the second equality from (6), multiplied by some parameter \(\tau > 0\). As a result, we obtain instead of (7) the system of equations

$$\begin{aligned} {\mathbf \Phi }(y) x = \mathcal{A}^Tb + \tau \left( y + \mathcal{A}^T u - c \right) \end{aligned}$$
(15)

with the solution \(x(u,y) = {\mathbf \Phi }^{-1} (y) f(u,y)\), where \( f(u, y) = \mathcal{A}^Tb + \tau \left( y + \mathcal{A}^T u - c \right) . \)

Substituting x(uy) in first and second equalities from (6), we obtain the system of \(n+m\) equations

$$\begin{aligned} \mathcal{A}{\mathbf \Phi }^{-1}(y) f(u,y) - b = 0_m, \qquad {\mathbf {Arr}}(y) {\mathbf \Phi }^{-1} (y) f(u,y) = 0_n. \end{aligned}$$
(16)

Denote \(w = [u;y]\) and \({\mathbf \Psi }(w) = \left[ {\mathbf \Psi }^{(1)} (w); \, {\mathbf \Psi }^{(2)} (w)\right] \), where

$$ {\mathbf \Psi }^{(1)} (w) = \mathcal{A}{\mathbf \Phi }^{-1}(y) f(u,y) - b, \quad {\mathbf \Psi }^{(2)} (w) = {\mathbf {Arr}}(y) {\mathbf \Phi }^{-1} (y) f(u,y). $$

Lemma 1

Let the point \(w = [u;y]\in \mathcal{F}_D\) be such that the matrix \({\mathbf \Phi }(y)\) is nonsingular. Then the matrix \({\mathbf \Psi }_w (w)\) has the form

$$\begin{aligned} {\mathbf \Psi }_w (w) = \left[ \begin{array}{cc} \tau \mathcal{A}{\mathbf \Phi }^{-1} \mathcal{A}^T &{} \ \mathcal{A}{\mathbf \Phi }^{-1}\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] \\ \tau {\mathbf {Arr}}(y) {\mathbf \Phi }^{-1} \mathcal{A}^T &{} \ \left[ I_n - {\mathbf {Arr}}(y){\mathbf \Phi }^{-1} \right] {\mathbf {Arr}}(x(w)) + \tau {\mathbf {Arr}}(y){\mathbf \Phi }^{-1} \end{array} \right] . \end{aligned}$$
(17)

where \({\mathbf \Phi }^{-1} = {\mathbf \Phi }^{-1}(y)\).

Proof

Differentiating \({\mathbf \Psi }^{(1)}\), we obtain: \({\mathbf \Psi }^{(1)}_u (w) = \mathcal{A}x_u(w)\), \({\mathbf \Psi }^{(1)}_y (w) = \mathcal{A}x_y(w)\). Moreover,

$$ {\mathbf \Psi }^{(2)}_u (w) = {\mathbf {Arr}}\, (y) \ x_u(w), \quad {\mathbf \Psi }^{(2)}_y (w) = {\mathbf {Arr}}\, (x(w)) + {\mathbf {Arr}}\, (y) \ x_y(w). $$

Because of (15), the function x(w) is satisfied to the identity

$$\begin{aligned} \left[ \mathcal{A}^T \mathcal{A}+ {\mathbf {Arr}}\, (y) \right] x(w) \equiv \mathcal{A}^Tb + \tau \left( y + \mathcal{A}^T u - c \right) . \end{aligned}$$
(18)

After differentiation (18) by u we obtain

$$\begin{aligned} \left[ \mathcal{A}^T \mathcal{A}+ {\mathbf {Arr}}\, (y) \right] x_u(w) = \tau \mathcal{A}^T. \end{aligned}$$
(19)

Respectfully, after differentiation (18) by y we derive the equality \( \mathcal{A}^T \mathcal{A}\, x_y(w) + \frac{\partial }{\partial \, y} \, {\mathbf {Arr}}\, (y) x(w) = \tau I_n \) or

$$\begin{aligned} \mathcal{A}^T \mathcal{A}\, x_y(w) + {\mathbf {Arr}}\, ( x(w)) + {\mathbf {Arr}}\, (y) \, x_y(w) = \tau I_n. \end{aligned}$$
(20)

Equalities (19) and (20) can be written as

$$ {\mathbf \Phi }(y) x_u(w) = \tau \mathcal{A}^T, \quad {\mathbf \Phi }(y) x_y(w) + {\mathbf {Arr}}\, (x(w)) = \tau I_n. $$

If \({\mathbf \Phi }(y)\) is a nonsingular matrix, we derive from here that

$$ x_u(w) = \tau {\mathbf \Phi }^{-1} (y)\mathcal{A}^T, \quad x_y(w) = {\mathbf \Phi }^{-1}(y)\left[ \tau I_n - {\mathbf {Arr}}\, (x(w)) \right] . $$

Thus, the matrix \({\mathbf \Psi }_w (w)\) has the form (17).    \(\square \)

If the matrix \({\mathbf \Psi }_w (w)\) is nonsingular for all points w in some neighbourhood of the solution \(w_*\) of the problem (3), then it is possible to apply the Newton method for solving the system of nonlinear equations (16). We obtain the dual iterative method

$$\begin{aligned} w_{k+1} = w_k - {\mathbf \Psi }_w^{-1}(w_k) {\mathbf \Psi }(w_k). \end{aligned}$$
(21)

The point \(w_0\) must be taken from some vicinity of the solution \(w_*\).

Denote \({\mathbf \Gamma }(y) = \mathcal{A}{\mathbf \Phi }^{-1} (y)\mathcal{A}^T\), \({\mathbf{K}}_1 (y) = \mathcal{A}\, {\mathbf \Phi }^{-1} (y)\), \({\mathbf{K}}_2 (y) = {\mathbf {Arr}}(y) {\mathbf \Phi }^{-1} (y)\). Then the matrix (17) can be written as

$$\begin{aligned} {\mathbf \Psi }_w (w) = \left[ \begin{array}{cc} \tau {\mathbf \Gamma }(y) &{} \ {\mathbf{K}}_1(y)\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] \\ \tau {\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) &{} \ \left[ I_n - {\mathbf{K}}_2(y) \right] {\mathbf {Arr}}(x(w)) + \tau {\mathbf{K}}_2(y) \end{array} \right] . \end{aligned}$$

Let the point \(w\in \mathcal{F}_D\) be such that the matrix \({\mathbf \Psi }_w (w)\) is nonsingular. In this case the matrix \({\mathbf \Gamma }(y) \) is also nonsingular. Denote \({\mathbf{K}}_3(y) = {\mathbf{K}}_1^T (y){\mathbf \Gamma }^{-1} {\mathbf{K}}_1 (y)\) and

$$ \begin{array}{c} {\mathbf \Omega }(w) = \left[ I_n - {\mathbf{K}}_2(y) \right] {\mathbf {Arr}}(x(w)) + \tau {\mathbf{K}}_2(y) - {\mathbf {Arr}}(y) {\mathbf{K}}_3(y) \left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] . \end{array} $$

Then, by the Frobenius formula, we obtain

$$ {\mathbf \Psi }_w^{-1} (w) = \left[ \begin{array}{cc} {\mathbf{P}}_1 &{} {\mathbf{P}}_2 \\ {\mathbf{P}}_3 &{} {\mathbf{P}}_4 \end{array} \right] , $$

where \({\mathbf{P}}_4 = {\mathbf \Omega }^{-1}\) and

$$ \begin{array}{c} {\mathbf{P}}_1 = \tau ^{-1} {\mathbf \Gamma }^{-1} + \tau ^{-1} {\mathbf \Gamma }^{-1} {\mathbf{K}}_1(y) \left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] {\mathbf \Omega }^{-1} {\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) {\mathbf \Gamma }^{-1}, \end{array} $$
$$ {\mathbf{P}}_2 = - \tau ^{-1} {\mathbf \Gamma }^{-1} {\mathbf{K}}_1(y)\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] {\mathbf \Omega }^{-1}, \quad {\mathbf{P}}_3 = - {\mathbf \Omega }^{-1} {\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) {\mathbf \Gamma }^{-1}. $$

At last, denote \({\mathbf{W}}= {\mathbf \Gamma }^{-1} {\mathbf{K}}_1(y)\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] {\mathbf \Omega }^{-1}\). It follows from previous formulas that \( {\mathbf{P}}_2 = - \tau ^{-1} {\mathbf{W}}\) and \( {\mathbf{P}}_1 = \tau ^{-1} \left[ I_m - {\mathbf{W}}{\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) \right] {\mathbf \Gamma }^{-1}. \)

Therefore, the formulas (21) for updating the point \([u_k; y_k]\) are following:

$$ u_{k+1} = u_k - \tau ^{-1} \left[ \left( I_m - {\mathbf{W}}{\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) \right) {\mathbf \Gamma }^{-1} \left( \mathcal{A}x_k - b \right) + {\mathbf{W}}{\mathbf {Arr}}(y_k) x_k\right] , $$
$$ y_{k+1} = y_k + {\mathbf \Omega }^{-1} \left[ {\mathbf {Arr}}(y) {\mathbf{K}}_1^T (y) {\mathbf \Gamma }^{-1} \left( \mathcal{A}x_k - b \right) - {\mathbf {Arr}}(y_k) x_k \right] , $$

where \(x_k = x(w_k) = {\mathbf \Phi }^{-1}(y_k) f(w_k)\).

4 Non-degeneracy in the Dual Problem

Let us show that the matrix \({\mathbf \Phi }(y)\) is nonsingular, if the point \([u,y]\in \mathcal{F}_D\) is non-degenerate.

Definition 1

[1]. The point \([u, y] \in \mathcal{F}_D\) is called non-degenerate if \( \mathcal{T}_\mathcal{K}(y) + \mathcal{R}(\mathcal{A}^T) = \mathbb R^n, \) where \(\mathcal{T}_\mathcal{K}(y)\) is the tangent space to the cone \(\mathcal{K}\) at the point y and \(\mathcal{R}(\mathcal{A}^T)\) is the image of the matrix \(\mathcal{A}^T\).

Let \([u,y]\in \mathcal{F}_D\), and let the vector \(y\in \mathcal{K}\) be partitioned onto three blocks of components: \( y = \left[ y_F; y_I; y_N \right] . \) We assume for definiteness that these blocks are consisted from components \(y_i\) ordered in the following way: \(y_F = \left[ y_1; \ \ldots ; \ y_{r_F}\right] \), \(y_I = \left[ y_{r_F+1}; \ \ldots ; \ y_{r_F+r_I}\right] \), \(y_N = \left[ y_{r_F+ r_I+1}; \ \ldots ; \ y_{r_F+r_B+ r_N}\right] \). Recall that \(r = r_F + r_I + r_N\).

The partition of the vector y induces the partition of the index set \(J^r = [1: r]\) onto three subsets:

$$ J^r_F (y) = [1, \ldots , r_F], \quad J^r_I (y) = [r_F + 1, \ldots , r_F+ r_I], \quad J^r_N(y) = [r_F+r_I + 1, \ldots , r]. $$

If \(i\in J^r_F (y)\), then \(y_i \not = 0_{n_i}\) and \(y_i \in \partial K^{n_i}_2\), where \(\partial K^{n_i}_2\) is the boundary of the cone \(K^{n_i}_2\). If \(i \in J^r_I (y)\), then \(y_i = 0_{n_i}\). At last, in the case, where \( i \in J^r_N (y)\), the inclusion \(y_i \in \text{ int } K^{n_i}_2\) holds. According to the partition of the vector y onto three blocks of components we partition also the matrix \(\mathcal{A}= \left[ \mathcal{A}_F, \mathcal{A}_I, \mathcal{A}_N \right] \) and the vector \(c= \left[ c_F; c_I; c_N \right] \).

For any nonzero component , \(i\in J^r\), the following spectral decomposition

$$\begin{aligned} y_i = \theta _{i,1} {\mathbf{d}}_{i,1} + \theta _{i,n_i} {\mathbf{d}}_{i,n_i} \end{aligned}$$
(22)

takes place [1]. Here the pair of vectors

$$ {\mathbf{d}}_{i,1} = \frac{1}{\sqrt{2}} \left[ 1; \frac{\bar{y}_i}{\Vert \bar{y}_i\Vert } \right] , \qquad {\mathbf{d}}_{i,n_i} = \frac{1}{\sqrt{2}} \left[ 1; -\frac{\bar{y}_i}{\Vert \bar{y}_i\Vert } \right] , $$

is a Jordan frame. The coefficients \(\theta _{i,1}\) and \(\theta _{i,n_i}\) in (22) are following:

$$ \theta _{i,1} = \frac{1}{\sqrt{2}} \left( y_i^0 + \Vert \bar{y}_i\Vert \right) , \qquad \theta _{i,n_i} = \frac{1}{\sqrt{2}} \left( y_i^0 - \Vert \bar{y}_i\Vert \right) . $$

Both vectors \({\mathbf{d}}_{i,1}\) and \({\mathbf{d}}_{i,n_i}\) are orthogonal each to other and their lengths equal to one.

If \(y_i \in K^{n_i}_2\), then \(\theta _{i,1}\ge 0\) and \(\theta _{i,n_i} \ge 0\). In the case, where \(y_i \not = 0_{n_i}\) and \(y_i \in \partial K^{n_i}_2\), the equality \(y_i^0 = \Vert \bar{y}_i \Vert \) holds. Hence, only the first coefficient \(\theta _{i,1} = \sqrt{2} y_i^0 = \sqrt{2} \Vert \bar{y}_i \Vert \) differs from zero.

Let us assume that \(y_i \in K^{n_i}_2\) and \(y_i \not =0_{n_i}\). The matrix \(\text{ Arr } \,(y_i)\) is symmetric. Denote by \(H_i\) the orthogonal matrix with columns being eigenvectors of \(\text{ Arr } \,(y_i)\). The vectors \({\mathbf{d}}_{i,1}\) and \({\mathbf{d}}_{i,n_i}\) are among eigenvectors of \(\text{ Arr } \,(y_i)\). The matrix \(H_i\) can be taken in the following form

$$\begin{aligned} H_i = \left[ {\mathbf{d}}_{i,1}, h_{i,2}, \, \ldots , \, h_{i,n_i-2}, \, {\mathbf{d}}_{i, n_i} \right] . \end{aligned}$$

The eigenvectors \(h_{i,2}, \ldots h_{i, n_i-2}\) are arbitrary vectors from the subspace

$$ \mathbb R^{n_i}_0 = \left\{ z = [z^0; \bar{z}] \in \mathbb R^{n_i}: \ z^0 = 0 \right\} . $$

All these vectors have the unit length and are orthogonal each to others. Moreover, they are orthogonal to the vectors \({\mathbf{d}}_{i,1}\) and \({\mathbf{d}}_{i,n_i}\).

Eigenvalues \(y_i^0 + \Vert \bar{y}_i\Vert \) and \(y_i^0 - \Vert \bar{y}_i\Vert \) correspond to the eigenvectors \({\mathbf{d}}_{i,1}\) and \({\mathbf{d}}_{i,n_i}\), respectively. The eigenvalue \(y_i^0\) has the multiplicity \(n_i -2\) and corresponds to eigenvectors \(h_{i,2}, \ldots h_{i, n_i-2}\). Denoting by \(\varSigma _i\) the diagonal matrix

$$\begin{aligned} \varSigma _i = \text{ Diag } \left( \sqrt{2}\theta _{i,1}, y_i^0, \ldots , y_i^0, \sqrt{2}\theta _{i, n_i} \right) , \end{aligned}$$

we have \(\text{ Arr } \, (y_i) = H_i \varSigma _i H_i^T. \)

If \(i\in J^r_I(y)\), then \(y_i = 0_{n_i}\). In this case the identity matrix \(I_{n_i}\) can be taken as the orthogonal matrix \(H_i\). It is evident that \(\varSigma _i = 0_{n_in_i}\) for this \(\text{ Arr } \,(y_i)\).

Introduce into consideration the block-diagonal matrices

$$ {\mathbf{H}}_F = \text{ DIAG } \, \left[ H_1, \ \ldots , \ H_{r_F} \right] , \quad {\mathbf{H}}_I = \text{ DIAG } \, \left[ H_{r_F+1}, \ \ldots , \ H_{r_F+r_I} \right] , $$

The matrices \({\mathbf{H}}_F\) and \({\mathbf{H}}_I\) are orthogonal. In the same way we combine the diagonal matrices \(\varSigma _i\):

$$\begin{aligned} {\mathbf \varSigma }_F = \text{ DIAG } \,\left[ \varSigma _1, \ldots , \varSigma _{r_F}\right] , \quad {\mathbf \varSigma }_I = \text{ DIAG } \, \left[ \varSigma _{r_F+1}, \ldots , \varSigma _{r_F+r_I}\right] , \end{aligned}$$

Let \(\mathcal{A}^{\mathbf{H}}_F = \mathcal{A}_F {\mathbf{H}}_F\), and let \(\tilde{\mathcal{A}}^{{\mathbf{H}}}_F\) be the matrix \(\mathcal{A}^{{\mathbf{H}}}_F\), from which all columns are removed, except the columns being the first columns of matrices \(A_i H_i\), \(i\in J^r_F(y)\). The matrix \(\tilde{\mathcal{A}}^{{\mathbf{H}}}_F\) has the dimension \(m \times r_F\). Denote \(\mathcal{A}^{{\mathbf{H}}}_{FI} = \left[ \tilde{\mathcal{A}}^{{\mathbf{H}}}_F, \mathcal{A}^{{\mathbf{H}}}_I \right] \), where \(\mathcal{A}^{{\mathbf{H}}}_I = \mathcal{A}_I {\mathbf{H}}_I\). The following criterion of the non-degeneracy of the point [uy] is valid [1].

Proposition 2

The point \([u,y]\in \mathcal{F}_D\) is non-degenerate if and only if columns of the matrix \(\mathcal{A}^{{\mathbf{H}}}_{FI}\) are linear independent.

It follows from Proposition 2, that at a non-degenerate point [uy] the inequality \(r_F + n_I \le m\) takes place, where \(n_I = \sum _{i\in J^r_I(y)} n_i\).

Proposition 3

[6] Let the point \([u,y]\in \mathcal{F}_D\) be non-degenerate. Then the matrix \({\mathbf \Phi }(y)\) is nonsingular.

We call the dual problem (3) non-degenerate, if all points \([u,y]\in \mathcal{F}_D\) are non-degenerate. Below we suppose that the problem (3) is non-degenerate.

5 Local Convergence of the Dual Methods

Let \(x_*\) and \([u_*,y_*]\) be the solutions of problems (2) and (3), respectively. We assume without loss of generality that for the vector \(x_*\) the following partition onto three blocks of components \( x_* = \left[ x_{*,F}, \, x_{*,I}, \, x_{*,N} \right] \) holds. Moreover, the number of component in blocks \(x_{*,F}\), \(x_{*,I}\) and \(x_{*,N}\) is equal to \(r_F\), \(r_I\) and \(r_N\), respectively. Each component \(x_{*,i}\) from the block \(x_{*,F}\) belongs to the boundary of the cone \(K^{n_i}_2\). Each component \(x_{*,i}\) from the block \(x_{*,I}\) is an interior point of \(K^{n_i}_2\). All \(x_{*,i}\) from the block \(x_{*,N}\) are zero vectors.

Besides, let for the vector \(y_*\) the decomposition onto block of components \( y_* = \left[ y_{*,F}, \, y_{*,I}, \, y_{*,N} \right] \) take place. Moreover, the number of components in blocks is equal to \(\bar{r}_F\), \(\bar{r}_I\) and \(\bar{r}_N\), respectively. But unlike to \(x_*\), components \(y_{*,i}\) from the block \(y_{*,I}\) are zero vectors. On the contrary, \(y_{*,i}\) from the block \(y_{*,N}\) is an interior point of the cone \(K^{n_i}_2\).

According to (5) the following complementary condition \(x_{*,i} \circ y_{*,i} = 0\), \(1\le i\le r\), holds. The strict complementary condition means that additionally \( x_{*,i} + y_{*,i} \in \, \text{ int } \, K^{n_i}_2\). In this case \(\bar{r}_F = r_F\), \(\bar{r}_I = r_I\) and \(\bar{r}_N = r_N\). Furthermore, the matrices \(\text{ Arr } (x_{*,i})\) and \(\text{ Arr } (y_{*,i})\) commute between themselves. The following decompositions

$$\begin{aligned} \text{ Arr } (x_{*,i}) = H_i \Lambda _{i} H_i^T, \qquad \text{ Arr } (y_{*,i}) = H_i \varSigma _{i} H_i^T, \end{aligned}$$
(23)

take place. Here \(H_i\) is an orthogonal matrix, and \(\Lambda _{i}\) and \(\varSigma _{i}\) are diagonal matrices with eigenvalues of \(\text{ Arr } (x_{*,i})\) and \(\text{ Arr } (y_{*,i})\) at their diagonals, respectively. Below we set \(r_{FI} = r_F + r_I\) and \(J^r_F = [1: r_F]\), \(J^r_I = [r_F+1: r_{FI}]\), \(J^r_N = [r_{FI} + 1:r]\), \(J^r_{FI} = J^r_F \cup J^r_I\).

Similar to (22) for \(x_{*,i}\) the spectral decomposition

$$\begin{aligned} x_{*,i} = \eta _{i,1} {\mathbf{e}}_{i,1} + \eta _{i, n_i} {\mathbf{e}}_{i, n_i} \end{aligned}$$
(24)

holds, where

$$ {\mathbf{e}}_{i,1} = \frac{1}{\sqrt{2}} \left[ 1; \frac{\bar{x}_{*,i}}{\Vert \bar{x}_{*,i}\Vert } \right] , \qquad {\mathbf{e}}_{i,n_i} = \frac{1}{\sqrt{2}} \left[ 1; -\frac{\bar{x}_{*,i}}{\Vert \bar{x}_{*,i}\Vert } \right] $$

are frame vectors. The coefficients \(\eta _{i,1}\) and \(\eta _{i,n_i}\) in (24) are following:

$$ \eta _{i,1} = \frac{1}{\sqrt{2}} \left( x_{*,i}^0 + \Vert \bar{x}_{*,i}\Vert \right) , \qquad \eta _{i,n_i} = \frac{1}{\sqrt{2}} \left( x_{*,i}^0 - \Vert \bar{x}_{*,i}\Vert \right) . $$

Both \({\mathbf{e}}_{i,1}\) and \({\mathbf{e}}_{i,n_i}\) are unit vectors and orthogonal each to other.

The orthogonal matrix \(H_i\) in (23) has the form

$$\begin{aligned} H_i = \left[ {\mathbf{e}}_{i,1}, h_{i,2}, \ldots , h_{i,n_i-2}, {\mathbf{e}}_{i,n_i} \right] , \end{aligned}$$
(25)

where \(h_{i,2}, \ldots , h_{i, n_i-2}\) are unit vectors from the subspace \(\mathbb {R}^{n_i}_0\). The matrix \(\Lambda _i = \text{ Diag } \left( \sqrt{2} \eta _{i,1}, \, x_{*,i}^0, \, \ldots , \, x_{*,i}^0, \, \sqrt{2}\eta _{i,n_i} \right) \) is diagonal with eigenvalues of \(\text{ Arr } (x_{*,i})\) on its diagonal, \(i \in J^r_{FI}\). Remark, that for \(i \in J^r_F\) the last eigenvalue is zero, that is \( \Lambda _i = \text{ Diag } \left( 2 x_{*,i}^0, \, x_{*,i}^0, \, \ldots , \, x_{*,i}^0, \, 0\right) . \)

At solutions \(x_*\) and \(y_*\) according to the complementary condition the vector \({\mathbf{e}}_{i,n_i}\) must be collinear to the vector \({\mathbf{d}}_{i,1}\) from the spectral decomposition (22) for \(y_{*,i}\). Hence, the orthogonal matrix (25) can be used also in the spectral decomposition of the matrix \(\text{ Arr } (y_{*,i})\), \(i\in J^r_{FI}\), i.e. \(\text{ Arr } (y_{*,i}) = H_i \varSigma _i H_i^T\), where \(\varSigma _i\) is a diagonal matrix with the vector of eigenvalues of the matrix \(\text{ Arr }(y_{*,i})\) at its diagonal. For \(i\in J^r_F\) the matrix \(\varSigma _i\) has the form \(\varSigma _i = \text{ Diag } \left( 0, y_{*,i}^0, \, \ldots , \, y_{*,i}^0, \, 2 y_{*,i}^0 \right) \). The matrix \(\Lambda _i\) is zero for \(i\in J^r_N\), and, vice verse, the matrix \(\varSigma _i\) is zero, when \(i\in J^r_I\).

In addition, let the orthogonal matrix \(H_i\) for \(i\in J^r_N\) be defined by the matrix \(\text{ Arr } (y_{*,i})\), that is \(\text{ Arr } (y_{*,i}) = H_i \varSigma _i H_i^T\). Then \( H_i = \left[ {\mathbf{d}}_{i,1}, h_{i,2}, \ldots , h_{i,n_i-2}, {\mathbf{d}}_{i,n_i} \right] \) and \( \varSigma _i = \text{ Diag }\left( y_{*,i}^0 + \Vert \bar{y}_{*,i}\Vert , \, y_{*,i}^0, \, \ldots , \, y_{*,i}^0, \, y_{*,i}^0 - \Vert \bar{y}_{*,i}\Vert \right) . \) Moreover, \(\Lambda _i\) is a zero matrix for \(i\in J^r_N\).

Let \({\mathbf \Lambda }= \text{ DIAG } \left( {\mathbf \Lambda }_F, {\mathbf \Lambda }_I, {\mathbf \Lambda }_N \right) \), \({\mathbf \varSigma }= \text{ DIAG } \left( {\mathbf \varSigma }_F, {\mathbf \varSigma }_I, {\mathbf \varSigma }_N \right) \), where

$$ {\mathbf \Lambda }_F = \text{ DIAG } \left( \Lambda _1, \ldots , \Lambda _{r_F} \right) , \quad {\mathbf \varSigma }_F = \text{ DIAG } \left( \varSigma _1, \ldots , \varSigma _{r_F} \right) $$

and

$$ {\mathbf \Lambda }_I = \text{ DIAG } \left( \Lambda _{r_F+1}, \ldots , \Lambda _{r_{FI}} \right) , \quad {\mathbf \varSigma }_I = \text{ DIAG } \left( \varSigma _{r_F+1}, \ldots , \varSigma _{r_{FI}} \right) , $$
$$ {\mathbf \Lambda }_N = \text{ DIAG } \left( \Lambda _{r_{FI}+1}, \ldots , \Lambda _{r} \right) , \quad {\mathbf \varSigma }_N = \text{ DIAG } \left( \varSigma _{r_{FI}+1}, \ldots , \varSigma _{r} \right) . $$

Set also \({\mathbf{H}}= \text{ DIAG } \left( H_1, \ldots , H_r \right) \) and denote: \(\mathcal{A}^{{\mathbf{H}}_F} = \mathcal{A}{\mathbf{H}}_F\), \(\mathcal{A}^{{\mathbf{H}}_I} = \mathcal{A}{\mathbf{H}}_I\), \(\mathcal{A}^{{\mathbf{H}}_N} = \mathcal{A}{\mathbf{H}}_N\), \(\mathcal{A}^{{\mathbf{H}}} = \mathcal{A}{\mathbf{H}}\). For \(\mathcal{A}^{{\mathbf{H}}}\) the decomposition \(\mathcal{A}^{{\mathbf{H}}} = \left[ \mathcal{A}^{{\mathbf{H}}_F}, \, \mathcal{A}^{{\mathbf{H}}_I}, \, \mathcal{A}^{{\mathbf{H}}_N} \right] \) is valid. With the introduced notations the matrix \({\mathbf{G} \,}(u_*)\) can be submitted in the form

$$\begin{aligned} {\mathbf{G} \,}(u_*) = \mathcal{A}^{{\mathbf{H}}} {\mathbf \Phi }^{-{\mathbf{H}}}(y_*) {\mathbf \Lambda }(x_*) (\mathcal{A}^{{\mathbf{H}}})^T, \qquad {\mathbf \Phi }^{-{\mathbf{H}}}(y_*) = \left( {\mathbf \Phi }^{\mathbf{H}}(y_*) \right) ^{-1}, \end{aligned}$$
(26)

where \(y_* = y(u_*)\) and \({\mathbf \Phi }^{\mathbf{H}}(y_*) = {\mathbf{H}}^T {\mathbf \Phi }(y_*){\mathbf{H}}\).

We have by aforesaid

$$ {\mathbf \Phi }^{{\mathbf{H}}}(y_*) = \left[ \begin{array}{ccc} \left( \mathcal{A}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_F + {\mathbf \varSigma }_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_I &{} \left( \mathcal{A}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \\ \left( \mathcal{A}^{{\mathbf{H}}}_I\right) ^T \mathcal{A}^{{\mathbf{H}}}_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_I\right) ^T \mathcal{A}^{{\mathbf{H}}}_I &{} \left( \mathcal{A}^{{\mathbf{H}}}_I\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \\ \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_I &{} \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_N + {\mathbf \varSigma }_N \end{array} \right] . $$

All diagonal entrees of the matrix \({\mathbf \varSigma }_N\) are strictly positive. The diagonal matrix \({\mathbf \varSigma }_F\) is such that there are \(r_F\) zero entrees at its diagonal. All these zero entrees are first diagonal elements of the matrices \(\varSigma _i\), \(i \in J^r_F\).

Compute \({\mathbf \Phi }^{-{\mathbf{H}}} (y_*)\). For this purpose we firstly rearrange rows and columns of the matrix. Suppose that first columns of the matrices \(A_i^H\), \(i\in J^r_F\), are removed from \(A_i^H\), and the separate sub-matrix \(\tilde{\mathcal{A}}_F^{\mathbf{H}}\) is composed from these first columns. The dimension of \(\tilde{\mathcal{A}}_F^{\mathbf{H}}\) is \(m\times r_F\). Denote by \(\hat{\mathcal{A}}_F^{\mathbf{H}}\) the sub-matrix of the matrix \(\mathcal{A}_F^{\mathbf{H}}\) composed from the rest columns of \(\mathcal{A}_F^{\mathbf{H}}\). Add the sub-matrix \(\tilde{\mathcal{A}}_F^{\mathbf{H}}\) to the matrix \(\mathcal{A}_I\), putting it before \(\mathcal{A}_I\). The resulting \(m \times (r_F + n_I)\) matrix denote by \(\mathcal{A}_{FI}^{\mathbf{H}}\). Moreover, denote by \(\hat{\mathbf \varSigma }_F\) the diagonal sub-matrix of the matrix \({\mathbf \varSigma }_F\), from which first diagonal entrees of the matrices \({\mathbf \varSigma }_i\), \(i\in J^r_F\), are eliminated. Let \(\mathbf \Pi \) be a permutation matrix, realizing the mentioned changes of rows and columns of \({\mathbf \Phi }^{\mathbf{H}}(y_*)\). Then the matrix \({\mathbf \Phi }^{\mathbf{H}}(y_*)\) can be written in the form

$$\begin{aligned} {\mathbf \Phi }^{{\mathbf{H}}}(y_*) = \mathbf \Pi \left[ \begin{array}{ccc} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F + \hat{{\mathbf \varSigma }}_F &{} \ \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T {\mathcal{A}}^{{\mathbf{H}}}_{FI} &{} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \\ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} &{} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \\ \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} &{} \ \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_N + {\mathbf \varSigma }_N \end{array} \right] \mathbf \Pi ^T. \end{aligned}$$
(27)

Partition the matrix (27) onto four blocks:

$$\begin{aligned} {\mathbf \Phi }^{{\mathbf{H}}}(y_*) = \mathbf \Pi \left[ \begin{array}{cc} \mathcal{W}_{11} &{} \mathcal{W}_{12} \\ \mathcal{W}_{12}^T &{} \mathcal{W}_{22} \end{array} \right] \Pi ^T, \end{aligned}$$

where

$$ \mathcal{W}_{11} = \left[ \begin{array}{cc} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F + \hat{\mathbf \varSigma }_F &{} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI}\\ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F &{} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} \end{array} \right] , \quad \mathcal{W}_{12} = \left[ \begin{array}{c} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \\ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_N \end{array} \right] $$

and \( \mathcal{W}_{22} = {\mathbf \varSigma }_N + \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \mathcal{A}^{{\mathbf{H}}}_N. \)

If the non-degeneracy condition holds at the point \([u_*, y_*]\), then according to Proposition 3 the matrix \({\mathbf \Phi }^{{\mathbf{H}}}(y_*)\) is positive definite. Therefore, the diagonal blocks \(\mathcal{W}_{11}\) and \(\mathcal{W}_{22}\) are also positive definite matrices.

Using the Frobenius formula, we obtain

$$\begin{aligned} {\mathbf \Phi }^{-{\mathbf{H}}}(y_*) = \mathbf \Pi \left[ \begin{array}{cc} \mathcal{V}_{11} &{} \mathcal{V}_{12} \\ \mathcal{V}_{12}^T &{} \mathcal{V}_{22} \end{array} \right] \mathbf \Pi ^T, \end{aligned}$$

where

$$\begin{aligned} \mathcal{V}_{11} = \mathcal{W}_{11}^{-1} + \mathcal{W}_{11}^{-1} \mathcal{W}_{12} \mathcal{Z}^{-1} \mathcal{W}_{12}^T \mathcal{W}_{11}^{-1}, \quad \mathcal{V}_{12} = -\mathcal{W}_{11}^{-1} \mathcal{W}_{12} \mathcal{Z}^{-1}, \quad \mathcal{V}_{22} = \mathcal{Z}^{-1} \end{aligned}$$
(28)

and \( \mathcal{Z}=\mathcal{W}_{22} - \mathcal{W}_{12}^T \mathcal{W}_{11}^{-1} \mathcal{W}_{12}. \)

Firstly, compute the matrix \(\mathcal{W}_{11}^{-1}\). According to Proposition 2 the matrix \(\left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI}\) at the non-degenerate point \([u_*,y_*]\) is nonsingular. Denote

$$\begin{aligned} \mathcal{Y}= \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F + \hat{\mathbf \varSigma }_F - \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} \left[ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} \right] ^{-1} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F. \end{aligned}$$
(29)

Denote also \(\mathcal{P}= \mathcal{A}^{{\mathbf{H}}}_{FI} \left[ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} \right] ^{-1} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T\). The matrix \(\mathcal{P}\) is an orthogonal projector onto the linear sub-space \(\mathcal{L}\), generated by columns of the matrix \(\mathcal{A}_{FI}^{\mathbf{H}}\). The matrix \(\mathcal{P}_\perp = I - \mathcal{P}\) projects onto the orthogonal complement \(\mathcal{L}^\perp \) to the sub-space \(\mathcal{L}\). By (29)

$$\begin{aligned} \mathcal{Y}= \hat{\mathbf \varSigma }_F + \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F. \end{aligned}$$
(30)

Let \(\mathcal{E}= \left[ \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T \mathcal{A}^{{\mathbf{H}}}_{FI} \right] ^{-1}\), \(\mathcal{S}= \hat{\mathcal{A}}^{{\mathbf{H}}}_F \mathcal{Y}^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T\). With the help of the Frobenius formula, we obtain

$$\begin{aligned} \mathcal{W}_{11}^{-1} = \left[ \begin{array}{cc} \mathcal{Y}^{-1} &{} - \mathcal{Y}^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{A}_{FI}^{\mathbf{H}}\mathcal{E}\\ - \mathcal{E}(\mathcal{A}_{FI}^{\mathbf{H}})^T \hat{\mathcal{A}}^{{\mathbf{H}}}_F \mathcal{Y}^{-1} &{} \mathcal{E}+ \mathcal{E}(\mathcal{A}_{FI}^{\mathbf{H}})^T \mathcal{S}\mathcal{A}_{FI}^{\mathbf{H}}\mathcal{E}\end{array} \right] , \end{aligned}$$
(31)

The matrix \(\mathcal{P}_\perp \) is idempotent, that is \(\mathcal{P}_\perp = \mathcal{P}_\perp ^2\). Using the Sherman-Morrison-Woodbury formula, we derive from (30)

$$\begin{aligned} \mathcal{Y}^{-1} = \hat{{\mathbf \varSigma }}_F^{-1} - \hat{{\mathbf \varSigma }}_F^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{P}_\perp \left[ I_m + \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \hat{{\mathbf \varSigma }}_F^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F\right) ^T \mathcal{P}_\perp \right] ^{-1} \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \ \hat{{\mathbf \varSigma }}_F^{-1}. \end{aligned}$$
(32)

Introduce the additional notation \(\hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} = \left[ \hat{\mathcal{A}}^{{\mathbf{H}}}_F, \mathcal{A}^{{\mathbf{H}}}_{FI} \right] \). Then the matrix \(\mathcal{Z}\) can be written in the form

$$\begin{aligned} \mathcal{Z}= {\mathbf \varSigma }_N + \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \left[ I_m - \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T \right] \mathcal{A}^{{\mathbf{H}}}_N. \end{aligned}$$
(33)

It can be seen from (33) that the matrix \(\mathcal{Z}\) is a Schur complement of the positive definite matrix \(\mathcal{W}_{11}\) at (27). Therefore, \(\mathcal{Z}\) is a positive definite matrix too.

Proposition 4

Let \(\hat{\mathcal{S}}= \mathcal{P}+ \mathcal{P}_\perp \mathcal{S}\mathcal{P}_\perp \). Then \(\hat{\mathcal{A}}^{\mathcal{H}}_{FI} \mathcal{W}_{11}^{-1} \left( \hat{\mathcal{A}}^{\mathcal{H}}_{FI}\right) ^T = \hat{\mathcal{S}}\).

Proof

This equality can be obtained by direct calculations.    \(\square \)

Corollary 1

According to (33) \(\mathcal{Z}= {\mathbf \varSigma }_N + \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \left( I_m - \hat{\mathcal{S}}\right) \mathcal{A}^{{\mathbf{H}}}_N\). Since \({\mathbf \varSigma }_N\) is a positive definite diagonal matrix, we obtain by the Sherman–Morrison–Woodbury formula

$$\begin{aligned} \begin{array}{c} \mathcal{Z}^{-1} = {\mathbf \varSigma }_N^{-1} - {\mathbf \varSigma }_N^{-1}\left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \left( I - \hat{\mathcal{S}}\right) ^{1/2} \\ \cdot \left[ I + \left( I - \hat{\mathcal{S}}\right) ^{1/2} \mathcal{A}^{{\mathbf{H}}}_N {\mathbf \varSigma }_N^{-1} \left( \mathcal{A}^{{\mathbf{H}}}_N\right) ^T \left( I - \hat{\mathcal{S}}\right) ^{1/2} \right] ^{-1} \left( I - \hat{\mathcal{S}}\right) ^{1/2} \mathcal{A}^{{\mathbf{H}}}_N {\mathbf \varSigma }_N^{-1} \end{array} \end{aligned}$$
(34)

Using the matrices (31) and (34), it is possible by (28) compute the matrix \({\mathbf \Phi }^{-{\mathbf{H}}} (y_*)\).

Below, we will need in the definition of non-degeneracy of a point \(x\in \mathcal{F}_P\) in the primal problem (2).

Definition 2

[1]. The point \(x\in \mathcal{F}_P\) is called non-degenerate, if \(\mathcal{T}_{\mathcal{K}} (x) + \mathcal{N}(\mathcal{A}) = \mathbb {R}^n\), where \(\mathcal{T}_{\mathcal{K}} (x)\) is a tangent space to the cone \(\mathcal{K}\) at the point x, and \(\mathcal{N}(A)\) is a null-space of the matrix \(\mathcal{A}\).

Denote by \(H_i^L\), \(i\in J^r_F\), the left \(n_i \times (n_i -1)\) sub-matrix of the matrix \(H_i\). In other words, \(H_i^L\) is the matrix \(H_i\), from which the last column \({\mathbf{e}}_{i, n_i}\) is removed. Denote also by \(A_i^{H_L} = A_i H_i^L\). Compose from \(A_i^{H_L}\), \(i\in J^r_F\), the matrix \( \mathcal{A}^{{\mathbf{H}}_L}_F = \left[ A_1^{H_L}, \ldots , A_{r_F}^{H_L} \right] \) with the dimension \(m\times (n_F - r_F)\), where \(n_F = \sum _{i\in J^r_F} n_i\). Introduce additionally the matrix \(\mathcal{A}^{{\mathbf{H}}_{L}}_{FI} = \left[ \mathcal{A}^{{\mathbf{H}}_L}_F, \mathcal{A}_I^{\mathbf{H}}\right] \). The following criterion of non-degeneracy of the point \(x\in \mathcal{F}_P\) is valid.

Proposition 5

[1]. The point \(x = \left[ x_F; x_I; x_N \right] \) is non-degenerate if and only if rows of the matrix \(\mathcal{A}^{{\mathbf{H}}_{L}}_{FI}\) are linear independent.

Lemma 2

Let \(x_*\in \mathcal{F}_P\) and \([u_*,y_*]\in \mathcal{F}_{D}\) be non-degenerate solutions of problems (2) and (3), respectively. Let also the solutions \(x_*\) and \(y_*\) be strictly complementary. Then the matrix \({\mathbf{G} \,}(u_*) = \mathcal{A}{\mathbf \Phi }^{-1} (u_*) {\mathbf {Arr}}(x(u_*)) \mathcal{A}^T\) is nonsingular.

Proof

Since \(\mathcal{A}x_* = b\) and \({\mathbf {Arr}}\, (y_*) x_* = 0_n\), we have \( \left[ {\mathbf {Arr}}\, (y_*) + \mathcal{A}^T \mathcal{A}\right] x_* = \mathcal{A}^T b. \) Hence, \(x(u_*) = x_*\). Moreover, the expression (26) for \({\mathbf{G} \,}(u_*)\) takes place.

Let show, that the homogeneous system of linear equations

$$\begin{aligned} {\mathbf{G} \,}(u_*) z = 0_m \end{aligned}$$
(35)

has only zero solution \(z=0_m\). It follows from here that the matrix \({\mathbf{G} \,}(u_*)\) is nonsingular.

Denote \( \hat{\mathcal{A}}^{\mathbf{H}}= \mathcal{A}^{\mathbf{H}}\mathbf \Pi \). By (26) \( {\mathbf{G} \,}(u_*) = \hat{\mathcal{A}}^{\mathbf{H}}\, \hat{\mathbf \Phi }^{-{\mathbf{H}}} (y_*) \hat{\mathbf{\Lambda }} (x_*) \left( \hat{\mathcal{A}}^{\mathbf{H}}\right) ^T, \) where \(\hat{\mathcal{A}}^{\mathbf{H}}= \left[ \hat{\mathcal{A}}_{FI}^{\mathbf{H}}, \mathcal{A}_N^{\mathbf{H}}\right] \) and \(\hat{\mathbf \Phi }^{-{\mathbf{H}}} (y_*) = \mathbf{\Pi }^T {\mathbf \Phi }^{-{\mathbf{H}}} (y_*) \mathbf{\Pi }\). The block-diagonal matrix \(\hat{\mathbf \Lambda }= \hat{\mathbf \Lambda }(x_*)\) is obtained from the matrix \({\mathbf \Lambda }\) by rearrangement of rows and columns with the help of the permutation matrix \(\mathbf \Pi \), that is \(\hat{\mathbf \Lambda }= \mathbf {\Pi }{\mathbf \Lambda }\mathbf {\Pi }^T\). This matrix can be written also in the form \(\hat{\mathbf \Lambda }= \text{ DIAG } \left[ \hat{\mathbf \Lambda }_{FI}, {\mathbf \Lambda }_N \right] \).

The right lower block \({\mathbf \Lambda }_N\) is a zero matrix, therefore

$$ {\mathbf{G} \,}(u_*)= \hat{\mathcal{A}}^{\mathbf{H}}_{FI} \mathcal{V}_{11} \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}_{FI}^{\mathbf{H}}\right) ^T + \mathcal{A}^{\mathbf{H}}_N \mathcal{V}_{12}^T \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}_{FI}^{\mathbf{H}}\right) ^T. $$

Substituting \(\mathcal{V}_{11}\) and \(\mathcal{V}_{12}\), we derive that \({\mathbf{G} \,}(u_*)\) is the matrix of the following form

$$ \begin{array}{c} {\mathbf{G} \,}(u_*) = \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1}\hat{\mathbf \Lambda }_{FI}\left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T \\ + \left[ \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T -I_m \right] \mathcal{A}^{{\mathbf{H}}}_N \mathcal{Z}^{-1} \left( \mathcal{A}^{{\mathbf{H}}}_{N}\right) ^T \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T. \end{array} $$

Denote \(\mathcal{U}= \mathcal{A}^{{\mathbf{H}}}_N \mathcal{Z}^{-1} \left( \mathcal{A}^{{\mathbf{H}}}_{N}\right) ^T\). Then, using Proposition 4, we come to conclusion that

$$ {\mathbf{G} \,}(u_*) = \left[ I_m - \left( I_m - \hat{\mathcal{S}}\right) \mathcal{U}\right] \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T. $$

We multiply the equality (35) from the left on the matrix \(\left( \mathcal{A}^{\mathbf{H}}_{FI}\right) ^T\). Since \(I_m - \hat{\mathcal{S}}= \mathcal{P}_\perp - \mathcal{P}_\perp \mathcal{S}\mathcal{P}_\perp \), we derive that \(\left( \mathcal{A}^{\mathbf{H}}_{FI}\right) ^T \left( I_m - \hat{\mathcal{S}}\right) = 0\). Thus, we have

$$\begin{aligned} \left( \mathcal{A}^{{\mathbf{H}}}_{FI}\right) ^T\hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1}\hat{\Lambda }_{FI} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI}\right) ^T z = 0_{l_1}, \end{aligned}$$
(36)

where \(l_1 = r_F + n_I\).

Assume that \(z\not = 0_m\) and consider separately two possibilities.

1) \(\left( \mathcal{A}^{\mathbf{H}}_{FI} \right) ^T z \not = 0_{l_1}\). In this case, taking into account the expression (31) for the matrix \(\mathcal{W}_{11}^{-1}\), we get

$$\begin{aligned} \left( \mathcal{A}^{\mathcal{H}}_{FI}\right) ^T\hat{\mathcal{A}}^{\mathcal{H}}_{FI} \mathcal{W}_{11}^{-1} = \left( \mathcal{A}^{\mathcal{H}}_{FI}\right) ^T \left[ \mathcal{P}_\perp \hat{\mathcal{A}}^{\mathbf{H}}_F \mathcal{Y}^{-1}, \left( I_m - \mathcal{P}_\perp \mathcal{S}\right) \mathcal{A}_{FI}^{\mathbf{H}}\mathcal{E}\right] = \left[ 0_{l_1l_2}, I_{l_1} \right] , \end{aligned}$$

where \(l_2 = n_F - r_F\). Hence, the equation (36) is reduced to the following one: \( \Lambda _{FI} \left( \mathcal{A}^{\mathbf{H}}_{FI} \right) ^T z = 0_{l_1}, \) where \(\Lambda _{FI}\) is a right lower diagonal \(l_1\times l_1\) sub-matrix of the matrix \(\hat{\Lambda }_{FI}\). Because all diagonal entrees of the matrix \(\Lambda _{FI}\) are positive numbers, this equality does not fulfilled, when \(z\not = 0_m\).

2) \(\left( \mathcal{A}^{\mathbf{H}}_{FI} \right) ^T z = 0_{l_1}\). Under this assumption \(z\in \mathcal{L}^\perp \), therefore,

$$\begin{aligned} \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \right) ^T z = \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \mathcal{Y}^{-1} \hat{\mathbf \Lambda }_F \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F \right) ^T \mathcal{P}_\perp z. \end{aligned}$$
(37)

By (32) \( \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \mathcal{Y}^{-1} = \left( I_m + \hat{\mathcal{C}}\right) ^{-1} \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \hat{\mathbf \varSigma }_F^{-1}, \) where \(\hat{\mathcal{C}}= \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \hat{\mathbf \varSigma }_F^{-1} (\hat{\mathcal{A}}^{{\mathbf{H}}}_F)^T \mathcal{P}_\perp \). It follows from here and (37) that

$$\begin{aligned} \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \mathcal{W}_{11}^{-1} \hat{\mathbf \Lambda }_{FI} \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_{FI} \right) ^T z = \left( I_m + \hat{\mathcal{C}}\right) ^{-1}\mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \hat{\mathbf \varSigma }_F^{-1} \hat{\mathbf \Lambda }_F \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F \right) ^T \mathcal{P}_\perp z = 0. \end{aligned}$$

But the matrix \(\left( I_m + \hat{\mathcal{C}}\right) ^{-1}\) is positive definite. Therefore, this equality is possible only when \( \mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F \hat{\mathbf \varSigma }_F^{-1} \hat{\mathbf \Lambda }_F \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F \right) ^T \mathcal{P}_\perp z = 0. \)

All diagonal entrees of the diagonal matrix \({\mathbf \varSigma }_F^{-1} \hat{\mathbf \Lambda }_F\) are positive except of \(r_F\) diagonal entrees equal to zero. All these zero entrees correspond to the last diagonal entrees of matrices \(\Lambda _i\) in the decomposition \(\text{ Arr } (x_{*,i}) = H_i \Lambda _i H_i^T\), \(i\in J^r_F\). Denote by \(\mathcal{A}^{{\mathbf{H}}_L}_F\) the sub-matrix of the matrix \(\mathcal{A}^{{\mathbf{H}}}_F\), from which the last columns of the matrices \(A^{H}_i\) are removed.

Let \(p = \hat{\mathbf \varSigma }_F^{-1} \hat{\mathbf \Lambda }_F \left( \hat{\mathcal{A}}^{{\mathbf{H}}}_F \right) ^T \mathcal{P}_\perp z\). The vector p is non-zero. Really, otherwise, because of \(\left( \mathcal{A}^{\mathbf{H}}_{FI}\right) ^T z = 0\), the rows of the matrix \(\left[ \mathcal{A}^{{\mathbf{H}}_L}, \mathcal{A}^{\mathbf{H}}_{FI} \right] \) are linear dependent. This contradicts to non-degeneracy of the point \(x_*\).

By the same reason the equality \(\mathcal{P}_\perp \hat{\mathcal{A}}^{{\mathbf{H}}}_F p = 0\) is also impossible, since in the opposite case we have contradiction with Proposition 5.    \(\square \)

Lemma 3

Let assumptions of Lemma 2 hold. Then the matrix \({\mathbf \Psi }_w (w_*)\) is nonsingular, where \(w_* = [u_*; y_*]\) is the solution of problem (3).

Proof

Multiplying the right column of the matrix \({\mathbf \Psi }_w(w_*)\) from the right by \(\mathcal{A}^T\) and subtracting this column from the left column, we obtain the matrix

$$\begin{aligned} \left[ \begin{array}{cc} \mathcal{A}{\mathbf \Phi }^{-1}{\mathbf {Arr}}(x_*)\mathcal{A}^T &{} \ \mathcal{A}{\mathbf \Phi }^{-1}\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] \\ \left[ {\mathbf {Arr}}(y_*){\mathbf \Phi }^{-1} - I_n \right] {\mathbf {Arr}}(x_*)\mathcal{A}^T &{} \ \left[ I_n - {\mathbf {Arr}}(y_*){\mathbf \Phi }^{-1} \right] {\mathbf {Arr}}(x_*) + \tau {\mathbf {Arr}}(y){\mathbf \Phi }^{-1} \end{array} \right] , \end{aligned}$$
(38)

where \(x_* = x(w_*)\) is the solution of the primal problem (2), and \({\mathbf \Phi }^{-1} = {\mathbf \Phi }^{-1}(y_*)\).

Further, we multiply the first row of the matrix (38) from the left by the matrix \(\mathcal{A}^T\) and sum it with the second row. Since \(\left[ \mathcal{A}^T \mathcal{A}+ {\mathbf {Arr}}(y_*) \right] {\mathbf \Phi }^{-1}(y_*) = I_n\), we amount to the matrix

$$\begin{aligned} \left[ \begin{array}{cc} \mathcal{A}{\mathbf \Phi }^{-1}(y_*){\mathbf {Arr}}(x_*)\mathcal{A}^T &{} \ \mathcal{A}{\mathbf \Phi }^{-1}(y)\left[ \tau I_n - {\mathbf {Arr}}(x(w)) \right] \\ 0_{nm} &{} \ \tau I_n \end{array} \right] . \end{aligned}$$
(39)

By Lemma 2 the left upper sub-matrix \(\mathcal{A}{\mathbf \Phi }^{-1}(y_*){\mathbf {Arr}}(x_*)\mathcal{A}^T\) is non-singular. Under \(\tau > 0\) the right lower sub-matrix of the matrix (39) is also non-singular. Therefore, the matrix (38) is non-singular too.    \(\square \)

Remark 1

If the point \([u_*,y_*] \in \mathcal{F}_D\) is non-degenerate, then due to continuity the points [uy] in some vicinity of \([u_*,y_*]\) are also non-degenerate. Thus, the algorithmic mappings in methods (14) and (21) are completely defined in some domain containing points \(u_*\) and \(w_*\), respectively.

Theorem 1

Let all conditions of Lemma 2 be valid. Then the iterative methods (14) and (21) converge locally to the solutions \(u_*\) and \(w_*\) with super-linear rate.

Proof

The proof follows from well-known results concerning the Newton method and from Lemmas 2, 3.    \(\square \)

6 Conclusion

We have proposed two variants of the dual Mewton’s method for solving linear second order cone programming problems. Both variants of the method converge locally with the super-linear rate. From theoretical point of view dual methods are preferable in compare with primal methods, when the number of equalities in the primal problem is not large.