A Block-Jacobi Algorithm for Non-Symmetric Joint Diagonalization of Matrices

Shen, Hao; Kleinsteuber, Martin

doi:10.1007/978-3-319-22482-4_37

Hao Shen¹⁷ &
Martin Kleinsteuber¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9237))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

2542 Accesses

Abstract

This paper studies the problem of Non-symmetric Joint Diagonalization (NsJD) of matrices, namely, jointly diagonalizing a set of complex matrices by one matrix multiplication from the left and one multiplication with possibly another matrix from the right. In order to avoid ambiguities in the solutions, these two matrices are restricted to lie in the complex oblique manifold. We derive a necessary and sufficient condition for the uniqueness of solutions, and characterize the Hessian of global minimizers of the off-norm cost function, which is typically used to measure joint diagonality. By exploiting existing results on Jacobi algorithms, we develop a block-Jacobi method that guarantees local convergence to a pair of joint diagonalizers at a quadratic rate. The performance of the proposed algorithm is investigated by numerical experiments.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Block-Jacobi Methods with Newton-Steps and Non-unitary Joint Matrix Diagonalization

Revisiting the (block) Jacobi subspace rotation method for the symmetric eigenvalue problem

Article 08 August 2022

Globally convergent Jacobi methods for positive definite matrix pairs

Article 17 November 2017

Keywords

1 Introduction

Joint Diagonalization (JD) of a set of matrices has attracted considerable attentions in the areas of statistical signal processing and multivariate statistics. Its applications include linear blind source separation (BSS), beamforming, and direction of arrival (DoA) estimation, cf. [1]. Classic literature focuses on the problem of symmetric joint diagonalization (SJD) of matrices. Namely, a set of matrices are to be diagonalized via matrix congruence transforms, i.e. multiplication from the left and the right with a matrix and its (Hermitian) transposed, respectively.

In this work, we consider the problem of Non-symmetric Joint Diagonalization (NsJD) of matrices, where the two matrices multiplied from the left and the right are different. Such a general form has been studied in the scheme of multiple-access multiple-input multiple-output (MIMO) wireless transmission, cf. [2]. In the work of [3], NsJD approaches have demonstrated its application in solving the problem of independent vector analysis. Moreover, the problem of NsJD is closely related to the problem of the canonical polyadic decomposition of tensors. We refer to [4, 5] for further discussions on the latter subject.

One contribution of this work is the development of a block Jacobi method for solving the problem of NsJD at a super-linear rate of convergence. Jacobi-type methods have a long history in solving problems of matrix joint diagonalization. Early work in [6] employs a Jacobi algorithm for solving the unitary joint diagonalization problem based on the common off-norm cost function. In the non-unitary setting, Jacobi-type methods have been developed for both the log-likelihood formulation, cf. [7], and the common off-norm case, cf. [8, 9]. The current work provides an extension of [10], which only considers the problem in the real symmetric setting.

2 Uniqueness of Non-symmetric JD

In this work, we denote by $(\cdot )^{\top }$ the matrix transpose, $(\cdot )^{\mathsf {H}}$ the Hermitian transpose, $\overline{(\cdot )}$ the complex conjugate of entries of a matrix, and by Gl(m) the set of all $m \times m$ invertible complex matrices. Let $\{C_{i}\}_{i=1}^{n}$ be a set of $m{\times }m$ complex matrices, constructed by

$$\begin{aligned} C_{i} = A_{L} {\varOmega }_{i} A_{R}^\mathsf {H}, \quad i=1, \dots n, \end{aligned}$$

(1)

where $A_{L},A_{R} \in Gl(m)$ and ${\varOmega }_{i} = {\text {diag}} \big (\omega _{i1}, \ldots , \omega _{im} \big ) \in \mathbb {C}^{m \times m}$ with ${\varOmega }_{i} \ne 0$. It is worth noticing that there is no relation between the matrices $A_{L}$ and $A_{R}$. The task of NsJD is to find a pair of matrices $X_{L},X_{R} \in Gl(m)$ such that the set of matrices

$$\begin{aligned} \left\{ \left. X_{L}^{\mathsf {H}} C_{i} X_{R} \right| i = 1, \ldots , n \right\} \end{aligned}$$

(2)

are jointly diagonalized. In order to investigate properties of any algorithm that aims at finding such a joint diagonalizer, it is fundamental to understand under what conditions there exists a unique solution. In the remainder of this section, we therefore elaborate the uniqueness properties of the non-symmetric joint diagonalization problem (2).

As it is known from the SJD case, there is also an inherent permutation and scaling ambiguity here. Let $D_{L}, D_{R} \in Gl(m)$ be diagonal and $P \in Gl(m)$ be a permutation matrix. If $X_{L}^{*} \in Gl(m)$ and $X_{R}^{*} \in Gl(m)$ are the joint diagonalizers of problem (2), then so is the pair of $(X_{L}^{*}D_{L}P, X_{R}^{*}D_{R}P)$. In other words, the joint diagonalizers can only be identified up to individual scaling and a joint permutation. We define the set of two jointly column-wise permuted diagonal $(m \times m)$-matrices by

$$\begin{aligned} \mathcal {G}(m):= \big \{(D_{1}P,D_{2}P)\big |&D_{1},D_{2} \in Gl(m) \text { are diagonal and }\nonumber \\&P \in Gl(m) \text { is a permutation matrix} \big \}. \end{aligned}$$

(3)

As the set $\mathcal {G}(m)$ admits a group structure, we can define an equivalence class on $Gl(m) \times Gl(m)$ as follows.

Definition 1

(Essential Equivalence). Let $(X_{L},X_{R}) \in Gl(m) \times Gl(m)$, then $(X_{L},X_{R})$ is said to be essentially equivalent to $(Y_{L},Y_{R}) \in Gl(m) \times Gl(m)$, and vice versa, if there exists $(E_{L},E_{R}) \in \mathcal {G}(m)$ such that

$$\begin{aligned} X_{L} = Y_{L} E_{L} \qquad \text {and}\qquad X_{R} = Y_{R} E_{R}. \end{aligned}$$

(4)

Moreover, we say that the solution of problem (2) is essentially unique, if it admits a unique solution on the set of equivalence classes.

Due to the fact that

$$\begin{aligned} X_{L}^{\mathsf {H}} C_{i} X_{R} = (X_{L}^{\mathsf {H}} A_{L}) {\varOmega }_{i} (A_{R}^{\mathsf {H}} X_{R}), \end{aligned}$$

(5)

we assume without loss of generality that the $C_{i} = {\varOmega }_{i}$, $i=1,\dots ,n$, are already diagonal. In other words, we investigate the question of under what conditions the set $\mathcal {G}(m)$ admits the only solutions to the joint diagonalization problem (2), when the $C_{i}$’s are already diagonal.

In order to characterize the uniqueness conditions, we need to define a measure of collinearity for diagonal matrices. Recall ${\varOmega }_{i} = {\text {diag}}(\omega _{i1}, \ldots , \omega _{im}) \in \mathbb {C}^{m \times m}$ for $i = 1, \ldots , n$. For a fixed diagonal position k, we denote by ${z}_{k} := [\omega _{1k}, \ldots , \omega _{nk}]^{\top } \in \mathbb {C}^{n}$ the vector consisting of the k-th diagonal element of each matrix, respectively. Then, the collinearity measure for the set of ${\varOmega }_{i}$’s is defined by

$$\begin{aligned} \rho ({\varOmega }_{1}, \ldots , {\varOmega }_{n}) := \max _{1 \le k < l \le n}|c({z}_k,{z}_l)|, \end{aligned}$$

(6)

where $c({z}_k,{z}_l)$ is the cosine of the complex angle between two vectors ${v},{w} \in \mathbb {C}^n$, computed as

$$\begin{aligned} c ({v},{w}):= \left\{ \begin{array}{cl} \frac{{v}^\mathsf {H}{w}}{\Vert {v} \Vert \Vert {w}\Vert } &{} \text { if } {v}\ne 0 \wedge {w} \ne 0, \\ 1 &{} \text { otherwise.} \end{array} \right. \end{aligned}$$

(7)

Here, $\Vert {v}\Vert $ denotes the Euclidean norm of a vector v. Note, that $0 \le \rho \le 1$ and that $\rho = 1$ if and only if there exists a complex scalar $\omega \in \mathbb {C}$ and a pair ${z}_k, {z}_l$, $k \ne l$ so that ${z}_k = \omega {z}_l$. In other words, $\rho =1$ if and only if there exist two positions (k, k) and (l, l) such that the corresponding entries in the matrices ${\varOmega }_i$ only differ by multiplication with a complex scalar $\omega $. We adopt the methods for uniqueness analysis of symmetric joint diagonalization cases, developed in [11], to the NsJD setting.

Lemma 1

Let ${\varOmega }_{i} \in \mathbb {C}^{m \times m}$, for $i = 1, \ldots , n$, be diagonal, and let $X_{L}, X_{R} \in Gl(m)$ so that $X_{L}^{\mathsf {H}} {\varOmega }_{i} X_{R}$ is diagonal as well. Then the pair $(X_{L},X_{R})$ is essentially unique if and only if $\rho ({\varOmega }_1, \ldots , {\varOmega }_{n}) < 1$.

Proof

First, consider the case $m=2$ and let

$$\begin{aligned} X_{L}=\begin{bmatrix} l_1&l_2 \\ l_3&l_4 \end{bmatrix} \in Gl(2), \qquad \text {and}\qquad X_{R}=\begin{bmatrix} r_1&r_2 \\ r_3&r_4 \end{bmatrix} \in Gl(2). \end{aligned}$$

(8)

Then $X_{L}^{\mathsf {H}} {\varOmega }_{i} X_{R}$ is diagonal for $i=1, \ldots , n$, if and only if

$$\begin{aligned} \left\{ \begin{array}{l} \omega _{i1} \overline{l_{1}} r_{2} + \omega _{i2} \overline{l_{3}} r_{4}=0 \\ \omega _{i1} \overline{l_{2}} r_{1} + \omega _{i2} \overline{l_{4}} r_{3}=0 \end{array} \right. \end{aligned}$$

(9)

for $i=1, \ldots , n$. The corresponding system of linear equations reads as

$$\begin{aligned} \begin{bmatrix} \omega _{11}&\omega _{21}&\ldots&\omega _{n1} \\ \omega _{12}&\omega _{22}&\ldots&\omega _{n2} \end{bmatrix}^{\mathsf {H}} \begin{bmatrix} l_{1} \overline{r_{2}}&~ l_{2} \overline{r_{1}} \\ l_{3} \overline{r_{4}}&~ l_{4} \overline{r_{3}} \end{bmatrix} =0, \end{aligned}$$

(10)

which only has a unique trivial solution if and only if the coefficient matrix in $\omega $’s has rank 2, which in turn is equivalent to $\rho ({\varOmega }_{1}, \ldots , {\varOmega }_{n})<1$. Specifically, the trivial solution, i.e.

$$\begin{aligned} l_{1} \overline{r_{2}} = l_{2} \overline{r_{1}} = l_{3} \overline{r_{4}} = l_{4} \overline{r_{3}} = 0, \end{aligned}$$

(11)

together with the invertibility of $X_{L}$ and $X_{R}$ yields that either

$$\begin{aligned} l_{1} = r_{1} = l_{4} = r_{4} = 0, \quad \text { or }\quad l_{2} = r_{2} = l_{3} = r_{3} = 0. \end{aligned}$$

(12)

Therefore, one can conclude that $(X_{L},X_{R}) \in \mathcal {G}(2)$. For the case $m >2$, if $\rho = 1$ then there exists a pair (k, l) such that $|c(z_k,z_l)|=1$ and the same argument as above shows that $\rho = 1$ implies the non-uniqueness of the joint diagonalizer.

For the reverse direction of the statement, we assume that the joint diagonalizer is not in $\mathcal {G}(m)$. We further assume that one of the ${\varOmega }_{i}$’s, say ${\varOmega }_{1}$, is invertible. Then

$$\begin{aligned} X_{L}^{\mathsf {H}} {\varOmega }_{i} X_{R} (X_{L}^{\mathsf {H}} {\varOmega }_{1} X_{R})^{-1} = X_{L}^{\mathsf {H}} {\varOmega }_{i} {\varOmega }_{1}^{-1} (X_{L}^{\mathsf {H}})^{-1}, \end{aligned}$$

(13)

for $i=1, \ldots , n$, gives the simultaneous eigendecomposition of the diagonal matrices ${\varOmega }_i {\varOmega }_{1}^{-1}$. Since we assume $(X_{L},X_{R}) \notin \mathcal {G}(m)$, this eigendecomposition is not unique and thus each matrix ${\varOmega }_i {\varOmega }_{1}^{-1}$ must have at least two identical eigenvalues and these eigenvalues must be located at the same positions (k, k) and (l, l) for all the matrices ${\varOmega }_i {\varOmega }_{1}^{-1}$. In other words, there exists a pair (k, l) with $k \ne l$ such that

$$\begin{aligned} \frac{\omega _{ik}}{\omega _{1k}}=\frac{\omega _{il}}{\omega _{1l}}, \end{aligned}$$

(14)

which is equivalent to $|c(z_k,z_l)|=1$ and hence $\rho ({\varOmega }_{1}, \ldots ,{\varOmega }_{n})=1$. If all the ${\varOmega }_{i}$’s are singular, we distinguish between two cases. Firstly, assume that there is a position on the diagonals, say k, where all $\omega _{ik}=0$. Then $|c(z_k,z_l)|=1$ holds true for any $k \ne l$ and thus $\rho =1$. Secondly, if there is no common position where all the ${\varOmega }_{i}$’s have a zero entry, there exists an invertible linear combination, say ${\varOmega }_{0}$, which can also be diagonalized via the same transformations. Then by considering a new set $\{{\varOmega }_{i}\}_{i=0}^{n}$, the same argument as from (13) to (14) for the invertible case applies by replacing ${\varOmega }_1$ with ${\varOmega }_0$. $\blacksquare $

3 Analysis of the Joint Diagonality Measure

To deal with the scaling ambiguity, we restrict the search space for the diagonalizing matrices to the quotient space

$$\begin{aligned} Op(m)=Gl(m)/\{ D \in Gl(m)~|~ D \text { is diagonal.} \}, \end{aligned}$$

(15)

cf. [12] for further details. As representatives for one equivalent class, we choose elements from the set of complex oblique matrices, i.e.

$$\begin{aligned} \mathcal {OB}(m) := \big \{ X \in \mathbb {C}^{m \times m} | {\text {ddiag}} ( X^{\mathsf {H}}X ) = I_{m}, {\text {rk}} X = m \big \}, \end{aligned}$$

(16)

where ${\text {ddiag}}(Z)$ forms a diagonal matrix, whose diagonal entries are just those of Z, and $I_{m}$ is the $m{\times }m$ identity matrix. As a measure for the joint diagonality, we choose

$$\begin{aligned} f :Op(m) \times Op(m) \rightarrow \mathbb {R}, \qquad (X_{L},X_{R}) \mapsto \frac{1}{4} \sum \limits _{i=1}^{n} \Big \Vert {\text {off}} ( X_{L}^{\mathsf {H}} C_{i} X_{R}) \Big \Vert _{F}^{2}, \end{aligned}$$

(17)

where ${\text {off}} (Z) = Z - {\text {ddiag}}(Z)$ and $\Vert \cdot \Vert _{\mathrm {F}}$ is the Frobenius norm of matrices.

The convergence rate of the Jacobi method depends on a non-degenerated Hessian form of the cost function at the optimal solution. In what follows, we therefore characterize the critical points of (17) and specify their Hessian form. Let us denote the set of all $m{\times }m$ matrices with zero diagonal entries by

$$\begin{aligned} \mathfrak {off}(m) = \big \{ Z \in \mathbb {C}^{m{\times }m} | z_{ii} = 0,~\text {for}~i=1,\ldots ,m \big \}. \end{aligned}$$

(18)

Then, for any $X \in {Op}(m)$, the following map

$$\begin{aligned} \mu _{X} :\! \mathfrak {off}(m) \!\rightarrow \! Op(m), \; Z \mapsto X (I_{m} + Z) {\text {diag}} \left\{ \! \tfrac{1}{\Vert X (e_{1}+z_{1}) \Vert }, \ldots , \tfrac{1}{\Vert X(e_{m}+z_{m}) \Vert } \! \right\} \!, \end{aligned}$$

(19)

where $Z = [z_{1},\ldots ,z_{m}] \in \mathfrak {off}(m)$ and $e_{i}$ is the i-th standard basis vector, is a local and smooth parameterization around X. Let $\mathcal {X} := (X_{L},X_{R}) \in Op^{2}(m) := Op(m) \times Op(m)$ and denote by $\mathcal {H} := (H_{L},H_{R}) \in T_{\mathcal {X}}Op^{2}(m)$ the tangent vector at $\mathcal {X}$. The first derivative of f evaluated at $\mathcal {X}$ in tangent direction $\mathcal {H}$ yields

$$\begin{aligned} {\text {D}} f(\mathcal {X}) (\mathcal {H}) \! = \! \sum \limits _{i=1}^{n} {\text {tr}}&\Big ({\text {off}} \big ( X_{L}^{\mathsf {H}} C_{i} X_{R} \big ) X_{R}^{\mathsf {H}} C_{i}^{\mathsf {H}} H_{L} + {\text {off}} \big ( H_{L}^{\mathsf {H}} C_{i} X_{R} \big ) X_{R}^{\mathsf {H}} C_{i}^{\mathsf {H}} X_{L} +\nonumber \\&\ \ {\text {off}} \big ( X_{L}^{\mathsf {H}} C_{i} X_{R} \big ) H_{R}^{\mathsf {H}} C_{i}^{\mathsf {H}} X_{L} + {\text {off}} \big ( X_{L}^{\mathsf {H}} C_{i} H_{R} \big ) X_{R}^{\mathsf {H}} C_{i}^{\mathsf {H}} X_{L} \Big ). \end{aligned}$$

(20)

Since any joint diagonalizer $\mathcal {X}^{*}$ is a global minimum, it satisfies ${\text {D}} f(\mathcal {X}^{*}) \mathcal {H} = 0$. We use $\mu _{\mathcal {X}} := (\mu _{X_{L}},\mu _{X_{R}})$ as the local parameterization of $Op^{2}(m)$ and compute the Hessian form $\mathsf {H}_{f}(\mathcal {X}^{*}) :T_{\mathcal {X}^{*}} Op^{2}(m) \times T_{\mathcal {X}^{*}} Op^{2}(m) \rightarrow \mathbb {R}$. Let $\mathcal {X}^{*}=(X_{L}^{*},X_{R}^{*})$ be a pair of joint diagonalizers. Without loss of generality assume that $X_{L}^{*\mathsf {H}}A_{L}$ is diagonal and denoted by $\varLambda _{L} := {\text {diag}}(\lambda _{L1},\ldots , \lambda _{Lm})$. Similarly, we denote $\varLambda _{R} := {\text {diag}}(\lambda _{R1},\ldots , \lambda _{Rm}) = A_{R}^{\mathsf {H}} X_{R}^{*}$. A tedious but direct calculation leads to

$$\begin{aligned} \!\!\!\!\mathsf {H}_{f}(\!\mathcal {X}^{*}\!) (\mathcal {H},\!\mathcal {H}) \!=\!\! \left. \tfrac{{\text {d}}^{2}}{{\text {d}}t^{2}} (f \!\circ \! \mu _{\mathcal {X}^{*}}\!) (t{\varTheta },t\varXi ) \right| _{t=0} \!\!\!=\!\! \sum \limits _{j\ne k}^{m} \! \begin{bmatrix} \theta _{jk} \\ \overline{\xi _{kj}} \end{bmatrix}^{\!\top } \underbrace{ \left[ {\begin{matrix} \!\!\sum \limits _{i=1}^{n} \left| \delta _{ij}\right| ^{2} &{} \sum \limits _{i=1}^{n} \delta _{ij} \overline{\delta _{ik}}\! \\ \!\!\sum \limits _{i=1}^{n} \overline{\delta _{ij}} \delta _{ik} &{} \sum \limits _{i=1}^{n} \left| \delta _{ik}\right| ^{2}\! \end{matrix}}\right] }_{=:B_{jk}} \begin{bmatrix} \overline{\theta _{jk}} \\ \xi _{kj} \end{bmatrix}\!,\!\!\!\!\! \end{aligned}$$

(21)

where $\delta _{ij}$ is the j-th diagonal entry of the diagonal matrix ${\varDelta }_{i} := X_{L}^{*} C_{i} X_{R}^{*} = \varLambda _{L} {\varOmega }_{i} \varLambda _{R}$.

Clearly, the Hessian of f at $\mathcal {X}^{*}$ is at least positive semi-definite, and diagonal in terms of $2 \times 2$ blocks, with respect to the standard basis of the parameter space $\mathfrak {off}(m) \times \mathfrak {off}(m)$. Then, the definiteness of the Hessian simply depends on the determinant of $B_{jk}$’s, which is computed by

$$\begin{aligned}&{\text {det}}(B_{jk}) = \bigg (\sum \limits _{i=1}^{n} \left| \delta _{ij}\right| ^{2} \bigg ) \bigg (\sum \limits _{i=1}^{n} \left| \delta _{ik}\right| ^{2} \bigg ) - \bigg |\sum \limits _{i=1}^{n} \delta _{ij} \overline{\delta _{ik}} \bigg |^{ 2} \nonumber \\ =&\left| \lambda _{Lj}\right| ^{2}\left| \lambda _{Rj}\right| ^{2} \left| \lambda _{Lk}\right| ^{2} \left| \lambda _{Rk}\right| ^{2} \!\left( \!\! \bigg (\sum \limits _{i=1}^{n} \left| \omega _{ij}\right| ^{2} \bigg ) \! \bigg (\sum \limits _{i=1}^{n} \left| \omega _{ik}\right| ^{2} \bigg ) \!-\! \bigg | \sum \limits _{i=1}^{n} \omega _{ij}\overline{\omega _{ik}} \bigg |^{2} \right) . \end{aligned}$$

(22)

By the Cauchy-Schwarz inequality, ${\text {det}}(B_{jk})$ is non-negative, and is equal to zero if and only if there is a pair of positions (j, k), so that $z_{j} \in \mathbb {R}^{n}$ and $z_{k}$ are linearly dependent, i.e. $\rho ({\varOmega }_{1},\ldots ,{\varOmega }_{n}) = 1$. Thus, we conclude.

Lemma 2

Let the NsJD problem (2) have a unique joint diagonalizer. Then the Hessian of the off-norm function (17) at the joint diagonalizer is positive definite.

4 Block-Jacobi for Non-symmetric Joint Diagonalization

In this section, we develop a block Jacobi algorithm to minimize the cost function (17). Firstly, let us define the complex one dimensional subspace

$$\begin{aligned} V_{ij}:= \big \{ Z = (z_{kl}) \in \mathbb {C}^{m{\times }m} | z_{kl} = 0 \text { for } (k,l) \ne (i,j) \big \}. \end{aligned}$$

(23)

It is clear that $\oplus _{i \ne j} (V_{ij} \times V_{ji}) = \mathfrak {off}(m) \times \mathfrak {off}(m)$. We then define

$$\begin{aligned} \mathcal {V}_{ij}(\mathcal {X}) := \{ \tfrac{{\text {d}}}{{\text {d}} t} \left. \mu _{\mathcal {X}} (t \cdot Z) \right| _{t=0} | Z \in (V_{ij} \times V_{ji})\}, \end{aligned}$$

(24)

yielding a vector space decomposition of the tangent space $T_{\mathcal {X}} Op^{2}(m)$.

The block Jacobi-type method iteratively employs the search along the subspaces $\mathcal {V}_{ij}(X)$ in a cyclic manner. More precisely, let $({\varTheta }_{jk},{\varTheta }_{kj}) \in V_{ij} \times V_{ji}$ and denote ${\theta } =[\theta _{jk}~\theta _{kj}]^{\top } \in \mathbb {C}^{2}$. For any point $\mathcal {X} \in Op^{2}(m)$, we construct a family of maps $\big \{ \nu _{jk}^{(\mathcal {X})} \big \}_{j \ne k}^{m}$ by

$$\begin{aligned} \nu _{jk}^{(\mathcal {X})} :\mathbb {C}^{2} \rightarrow Op^{2}(m), \qquad {\theta } \mapsto \mu _{\mathcal {X}}({\varTheta }_{jk},{\varTheta }_{kj}). \end{aligned}$$

(25)

The algorithm is presented in Algorithm 1. It is readily seen with (21) that the $\mathcal {V}_{ij}$ are orthogonal with respect to $\mathsf {H}_{f}{(\mathcal {X}^{*})}$. Therefore, the following result guarantees the super linear convergence rate of the block Jacobi method to an exact joint diagonalizer in case that this diagonalizer is essentially unique.

Theorem 1

([13]). Let M be an n-dimensional manifold and let $x^{*}$ be a local minimum of the smooth cost function $f :M \rightarrow \mathbb {R}$ with nondegenerate Hessian $\mathsf {H}_{f}{(x^{*})}$. Let $\mu _x$ be a family of local parameterizations of M and let $\oplus _i V_i$ be a decomposition of $\mathbb {R}^n$. If the subspaces $\mathcal {V}_i:=T_{0}\mu _{x^{*}} (V_i) \subset T_{x^{*}}M$ are orthogonal with respect to $\mathsf {H}_{f}{(x^{*})}$, then the Block-Jacobi method is locally quadratic convergent to $x^{*}$.

For an experimantal evaluation of this result, we consider the task of jointly diagonalizing a set of non-symmetric matrices $\{\widetilde{C}_{i}\}_{i=1}^{n}$, constructed by

$$\begin{aligned} \widetilde{C}_{i} = A_{L} \varLambda _{i} A_{R}^{\mathsf {H}} + \varepsilon E_{i}, \qquad i = 1, \ldots , n, \end{aligned}$$

(26)

where $A \in \mathbb {C}^{m \times m}$ is a randomly picked matrix in $\mathcal {OB}(m)$, the modulus of diagonal entries of $\varLambda _{i}$ are drawn from a uniform distribution on the interval (9, 11), $E_{i} \in \mathbb {C}^{m \times m}$ represents the additive noise, whose entries are generated from a uniform distribution on the unit interval $(-0.5,0.5)$, and $\varepsilon \in \mathbb {R}$ is the noise level. We set $m=5$, $n=20$, and run six tests in accordance with increasing noise, by using $\varepsilon = d \times 10^{-2}$ where $d = 0, \ldots , 5$. Each experiment was initialized with the same point, which is randomly drawn within an appropriate neighbourhood of the true joint diagonalizer.

The convergence of algorithms is measured by the distance of the accumulation point $\mathcal {X}^{*} := (X_{L}^{*},X_{R}^{*})$ to the current iterate $\mathcal {X}^{(k)} := (X_{L}^{(k)},X_{R}^{(k)})$, i.e., by $\Vert X_{L}^{(k)}-X_{L}^{*}\Vert _{F} + \Vert X_{R}^{(k)}-X_{R}^{*}\Vert _{F}$. According to Fig. 1, our proposed algorithm converges locally quadratically fast to a pair of joint diagonalizers under the NsJD setting, i.e., when $\varepsilon = 0$, whereas with an increasing level of noise, the convergence rate slows down accordingly with a tendency of more gradual slopes.

References

Chabriel, G., Kleinsteuber, M., Moreau, E., Shen, H., Tichavsky, P., Yeredor, A.: Joint matrices decompositions and blind source separation: a survey of methods, identification, and applications. IEEE Sig. Process. Mag. 31(3), 34–43 (2014)
Article Google Scholar
Chabriel, G., Barrere, J.: Non-symmetrical joint zero-diagonalization and mimo zero-division multiple access. IEEE Trans. Sig. Process. 59(5), 2296–2307 (2011)
Article MathSciNet Google Scholar
Shen, H., Kleinsteuber, M.: A matrix joint diagonalization approach for complex independent vector analysis. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 66–73. Springer, Heidelberg (2012)
Chapter Google Scholar
Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Article MATH MathSciNet Google Scholar
Cichocki, A., Mandic, D.P., Phan, A.H., Caiafa, C.F., Zhou, G., Zhao, Q., de Lathauwer, L.: Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Sig. Process. Mag. 32(2), 145–163 (2015)
Article Google Scholar
Cardoso, J.F., Souloumiac, A.: Jacobi angles for simultaneous diagonalisation. SIAM J. Matrix Anal. Appl. 17(1), 161–164 (1996)
Article MATH MathSciNet Google Scholar
Pham, D.T.: Joint approximate diagonalization of positive definite Hermitian matrices. SIAM J. Matrix Anal. Appl. 22(4), 1136–1152 (2001)
Article MATH MathSciNet Google Scholar
Wang, F., Liu, Z., Zhang, J.: Nonorthogonal joint diagonalization algorithm based on trigonometric parameterization. IEEE Trans. Sig. Process. 55(11), 5299–5308 (2007)
Article MathSciNet Google Scholar
Souloumiac, A.: Nonorthogonal joint diagonalization by combining givens and hyperbolic rotations. IEEE Trans. Sig. Process. 57(6), 2222–2231 (2009)
Article MathSciNet Google Scholar
Shen, H., Hüper, K.: Block Jacobi-type methods for non-orthogonal joint diagonalisation. In: Proceedings of the 34th IEEE ICASSP, pp. 3285–3288 (2009)
Google Scholar
Kleinsteuber, M., Shen, H.: Uniqueness analysis of non-unitary matrix joint diagonalization. IEEE Trans. Sig. Process. 61(7), 1786–1796 (2013)
Article MathSciNet Google Scholar
Kleinsteuber, M., Shen, H.: A geometric framework for non-unitary joint diagonalization of complex symmetric matrices. In: Nielsen, F., Barbaresco, F. (eds.) GSI 2013. LNCS, vol. 8085, pp. 353–360. Springer, Heidelberg (2013)
Chapter Google Scholar
Kleinsteuber, M., Shen, H.: Block-jacobi methods with newton-steps andnon-unitary joint matrix diagonalization. In: Accepted at the 2nd Conference on Geometric Science of Information (GSI) (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Technische Universität München, Munich, Germany
Hao Shen & Martin Kleinsteuber

Authors

Hao Shen
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kleinsteuber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Shen .

Editor information

Editors and Affiliations

Inria, Villers-les-Nancy, France
Emmanuel Vincent
Tel Aviv University, Tel-Aviv, Israel
Arie Yeredor
Technical University of Libere, Liberec, Czech Republic
Zbyněk Koldovský
The Czech Academy of Sciences, Prague, Czech Republic
Petr Tichavský

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, H., Kleinsteuber, M. (2015). A Block-Jacobi Algorithm for Non-Symmetric Joint Diagonalization of Matrices. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2015. Lecture Notes in Computer Science(), vol 9237. Springer, Cham. https://doi.org/10.1007/978-3-319-22482-4_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-22482-4_37
Published: 15 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22481-7
Online ISBN: 978-3-319-22482-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics