Abstract
This paper proposes a novel matrix rank-one decomposition for quaternion Hermitian matrices, which admits a stronger property than the previous results in Ai W et al (Math Progr 128(1):253–283, 2011), Huang Y, Zhang S (Math Oper Res 32(3):758–768, 2007), Sturm JF, Zhang S (Math Oper Res 28(2):246–267 2003). The enhanced property can be used to drive some improved results in joint numerical range, \({\mathcal {S}}\)-Procedure and quadratically constrained quadratic programming (QCQP) in the quaternion domain, demonstrating the capability of our new decomposition technique.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In recent years, we have witnessed a burst of quaternion representations in many fields, including color imaging [9,10,11, 27, 39], signal processing [14, 15], robotics [12], rolling bearing fault diagnosis [42], quaternion convolutional neural networks (QCNNs) [29, 43], etc. Moreover, there are some noticeable steps towards optimizing the corresponding quaternion represented problems. Specifically, Qi et al. [31, 32] conducted a systematic study on quaternion matrix optimization, and Flamant et al. [16] proposed a general framework for constrained convex quaternion optimization. In terms of algorithms in the quaternion domain, affine projection algorithms [38] and learning algorithms [21] based on gradient and Hessian have been proposed and analyzed. Hence, the increasing number of quaternion-represented applications and the studies on the associated optimization problems call for a deeper understanding of the quaternion structure that could lead to some efficient solution methods. In this paper, we shall focus on one algebraic quaternion structure: matrix rank-one decomposition, and show that such decomposition admits a stronger property than that in the real and complex domain by leveraging the intrinsic quaternion nature. We further show that such merit of rank-one decomposition can be extended to some of its theoretical implications such as \({\mathcal {S}}\)-Procedure, joint numerical range and quadratically constrained quadratic programming (QCQP) in the quaternion domain, and improve the associated results in the real and complex domains.
The matrix rank-one decomposition that we discuss in this paper is a technique of decomposing a positive semidefinite Hermitian matrix into the sum of rank-one matrices to satisfy the so-called equal inner product property, i.e., the inner product between some given matrices and each rank-one term in the decomposition has the same value. The first such type of decomposition was introduced by Sturm and Zhang in [36] with the equal inner product property valid for one matrix, and it was used as a key technique to establish the Linear Matrix Inequality presentation of a class of matrix cones with its quadratic form co-positive over the real domain. Moreover, such a decomposition technique was found to be useful in quadratic minimization [41] and designing approximation algorithms for biquadratic optimization [25]. Soon after the work of [36], Huang and Zhang [20] extended the matrix rank-one decomposition to the complex domain such that the equal inner product property holds for two matrices. Interestingly, we find that such property remains valid for four matrices when the rank-one decomposition is conducted in the quaternion domain. Moreover, our proof is fully constructive and the corresponding computational procedure is summarized in Algorithm 1.
Theoretical implications of our novel matrix decomposition technique are quite versatile and yield stronger results than those in the real and complex domains. The first two theoretical implications are in joint numerical range and \({\mathcal {S}}\)-Procedure, both of which have some fundamental impacts and wide applications in many fields. In particular, the joint numerical range is an important tool in linear algebra and convex analysis, and it is found to be useful in spectrum analysis [33] and quantum computing [13, 35]. \({\mathcal {S}}\)-Procedure occupies a crucial position in the field of robust optimization [2, 3, 17], statistics [19], signal processing [26], among others [30]. With our matrix rank-one decomposition result, we manage to establish the convexity of joint numerical range [4, 28] for five matrices and the lossless of \({\mathcal {S}}\)-Procedure for four Hermitian forms. As a comparison, similar results only hold for fewer matrices in the real and complex domains. In addition, our result can also be applied to quadratically constrained quadratic programming (QCQP). To be specific, when the number of quadratic constraints is no larger than 4, we show that a rank-one solution of the SDP relaxation of (QCQP), which is hence an optimal solution of (QCQP), can be recovered from our matrix rank-one decomposition technique.
This paper is organized as follows. In Sect. 2, we introduce some notations and definitions used throughout this paper. Section 3 is devoted to the new quaternion matrix rank-one decomposition theorem. To showcase the capability of our new theorem, we illustrate some improved results in the joint numerical range in Sect. 4 and the \({\mathcal {S}}\)-Procedure in Sect. 5. Finally, we present how to solve the (QCQP) with our novel decomposition technique as another theoretical implication of our result in Sect. 6.
2 Preliminaries
In this section, we introduce some basic algebraic operations in the quaternion domain for scalars, vectors and matrices.
2.1 Quaternion operations for scalars
We define the set of quaternions \({\mathbb {H}}\) as a 4-dimensional normed division algebra over the real numbers \({\mathbb {R}}\). It has a canonical basis \(\{1, {\varvec{i}}, {\varvec{j}}, {\varvec{k}}\}\), where \({\varvec{i}}, {\varvec{j}}, {\varvec{k}}\) are imaginary units such that
Then, any quaternion \(q \in {\mathbb {H}}\) can be written as
where \(q_a,q_b,q_c,q_d \in {\mathbb {R}}\) are the components of q. The real and imaginary parts of q are denoted as \(\mathrm{Re~}q=q_a\) and \(\mathrm{Im~}q=q_b{\varvec{i}}+ q_c{\varvec{j}}+ q_d{\varvec{k}}\) respectively. Note that, in contrast with the product operation in the real and complex domain, the product operation in the quaternion domain is noncommutative, i.e., \(q \cdot p \ne p \cdot q\) for \(p, q \in {\mathbb {H}}\).
We denote by \({\overline{q}}=\mathrm{Re~}q-\mathrm{Im~}q\) the quaternion conjugate of q, and it holds that \(\overline{(p \cdot q)}={\overline{q}} \cdot {\overline{p}}\). For a given quaternion \(q \in {\mathbb {H}}\), |q| denotes its modulus and can be expressed as
At last, any non-zero quaternion q has an inverse \(q^{-1}={\overline{q}}/|q|^2\) and the inverse of the product of two quaternions is \((p \cdot q)^{-1}=q^{-1} \cdot p^{-1}\).
2.2 Quaternion vectors and quaternion matrices
Similar to the scalar case, a quaternion vector \({\varvec{q}}\in {\mathbb {H}}^n\) can be written as
where \({\varvec{q}}_a,{\varvec{q}}_b,{\varvec{q}}_c,{\varvec{q}}_d \in {\mathbb {R}}^n\) are the components of \({\varvec{q}}\). For a quaternion vector \({\varvec{q}}\), \({\varvec{q}}^{\top }\) denotes the transpose of \({\varvec{q}}\), and \({\varvec{q}}^{H}=(\overline{{\varvec{q}}})^{\top } = \overline{({\varvec{q}}^{\top })}\) denotes its conjugate transpose. Any quaternion matrix \(A\in {\mathbb {H}}^{m\times n}\) can be expressed as
with \(A_a,A_b,A_c,A_d \in {\mathbb {R}}^{m\times n}\). The transpose and the conjugate transpose of A are \(A^{\top }\) and \(A^H=({\overline{A}})^{\top } = \overline{(A^{\top })}\), respectively. For two quaternion matrices \(A \in {\mathbb {H}}^{m \times n}\) and \(B \in {\mathbb {H}}^{n \times p}\), we have \((AB)^H = B^HA^H\); however, due to the noncommutativity of the product operation, \((AB)^\top \not = B^\top A^\top \) and \(\overline{(AB)} \not = {\overline{A}} \ {\overline{B}}\). For two vectors \({\varvec{q}}, {\varvec{p}}\in {\mathbb {H}}^n\), their inner product
Similarly, for two matrices \(A ,B \in {\mathbb {H}}^{n \times n}\), their inner product is defined as
where ‘tr’ denotes the trace of a matrix.
2.3 Hermitian and positive semidefinite matrices
We call X a quaternion Hermitian matrix if it satisfies \(X = X^H\). The cone of quaternion Hermitian matrices is denoted as \({\mathcal {H}}^{n}\). Then \({\varvec{u}}^HX{\varvec{u}}\) is real for all \({\varvec{u}}\in {\mathbb {H}}^n\) if and only if \(X \in {\mathcal {H}}^{n}\). We denote by \({\mathcal {S}}_{+}^{n}({\mathcal {S}}_{++}^{n})\), \({\mathcal {C}}_{+}^{n}({\mathcal {C}}_{++}^{n})\) and \({\mathcal {H}}_{+}^{n}({\mathcal {H}}_{++}^{n})\) the cones of real symmetric positive semidefinite (positive definite), complex Hermitian positive semidefinite (positive definite) and quaternion Hermitian positive semidefinite (positive definite) matrices, respectively. The notation \(X \succeq 0\) (\(X \succ 0\)) means that X is positive semidefinite (positive definite). Then, for any matrix \(X \succeq 0\) (\(X \succ 0\)), we have \({\varvec{u}}^HX{\varvec{u}}\) is real and nonnegative (positive) for all \({\varvec{u}}\in {\mathbb {H}}^n\) (\(0 \ne {\varvec{u}}\in {\mathbb {H}}^n\)).
3 The quaternion matrix Rank-One decomposition method
In this section, we discuss a particular rank-one decomposition of matrices over the quaternion domain such that the inner product between some given matrices and each rank-one term in the decomposition has the same value.
In the real domain, Sturm and Zhang [36] showed that for a rank-r symmetric positive semidefinite matrix \(Y \in {\mathcal {S}}_+^{n}\) and a real symmetric matrix \(B \in {\mathcal {S}}^{n}\), there is a rank-one decomposition of Y such that:
Subsequently, Huang and Zhang [20] proved that the above decomposition could hold for two matrices in the complex domain. In particular, suppose \(Z\in {\mathcal {C}}_+^n\) is a complex Hermitian positive semidefinite matrix of rank r, and \(C_1,C_2 \in {\mathcal {C}}^n\) are two given complex Hermitian matrices. Then, there is a rank-one decomposition of Z such that:
Thus, a natural question arises: can a similar decomposition holds for more matrices over the quaternion domain?
3.1 The main result
The following theorem is the main result of this paper and gives an affirmative answer to the question above.
Theorem 1
Let \(A_k \in {\mathcal {H}}^{n}\), \(k = 1, 2, 3, 4\), and rank-r matrix \(X \succeq 0\). There exists a rank-one decomposition of X such that
and \(k = 1, 2, 3, 4\).
Proof
We start with a weaker version of (2) such that it holds only for two matrices \(A_1\) and \(A_2\), i.e., there exists a rank-one decomposition of X such that:
according to the similar argument of Theorem 2.1 in [20]. The rest of the proof consists of two steps.
Step 1: prove (2) for \(k=1,2,3\)
The conclusion follows if the decomposition in (3) is also valid for matrix \(A_3\), i.e., \({\varvec{u}}_i^HA_3{\varvec{u}}_i = \frac{A_3 \bullet X}{r}\), \(i = 1,\cdots ,r\). Otherwise, without loss of generality, there exist two vectors \({\varvec{u}}_1\) and \({\varvec{u}}_2\) such that
Denote \({\varvec{u}}_1^HA_1{\varvec{u}}_2 = a_1 + b_1{\varvec{i}}+ c_1{\varvec{j}}+ d_1{\varvec{k}}\) and \({\varvec{u}}_1^HA_2{\varvec{u}}_2 = a_2 + b_2{\varvec{i}}+ c_2{\varvec{j}}+ d_2{\varvec{k}}\). Let \(\omega = \omega _a + \omega _b{\varvec{i}}+ \omega _c{\varvec{j}}+ \omega _d{\varvec{k}}\in {\mathbb {H}}\) and construct
Then, it is easy to verify that
Note that the identity above holds for any \(\omega \), which will allow us some flexibility to find an appropriate \(\omega \) such that
In particular, observe that
as long as
Therefore, any solution of the following equations
will make (5) valid. Moreover, for a particular solution \((\omega _{a*}, \omega _{b*}, \omega _{c*}, \omega _{d*})\) of (8) with \(\omega _{a*}^2 + \omega _{b*}^2 + \omega _{c*}^2+ \omega _{d*}^2 =1\), \(\omega (\alpha ) = \alpha (\omega _{a*}+ \omega _{b*}{\varvec{i}}+ \omega _{c*}{\varvec{j}}+ \omega _{d*}{\varvec{k}})\) is also a solution of (8) with \( |\omega (\alpha )|=\alpha \) for any \(\alpha > 0\), as (8) is homogeneous. To extend the validness of (5) to matrix \(A_3\) for some \(\alpha \), i.e.,
we compute
Combing (10) and \(\mathrm{Re~}(\overline{\omega (\alpha )}{\varvec{u}}_1^HA_3{\varvec{u}}_2)=\alpha \mathrm{Re~}(\overline{\omega ^*}{\varvec{u}}_1^HA_3{\varvec{u}}_2)\), identity (9) that we want to prove is equivalent to
\(\iota = \mathrm{Re~}(\overline{\omega ^*}{\varvec{u}}_1^HA_3{\varvec{u}}_2)\) and \(\omega ^*=\omega _{a^*}+\omega _{b^*}{\varvec{i}}+\omega _{c^*}{\varvec{j}}+\omega _{d^*}{{\varvec{k}}}\). Then (11) is just a quadratic equation in \(\alpha \), and it must have two real roots with opposite signs since \({\varvec{u}}_1^HA_3{\varvec{u}}_1 > \frac{A_3 \bullet X}{r}\) and \({\varvec{u}}_2^HA_3{\varvec{u}}_2 < \frac{A_3 \bullet X}{r}\). Let \(\alpha _*\) be the positive root of (11) and construct \(\omega _* = \alpha _*(\omega _{a*} + \omega _{b*}{\varvec{i}}+ \omega _{c*}{\varvec{j}}+ \omega _{d*}{\varvec{k}})\). Then (9) holds with \(\alpha = \alpha _*\), and thus \({\varvec{v}}_1^HA_k{\varvec{v}}_1 = \frac{A_k \bullet X}{r}\) for \(k = 1, 2, 3\). Since \(X = \sum _{i=1}^r{\varvec{x}}_i{\varvec{x}}_i^H\), we have
Therefore, rank\((X - {\varvec{v}}_1{\varvec{v}}_1^H) = r -1\), , \(i = 3,\cdots ,r\) for \(k = 1, 2\), and
Then, we can recursively repeat the above process on \(X - {\varvec{v}}_1{\varvec{v}}_1^H\) and finally obtain a rank-one decomposition of X such that
Step 2: prove (2) for \(k=1,2,3,4\)
The conclusion follows if the decomposition in (12) is also valid for matrix \(A_4\), i.e., \({\varvec{v}}_i^HA_4{\varvec{v}}_i = \frac{A_4 \bullet X}{r}\), \(i = 1,\cdots ,r\). Otherwise, without loss of generality, there exist two vectors \({\varvec{v}}_1\) and \({\varvec{v}}_2\) such that
Denote \({\varvec{v}}_1^HA_1{\varvec{v}}_2 = {\hat{a}}_1 + {\hat{b}}_1{\varvec{i}}+ {\hat{c}}_1{\varvec{j}}+ {\hat{d}}_1{\varvec{k}}\), \({\varvec{v}}_1^HA_2{\varvec{v}}_2 = {\hat{a}}_2 + {\hat{b}}_2{\varvec{i}}+ {\hat{c}}_2{\varvec{j}}+ {\hat{d}}_2{\varvec{k}}\) and \({\varvec{v}}_1^HA_3{\varvec{v}}_2 = {\hat{a}}_3 + {\hat{b}}_3{\varvec{i}}+ {\hat{c}}_3{\varvec{j}}+ {\hat{d}}_3{\varvec{k}}\). Let \({\hat{\omega }} = {\hat{\omega }}_a + {\hat{\omega }}_b{\varvec{i}}+ {\hat{\omega }}_c{\varvec{j}}+ {\hat{\omega }}_d{\varvec{k}}\in {\mathbb {H}}\) and construct
Then, similar to (4), we can verify that
holds for any \({\hat{\omega }}\). Moreover, when \(\mathrm{Re~}(\overline{{\hat{\omega }}}{\varvec{v}}_1^HA_k{\varvec{v}}_2) = 0\), or equivalently \({\hat{a}}_k{\hat{\omega }}_a + {\hat{b}}_k{\hat{\omega }}_b + {\hat{c}}_k{\hat{\omega }}_c + {\hat{d}}_k{\hat{\omega }}_d = 0\), it holds that
In other words, any solution of the following equations
will make (14) valid. Since (14) is homogeneous and has three identities and four variables, we can find a particular solution \(({\hat{\omega }}_{a*}, {\hat{\omega }}_{b*}, {\hat{\omega }}_{c*}, {\hat{\omega }}_{d*})\) of (15) with \({\hat{\omega }}_{a*}^2 + {\hat{\omega }}_{b*}^2 + {\hat{\omega }}_{c*}^2+ {\hat{\omega }}_{d*}^2 =1\) such that \({\hat{\omega }}({\hat{\alpha }}) = {\hat{\alpha }}({\hat{\omega }}_{a*}+ {\hat{\omega }}_{b*}{\varvec{i}}+ {\hat{\omega }}_{c*}{\varvec{j}}+ {\hat{\omega }}_{d*}{\varvec{k}})\) is also a solution of (15) with \( |{\hat{\omega }}({\hat{\alpha }})|={\hat{\alpha }}\) for any \({\hat{\alpha }} > 0\). To extend the validness of (14) to matrix \(A_4\) for some \({\hat{\alpha }}\), i.e.,
we compute
Combining \(\mathrm{Re~}(\overline{{\hat{\omega }}({\hat{\alpha }})}{\varvec{v}}_1^HA_4{\varvec{v}}_2)={\hat{\alpha }}\mathrm{Re~}(\overline{{\hat{\omega }}^*}{\varvec{v}}_1^HA_4{\varvec{v}}_2)\) with (17), identity (16) that we want to prove is equivalent to the following real quadratic equation in terms of \({\hat{\alpha }}\):
where \({\hat{\iota }}=\mathrm{Re~}(\overline{{\hat{\omega }}^*}{\varvec{v}}_1^HA_4{\varvec{v}}_2)\) is independent of \({{\hat{\alpha }}}\) as \({\hat{\omega }}^*={\hat{\omega }}_{a*} + {\hat{\omega }}_{b*}{\varvec{i}}+ {\hat{\omega }}_{c*}{\varvec{j}}+ {\hat{\omega }}_{d*}{\varvec{k}}\) and \({\varvec{v}}_1^HA_4{\varvec{v}}_2\) are both independent of \({{\hat{\alpha }}}\). Note that (18) must have two real roots with opposite signs since \({\varvec{v}}_1^HA_4{\varvec{v}}_1 > \frac{A_4 \bullet X}{r}\) and \({\varvec{v}}_2^HA_4{\varvec{v}}_2 < \frac{A_4 \bullet X}{r}\). Let \({\hat{\alpha }}_*\) be the positive root of (18) and construct \({\hat{\omega }}_* = {\hat{\alpha }}_*({\hat{\omega }}_{a*} + {\hat{\omega }}_{b*}{\varvec{i}}+ {\hat{\omega }}_{c*}{\varvec{j}}+ {\hat{\omega }}_{d*}{\varvec{k}})\). Then (16) holds with \({\hat{\alpha }} = {\hat{\alpha }}_*\), and thus \({\varvec{v}}_1^HA_k{\varvec{v}}_1 = \frac{A_k \bullet X}{r}\) for \(k = 1, 2, 3, 4\). Recall that \(X = \sum _{i=1}^r{\varvec{x}}_i{\varvec{x}}_i^H\), we have
Therefore, rank\((X - {\varvec{x}}_1{\varvec{x}}_1^H) = r -1\), \( {\varvec{v}}_i^HA_k{\varvec{v}}_i \overset{(12)}{=} \frac{A_k \bullet X}{r}\), \(i = 3,\cdots ,r\) for \(k = 1, 2, 3\), and
Repeating the above process recursively on \(X - {\varvec{x}}_1{\varvec{x}}_1^H\), we will finally obtain a rank-one decomposition of X such that (2) holds for \(k=1,2,3,4\). \(\square \)
3.2 The algorithm and numerical verification
Note that the rank-one decomposition procedure we provide in the proof of Theorem 1 is actually implementable. To make it clear, we summarize all the computational procedures in Algorithm 1 below. In particular, Step 1 and Step 2 in the proof are described by Algorithm 1 with \(\ell =3\) and \(\ell =4\), respectively.
Next, we verify Theorem 1 by implementing Algorithm 1 and decomposing some randomly generated quaternion Hermitian positive semidefinite matrices. The details of generating those matrices are explained below.
-
1.
Randomly generate four real \(n\times n\) matrices \(A_a, A_b, A_c, A_d\) with each component following the standard normal distribution. Construct quaternion matrix \(A=A_a+A_b{\varvec{i}}+A_c{\varvec{j}}+A_d{\varvec{k}}\).
-
2.
Construct a \(4n\times 4n\) real matrix
$$\begin{aligned} A_R=\begin{bmatrix} A_a &{} -A_b &{} -A_c &{} -A_d \\ A_b &{} ~~A_a &{} -A_d &{} ~~A_c \\ A_c &{} ~~A_d &{} ~~A_a &{} -A_b \\ A_d &{} -A_c &{} ~~A_b &{} ~~A_a \end{bmatrix}; \end{aligned}$$ -
3.
Compute \(X_R=A_RA_R^{\top }\), and \(X_R\) can be written in the form of
$$\begin{aligned} X_R=\begin{bmatrix} X_a &{} -X_b &{} -X_c &{} -X_d \\ X_b &{} ~~X_a &{} -X_d &{} ~~X_c \\ X_c &{} ~~X_d &{} ~~X_a &{} -X_b \\ X_d &{} -X_c &{} ~~X_b &{} ~~X_a \end{bmatrix}; \end{aligned}$$ -
4.
Construct \(X=X_a+X_b{\varvec{i}}+X_c{\varvec{j}}+X_d{\varvec{k}}\). Then, X is a quaternion Hermitian positive semidefinite matrix as its real representation \(X_R\) [22, 24, 34] is positive semidefinite.
The procedure of randomly generating a quaternion Hermitian matrix is the same as above, except that in step 3, we let \(B_R = A_R+A_R^{\top }\) and write it as
Then \(B=B_a+B_b{\varvec{i}}+B_c{\varvec{j}}+B_d{\varvec{k}}\) is a quaternion Hermitian matrix as the real representation \(B_R\) [22, 24, 34] is symmetric.
Now, we start to verify Theorem 1 with a toy example. In particular, we apply the above procedures to randomly generate a \(2\times 2\) quaternion Hermitian positive semidefinite matrix of rank 2:
and randomly generate four quaternion Hermitian matrices:
Then, we apply Algorithm 1 to perform the quaternion matrix rank-one decomposition with the accuracy \(\epsilon =1e^{-8}\), and obtain:
We compute the decomposition residuals
and obtain
Thus, we can see that Theorem 1 is numerically verified as the residuals are very small.
To further verify Theorem 1 for the more general case, we randomly generate 600 instances. In each instance, matrices \(X, A_1, A_2, A_3, A_4\) are generated according to the above procedure. The size of those matrices is also randomly generated under the following two settings:
where \(\xi \) is uniformly drawn from [0, 1]. In particular, one half of the 600 instances are generated under setting (1) and setting (2) respectively. After applying Algorithm 1, the performance of each instance is measured by mean residuals, which is the average of the 4r terms defined in (19). We plot the performance of our algorithm for 600 random instances in Fig. 1 and Fig. 2, and the red line in the figures is the mean error, i.e., the average mean residuals over all instances. As we can see that Algorithm 1 performs remarkably stable and accurate as the average mean residuals is in the order of \(10^{-13}\), which further numerically verifies Theorem 1.
4 The joint numerical range
Numerical range is an important tool in linear algebra and convex analysis, and it has wide applications in spectrum analysis [33], quantum computing [13, 35], engineering, etc. Joint numerical range was first proposed in 1979 and is an extension of numerical range [4], which mainly focuses on the geometric properties like convexity of the joint field values of several matrices. Joint numerical range also has as wide applications as that of numerical range, and theoretically, it has a close connection to other fundamental results such as \({\mathcal {S}}\)-Lemma, which will be discussed in the next section. On the other hand, people generalized the results of joint numerical ranges in various ways [1, 8, 23, 37]. In this section, we show how to further extend the classical results on the convexity of the joint numerical ranges [4, 28] to the quaternion domain via the rank-one decomposition of quaternion matrices studied in the last section. We start with some basic concepts that will be used later.
Definition 1
Let A be any \(n \times n\) quaternion matrix, the numerical range of A is given by
With this concept in hand, we formally define the joint numerical range of a set of matrices as follows.
Definition 2
The \(joint \ numerical \ range\) of \(n \times n\) quaternion matrices \(A_1, \cdots , A_m\) is defined to be
Regarding the convexity of the joint numerical ranges, Hausdorff [18] showed that
“ If \(A_1\) and \(A_2\) are complex Hermitian, then \({\mathcal {F}}(A_1, A_2)\) is a convex set.”
In a slightly different form, Brickman [8] extended the above result to three matrices, i.e., suppose \(A_1, A_2, A_3\) are \(n \times n\) complex Hermitian matrices, then
is a convex set, where \({\mathbb {C}}^n\) is the set of complex vectors.
As a matter of fact, the convexity of the joint numerical ranges in complex domain can also be extended to quaternion domain for more matrices. In particular, Au-Yeung and Poon [4] established the following result for quaternion Hermitian matrices.
Theorem 2
(Au-Yeung and Poon [4]) If \(n \not = 2\) and \(A_1, \cdots , A_5 \in {\mathcal {H}}^{n}\) are quaternion hermitian matrices, then
is a convex set.
Note that the above theorem assumes \(n \not = 2\) and this condition is necessary as a counterexample is presented in [34]. However, such condition can be relaxed if we consider the slightly different form of (20) by Brickman, where the restriction \({\varvec{x}}^H{\varvec{x}}=1\) is dropped. We present this result in Theorem 3, which is proved by our quaternion matrix rank-one decomposition theorem.
Theorem 3
Suppose that \(A_k \in {\mathcal {H}}^{n}\), \(k = 1, \cdots , 5\). Then
is a convex set.
Proof
To complete the proof, it suffices to show
as the right-hand side is a convex set. Noting that the right-hand side is the convex hull of the left-hand side, we only need to show the right-hand side is included by the left-hand side.
Take any nonzero vector
Without loss of generality, suppose \(v_5 \not = 0\) and \((A_k - \frac{v_k}{v_5}A_5) \bullet X = 0, ~\mathrm{for} \ k = 1, 2, 3, 4\). Then, by Theorem 1, there exits a rank-one decomposition \(X = \sum _{i=1}^r{\varvec{x}}_i{\varvec{x}}_i^H\) such that
for \(i = 1, \cdots , r\). Since \(v_5 = A_5 \bullet X = \sum _{i=1}^r{\varvec{x}}_i^HA_5{\varvec{x}}_i\), there is at least one vector, say \({\varvec{x}}_1\) such that \({\varvec{x}}_1^HA_5{\varvec{x}}_1\) shares the same sign as \(v_5\). Let \(\rho = \sqrt{\frac{v_5}{{\varvec{x}}_1^HA_5{\varvec{x}}_1}}\) and \({\varvec{x}}= \rho {\varvec{x}}_1\), then we have
and (23) also holds for \({\varvec{x}}\). Plugging \({\varvec{x}}\) and \({\varvec{x}}^HA_5{\varvec{x}}= v_5\) into (23), we have
for \(k = 1, 2, 3, 4\). Hence the vector \( \begin{pmatrix} v_1 \\ v_2 \\ \cdots \\ v_5 \end{pmatrix} \in \left\{ \begin{pmatrix} {\varvec{x}}^HA_1{\varvec{x}}\\ {\varvec{x}}^HA_2{\varvec{x}}\\ \cdots \\ {\varvec{x}}^HA_5{\varvec{x}}\end{pmatrix} ~:~ x \in {\mathbb {H}}^n \right\} \) and the conclusion follows. \(\square \)
5 S-Procedure
It is well known that \({\mathcal {S}}\)-procedure plays an important role in robust optimization [2, 3, 17], statistics [19], signal processing [26], control and stability problems [5, 6], among others. In this section, we discuss another interesting theoretical implication of our new rank-one decomposition (Theorem 1) in \({\mathcal {S}}\)-procedure. We first recall the following lemma, which is due to [40], about \({\mathcal {S}}\)-procedure in real domain.
Lemma 1
-(\({\mathcal {S}}\)-Procedure) Let \(F({\varvec{x}})={\varvec{x}}^{\top }A_0{\varvec{x}}+ 2{\varvec{b}}_0^{\top }{\varvec{x}}+ c_0\) and \(G_i({\varvec{x}}) = {\varvec{x}}^{\top }A_i{\varvec{x}}+ 2{\varvec{b}}_i^{\top }{\varvec{x}}+ c_i\), \(i=1,\ldots ,m\) be quadratic functions of \({\varvec{x}}\in {\mathbb {R}}^n\). Then \(F({\varvec{x}}) \ge 0\) for all \({\varvec{x}}\) such that \(G_i({\varvec{x}})\ge 0\), \(i=1,\ldots ,m\), if there exist \(\tau _i \ge 0 \) such that
Moreover, if \(m=1\) then the converse holds if there exists \({\varvec{x}}_0\) such that \(F({\varvec{x}}_0)>0\).
We remark that the above result essentially study the relationship between
and
It is obvious that (25) implies (24), and the converse also holds when \(m=1\). In the following, we call \({\mathcal {S}}\)-procedure is lossless if (24) and (25) are equivalent. Although \({\mathcal {S}}\)-procedure was first studied in the real domain, interestingly, a stronger result has been established in the complex domain.
Lemma 2
(Yakubovich [40]) Suppose \(F, G_1, G_2\) are Hermitian forms and satisfy (24). Moreover, there is \({\varvec{x}}_0 \in {\mathbb {C}}^n\) such that \(G_i({\varvec{x}}_0) > 0, i = 1,2\). Then the S-procedure is lossless for \(m = 2\).
Inspired by such result, we further consider \({\mathcal {S}}\)-procedure in the quaternion domain and manage to show that it is lossless for \(m = 4\) by Theorem 1.
Theorem 4
Suppose \(F, G_1, G_2, G_3, G_4\) are quaternion Hermitian forms and satisfy (24). Moreover, there is
Then the S-procedure is lossless for \(m = 4\).
Proof
To prove the conclusion, it suffices to establish (25) for \(m=4\). Let \(F({\varvec{x}}) = {\varvec{x}}^HA_0{\varvec{x}}\) and \(G_i({\varvec{x}}) = {\varvec{x}}^HA_i{\varvec{x}}\) for \(i = 1, 2, 3, 4\). Consider the following set
which is a convex cone in \({\mathbb {R}}^5\) by Theorem 3. Moreover, since \(F({\varvec{x}})\) and \(G_i({\varvec{x}})\), \(i = 1, \cdots , 4\) satisfy (24), i.e.,
we have \({\mathbb {D}}\cap {\mathbb {E}}=\emptyset \), where \({\mathbb {E}} = \{(y_0,y_1,y_2,y_3,y_4) |~ y_0<0,~ y_i>0,~i=1,2,3,4\}\). Noting that \({\mathbb {E}} \) is also a convex cone, the standard separating hyperplane theorem [7] implies that there exists \({\varvec{t}}=(t_0, t_1, t_2, t_3, t_4) \not = 0\) such that
and
Then we must have \(t_0 \ge 0\) and \(t_k \le 0\) for \(k = 1, 2, 3, 4\) by inequality (27). Moreover, we can confirm that \(t_0 > 0\), otherwise since \({\varvec{t}}\ne 0 \) there exists some \(k \in \{1,2,3,4\}\) such that \(t_k < 0\) and thus for \({\varvec{x}}_0\) in (26) it holds that \(\sum _{k = 0}^4t_k{\varvec{x}}_0^HA_k{\varvec{x}}_0 = \sum _{k = 1}^4t_k{\varvec{x}}_0^HA_k{\varvec{x}}_0 < 0\), which contradicts with (28). Finally, the validness of (25) with \(m=4\) is implied by (28) by dividing both sides of (28) by \(t_0\) and letting \(\tau _k = \frac{-t_k}{t_0} \ge 0 \) for \(k=1,2,3,4\). \(\square \)
6 Quaternion quadratically constrained quadratic optimization
In this section, we consider the following quadratically constrained quadratic programming in the quaternion domain:
where \(Q, A_j \in {\mathcal {H}}^n\), \(b_j, q \in {\mathbb {H}}^n\), \(c_j \in {\mathbb {R}}\), \(j = 1,\cdots ,m\). Normally, solving the above problem is very challenging. However, when the number of constraints m is small, the problem is possibly tractable. For instance, in the complex domain, Huang and Zhang [20] showed that the problem (QCQP) with \(m=2\) could be cast as an SDP and thus could be solved in polynomial time. While regarding the quaternion domain, we shall show that a similar result holds for a larger value of m (in particular for \(m=4\)) by the afore-mentioned rank-one decomposition in Theorem 1.
To relate this problem to SDP, we rewrite (QCQP) as the following matrix form:
where \(B_0 = \begin{bmatrix} 0 &{} q^H \\ q &{} Q \\ \end{bmatrix}, \ B_j = \begin{bmatrix} c_j &{} b_j^H \\ b_j &{} A_j \\ \end{bmatrix}, ~j = 1,\cdots ,m\). Problem (29) can be homogenized by introducing a new variable t and requiring \(|t|^2=1\):
where \( B_{m+1} = \begin{bmatrix} 1 &{} 0 \\ 0 &{} 0 \\ \end{bmatrix}\). Then for any solution \(\begin{pmatrix} t \\ {\varvec{x}}\\ \end{pmatrix}\) of (HQCQP), since \(|t|^2=1\), \({\varvec{x}}\cdot {\bar{t}}\) is a solution to (29) and hence a solution to (QCQP). Therefore, we shall work on (HQCQP) in the rest of this section. By letting
we obtain an SDP relaxation of (HQCQP):
whose dual is given by
To proceed, we assume (QCQP) satisfies the Slater condition, that is, there exists \({\varvec{x}}_0 \in {\mathbb {H}}^n\) such that
Accordingly, (QCQPR) satisfies the Slater condition as well. Now we are ready to present the main theorem of this section, which states that (QCQP) is essentially equivalent to an SDP when \(m = 4\).
Theorem 5
Suppose (QCQP) satisfies the Slater condition (30) and \(m = 4\). Then (QCQP) and (QCQPR) have the same optimal value. Moreover, an optimal solution to (QCQP) can be constructed from that of (QCQPR).
Proof
Denote ‘\(\trianglelefteq \)’ to be either ‘<’ or ‘=’. For the primal optimal solution \(X^*\), the first group of constraints in (QCQPR) can be rewritten as
Letting \(r=\mathrm{rank}\, X^*\), by Theorem 1, there exist some non-zero quaternion vectors \({\tilde{{\varvec{x}}}}_k = \begin{pmatrix} t_k \\ {{\varvec{x}}}_k \\ \end{pmatrix} \in {\mathbb {H}}^{n+1}, ~k = 1,\cdots ,r\), such that
Since \(B_{m+1} \bullet X^* = 1\), we have \(\sum _{k=1}^{r}|t_k|^2 = X^*_{11} = 1\) and there exits \(\ell \in \left\{ 1, \cdots , r \right\} \) such that \(t_\ell \not = 0\). Therefore, according to (31), it holds that
which further implies \({\varvec{x}}_{\ell } \cdot \frac{1}{t_\ell }\) is a feasible solution to (QCQP).
Suppose \((y_0^*, y_1^*, y_2^*, y_3^*, y_4^*, Y^*)\) is a dual optimal solution. Since the Slater condition holds, the strong duality is valid between (QCQPR) and (DQCQPR). Consequently, the complementary slackness condition holds, i.e.,
Moreover, since \(X^*\succeq 0\) and \(Y^*\succeq 0\), we have \(Y^* \bullet {\tilde{{\varvec{x}}}}_{\ell }{\tilde{{\varvec{x}}}}_{\ell }^H = 0\). We also observe from (31) that
which indicates that
Combining the above results with \(\begin{pmatrix} 1 \\ {{\varvec{x}}}_{\ell } \cdot \frac{1}{t_{\ell }}\\ \end{pmatrix} \begin{pmatrix} 1 \\ {{\varvec{x}}}_{\ell } \cdot \frac{1}{t_{\ell }}\\ \end{pmatrix}^H = \frac{1}{|t_{\ell }|^2}\cdot {\tilde{{\varvec{x}}}}_{\ell }{\tilde{{\varvec{x}}}}_{\ell }^H\) yields that \(\begin{pmatrix} 1 \\ {{\varvec{x}}}_{\ell } \cdot \frac{1}{t_{\ell }}\\ \end{pmatrix} \begin{pmatrix} 1 \\ {{\varvec{x}}}_{\ell } \cdot \frac{1}{t_{\ell }}\\ \end{pmatrix}^H\) is complementary to \((y_0^*, y_1^*, y_2^*, y_3^*, y_4^*, Y^*)\), and it is an optimal solution to (QCQPR). Therefore \({{\varvec{x}}}_{\ell } \cdot \frac{1}{t_{\ell }}\) is optimal for (QCQP) and the conclusion follows. \(\square \)
References
Ai, W., Huang, Y., Zhang, S.: New results on hermitian matrix rank-one decomposition. Math. Program. 128(1), 253–283 (2011)
Anitescu, M.: Degenerate nonlinear programming with a quadratic growth condition. SIAM J. Optim. 10(4), 1116–1135 (2000)
Anitescu, M.: A superlinearly convergent sequential quadratically constrained quadratic programming algorithm for degenerate nonlinear programming. SIAM J. Optim. 12(4), 949–978 (2002)
Au-Yeung, Y.H., Poon, Y.T.: A remark on the convexity and positive definiteness concerning hermitian matrices. Southeast Asian Bull. Math. 3(2), 85–92 (1979)
Ben-Tal, A., Nemirovski, A.: Lectures on modern convex optimization: analysis, algorithms, and engineering applications. SIAM (2001)
Boyd, S., El Ghaoui, L., Feron, E., Balakrishnan, V.: Linear matrix inequalities in system and control theory. SIAM (1994)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press (2004)
Brickman, L.: On the field of values of a matrix. Proceedings of the American Mathematical Society 12(1), 61–66 (1961)
Chen, B., Shu, H., Coatrieux, G., Chen, G., Sun, X., Coatrieux, J.L.: Color image analysis by quaternion-type moments. Journal of mathematical imaging and vision 51(1), 124–144 (2015)
Chen, Y., Qi, L., Zhang, X., Xu, Y.: A low rank quaternion decomposition algorithm and its application in color image inpainting. arXiv preprint arXiv:2009.12203 (2020)
Chen, Y., Xiao, X., Zhou, Y.: Low-rank quaternion approximation for color image processing. IEEE Trans. Image Process. 29, 1426–1439 (2019)
Chou, J.C.: Quaternion kinematic and dynamic differential equations. IEEE Trans. Robot. Autom. 8(1), 53–64 (1992)
Dirr, G., Helmke, U., Kleinsteuber, M., Schulte-Herbrüggen, T.: A new type of c-numerical range arising in quantum computing. In: PAMM: Proceedings in Applied Mathematics and Mechanics, vol. 6, pp. 711–712. Wiley Online Library (2006)
Flamant, J., Chainais, P., Le Bihan, N.: A complete framework for linear filtering of bivariate signals. IEEE Trans. Signal Process. 66(17), 4541–4552 (2018)
Flamant, J., Le Bihan, N., Chainais, P.: Time-frequency analysis of bivariate signals. Appl. Comput. Harmon. Anal. 46(2), 351–383 (2019)
Flamant, J., Miron, S., Brie, D.: A general framework for constrained convex quaternion optimization. arXiv preprint arXiv:2102.02763 (2021)
Goldfarb, D., Iyengar, G.: Robust portfolio selection problems. Math. Oper. Res. 28(1), 1–38 (2003)
Hausdorff, F.: Der wertvorrat einer bilinearform. Math. Z. 3(1), 314–316 (1919)
Hoerl, A.E., Kennard, R.W.: Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1), 55–67 (1970)
Huang, Y., Zhang, S.: Complex matrix decomposition and quadratic programming. Math. Oper. Res. 32(3), 758–768 (2007)
Jahanchahi, C., Took, C.C., Mandic, D.P.: A class of quaternion valued affine projection algorithms. Signal Process. 93(7), 1712–1723 (2013)
Jia, Z., Wei, M., Ling, S.: A new structure-preserving method for quaternion hermitian eigenvalue problems. J. Comput. Appl. Math. 239, 12–24 (2013)
Li, C.K., Poon, Y.T.: The joint essential numerical range of operators: convexity and related results. Studia Math 194, 91–104 (2009)
Li, Y., Wei, M., Zhang, F., Zhao, J.: A real structure-preserving method for the quaternion lu decomposition, revisited. Calcolo 54(4), 1553–1563 (2017)
Ling, C., Nie, J., Qi, L., Ye, Y.: Biquadratic optimization over unit spheres and semidefinite programming relaxations. SIAM J. Optim. 20(3), 1286–1310 (2010)
Luo, Z.Q.: Applications of convex optimization in signal processing and digital communication. Math. Program. 97(1), 177–207 (2003)
Miao, J., Kou, K.I., Liu, W.: Low-rank quaternion tensor completion for recovering color videos and images. Pattern Recogn. 107, 107505 (2020)
Pang, J., Zhang, S.: The joint numerical range and quadratic optimization. Unpublished Manuscript (2004)
Parcollet, T., Morchid, M., Linarès, G.: Quaternion convolutional neural networks for heterogeneous image processing. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8514–8518. IEEE (2019)
Pólik, I., Terlaky, T.: A survey of the s-lemma. SIAM Rev. 49(3), 371–418 (2007)
Qi, L., Luo, Z., Wang, Q., Zhang, X.: Quaternion matrix optimization and the underlying calculus. arXiv preprint arXiv:2009.13884 (2020)
Qi, L., Luo, Z., Wang, Q.W., Zhang, X.: Quaternion matrix optimization: Motivation and analysis. Journal of Optimization Theory and Applications pp. 1–28 (2021)
Rasulov, T., Bahronov, B.: Description of the numerical range of a friedrichs model with rank two perturbation. Journal of Global Research in Mathematical Archives 9(6), 15–17 (2019)
Rodman, L.: Topics in quaternion linear algebra. Princeton University Press (2014)
Rodman, L., Spitkovsky, I.M., Szkoła, A., Weis, S.: Continuity of the maximum-entropy inference: Convex geometry and numerical ranges approach. J. Math. Phys. 57(1), 015204 (2016)
Sturm, J.F., Zhang, S.: On cones of nonnegative quadratic functions. Math. Oper. Res. 28(2), 246–267 (2003)
Szymański, K., Weis, S., Życzkowski, K.: Classification of joint numerical ranges of three hermitian matrices of size three. Linear Algebra Appl. 545, 148–173 (2018)
Xu, D., Xia, Y., Mandic, D.P.: Optimization in quaternion dynamic systems: gradient, hessian, and learning algorithms. IEEE transactions on neural networks and learning systems 27(2), 249–261 (2015)
Xu, Y., Yu, L., Xu, H., Zhang, H., Nguyen, T.: Vector sparse representation of color image using quaternion matrix analysis. IEEE Trans. Image Process. 24(4), 1315–1329 (2015)
Yakubovich, V.A.: S-procedure in nolinear control theory. Vestnik Leninggradskogo Universiteta, Ser. Matematika pp. 62–77 (1971)
Ye, Y., Zhang, S.: New results on quadratic minimization. SIAM J. Optim. 14(1), 245–267 (2003)
Yi, C., Lv, Y., Dang, Z., Xiao, H., Yu, X.: Quaternion singular spectrum analysis using convex optimization and its application to fault diagnosis of rolling bearing. Measurement 103, 321–332 (2017)
Zhu, X., Xu, Y., Xu, H., Chen, C.: Quaternion convolutional neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 631–647 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Research supported by NSFC Grants 72171141, 72150001 and 11831002, GIFSUFE Grants CXJJ-2019-391, and Program for Innovative Research Team of Shanghai University of Finance and Economics.
Rights and permissions
About this article
Cite this article
He, C., Jiang, B. & Zhu, X. Quaternion matrix decomposition and its theoretical implications. J Glob Optim 87, 741–758 (2023). https://doi.org/10.1007/s10898-022-01210-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-022-01210-7
Keywords
- Matrix rank-one decomposition
- Quaternion
- Joint numerical range
- \({\mathcal {S}}\)-Procedure
- Quadratic optimization