Distance Evaluation to the Set of Matrices with Multiple Eigenvalues

Kalinina, Elizaveta; Uteshev, Alexei

doi:10.1007/978-3-031-14788-3_12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13366))

Included in the following conference series:

International Workshop on Computer Algebra in Scientific Computing

530 Accesses
1 Citations

Abstract

The problem of finding the Frobenius distance in the $\mathbb R^{n\times n} $ matrix space from a given matrix to the set of matrices possessing multiple eigenvalues is considered. Two approaches are discussed: the one is reducing the problem to a constrained optimization problem in $\mathbb R^n$ with a quartic objective function, and the other one is connected with the singular value analysis for an appropriate matrix in $\mathbb R^{2n\times 2n} $. Several examples are presented including classes of matrices where the distance in question can be explicitly expressed via the matrix eigenvalues.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Matrix Optimization Over Low-Rank Spectral Sets: Stationary Points and Local and Global Minimizers

Article 09 December 2019

Matrix versions of the Hellinger distance

Article 14 January 2019

Affine-Invariant Midrange Statistics

Keywords

1 Introduction

Given a matrix $A\in \mathbb R^{n\times n} $ with distinct eigenvalues, we intend to find the distance from A to the set $ \mathbb D $ of real matrices with multiple eigenvalues as well as the corresponding minimal perturbation, i.e., a matrix $E_{*} \in \mathbb {R}^{n\times n} $ of the minimal norm such that $B_{*}=A+E_{*} \in \mathbb D $.

The problem under consideration is known as Wilkinson’s problem [21] and the desired distance, further denoted as $d(A, \mathbb D)$, is called the Wilkinson distance of A [2, 15]. Wilkinson’s problem is closely related to ill-conditioning of eigenvalue problems. The ill-conditioning of a linear system is determined by the distance of the coefficient matrix from the set of singular matrices. For eigenvalue problems, the set of matrices with multiple eigenvalues plays the role of singularity [23]. The Wilkinson distance can be considered as a measure of sensitivity of the worst-conditioned eigenvalue of A. By eigenvalue perturbation theory, a matrix that is close to a defective matrix has an eigenvalue with large condition number. Conversely, any matrix with an ill-conditioned eigenvalue is close to a defective matrix [18, 22].

For the spectral and the Frobenius norms, the problem has been studied intensively by Wilkinson [22,23,24] as well as by other researchers [2, 4, 5, 10, 14, 18]. In the works [1, 3, 13, 15], generalizations of Wilkinson’s problem for the cases of prescribed eigenvalues or multiplicities and matrix pencils are studied. However, several aspects of the problem still need further clarification.

The present paper is devoted to the stated problem for the case of Frobenius norm. It is organized as follows.

In Sect. 2, we start with algebraic background for the stated problem. We first detail the structure of the set $ \mathbb D $ in the matrix space. The cornerstone notion here is the discriminant of a characteristic polynomial of a matrix. Being a polynomial function in the entries of the matrix, the discriminant permits one to translate the problem of evaluation of $ d(A, \mathbb D)$ to that of finding the distance from a point to an algebraic manifold in the matrix space. This makes it possible to attack the problem within the framework of the approach already exploited by the present authors in the preceding studies [11, 12] on the distance to instability in the matrix space. The approach is aimed at the construction of the so-called distance equation, i.e., the univariate equation whose zero set contains all the critical values of the squared distance function. Its construction is theoretically feasible via application of symbolic methods for elimination of variables in an appropriate multivariate algebraic system. Unfortunately, the practical realization faces the variable flood difficulty, where the number of variables grows rapidly with the order of the matrix.

To bypass this, we reformulate the problem in terms of the minimal perturbation matrix. In Sect. 3, we prove that this matrix is a rank 1 matrix. Then we reduce the problem of its finding to that of a constrained optimization n-variate problem with an objective function of order 4. Some examples are presented illuminating the applicability of the developed algorithm.

The discovered property of the perturbation matrix makes it possible to look at the problem from the other side. Generically, the 2-norm of a matrix does not equal its Frobenius norm. However, for the rank 1 matrix (and this is exactly the case of the minimal perturbation matrix), these norms coincide. This allows one to verify the results obtained in the framework of symbolic approach with the counterpart obtained for the 2-norm case [14]. This issue is discussed in Sect. 4 while in Sect. 5, both approaches are illustrated for three classes of matrices where the distance $d(A, \mathbb D)$ can be explicitly expressed via the eigenvalues of A. These happen to be symmetric, skew-symmetric and orthogonal matrices. Quite unexpected for the authors became the fact that, for some classes, each of their representative had a continuum of nearest matrices in $ \mathbb D $.

Notation. For a matrix $A \in \mathbb R^{n\times n} $, $f_A(\lambda ) $ denotes its characteristic polynomial, ${\text {adj}} (A)$ stands for its adjoint matrix, $ d(A, \mathbb D) $ denotes the distance from A to the set $ \mathbb D $ of matrices possessing a multiple eigenvalue. $ E_{*} $ and $ B_{*} = A+ E_{*} $ stand for, correspondingly, the (minimal) perturbation matrix and the nearest to A matrix in $ \mathbb D$ (i.e., $d(A,\mathbb D)=\Vert A- B_{*}\Vert $); we then term by $\lambda _{*}$ the multiple eigenvalue of $B_{*}$. I (or $ I_n$) denotes the identity matrix (of the corresponding order). $\mathcal D$ (or $\mathcal D_{\lambda } $) denotes the discriminant of a polynomial (with subscript indicating the variable).

Remark. All the computations were performed in CAS Maple 15.0. (LinearAlgebra package and functions discrim, and resultant). Although all the approximate computations have been performed within the accuracy $10^{-40}$, the final results are rounded to $ 10^{-6} $.

2 Algebraic Preliminaries

It is well-known that in the $(n+1)$-dimensional space of the polynomial $ f(\lambda )=a_0\lambda ^n + a_1\lambda ^{n-1} +\dots +a_n, n \ge 2 $ coefficients, the manifold of polynomials with multiple zeros is defined by the equation

$$\begin{aligned} D(a_0,a_1,\dots ,a_n)=0 \quad {where} \ D:=\mathcal D_{\lambda }(f(\lambda )) \end{aligned}$$

(1)

denotes the discriminant of the polynomial. Discriminant can be represented in different ways, for instance, as the Sylvester determinant

$$ \mathcal D_{\lambda }(a_0\lambda ^4+a_1\lambda ^3+ a_2\lambda ^2 +a_3\lambda +a_4)= \frac{1}{4^2}\left| \begin{array}{rrrrrr} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 &{} 0 &{} 0 \\ 0 &{} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 &{} 0 \\ 0 &{} 0 &{} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 \\ 0 &{} 0 &{} 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 \\ 0 &{} 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 &{} 0 \\ 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 &{} 0 &{} 0 \end{array} \right| \, . $$

The discriminant $ D(a_0,a_1,\dots ,a_n) $ is a homogeneous polynomial over $ \mathbb Z_{} $ of order $2n-2$ in its variables, and it is irreducible over $ \mathbb Z_{} $.

The following result [16] is much less known.

Theorem 1 (Jacobi)

If $f(\lambda )$ possesses a unique multiple zero $ \lambda _{*} $ and its multiplicity equals 2, then the following ratio is valid

$$\begin{aligned} 1 : \lambda : \lambda ^2 : \dots : \lambda ^n = \frac{\partial D}{\partial a_n} : \frac{\partial D}{\partial a_{n-1}} : \frac{\partial D}{\partial a_{n-2}} : \dots : \frac{\partial D}{\partial a_{0}} \, . \end{aligned}$$

(2)

To solve the problem stated in Introduction, one needs to transfer the discriminant manifold (1) into the matrix space. The corresponding manifold is then defined by a homogeneous polynomial of order $ n(n-1)$ in the matrix entries:

$$\begin{aligned} \mathfrak D(B):=\mathcal D_{\lambda }(f_B(\lambda ))=0\ . \end{aligned}$$

(3)

We will further denote this manifold in $ \mathbb R^{n^2} $ as $ \mathbb D $. The problem of distance evaluation between a given matrix A and $ \mathbb D $ can be viewed as a constrained optimization problem:

$$\begin{aligned} d^2(A, \mathbb D)=\min _{B\in \mathbb R^{n\times n}} \Vert B-A\Vert ^2\ \text {subject\, to }\ (3) \, . \end{aligned}$$

(4)

Consider the Lagrange function for this problem

$$ F(B,\mu ):= \Vert B-A\Vert ^2 - \mu \mathfrak D(B) \, . $$

Evidently, $ \partial F / \partial \mu = 0 $ is equivalent to (3). Differentiation with respect to the entries of B yields

$$\begin{aligned} 2 (b_{jk}-a_{jk}) - \mu \partial \mathfrak D(B) /\partial b_{jk}=0\ \text { for } \ \{j,k\} \subset \{1,\dots ,n\} . \end{aligned}$$

(5)

Since the system (3)–(5) is an algebraic one, it admits application of symbolic methods of elimination of variables. We attach to the considered system an extra equation

$$\begin{aligned} z= \Vert B-A\Vert ^2 \end{aligned}$$

(6)

and then aim at finding the so-called distance equation

$$ \mathcal F(z)=0 $$

resulting from the elimination of all the variables but z from this system. Positive zeros of this equation are the critical values of the squared distance function for the problem (4).

Example 1

For the matrix $ A =[a_{jk}]_{j,k=1}^2 $ with the characteristic polynomial $ f_A(\lambda )$, the system (5) is linear with respect to $\{b_{jk}\}_{j,k=1}^2 $ and the distance equation is easily computed as

$$ \mathcal F(z):=4096(a_{12}-a_{21})^2\left[ (a_{11}-a_{22})^2 + (a_{12}+a_{21})^2 \right] $$

$$\begin{aligned} \times \left\{ \left[ 4z- \mathcal D_{\lambda } (f_A(\lambda )) \right] ^2-16(a_{12}-a_{21})^2z \right\} =0. \end{aligned}$$

(7)

It turns out that for any matrix A such that $ \mathcal D_{\lambda } (f_A(\lambda )) \ne 0 $, the distance equation is the quadratic one (7) where $ d^2(A,\mathbb D)$ equals its minimal zero.

For the matrix

$$ A=\left[ \begin{array}{rr} s &{} t \\ -t &{} s \end{array} \right] \quad {\mathrm where} \ t > 0, $$

polynomial $ \mathcal F(z)$ vanishes identically. Equation (7) possesses a multiple zero, namely $z=t^2$, and $ d(A, \mathbb D)=t $. Surprisingly, this distance is provided by a continuum of perturbation (and thus nearest in $\mathbb D $) matrices, namely

$$ E_{*}= \frac{t}{2} \left[ \begin{array}{cc} \sin \varphi &{} -1 + \cos \varphi \\ 1 + \cos \varphi &{} -\sin \varphi \end{array} \right] , \ \quad {\mathrm where} \ \varphi \in [0, 2 \pi ). $$

This example causes an anxious expectation of difficulties to appear while solving the stated distance evaluation problem for the case of orthogonal or skew-symmetric matrices A. $\square $

For a general case, computation of the distance equation via the solution of the system (3)–(5)–(6) is a hardly executable task due to a drastic increase in the number of variables (i.e., the entries of matrix B) to be eliminated. To overcome this difficulty, let us reformulate the problem in terms of the entries of the perturbation matrix.

3 Distance Equation and Perturbation Matrix

Theorem 2

Matrices $ E_{*} $ are $ B_{*} $ are linked by the equality

$$\begin{aligned} E_{*}=\varkappa \left[ f_{*}(B_{*}) \right] ^{\top }, \end{aligned}$$

(8)

were

$$ f_{*} (\lambda ):= \frac{f_{B_{*}}(\lambda )}{\lambda -\lambda _{*}}, $$

and $ \varkappa \in \mathbb R $ is some scalar.

Proof

We start with system (5) resulting from application of the Lagrange method to problem (4). Compute $ \partial \mathfrak D(B) /\partial b_{jk} $ as a composite function with the coefficients of characteristic polynomial $f_B(\lambda )=\lambda ^n+p_1\lambda ^{n-1}+\dots +p_n $ treated as intermediate variables:

$$ \frac{\partial \mathfrak D(B)}{\partial b_{jk}} = \frac{\partial \mathfrak D(B)}{\partial p_0} \frac{\partial p_0}{\partial b_{jk}} + \frac{\partial \mathfrak D(B)}{\partial p_1} \frac{\partial p_1}{\partial b_{jk}} + \dots + \frac{\partial \mathfrak D(B)}{\partial p_n} \frac{\partial p_n}{\partial b_{jk}}. $$

(We set here $ p_0:=1$ and thus the first term in the right-hand side is just 0). Under the condition $ \mathfrak D(B)=0 $ (i.e., the matrix $ B=B_{*} $ possesses a multiple eigenvalue $ \lambda _{*} $), the Jacobi ratio (2) is fulfilled

$$ 1 : \lambda _{*} : \lambda _{*}^2 : \dots : \lambda _{*}^n = \frac{\partial \mathfrak D}{\partial p_n} : \frac{\partial \mathfrak D}{\partial p_{n-1}} : \frac{\partial \mathfrak D}{\partial p_{n-2}} : \dots : \frac{\partial \mathfrak D}{\partial p_{0}} \, . $$

Therefore,

$$ \frac{\partial \mathfrak D}{\partial p_{\ell }}= \kappa \lambda _{*}^{n-\ell } \ {\mathrm for } \ \ell \in \{1,\dots ,n\} $$

and for some constant $ \kappa \in \mathbb R$. Consequently

$$ \frac{\partial \mathfrak D(B)}{\partial b_{jk}} = \kappa \left( \lambda _{*}^n \frac{\partial p_0}{\partial b_{jk}} + \lambda _{*}^{n-1} \frac{\partial p_1}{\partial b_{jk}} + \dots + \frac{\partial p_n}{\partial b_{jk}} \right) = \kappa \frac{\partial f_B(\lambda _{*})}{\partial b_{jk}}. $$

The preceding considerations lead to a conclusion that the system of Eqs. (5) is equivalent to the matrix equation

$$ 2(B_{*} - A)=\mu \kappa \, \partial f_B(\lambda _{*}) / \partial B \big |_{B=B_{*}}. $$

Next utilize the formula of differentiation of characteristic polynomial with respect to the matrix [19]:

$$ \partial f_B(\lambda )/ \partial B=\left[ {\text {adj}}(\lambda I - B) \right] ^{\top }. $$

Equality (8) then follows from the representation of the adjoint matrix for $ \lambda _{*} I - B_{*} $ as $ f_{*}(B_{*}) $ with $ f_{*} (\lambda ) $ standing for the quotient on division of $ f_B(\lambda ) $ by $ \lambda - \lambda _{*} $.

Corollary 1

Matrices $ E_{*}^{\top } $ and $ B_{*} $ commute and

$$ (\lambda _{*}I-B_{*})E_{*}^{\top }= \mathbb O_{n\times n}. $$

Corollary 2

If A does not have a multiple eigenvalue, then $ E_{*} $ is the rank 1 matrix with only zero eigenvalues.

Proof

Matrix $ f_{*}(B_{*}) = {\text {adj}}(\lambda _{*} I - B_{*}) $ is the rank 1 matrix, since its columns are the eigenvectors of the matrix $ B_{*} $ corresponding to $\lambda _{*}$ (Cayley–Hamilton theorem).

We next prove that ${\text {tr}} ({\text {adj}}(\lambda _{*} I - B_{*})=0 $. For any matrix B with spectrum $\{\lambda _j\}_{j=1}^n $, matrix $ {\text {adj}}(\lambda I-B) $ has the spectrum [17] (part VII, problem 48):

$$ \left\{ \frac{f_B(\lambda )}{\lambda -\lambda _j} \right\} _{j=1}^n \, . $$

Thus,

$$ {\text {tr}}(E_{*})= {\text {tr}}({\text {adj}}(\lambda _{*} I-B_{*})) = f_{B_{*}}^{\prime }(\lambda _{*})=0. $$

Corollary 3

Matrix $ E_{*} $ is normal to $ B_{*} $, i.e., ${\text {tr}} (B_{*}^{\top }E_{*})=0$.

Corollary 4

$ {\text {tr}}(B_{*}) = {\text {tr}}(A) $.

Theorem 3

The value $ d^2(A,\mathbb D) $ is contained in the set of critical values of the function

$$\begin{aligned} G(U):=U^{\top }A A^{\top } U - \left( U^{\top }AU \right) ^2 \ { subject\ to} \ U^{\top }U=1, U \in \mathbb R^n \, \end{aligned}$$

(9)

If $ U_{*} $ is the vector providing $ d^2(A,\mathbb D) $, then the perturbation matrix can be computed by the formula

$$\begin{aligned} E_{*}=U_{*} U_{*}^{\top } (\kappa I-A) \quad { where } \ \kappa := U_{*}^{\top } A U_{*}. \end{aligned}$$

(10)

Proof

Due to Corollary 2, the singular value decomposition for the perturbation matrix E is represented as

$$\begin{aligned} E=\sigma U\cdot V^{\top } \end{aligned}$$

(11)

under restrictions

$$\begin{aligned} U^{\top } U=1,\ V^{\top } V=1,\ U^{\top } V=0. \end{aligned}$$

(12)

From the condition $ {\text {tr}}((A+E)E^{\top })=0 $ we deduce that $ \sigma =- {\text {tr}}(AVU^{\top })=-U^{\top }AV. $ Formulate the constrained optimization problem

$$\begin{aligned} \min (-U^{\top }AV) \quad {\mathrm {subject\ to}}\ (12). \end{aligned}$$

(13)

The derivatives of the corresponding Lagrange function

$$ L(U,V,\mu _1,\mu _2,\mu _3):= -U^{\top }AV - \mu _1( U^{\top } U-1)-\mu _2( V^{\top } V-1)-\mu _3 U^{\top } V $$

result in the system of linear equations

$$\begin{aligned} \partial L / \partial U= & {} - AV-2\mu _1 U-\mu _3 V =\mathbb O_{n\times 1} , \end{aligned}$$

(14)

$$\begin{aligned} \partial L / \partial V= & {} -A^{\top }U-2\, \mu _2 V - \mu _3 U = \mathbb O_{n\times 1} \end{aligned}$$

(15)

with respect to U and V. Multiplication of (14) by $ U^{\top } $ and (15) by $ V^{\top } $ results (in accordance with (12)) in

$$\begin{aligned} \mu _3 = -V^{\top }AV=-U^{\top }AU. \end{aligned}$$

(16)

Multiplication of (14) by $U^{\top } $ while (15) by $V^{\top } $ yields

$$\begin{aligned} -2\mu _1 = -2\mu _2 = U^{\top }AV=-\sigma \end{aligned}$$

(17)

and, provided this value is not 0,

$$\begin{aligned} V=-\frac{1}{2\mu _2} (A^{\top }+\mu _3 I)U. \end{aligned}$$

(18)

Substituting (18) in (11) and taking into account (16), we arrive at (10).

If $ \mu _1=\mu _2=0 $ then system (14)–(15) is reduced to $AV=-\mu _3V, A^{\top }U=-\mu _3U$. This implies that the matrix A should possess a real eigenvalue $ \kappa _1 $ with the corresponding right and left eigenvectors $ V_1 $ and $U_1$ satisfying the condition $U_1^{\top }V_1=0$. We claim that, in this case, matrix A has a multiple eigenvalue. For the sake of simplicity, we prove this statement under an extra assumption that all the eigenvalues $\kappa _1,\dots ,\kappa _n $ of A are real. Suppose, by contradiction, that they are distinct. One has then

$$ \kappa _1 U_1^{\top }V_j = U_1^{\top } A V_j=\kappa _j U_1^{\top }V_j \quad \Rightarrow \ U_1^{\top }V_j=0 \quad {for} \ j\in \{2,\dots ,n\} $$

and for $V_j$ standing for the right eigenvector corresponding to $ \kappa _j $. Therefore, $U_1$ is normal to all the vectors $V_1,V_2,\dots , V_n$ composing a basis of $ \mathbb R^n$. The contradiction proves the assertion. The statement of the theorem remains valid with the corresponding critical value of (9) equal to 0. $\square $

To find the critical values of the function (9), the Lagrange multipliers method is to be applied with the objective function $ G(U)-\mu (U^{\top }U -1 )$. This results into the system

$$\begin{aligned} A A^{\top } U - \left( U^{\top }AU \right) (A+A^{\top })U-\mu U = \mathbb O_{n\times 1} \end{aligned}$$

(19)

where every equation is now just cubic with respect to the entries of U. This is an essential progress compared to the system (3)–(5)–(6), and makes it feasible to manage the procedure of elimination of variables from the system (19) accomplished with $ z-G(U)=0 $ and $U^{\top }U=1$ (at least for the matrices of the order $ n \le 8 $).

Unfortunately, the new system possesses some extraneous solutions, i.e., those not corresponding to the critical values of the distance function.

Example 2

For the matrix

$$ A=\left[ \begin{array}{rr} 0 &{} 1 \\ 13 &{} -6 \end{array} \right] , $$

the system

$$ u_1^2+u_2^2=1, \ u_2 \partial G / \partial u_1 - u_1 \partial G / \partial u_2=0 $$

possesses solutions

$$ u_1=\pm \frac{1}{58} \sqrt{2900+82\sqrt{22}}, \ u_2 = \pm \frac{1}{58} \sqrt{464-87\sqrt{22}} $$

that yield the value $ z= 0$. The true distance equation is given by (7), and $ d^2(A,\mathbb D)=-12\sqrt{58}+94$ is provided by another solution of the system, namely

$$ u_1= \pm \frac{1}{58} \sqrt{1682+203\sqrt{58}}, \ u_2= \pm \frac{1}{58} \sqrt{1682-203\sqrt{58}}. $$

$\square $

The appearance of such extraneous solutions is caused by the non-equivalence of the passage from the original stated problem to that from Theorem 3. For instance, representation (10) is deduced under an extra condition of non-vanishing of value (17).

Example 3

For the Frobenius matrix

$$ A=\left[ \begin{array}{rrr} 0 &{} 1 &{}0\\ 0 &{} 0 &{} 1 \\ -91 &{} -55 &{}-13 \end{array} \right] , $$

the distance equation

$$ \mathcal F(z):= 33076090700402342058246544\,z^6-377039198861306289080145178864 z^5 $$

$$ +\, 937864902703881321034450183916\,z^4-771868276098720970149792503999\,z^3 $$

$$ +\, 211070978787821517684022650624\,z^2 -510584100140452518540394496\,z $$

$$ +\, 319295875259784560640000 =0 $$

possesses the following real zeros

$$ z_1\approx 0.739335,\ z_2 \approx 0.765571,\ z_3 \approx 0.980467,\ z_4 \approx 11396.658548. $$

One has $d(A, \mathbb D) = \sqrt{z_1} \approx 0.859846$ and

$$ E_{*}\approx \left[ \begin{array}{rrr} 0.198499 &{} -0.195124 &{}-0.530440\\ 0.204398 &{} -0.200922 &{} -0.546202 \\ -0.000907&{} 0.000891 &{} 0.002424 \end{array}\right] ,\ $$

$$ B_{*}=A+E_{*}\approx \left[ \begin{array}{rrr} 0.198499 &{} 0.804875 &{} -0.530440\\ 0.204398&{} -0.200923 &{} 0.453797\\ -91.000907&{} -54.999108 &{} -12.997576 \end{array}\right] . $$

The latter matrix possesses the double eigenvalue $ \lambda _{*} \approx 0.824777 $. $\square $

Example 4

For the matrix

$$ A=\left[ \begin{array}{rrrr} 5&{} -36 &{} -57 &{} 85\\ 80 &{} 90 &{} 74 &{} 27\\ 9 &{} -91 &{} 81 &{} 65\\ -12&{} 78&{} 5 &{} -63 \end{array} \right] , $$

the distance equation is represented by the order 12 irreducible over $ \mathbb Z$ polynomial $\mathcal F(z) $ with the absolute value of coefficients up to $ 10^{100}$. Its real zeros are

$$ z_1\approx 87.614714, z_2\approx 2588.509661, \ z_3\approx 17853.256334,\ z_4 \approx 32194.078324. $$

One has $d(A, \mathbb D) = \sqrt{z_1} \approx 9.360273$ and

$$ E_{*}\approx \left[ \begin{array}{rrrr} 3.350324 &{} -0.177130 &{} -3.704042 &{} -0.328216\\ 2.489713 &{} -0.131630 &{} -2.752569 &{} -0.243906\\ 2.565863 &{} -0.135656 &{} -2.836760 &{} 0.251366 \\ 3.898666 &{} -0.206121 &{} -4.310276 &{} 0.381935 \end{array}\right] , $$

with the matrix

$$ B_{*}=A+E_{*} \approx \left[ \begin{array}{rrrr} 8.350324 &{} -36.177130 &{} -60.704042&{} 84.671784\\ 82.489713&{} 89.868370 &{} 71.247430&{} 26.756094\\ 11.565863 &{} -91.135656 &{} 78.163240 &{} 64.748634\\ -8.101333 &{} 77.793879&{} 0.689724&{} -63.381935 \end{array}\right] $$

possessing the double eigenvalue $ \lambda _{*} \approx 69.081077 $. $\square $

Some empirical conclusions resulting from about 30 generated matrices of the orders up to $ n= 20 $. Generically,

(a)
The extraneous factor equals $z^n$, and on its exclusion one has
(b)
the order of the distance equation $\mathcal F(z)=0$ equals $ n(n-1)$, and, if computed symbolically w.r.t. the entries of A, $ \mathcal F(0)$ has a factor $ \left[ \mathcal D_{\lambda } (f_A(\lambda )) \right] ^2 $;
(c)
$ d^2(A,\mathbb D)$ equals the minimal positive zero of this equation.

Complete computational results for some examples are presented in [20]. For the matrices A with integer entries within $[-99,+99] $ (generated by Maple 15.0. RandomMatrix package) we point out some complexity estimates for the distance equation computation (PC AMD FX-6300 6 core 3.5 GHz)

n	$ \deg \mathcal F(z) $	coefficient size	number of real zeros	timing (s)
5	20	${\sim } 10^{170} $	10	0.03
10	90	${\sim } 10^{780} $	28	0.13
20	380	${\sim } 10^{3500} $	36	1940

The adequacy of the results has been extra checked via the nearest matrix $ B_{*}$ computation. This matrix should

(a)
possess a double eigenvalue;
(b)
have the value $ \Vert B_{*}-A \Vert $ equal to the square root of the least positive zero of $\mathcal F(z)$;
(c)
satisfy the system of equations (3)–(5) (this property has been tested only for the orders $n\le 8$);
(d)
have the number of real eigenvalues which differs from that of the matrix A at most by 2.

4 Singular Values

Let $A\in \mathbb {R}^{n\times n}$ be a nonsingular matrix with the singular value decomposition as follows

$$\begin{aligned} A=WD_n V^{\top }, \end{aligned}$$

(20)

where $D_n=\mathrm{diag}\,\{\sigma _1,\sigma _2, \ldots ,\sigma _n\}$, with singular values $\sigma _1\ge \sigma _2\ge \ldots \ge \sigma _n>0$.

The following result [6, 8] gives us the distance to the nearest matrix with rank $k<n$.

Theorem 4

One has

$$\min _{\mathrm{rank\,}B=k} ||A-B||=||A-A_k||=\left\{ \begin{array}{ll} \sigma _{k+1},&{}\mathrm{for\ the} \ \textit{2}\text {-}norm,\\ \displaystyle \left[ \sum _{i=k+1}^n\sigma _i^2\right] ^{1/2} &{}\mathrm{for\ the \ Frobenius\ norm.} \end{array}\right. $$

Here

$$A_k=WD_k V^{\top },\quad D_k=\mathrm{diag}\,\{\sigma _1,\sigma _2, \ldots ,\sigma _k,0,\ldots ,0\}.$$

According to this theorem, the Frobenius distance from the nonsingular A to the set of matrices with multiple eigenvalues satisfies the following inequality:

$$ d(A,\mathbb D) \le \sqrt{\sigma _{n-1}^2+\sigma _n^2}. $$

As for the distance $ d(A,\mathbb D) $ in the 2-norm, the following result [14] is known:

Theorem 5

Let the singular values of the matrix

$$\begin{aligned} M=\left[ \begin{array}{cc} A-\lambda I_n&{}\gamma I_n\\ \mathbb {O}_{n\times n}&{}A-\lambda I_n \end{array}\right] \end{aligned}$$

(21)

be ordered like $\sigma _1(\lambda ,\gamma )\ge \sigma _2(\lambda ,\gamma )\ge \ldots \ge \sigma _{2n}(\lambda ,\gamma ) \ge 0 $. Then one has

$$d(A,\mathbb D)=\min _{\lambda \in \mathbb {C}}\max _{\gamma \ge 0} \sigma _{2n-1}(\lambda ,\gamma ).$$

It is well-known that for the matrix $A \in \mathbb R^{n\times n}, n\ge 2$, Frobenius norm and the 2-norm are related by the inequality [7]

$$||A||_2\le ||A||_F\le \sqrt{n}||A||_2.$$

It is also known, that $ ||A||_2=||A||_F $ iff $ {\text {rank}}(A)= 1 $. According to Corollary 2, both norms coincide for the minimal perturbation $E_{*}$. This results in an algorithm for $d(A,\mathbb D)$ computation that is an alternative to that treated in Sect. 3.

To find singular values of the matrix (21), i.e., zeros of the polynomial

$$\begin{aligned} \det (MM^{\top }-\mu I_{2n}) \end{aligned}$$

(22)

$$=\det \left[ \begin{array}{cc} (A-\lambda I_n)(A-\lambda I_n)^{\top }+\gamma ^2I_{n}-\mu I_{n}&{}\gamma (A-\lambda I_n)^{\top }\\ \gamma (A-\lambda I_n)&{}(A-\lambda I_n)(A-\lambda I_n)^{\top } - \mu I_{n} \end{array}\right] $$

treated with respect to $\mu $, is a nontrivial task. We will restrict our consideration to the classes of matrices A where application of Schur formula for the determinant of the block matrix (22) is possible, i.e., transforming it into

$$\begin{aligned} \det (\mu ^2 I_n- \mu [2(A-\lambda I_n)(A-\lambda I_n)^{\top }+\gamma ^2I_n]+[(A-\lambda I_n)(A-\lambda I_n)^{\top }]^2). \end{aligned}$$

(23)

These happen to be symmetric, skew-symmetric, and orthogonal matrices. Singular values of the matrix (21) can be expressed explicitly via the eigenvalues of this matrix.

5 Distance via Matrix Eigenvalues

5.1 Symmetric Matrix

Theorem 6

Let A be a symmetric matrix with distinct eigenvalues $\lambda _1,\lambda _2,$ $\ldots ,\lambda _n$. Then

$$d(A,\mathbb D)= \frac{1}{2} \min _{1\le k<\ell \le n}|\lambda _k-\lambda _{\ell }|. $$

If this minimum is attained at the eigenvalues $\lambda _2 $ and $ \lambda _1, \lambda _2 > \lambda _1$, then the perturbation can be found as

$$\begin{aligned} E_{*}=\frac{1}{4}(\lambda _2-\lambda _{1})(P_1+P_{2})(P_1-P_2)^{\top }, \end{aligned}$$

(24)

where $P_1$ and $P_2$ are the eigenvectors of A corresponding to $\lambda _1$ and $\lambda _{2}$ respectively with $\Vert P_1\Vert =\Vert P_2\Vert =1$.

Remark. Generically, matrices $E_{*}$ and $B_{*}=A+E_{*}$ are not the symmetric ones.

Proof

For $j \in \{1,\dots , m\} $, denote $P_j $ the eigenvector of A corresponding to $ \lambda _j $ with $\Vert P_j\Vert =1 $. Then $ P=(P_1,P_2,\ldots ,P_n)$ is the orthogonal matrix such that

$$ P^{\top }A P = \varLambda \quad {\mathrm {where}} \ \varLambda =\mathrm{diag}\,\{\lambda _1,\lambda _2,\ldots ,\lambda _n\}. $$

Since the orthogonal transformation does not influence the Frobenius distance, we reduce $d(A,\mathbb D)$ to $d(\varLambda ,\mathbb D)$.

In this case, $\varLambda -\lambda I=(\varLambda -\lambda I)^{\top }$ and these matrices commute. Hence, the expression (23) is valid. Therefore, the singular values of the matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2(\lambda _j-\lambda )^2+\gamma ^2)+(\lambda _j-\lambda )^4,\ j\in \{ 1,2,\ldots ,n\},$$

namely

$$\mu ^{(j)}_{1,2}=\frac{2(\lambda _j-\lambda )^2+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4(\lambda _j-\lambda )^2}}{2}.$$

Differentiating w.r.t. $\gamma $, we get the single stationary point $\gamma =0$. According to [14], to find the 2-norm distance from $A-\lambda I$ to the manifold of matrices with multiple zero eigenvalue, one should find the singular values $\sigma _n$ and $\sigma _{n-1}$ for the matrix $(A-\lambda I)$. They are $|\lambda _k-\lambda |$ and $|\lambda _{\ell }-\lambda |$ for some $k,\ell $. The minimal w.r.t. $\lambda $ value of $\sigma _{n-1}$ comes up to $|\lambda _k-\lambda _{\ell }|/2$ where $\lambda _k-\lambda =\lambda -\lambda _{\ell }$.

Assume that

$$\min _{1\le k<\ell \le n}|\lambda _k-\lambda _{\ell }|=|\lambda _1-\lambda _2|.$$

Denote

$$Q:=\left[ \begin{array}{rcccc} \frac{1}{\sqrt{2}}&{}-\frac{1}{\sqrt{2}}&{}0&{}\ldots &{}0\\ \frac{1}{\sqrt{2}}&{} \frac{1}{\sqrt{2}}&{}0&{}\ldots &{}0\\ 0&{}0&{}1&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}1 \end{array}\right] ,\ { \mathrm then\ } Q^{\top }\varLambda Q=\left[ \begin{array}{ccccc} \frac{\lambda _1+\lambda _2}{2}&{}\frac{\lambda _2-\lambda _1}{2}&{}0&{}\ldots &{}0\\ \frac{\lambda _2-\lambda _1}{2}&{}\frac{\lambda _1+\lambda _2}{2}&{}0&{}\ldots &{}0\\ 0&{}0&{}1&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}1 \end{array}\right] .$$

For this matrix, $\tilde{E}_{*}=\left[ \begin{array}{ccccc} 0&{}\frac{\lambda _1-\lambda _2}{2}&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] $. Obviously, we get

$$E_{*}=QP\tilde{E}_{*}P^{\top }Q^{\top }=\frac{\lambda _1-\lambda _2}{4}(P_1+P_2)(P_1-P_2)^{\top }.$$

$\square $

Example 5

For the matrix

$$A=\frac{1}{9}\left[ \begin{array}{rrr} -269&{}-98&{}76\\ -98&{}-296&{}22\\ 76&{}22&{}-209 \end{array}\right] , $$

one has

$$\lambda _1=-45,\lambda _2=-25,\lambda _3=-16 \ , P_1= [2/3,2/3,-1]^{\top }, P_2=[-1/3,2/3,2/3]^{\top }.$$

$$d(A,\mathbb D)=\frac{|-25 +16|}{2}=\frac{9}{2} \quad {and} \ E_{*}= \left[ \begin{array}{rrr} -3/4 &{} 3/4 &{} 0 \\ -3/4 &{} 3/4 &{} 0 \\ -3 &{} 3 &{} 0 \end{array} \right] . $$

5.2 Skew-Symmetric Matrix

Theorem 7

Let the nonzero eigenvalues of a skew-symmetric matrix A be

$$\pm b_1\mathbf {i},\pm b_2\mathbf {i},\ldots ,\pm b_{m}\mathbf {i} \ {\mathrm where} \ 0<b_1<b_2<\ldots <b_m. $$

Then

$$d(A, \mathbb D)=b_1 $$

and the minimal perturbation can be found as

$$\begin{aligned} E_{*}=-b_1 \Re (P_1) \Im (P_1)^{\top }, \end{aligned}$$

(25)

where $P_1$ is the eigenvector of A corresponding to the eigenvalue $ b_1{\textbf {i}} $ with $\Vert \Re (P_1)\Vert $ =$\Vert \Im (P_1)\Vert =1 $.

Proof

For $j \in \{1,\dots , m\} $, denote $P_j $ the eigenvector of A corresponding to $b_j {\textbf {i}} $ with $\Vert \Re (P_j)\Vert =\Vert \Im (P_j)\Vert =1 $. If A possesses the zero eigenvalue, denote by $P_{0}$ the corresponding eigenvector with $\Vert P_{0} \Vert =1 $. Then the orthogonal matrix

$$P=(\Re (P_1),\Im (P_1),\Re (P_2),\Im (P_2),\ldots ,\Re (P_m),\Im (P_m),\{P_{0}\})$$

is such that

$$P^{\top }A P = \varUpsilon \quad {\mathrm {where} } \ \varUpsilon :=\mathrm{diag}\,\{\varUpsilon _1,\varUpsilon _2,\ldots ,\varUpsilon _m,\{0\}\},$$

$$\varUpsilon _k:= \left[ \begin{array}{cc} 0&{}b_k\\ -b_k&{}0 \end{array}\right] , k\in \{1,2,\ldots ,m\} $$

(we set in braces the entries of the matrices corresponding to the case of existence of zero eigenvalue for A).

Since an orthogonal transformation does not influence the Frobenius distance, we reduce $d(A,\mathbb D)$ to $d(\varUpsilon ,\mathbb D)$. In this case,

$$(\varUpsilon -\lambda I)(\varUpsilon -\lambda I)^{\top }=\mathrm{diag}\,\{ \tilde{\varUpsilon }_1,\tilde{\varUpsilon }_2,\ldots ,\tilde{\varUpsilon }_m,\{ 0\} \},$$

where

$$\tilde{\varUpsilon }_k:=\left[ \begin{array}{cc} b_k+\lambda ^2&{}0\\ 0&{}b_k^2+\lambda ^2 \end{array}\right] \ {for} \ k\in \{1,\dots ,m\}. $$

It is evident that

$$(\varUpsilon -\lambda I)(\varUpsilon -\lambda I)^{\top }(\varUpsilon -\lambda I)=(\varUpsilon -\lambda I)^2(\varUpsilon -\lambda I)^{\top }.$$

Hence, the expression (23) is valid.

Therefore, the singular values of matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2(\lambda +b_k)^2+\gamma ^2)+(\lambda ^2+b_k^2)^2,\ k\in \{1,2,\ldots ,m\},$$

namely

$$\mu ^{(k)}_{1,2}=\frac{1}{2}\left[ 2(\lambda +b_k)^2+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4(\lambda ^2+b_k^2)}\right] .$$

Differentiating w.r.t. $\gamma $, we get a single stationary point $\gamma =0$. According to [14], to find the 2-norm distance from $\varUpsilon -\lambda I$ to the manifold of matrices with multiple zero eigenvalue, it is sufficient to compute the singular values $\sigma _n$ and $\sigma _{n-1}$ of this matrix. They are

$$ \mathrm{either}\ \sigma _n=\sigma _{n-1}=\sqrt{b_k^2+\lambda ^2} \ \mathrm{for\ some} \ k, \ \mathrm{or}\ \sigma _{n-1}=\sqrt{b_k^2+\lambda ^2},\sigma _n=|\lambda |. $$

The minimal w.r.t. $\lambda $ value of $\sigma _{n-1}$ comes up to $b_1$ when $\lambda =0$.

The corresponding perturbation

$$E_{*}=P\left[ \begin{array}{ccccc} 0&{}-b_1&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] P^{\top }=-b_1\Re (P_1) \Im (P_1)^{\top }. $$

$\square $

Corollary 5

In the notation of Theorem 7, the distance $d(A, \mathbb D) $ is provided by a continuum of perturbations $ E_{*} $ contained in the set

$$ \left\{ -b_1 (\eta \Re (P_1) +\theta \Im (P_1))(-\eta \Im (P_1)+\theta \Re (P_1))^{\top } \mid \{\eta , \theta \} \subset \mathbb R, \eta ^2+\theta ^2=1 \right\} . $$

5.3 Orthogonal Matrix

Theorem 8

Let $ n \ge 3$, and the eigenvalues of an orthogonal matrix A, other than $\pm 1$, be

$$\begin{aligned} \cos \alpha _1\pm \mathbf {i}\sin \alpha _1,\cos \alpha _2\pm \mathbf {i}\sin \alpha _2,\ldots ,\cos \alpha _{m}\pm \mathbf {i}\sin \alpha _{m}, \end{aligned}$$

(26)

where $ 0<\sin \alpha _1\le \sin \alpha _2\le \ldots \le \sin \alpha _m $. Then

$$\begin{aligned} d(A, \mathbb D)= \sin \alpha _1 \, , \end{aligned}$$

(27)

and the minimal perturbation can be found as

$$\begin{aligned} E_{*}=-(\sin \alpha _1) \Re (P_1) \Im (P_1)^{\top }, \end{aligned}$$

(28)

where $P_1$ is the eigenvector of A corresponding to the eigenvalue $\cos \alpha _1+{\textbf {i}}\sin \alpha _1$ with $||\Re (P_1)||=||\Im (P_1)||=1$.

We present two independent proofs for this result: the first one following from Theorem 3 while the second one exploiting the considerations of Sect. 4.

Proof

I. Since $ AA^{\top }=I $, the objective function (9) can be transformed into

$$ G(U)=1- \left( U^{\top }AU\right) ^2, $$

and system (19) is then replaced by

$$\begin{aligned} \left( U^{\top }AU\right) \left( A^{\top }+A\right) U - \mu U=\mathbb O . \end{aligned}$$

(29)

Multiply it by $U^{\top } A^{\top }$:

$$ \left( U^{\top }AU\right) \left[ U^{\top }(A^{\top })^2U + 1 -\mu \right] =0, $$

and we get two alternatives:

$$ \mathrm{either} \ U^{\top }AU=0 \quad \mathrm{or } \ \mu = 1+ U^{\top }A^2U. $$

If the second alternative takes place, substitute the expression for $ \mu $ into (29):

$$ \left( U^{\top }AU\right) (A^{\top }+A)U-(1+ U^{\top }A^2U)U=\mathbb O. $$

Wherefrom it follows that

$$\begin{aligned} (A^{\top }+A)U=\frac{1+ U^{\top }A^2U}{U^{\top }AU}U. \end{aligned}$$

(30)

If there exists a solution $ U=U_{*} \ne \mathbb O $ for this equation, then $ U_{*} $ is necessarily an eigenvector of $ A^{\top }+A $ corresponding to the eigenvalue

$$ \nu _{*}=(1+ U_{*}^{\top }A^2U_{*})/(U_{*}^{\top }AU_{*}). $$

Matrix $ A^{\top }+A $ is a symmetric one with the eigenvalues $ 2\cos \alpha _1, \dots , 2\cos \alpha _m $ of the multiplicity 2 and, probably, $\pm 2$. Substitution $ U=U_{*} $ into (30) and multiplication by $U_{*}^{\top } $ yields

$$ \nu _{*}=2U_{*}^{\top }AU_{*}=2 \cos \alpha _j \ \text{ for } \text{ some } \ j. $$

Therefore, the critical values of the function G(U) are in the set $\{1-\cos ^2 \alpha _j\}_{j=1}^m $. This results in (27).

The alternative $ U_{*}^{\top }AU_{*}=0 $ for $U_{*}^{\top }U_{*}=1 $ corresponds to the case where A possesses eigenvalues $\pm {\textbf {i}} $. The result (27) remains valid. $\square $

Proof

II. For $j \in \{1,\dots ,m\}$, denote by $ P_j$ the eigenvectors of A corresponding to the eigenvalue $\cos \alpha _j\pm \mathbf {i}\sin \alpha _j$ with $||\Re (P_j)||=||\Im (P_j)||=1$. Denote $P_{[1]}$ and $P_{[-1]}$ the eigenvectors corresponding to the eigenvalues 1 and $-1$ correspondingly (if any) with $\Vert P_{[1]}\Vert =\Vert P_{[-1]}\Vert =1$. Then the orthogonal matrix

$$ P=(\Re (P_1),\Im (P_1),\Re (P_2),\Im (P_2),\ldots ,\Re (P_m),\Im (P_m),\{ P_{[1]},P_{[-1]}\}) $$

is such that

$$ P^{\top }A P = \varOmega \quad { \mathrm {where} } \ \varOmega =\mathrm{diag}\,\{\varOmega _1,\varOmega _2,\ldots ,\varOmega _m,\{ 1,-1\} \}, $$

where

$$ \varOmega _k:=\left[ \begin{array}{cc} \cos \alpha _k&{}\sin \alpha _k\\ -\sin \alpha _k&{}\cos \alpha _k \end{array}\right] \quad \mathrm{for} \ k\in \{1,2,\ldots ,m\} $$

(we set in braces the entries of the matrices corresponding to the case of existence of either of eigenvalues 1 or $-1$ or both for A).

Since the orthogonal transformation does not influence the Frobenius distance, we reduce $d(A,\mathbb D)$ to $d(\varOmega ,\mathbb D)$. In this case,

$$(\varOmega -\lambda I)(\varOmega -\lambda I)^{\top }=\mathrm{diag}\,\{\tilde{\varOmega }_1\tilde{\varOmega }_2,\ldots ,\tilde{\varOmega }_m,\{ 1,1\} \},$$

where

$$\tilde{\varOmega }_k:=\left[ \begin{array}{cc} (\cos \alpha _k-\lambda )^2&{}0\\ 0&{}(\cos \alpha _k-\lambda )^2 \end{array}\right] \quad {\mathrm for} \ k\in \{1,2,\dots ,m\}.$$

It is evident that

$$(\varOmega -\lambda I)(\varOmega -\lambda I)^{\top }(\varOmega -\lambda I)=(\varOmega -\lambda I)^2(\varOmega -\lambda I)^{\top }.$$

Hence, expression (23) is valid. In this case, the singular values of the matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)+\gamma ^2)+(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k), $$

namely:

$$\mu ^{(k)}_{1,2}=\frac{2((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)}}{2}.$$

Differentiating w.r.t. $\gamma $, we get a single stationary point $\gamma =0$. According to [14], to find the 2-norm distance from $\varOmega -\lambda I$ to the manifold of matrices with multiple zero eigenvalue, one should find the singular values $\sigma _n$ and $\sigma _{n-1}$ of this matrix. They are either

$$\sigma _n=\sigma _{n-1}=\sqrt{(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k} $$

for some k or

$$ \sigma _{n-1}=\sqrt{(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k},\sigma _n=|1-\lambda | \, . $$

The minimal value of $\sigma _{n-1}$ w.r.t. $\lambda $ comes up to $\sin \alpha _1$ in both cases.

The minimal perturbation

$$E_{*}=P\left[ \begin{array}{ccccc} 0&{}-\sin \alpha _1&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \vdots &{}&{}&{}&{} \vdots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] P^{\top }=-(\sin \alpha _1)\Re (P_1) \Im (P_1)^{\top }.$$

$\square $

Corollary 6

In the notation of Theorem 8, the distance $d(A, \mathbb D) $ is provided by a continuum of perturbations $ E_{*} $ contained in the set

$$ \left\{ (-\sin \alpha _1) (\eta \Re (P_1) +\theta \Im (P_1))(-\eta \Im (P_1)+\theta \Re (P_1))^{\top } \mid \{\eta , \theta \} \subset \mathbb R, \eta ^2+\theta ^2=1\right\} . $$

Example 6

For the matrix

$$A=\frac{1}{3}\left[ \begin{array}{rrr} -2&{}-2&{}1\\ 1&{}-2&{}-2\\ -2&{}1&{}-2 \end{array}\right] , $$

one has

$$ \lambda _{1,2}=-\frac{1}{2}\pm \mathbf {i} \frac{\sqrt{3}}{2}, \lambda _3=-1, \ P_1=\left[ -\frac{2}{\sqrt{6}},\frac{1}{\sqrt{6}},\frac{1}{\sqrt{6}} \right] ^{\top } + \mathbf{i}\left[ -\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}},0 \right] ^{\top }. $$

Here $ d(A, \mathbb D) = \sqrt{3}/2\approx 0.866025 $ and there are infinite number corresponding perturbation matrices (10) generated by columns $ U_{*}$ chosen from the span of $\Re (P_1) $ and $ \Im (P_1)$. For instance:

$$ \begin{array}{c} U_{*}:=\Re (P_1) \\ \Downarrow \\ E_{*}=\left[ \begin{array}{rrr} 0&{}1/2&{}-1/2\\ 0&{}-1/4&{}1/4\\ 0&{}-1/4&{}1/4 \end{array}\right] ; \end{array} \qquad \begin{array}{c} U_{*}:=\Im (P_1) \\ \Downarrow \\ E_{*}=\left[ \begin{array}{rrr} 1/4 &{} 1/4 &{} -1/2\\ -1/4 &{} -1/4 &{} 1/2\\ 0 &{} 0 &{} 0 \end{array}\right] . \end{array} $$

In the both cases, spectrum of matrix $ B_{*}$ is $ \{-1,-1/2,-1/2\}$. $\square $

Remark. In all the cases, where the distance $d(A,\mathbb {D})$ is achieved at $\gamma =0$ and two minimal singular values of the matrix (21) coincide, i.e., $\sigma _{2n-1}(\lambda ,0)=\sigma _{2n}(\lambda ,0)$, we have found the rank 1 minimal perturbation whilst in the work [14] it is described as a rank 2 matrix.

6 Conclusions

We have investigated Wilkinson’s problem for the distance evaluation from a given matrix to the set of matrices possessing multiple eigenvalues. The structure of the perturbation matrix is clarified that gives us an opportunity to compute symbolically the distance equation with the zero set containing the critical values of the squared distance function.

Computational complexity of the proposed solution is (traditionally to analytical approach) high. Although this payment should be agreed with regard to the reliability of the computation results, we still hope to reduce it in further investigations.

There exists a definite similarity of the considered problem to that of Routh–Hurwitz distance to instability computation. For instance, the approach suggested in Sect. 3 has its counterpart in the one developed by Ch. Van Loan for the distance to instability problem [11, 12]. This is also a subject of subsequent discussions.

References

Ahmad, S.S., Alam, R.: On Wilkinson’s problem for matrix pencils. ELA 30, 632–648 (2015)
Google Scholar
Alam, R., Bora, S.: On sensitivity of eigenvalues and eigendecompositions of matrices. Linear Algebra Appl. 396, 273–301 (2005)
Article MathSciNet Google Scholar
Armentia, G., Gracia, J.-M., Velasco, F.-E.: Nearest matrix with a prescribed eigenvalue of bounded multiplicities. Linear Algebra Appl. 592, 188–209 (2020)
Article MathSciNet Google Scholar
Demmel, J.W.: Computing stable eigendecompositions of matrices. Linear Algebra Appl. 79, 163–193 (1986)
Article MathSciNet Google Scholar
Demmel, J.W.: On condition numbers and the distance to the nearest ill-posed problem. Numer. Math. 51, 251–289 (1987)
Article MathSciNet Google Scholar
Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1, 211–218 (1936)
Article Google Scholar
Golub, G., Van Loan, Ch.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Higham, N.G.: Matrix nearness problems and applications. In: Applications of matrix theory, pp. 1–27. Oxford University Press, New York (1989)
Google Scholar
Horn, R.A., Johnson, Ch.: Matrix Analysis, 2nd edn. Cambridge University Press, New York (2013)
MATH Google Scholar
Lippert, R.A., Edelman, A.: The computation and sensitivity of double eigenvalues. In: Chen, Z., Li, Y., Micchelli, C.A., Xu, Y. (eds.) Advances in Computational Mathematics: Proceedings, pp. 353–393. Gaungzhou International Symposium, Dekker, New York (1999)
Google Scholar
Kalinina, E.A., Smol’kin, Y.A., Uteshev, A.Y.: Routh – Hurwitz stability of a polynomial matrix family. Real perturbations. In: Boulier, F., England, M., Sadykov, T.M., Vorozhtsov, E.V. (eds.) CASC 2020. LNCS, vol. 12291, pp. 316–334. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60026-6_18
Chapter Google Scholar
Kalinina, E., Uteshev, A.: On the real stability radius for some classes of matrices. In: Boulier, F., England, M., Sadykov, T.M., Vorozhtsov, E.V. (eds.) CASC 2021. LNCS, vol. 12865, pp. 192–208. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85165-1_12
Chapter Google Scholar
Kokabifar, E., Loghmani, G.B., Karbassi, S.M.: Nearest matrix with prescribed eigenvalues and its applications. J. Comput. Appl. Math. 298, 53–63 (2016)
Article MathSciNet Google Scholar
Malyshev, A.: A formula for the 2-norm distance from a matrix to the set of matrices with multiple eigenvalues. Numer. Math. 83, 443–454 (1999)
Article MathSciNet Google Scholar
Mengi, E.: Locating a nearest matrix with an eigenvalue of prespecified algebraic multiplicity. Numer. Math. 118, 109–135 (2011)
Article MathSciNet Google Scholar
Netto, E.: Rationale Funktionen einer Veränderlichen; ihre Nullstellen. In: Meyer, W.F. (Ed.) Encyklopadie der Mathematischen Wissenschaften mit Einschluss ihrer Anwendungen, Teubner, Leipzig, Germany, 1898–1904, vol. 1, pp. 227–254 (1898). https://doi.org/10.1007/978-3-663-16017-5_7
Pólya, G., Szegö, G.: Problems and Theorems in Analysis II. Springer, Berlin (1976). https://doi.org/10.1007/978-3-642-61983-0
Book MATH Google Scholar
Ruhe, A.: Properties of a matrix with a very ill-conditioned eigenproblem. Numer. Math. 15, 57–60 (1970)
Article MathSciNet Google Scholar
Turnbull, H.W.: Matrix differentiation of the characteristic function. Proc. Edinb. Math. Soc. Second Ser. II, 256–264 (1931)
Article Google Scholar
Uteshev, A.: Notebook (2022). http://vmath.ru/vf5/matricese/optimize/distancee/casc2022ex. Accessed 21 June 2022
Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Oxford University Press, New York (1965)
MATH Google Scholar
Wilkinson, J.H.: Note on matrices with a very ill-conditioned eigenproblem. Numer. Math. 19, 176–178 (1972)
Article MathSciNet Google Scholar
Wilkinson, J.H.: On neighbouring matrices with quadratic elementary divisors. Numer. Math. 44, 1–21 (1984)
Article MathSciNet Google Scholar
Wilkinson, J.H.: Sensitivity of eigenvalues. Util. Math. 25, 5–76 (1984)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors are grateful to Prof. Evgenii V. Vorozhtsov and to the anonymous referees for valuable suggestions that helped to improve the quality of the paper.

Author information

Authors and Affiliations

Faculty of Applied Mathematics, St. Petersburg State University, 7–9 Universitetskaya nab., St. Petersburg, 199034, Russia
Elizaveta Kalinina & Alexei Uteshev

Authors

Elizaveta Kalinina
View author publications
You can also search for this author in PubMed Google Scholar
Alexei Uteshev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elizaveta Kalinina .

Editor information

Editors and Affiliations

Université de Lille, Villeneuve d'Ascq, France
François Boulier
Coventry University, Coventry, UK
Matthew England
Plekhanov Russian University of Economics, Moscow, Russia
Timur M. Sadykov
Institute of Theoretical and Applied Mechanics, Novosibirsk, Russia
Evgenii V. Vorozhtsov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kalinina, E., Uteshev, A. (2022). Distance Evaluation to the Set of Matrices with Multiple Eigenvalues. In: Boulier, F., England, M., Sadykov, T.M., Vorozhtsov, E.V. (eds) Computer Algebra in Scientific Computing. CASC 2022. Lecture Notes in Computer Science, vol 13366. Springer, Cham. https://doi.org/10.1007/978-3-031-14788-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-14788-3_12
Published: 11 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14787-6
Online ISBN: 978-3-031-14788-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Distance Evaluation to the Set of Matrices with Multiple Eigenvalues

Abstract

Similar content being viewed by others

Matrix Optimization Over Low-Rank Spectral Sets: Stationary Points and Local and Global Minimizers

Matrix versions of the Hellinger distance

Affine-Invariant Midrange Statistics

Keywords

1 Introduction

2 Algebraic Preliminaries

Theorem 1 (Jacobi)

Example 1

3 Distance Equation and Perturbation Matrix

Theorem 2

Proof

Corollary 1

Corollary 2

Proof

Corollary 3

Corollary 4

Theorem 3

Proof

Example 2

Example 3

Example 4

4 Singular Values

Theorem 4

Theorem 5

5 Distance via Matrix Eigenvalues

5.1 Symmetric Matrix

Theorem 6

Proof

Example 5

5.2 Skew-Symmetric Matrix

Theorem 7

Proof

Corollary 5

5.3 Orthogonal Matrix

Theorem 8

Proof

Proof

Corollary 6

Example 6

6 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation