Keywords

1 Introduction

Given a matrix \(A\in \mathbb R^{n\times n} \) with distinct eigenvalues, we intend to find the distance from A to the set \( \mathbb D \) of real matrices with multiple eigenvalues as well as the corresponding minimal perturbation, i.e., a matrix \(E_{*} \in \mathbb {R}^{n\times n} \) of the minimal norm such that \(B_{*}=A+E_{*} \in \mathbb D \).

The problem under consideration is known as Wilkinson’s problem [21] and the desired distance, further denoted as \(d(A, \mathbb D)\), is called the Wilkinson distance of A [2, 15]. Wilkinson’s problem is closely related to ill-conditioning of eigenvalue problems. The ill-conditioning of a linear system is determined by the distance of the coefficient matrix from the set of singular matrices. For eigenvalue problems, the set of matrices with multiple eigenvalues plays the role of singularity [23]. The Wilkinson distance can be considered as a measure of sensitivity of the worst-conditioned eigenvalue of A. By eigenvalue perturbation theory, a matrix that is close to a defective matrix has an eigenvalue with large condition number. Conversely, any matrix with an ill-conditioned eigenvalue is close to a defective matrix [18, 22].

For the spectral and the Frobenius norms, the problem has been studied intensively by Wilkinson [22,23,24] as well as by other researchers [2, 4, 5, 10, 14, 18]. In the works [1, 3, 13, 15], generalizations of Wilkinson’s problem for the cases of prescribed eigenvalues or multiplicities and matrix pencils are studied. However, several aspects of the problem still need further clarification.

The present paper is devoted to the stated problem for the case of Frobenius norm. It is organized as follows.

In Sect. 2, we start with algebraic background for the stated problem. We first detail the structure of the set \( \mathbb D \) in the matrix space. The cornerstone notion here is the discriminant of a characteristic polynomial of a matrix. Being a polynomial function in the entries of the matrix, the discriminant permits one to translate the problem of evaluation of \( d(A, \mathbb D)\) to that of finding the distance from a point to an algebraic manifold in the matrix space. This makes it possible to attack the problem within the framework of the approach already exploited by the present authors in the preceding studies [11, 12] on the distance to instability in the matrix space. The approach is aimed at the construction of the so-called distance equation, i.e., the univariate equation whose zero set contains all the critical values of the squared distance function. Its construction is theoretically feasible via application of symbolic methods for elimination of variables in an appropriate multivariate algebraic system. Unfortunately, the practical realization faces the variable flood difficulty, where the number of variables grows rapidly with the order of the matrix.

To bypass this, we reformulate the problem in terms of the minimal perturbation matrix. In Sect. 3, we prove that this matrix is a rank 1 matrix. Then we reduce the problem of its finding to that of a constrained optimization n-variate problem with an objective function of order 4. Some examples are presented illuminating the applicability of the developed algorithm.

The discovered property of the perturbation matrix makes it possible to look at the problem from the other side. Generically, the 2-norm of a matrix does not equal its Frobenius norm. However, for the rank 1 matrix (and this is exactly the case of the minimal perturbation matrix), these norms coincide. This allows one to verify the results obtained in the framework of symbolic approach with the counterpart obtained for the 2-norm case [14]. This issue is discussed in Sect. 4 while in Sect. 5, both approaches are illustrated for three classes of matrices where the distance \(d(A, \mathbb D)\) can be explicitly expressed via the eigenvalues of A. These happen to be symmetric, skew-symmetric and orthogonal matrices. Quite unexpected for the authors became the fact that, for some classes, each of their representative had a continuum of nearest matrices in \( \mathbb D \).

Notation. For a matrix \(A \in \mathbb R^{n\times n} \), \(f_A(\lambda ) \) denotes its characteristic polynomial, \({\text {adj}} (A)\) stands for its adjoint matrix, \( d(A, \mathbb D) \) denotes the distance from A to the set \( \mathbb D \) of matrices possessing a multiple eigenvalue. \( E_{*} \) and \( B_{*} = A+ E_{*} \) stand for, correspondingly, the (minimal) perturbation matrix and the nearest to A matrix in \( \mathbb D\) (i.e., \(d(A,\mathbb D)=\Vert A- B_{*}\Vert \)); we then term by \(\lambda _{*}\) the multiple eigenvalue of \(B_{*}\). I (or \( I_n\)) denotes the identity matrix (of the corresponding order). \(\mathcal D\) (or \(\mathcal D_{\lambda } \)) denotes the discriminant of a polynomial (with subscript indicating the variable).

Remark. All the computations were performed in CAS Maple 15.0. (LinearAlgebra package and functions discrim, and resultant). Although all the approximate computations have been performed within the accuracy \(10^{-40}\), the final results are rounded to \( 10^{-6} \).

2 Algebraic Preliminaries

It is well-known that in the \((n+1)\)-dimensional space of the polynomial \( f(\lambda )=a_0\lambda ^n + a_1\lambda ^{n-1} +\dots +a_n, n \ge 2 \) coefficients, the manifold of polynomials with multiple zeros is defined by the equation

$$\begin{aligned} D(a_0,a_1,\dots ,a_n)=0 \quad {where} \ D:=\mathcal D_{\lambda }(f(\lambda )) \end{aligned}$$
(1)

denotes the discriminant of the polynomial. Discriminant can be represented in different ways, for instance, as the Sylvester determinant

$$ \mathcal D_{\lambda }(a_0\lambda ^4+a_1\lambda ^3+ a_2\lambda ^2 +a_3\lambda +a_4)= \frac{1}{4^2}\left| \begin{array}{rrrrrr} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 &{} 0 &{} 0 \\ 0 &{} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 &{} 0 \\ 0 &{} 0 &{} a_1 &{} 2a_2 &{} 3 a_3 &{} 4 a_4 \\ 0 &{} 0 &{} 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 \\ 0 &{} 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 &{} 0 \\ 4a_0 &{} 3a_1 &{} 2 a_2 &{} a_3 &{} 0 &{} 0 \end{array} \right| \, . $$

The discriminant \( D(a_0,a_1,\dots ,a_n) \) is a homogeneous polynomial over \( \mathbb Z_{} \) of order \(2n-2\) in its variables, and it is irreducible over \( \mathbb Z_{} \).

The following result [16] is much less known.

Theorem 1 (Jacobi)

If \(f(\lambda )\) possesses a unique multiple zero \( \lambda _{*} \) and its multiplicity equals 2, then the following ratio is valid

$$\begin{aligned} 1 : \lambda : \lambda ^2 : \dots : \lambda ^n = \frac{\partial D}{\partial a_n} : \frac{\partial D}{\partial a_{n-1}} : \frac{\partial D}{\partial a_{n-2}} : \dots : \frac{\partial D}{\partial a_{0}} \, . \end{aligned}$$
(2)

To solve the problem stated in Introduction, one needs to transfer the discriminant manifold (1) into the matrix space. The corresponding manifold is then defined by a homogeneous polynomial of order \( n(n-1)\) in the matrix entries:

$$\begin{aligned} \mathfrak D(B):=\mathcal D_{\lambda }(f_B(\lambda ))=0\ . \end{aligned}$$
(3)

We will further denote this manifold in \( \mathbb R^{n^2} \) as \( \mathbb D \). The problem of distance evaluation between a given matrix A and \( \mathbb D \) can be viewed as a constrained optimization problem:

$$\begin{aligned} d^2(A, \mathbb D)=\min _{B\in \mathbb R^{n\times n}} \Vert B-A\Vert ^2\ \text {subject\, to }\ (3) \, . \end{aligned}$$
(4)

Consider the Lagrange function for this problem

$$ F(B,\mu ):= \Vert B-A\Vert ^2 - \mu \mathfrak D(B) \, . $$

Evidently, \( \partial F / \partial \mu = 0 \) is equivalent to (3). Differentiation with respect to the entries of B yields

$$\begin{aligned} 2 (b_{jk}-a_{jk}) - \mu \partial \mathfrak D(B) /\partial b_{jk}=0\ \text { for } \ \{j,k\} \subset \{1,\dots ,n\} . \end{aligned}$$
(5)

Since the system (3)–(5) is an algebraic one, it admits application of symbolic methods of elimination of variables. We attach to the considered system an extra equation

$$\begin{aligned} z= \Vert B-A\Vert ^2 \end{aligned}$$
(6)

and then aim at finding the so-called distance equation

$$ \mathcal F(z)=0 $$

resulting from the elimination of all the variables but z from this system. Positive zeros of this equation are the critical values of the squared distance function for the problem (4).

Example 1

For the matrix \( A =[a_{jk}]_{j,k=1}^2 \) with the characteristic polynomial \( f_A(\lambda )\), the system (5) is linear with respect to \(\{b_{jk}\}_{j,k=1}^2 \) and the distance equation is easily computed as

$$ \mathcal F(z):=4096(a_{12}-a_{21})^2\left[ (a_{11}-a_{22})^2 + (a_{12}+a_{21})^2 \right] $$
$$\begin{aligned} \times \left\{ \left[ 4z- \mathcal D_{\lambda } (f_A(\lambda )) \right] ^2-16(a_{12}-a_{21})^2z \right\} =0. \end{aligned}$$
(7)

It turns out that for any matrix A such that \( \mathcal D_{\lambda } (f_A(\lambda )) \ne 0 \), the distance equation is the quadratic one (7) where \( d^2(A,\mathbb D)\) equals its minimal zero.

For the matrix

$$ A=\left[ \begin{array}{rr} s &{} t \\ -t &{} s \end{array} \right] \quad {\mathrm where} \ t > 0, $$

polynomial \( \mathcal F(z)\) vanishes identically. Equation (7) possesses a multiple zero, namely \(z=t^2\), and \( d(A, \mathbb D)=t \). Surprisingly, this distance is provided by a continuum of perturbation (and thus nearest in \(\mathbb D \)) matrices, namely

$$ E_{*}= \frac{t}{2} \left[ \begin{array}{cc} \sin \varphi &{} -1 + \cos \varphi \\ 1 + \cos \varphi &{} -\sin \varphi \end{array} \right] , \ \quad {\mathrm where} \ \varphi \in [0, 2 \pi ). $$

This example causes an anxious expectation of difficulties to appear while solving the stated distance evaluation problem for the case of orthogonal or skew-symmetric matrices A.    \(\square \)

For a general case, computation of the distance equation via the solution of the system (3)–(5)–(6) is a hardly executable task due to a drastic increase in the number of variables (i.e., the entries of matrix B) to be eliminated. To overcome this difficulty, let us reformulate the problem in terms of the entries of the perturbation matrix.

3 Distance Equation and Perturbation Matrix

Theorem 2

Matrices \( E_{*} \) are \( B_{*} \) are linked by the equality

$$\begin{aligned} E_{*}=\varkappa \left[ f_{*}(B_{*}) \right] ^{\top }, \end{aligned}$$
(8)

were

$$ f_{*} (\lambda ):= \frac{f_{B_{*}}(\lambda )}{\lambda -\lambda _{*}}, $$

and \( \varkappa \in \mathbb R \) is some scalar.

Proof

We start with system (5) resulting from application of the Lagrange method to problem (4). Compute \( \partial \mathfrak D(B) /\partial b_{jk} \) as a composite function with the coefficients of characteristic polynomial \(f_B(\lambda )=\lambda ^n+p_1\lambda ^{n-1}+\dots +p_n \) treated as intermediate variables:

$$ \frac{\partial \mathfrak D(B)}{\partial b_{jk}} = \frac{\partial \mathfrak D(B)}{\partial p_0} \frac{\partial p_0}{\partial b_{jk}} + \frac{\partial \mathfrak D(B)}{\partial p_1} \frac{\partial p_1}{\partial b_{jk}} + \dots + \frac{\partial \mathfrak D(B)}{\partial p_n} \frac{\partial p_n}{\partial b_{jk}}. $$

(We set here \( p_0:=1\) and thus the first term in the right-hand side is just 0). Under the condition \( \mathfrak D(B)=0 \) (i.e., the matrix \( B=B_{*} \) possesses a multiple eigenvalue \( \lambda _{*} \)), the Jacobi ratio (2) is fulfilled

$$ 1 : \lambda _{*} : \lambda _{*}^2 : \dots : \lambda _{*}^n = \frac{\partial \mathfrak D}{\partial p_n} : \frac{\partial \mathfrak D}{\partial p_{n-1}} : \frac{\partial \mathfrak D}{\partial p_{n-2}} : \dots : \frac{\partial \mathfrak D}{\partial p_{0}} \, . $$

Therefore,

$$ \frac{\partial \mathfrak D}{\partial p_{\ell }}= \kappa \lambda _{*}^{n-\ell } \ {\mathrm for } \ \ell \in \{1,\dots ,n\} $$

and for some constant \( \kappa \in \mathbb R\). Consequently

$$ \frac{\partial \mathfrak D(B)}{\partial b_{jk}} = \kappa \left( \lambda _{*}^n \frac{\partial p_0}{\partial b_{jk}} + \lambda _{*}^{n-1} \frac{\partial p_1}{\partial b_{jk}} + \dots + \frac{\partial p_n}{\partial b_{jk}} \right) = \kappa \frac{\partial f_B(\lambda _{*})}{\partial b_{jk}}. $$

The preceding considerations lead to a conclusion that the system of Eqs. (5) is equivalent to the matrix equation

$$ 2(B_{*} - A)=\mu \kappa \, \partial f_B(\lambda _{*}) / \partial B \big |_{B=B_{*}}. $$

Next utilize the formula of differentiation of characteristic polynomial with respect to the matrix [19]:

$$ \partial f_B(\lambda )/ \partial B=\left[ {\text {adj}}(\lambda I - B) \right] ^{\top }. $$

Equality (8) then follows from the representation of the adjoint matrix for \( \lambda _{*} I - B_{*} \) as \( f_{*}(B_{*}) \) with \( f_{*} (\lambda ) \) standing for the quotient on division of \( f_B(\lambda ) \) by \( \lambda - \lambda _{*} \).

Corollary 1

Matrices \( E_{*}^{\top } \) and \( B_{*} \) commute and

$$ (\lambda _{*}I-B_{*})E_{*}^{\top }= \mathbb O_{n\times n}. $$

Corollary 2

If A does not have a multiple eigenvalue, then \( E_{*} \) is the rank 1 matrix with only zero eigenvalues.

Proof

Matrix \( f_{*}(B_{*}) = {\text {adj}}(\lambda _{*} I - B_{*}) \) is the rank 1 matrix, since its columns are the eigenvectors of the matrix \( B_{*} \) corresponding to \(\lambda _{*}\) (Cayley–Hamilton theorem).

We next prove that \({\text {tr}} ({\text {adj}}(\lambda _{*} I - B_{*})=0 \). For any matrix B with spectrum \(\{\lambda _j\}_{j=1}^n \), matrix \( {\text {adj}}(\lambda I-B) \) has the spectrum [17] (part VII, problem 48):

$$ \left\{ \frac{f_B(\lambda )}{\lambda -\lambda _j} \right\} _{j=1}^n \, . $$

Thus,

$$ {\text {tr}}(E_{*})= {\text {tr}}({\text {adj}}(\lambda _{*} I-B_{*})) = f_{B_{*}}^{\prime }(\lambda _{*})=0. $$

Corollary 3

Matrix \( E_{*} \) is normal to \( B_{*} \), i.e., \({\text {tr}} (B_{*}^{\top }E_{*})=0\).

Corollary 4

\( {\text {tr}}(B_{*}) = {\text {tr}}(A) \).

Theorem 3

The value \( d^2(A,\mathbb D) \) is contained in the set of critical values of the function

$$\begin{aligned} G(U):=U^{\top }A A^{\top } U - \left( U^{\top }AU \right) ^2 \ { subject\ to} \ U^{\top }U=1, U \in \mathbb R^n \, \end{aligned}$$
(9)

If \( U_{*} \) is the vector providing \( d^2(A,\mathbb D) \), then the perturbation matrix can be computed by the formula

$$\begin{aligned} E_{*}=U_{*} U_{*}^{\top } (\kappa I-A) \quad { where } \ \kappa := U_{*}^{\top } A U_{*}. \end{aligned}$$
(10)

Proof

Due to Corollary 2, the singular value decomposition for the perturbation matrix E is represented as

$$\begin{aligned} E=\sigma U\cdot V^{\top } \end{aligned}$$
(11)

under restrictions

$$\begin{aligned} U^{\top } U=1,\ V^{\top } V=1,\ U^{\top } V=0. \end{aligned}$$
(12)

From the condition \( {\text {tr}}((A+E)E^{\top })=0 \) we deduce that \( \sigma =- {\text {tr}}(AVU^{\top })=-U^{\top }AV. \) Formulate the constrained optimization problem

$$\begin{aligned} \min (-U^{\top }AV) \quad {\mathrm {subject\ to}}\ (12). \end{aligned}$$
(13)

The derivatives of the corresponding Lagrange function

$$ L(U,V,\mu _1,\mu _2,\mu _3):= -U^{\top }AV - \mu _1( U^{\top } U-1)-\mu _2( V^{\top } V-1)-\mu _3 U^{\top } V $$

result in the system of linear equations

$$\begin{aligned} \partial L / \partial U= & {} - AV-2\mu _1 U-\mu _3 V =\mathbb O_{n\times 1} , \end{aligned}$$
(14)
$$\begin{aligned} \partial L / \partial V= & {} -A^{\top }U-2\, \mu _2 V - \mu _3 U = \mathbb O_{n\times 1} \end{aligned}$$
(15)

with respect to U and V. Multiplication of (14) by \( U^{\top } \) and (15) by \( V^{\top } \) results (in accordance with (12)) in

$$\begin{aligned} \mu _3 = -V^{\top }AV=-U^{\top }AU. \end{aligned}$$
(16)

Multiplication of (14) by \(U^{\top } \) while (15) by \(V^{\top } \) yields

$$\begin{aligned} -2\mu _1 = -2\mu _2 = U^{\top }AV=-\sigma \end{aligned}$$
(17)

and, provided this value is not 0,

$$\begin{aligned} V=-\frac{1}{2\mu _2} (A^{\top }+\mu _3 I)U. \end{aligned}$$
(18)

Substituting (18) in (11) and taking into account (16), we arrive at (10).

If \( \mu _1=\mu _2=0 \) then system (14)–(15) is reduced to \(AV=-\mu _3V, A^{\top }U=-\mu _3U\). This implies that the matrix A should possess a real eigenvalue \( \kappa _1 \) with the corresponding right and left eigenvectors \( V_1 \) and \(U_1\) satisfying the condition \(U_1^{\top }V_1=0\). We claim that, in this case, matrix A has a multiple eigenvalue. For the sake of simplicity, we prove this statement under an extra assumption that all the eigenvalues \(\kappa _1,\dots ,\kappa _n \) of A are real. Suppose, by contradiction, that they are distinct. One has then

$$ \kappa _1 U_1^{\top }V_j = U_1^{\top } A V_j=\kappa _j U_1^{\top }V_j \quad \Rightarrow \ U_1^{\top }V_j=0 \quad {for} \ j\in \{2,\dots ,n\} $$

and for \(V_j\) standing for the right eigenvector corresponding to \( \kappa _j \). Therefore, \(U_1\) is normal to all the vectors \(V_1,V_2,\dots , V_n\) composing a basis of \( \mathbb R^n\). The contradiction proves the assertion. The statement of the theorem remains valid with the corresponding critical value of (9) equal to 0.    \(\square \)

To find the critical values of the function (9), the Lagrange multipliers method is to be applied with the objective function \( G(U)-\mu (U^{\top }U -1 )\). This results into the system

$$\begin{aligned} A A^{\top } U - \left( U^{\top }AU \right) (A+A^{\top })U-\mu U = \mathbb O_{n\times 1} \end{aligned}$$
(19)

where every equation is now just cubic with respect to the entries of U. This is an essential progress compared to the system (3)–(5)–(6), and makes it feasible to manage the procedure of elimination of variables from the system (19) accomplished with \( z-G(U)=0 \) and \(U^{\top }U=1\) (at least for the matrices of the order \( n \le 8 \)).

Unfortunately, the new system possesses some extraneous solutions, i.e., those not corresponding to the critical values of the distance function.

Example 2

For the matrix

$$ A=\left[ \begin{array}{rr} 0 &{} 1 \\ 13 &{} -6 \end{array} \right] , $$

the system

$$ u_1^2+u_2^2=1, \ u_2 \partial G / \partial u_1 - u_1 \partial G / \partial u_2=0 $$

possesses solutions

$$ u_1=\pm \frac{1}{58} \sqrt{2900+82\sqrt{22}}, \ u_2 = \pm \frac{1}{58} \sqrt{464-87\sqrt{22}} $$

that yield the value \( z= 0\). The true distance equation is given by (7), and \( d^2(A,\mathbb D)=-12\sqrt{58}+94\) is provided by another solution of the system, namely

$$ u_1= \pm \frac{1}{58} \sqrt{1682+203\sqrt{58}}, \ u_2= \pm \frac{1}{58} \sqrt{1682-203\sqrt{58}}. $$

   \(\square \)

The appearance of such extraneous solutions is caused by the non-equivalence of the passage from the original stated problem to that from Theorem 3. For instance, representation (10) is deduced under an extra condition of non-vanishing of value (17).

Example 3

For the Frobenius matrix

$$ A=\left[ \begin{array}{rrr} 0 &{} 1 &{}0\\ 0 &{} 0 &{} 1 \\ -91 &{} -55 &{}-13 \end{array} \right] , $$

the distance equation

$$ \mathcal F(z):= 33076090700402342058246544\,z^6-377039198861306289080145178864 z^5 $$
$$ +\, 937864902703881321034450183916\,z^4-771868276098720970149792503999\,z^3 $$
$$ +\, 211070978787821517684022650624\,z^2 -510584100140452518540394496\,z $$
$$ +\, 319295875259784560640000 =0 $$

possesses the following real zeros

$$ z_1\approx 0.739335,\ z_2 \approx 0.765571,\ z_3 \approx 0.980467,\ z_4 \approx 11396.658548. $$

One has \(d(A, \mathbb D) = \sqrt{z_1} \approx 0.859846\) and

$$ E_{*}\approx \left[ \begin{array}{rrr} 0.198499 &{} -0.195124 &{}-0.530440\\ 0.204398 &{} -0.200922 &{} -0.546202 \\ -0.000907&{} 0.000891 &{} 0.002424 \end{array}\right] ,\ $$
$$ B_{*}=A+E_{*}\approx \left[ \begin{array}{rrr} 0.198499 &{} 0.804875 &{} -0.530440\\ 0.204398&{} -0.200923 &{} 0.453797\\ -91.000907&{} -54.999108 &{} -12.997576 \end{array}\right] . $$

The latter matrix possesses the double eigenvalue \( \lambda _{*} \approx 0.824777 \).    \(\square \)

Example 4

For the matrix

$$ A=\left[ \begin{array}{rrrr} 5&{} -36 &{} -57 &{} 85\\ 80 &{} 90 &{} 74 &{} 27\\ 9 &{} -91 &{} 81 &{} 65\\ -12&{} 78&{} 5 &{} -63 \end{array} \right] , $$

the distance equation is represented by the order 12 irreducible over \( \mathbb Z\) polynomial \(\mathcal F(z) \) with the absolute value of coefficients up to \( 10^{100}\). Its real zeros are

$$ z_1\approx 87.614714, z_2\approx 2588.509661, \ z_3\approx 17853.256334,\ z_4 \approx 32194.078324. $$

One has \(d(A, \mathbb D) = \sqrt{z_1} \approx 9.360273\) and

$$ E_{*}\approx \left[ \begin{array}{rrrr} 3.350324 &{} -0.177130 &{} -3.704042 &{} -0.328216\\ 2.489713 &{} -0.131630 &{} -2.752569 &{} -0.243906\\ 2.565863 &{} -0.135656 &{} -2.836760 &{} 0.251366 \\ 3.898666 &{} -0.206121 &{} -4.310276 &{} 0.381935 \end{array}\right] , $$

with the matrix

$$ B_{*}=A+E_{*} \approx \left[ \begin{array}{rrrr} 8.350324 &{} -36.177130 &{} -60.704042&{} 84.671784\\ 82.489713&{} 89.868370 &{} 71.247430&{} 26.756094\\ 11.565863 &{} -91.135656 &{} 78.163240 &{} 64.748634\\ -8.101333 &{} 77.793879&{} 0.689724&{} -63.381935 \end{array}\right] $$

possessing the double eigenvalue \( \lambda _{*} \approx 69.081077 \).    \(\square \)

Some empirical conclusions resulting from about 30 generated matrices of the orders up to \( n= 20 \). Generically,

  1. (a)

    The extraneous factor equals \(z^n\), and on its exclusion one has

  2. (b)

    the order of the distance equation \(\mathcal F(z)=0\) equals \( n(n-1)\), and, if computed symbolically w.r.t. the entries of A, \( \mathcal F(0)\) has a factor \( \left[ \mathcal D_{\lambda } (f_A(\lambda )) \right] ^2 \);

  3. (c)

    \( d^2(A,\mathbb D)\) equals the minimal positive zero of this equation.

Complete computational results for some examples are presented in [20]. For the matrices A with integer entries within \([-99,+99] \) (generated by Maple 15.0. RandomMatrix package) we point out some complexity estimates for the distance equation computation (PC AMD FX-6300 6 core 3.5 GHz)

n

\( \deg \mathcal F(z) \)

coefficient size

number of real zeros

timing (s)

5

20

\({\sim } 10^{170} \)

10

0.03

10

90

\({\sim } 10^{780} \)

28

0.13

20

380

\({\sim } 10^{3500} \)

36

1940

The adequacy of the results has been extra checked via the nearest matrix \( B_{*}\) computation. This matrix should

  1. (a)

    possess a double eigenvalue;

  2. (b)

    have the value \( \Vert B_{*}-A \Vert \) equal to the square root of the least positive zero of \(\mathcal F(z)\);

  3. (c)

    satisfy the system of equations (3)–(5) (this property has been tested only for the orders \(n\le 8\));

  4. (d)

    have the number of real eigenvalues which differs from that of the matrix A at most by 2.

4 Singular Values

Let \(A\in \mathbb {R}^{n\times n}\) be a nonsingular matrix with the singular value decomposition as follows

$$\begin{aligned} A=WD_n V^{\top }, \end{aligned}$$
(20)

where \(D_n=\mathrm{diag}\,\{\sigma _1,\sigma _2, \ldots ,\sigma _n\}\), with singular values \(\sigma _1\ge \sigma _2\ge \ldots \ge \sigma _n>0\).

The following result [6, 8] gives us the distance to the nearest matrix with rank \(k<n\).

Theorem 4

One has

$$\min _{\mathrm{rank\,}B=k} ||A-B||=||A-A_k||=\left\{ \begin{array}{ll} \sigma _{k+1},&{}\mathrm{for\ the} \ \textit{2}\text {-}norm,\\ \displaystyle \left[ \sum _{i=k+1}^n\sigma _i^2\right] ^{1/2} &{}\mathrm{for\ the \ Frobenius\ norm.} \end{array}\right. $$

Here

$$A_k=WD_k V^{\top },\quad D_k=\mathrm{diag}\,\{\sigma _1,\sigma _2, \ldots ,\sigma _k,0,\ldots ,0\}.$$

According to this theorem, the Frobenius distance from the nonsingular A to the set of matrices with multiple eigenvalues satisfies the following inequality:

$$ d(A,\mathbb D) \le \sqrt{\sigma _{n-1}^2+\sigma _n^2}. $$

As for the distance \( d(A,\mathbb D) \) in the 2-norm, the following result [14] is known:

Theorem 5

Let the singular values of the matrix

$$\begin{aligned} M=\left[ \begin{array}{cc} A-\lambda I_n&{}\gamma I_n\\ \mathbb {O}_{n\times n}&{}A-\lambda I_n \end{array}\right] \end{aligned}$$
(21)

be ordered like \(\sigma _1(\lambda ,\gamma )\ge \sigma _2(\lambda ,\gamma )\ge \ldots \ge \sigma _{2n}(\lambda ,\gamma ) \ge 0 \). Then one has

$$d(A,\mathbb D)=\min _{\lambda \in \mathbb {C}}\max _{\gamma \ge 0} \sigma _{2n-1}(\lambda ,\gamma ).$$

It is well-known that for the matrix \(A \in \mathbb R^{n\times n}, n\ge 2\), Frobenius norm and the 2-norm are related by the inequality [7]

$$||A||_2\le ||A||_F\le \sqrt{n}||A||_2.$$

It is also known, that \( ||A||_2=||A||_F \) iff \( {\text {rank}}(A)= 1 \). According to Corollary 2, both norms coincide for the minimal perturbation \(E_{*}\). This results in an algorithm for \(d(A,\mathbb D)\) computation that is an alternative to that treated in Sect. 3.

To find singular values of the matrix (21), i.e., zeros of the polynomial

$$\begin{aligned} \det (MM^{\top }-\mu I_{2n}) \end{aligned}$$
(22)
$$=\det \left[ \begin{array}{cc} (A-\lambda I_n)(A-\lambda I_n)^{\top }+\gamma ^2I_{n}-\mu I_{n}&{}\gamma (A-\lambda I_n)^{\top }\\ \gamma (A-\lambda I_n)&{}(A-\lambda I_n)(A-\lambda I_n)^{\top } - \mu I_{n} \end{array}\right] $$

treated with respect to \(\mu \), is a nontrivial task. We will restrict our consideration to the classes of matrices A where application of Schur formula for the determinant of the block matrix (22) is possible, i.e., transforming it into

$$\begin{aligned} \det (\mu ^2 I_n- \mu [2(A-\lambda I_n)(A-\lambda I_n)^{\top }+\gamma ^2I_n]+[(A-\lambda I_n)(A-\lambda I_n)^{\top }]^2). \end{aligned}$$
(23)

These happen to be symmetric, skew-symmetric, and orthogonal matrices. Singular values of the matrix (21) can be expressed explicitly via the eigenvalues of this matrix.

5 Distance via Matrix Eigenvalues

5.1 Symmetric Matrix

Theorem 6

Let A be a symmetric matrix with distinct eigenvalues \(\lambda _1,\lambda _2,\) \(\ldots ,\lambda _n\). Then

$$d(A,\mathbb D)= \frac{1}{2} \min _{1\le k<\ell \le n}|\lambda _k-\lambda _{\ell }|. $$

If this minimum is attained at the eigenvalues \(\lambda _2 \) and \( \lambda _1, \lambda _2 > \lambda _1\), then the perturbation can be found as

$$\begin{aligned} E_{*}=\frac{1}{4}(\lambda _2-\lambda _{1})(P_1+P_{2})(P_1-P_2)^{\top }, \end{aligned}$$
(24)

where \(P_1\) and \(P_2\) are the eigenvectors of A corresponding to \(\lambda _1\) and \(\lambda _{2}\) respectively with \(\Vert P_1\Vert =\Vert P_2\Vert =1\).

Remark. Generically, matrices \(E_{*}\) and \(B_{*}=A+E_{*}\) are not the symmetric ones.

Proof

For \(j \in \{1,\dots , m\} \), denote \(P_j \) the eigenvector of A corresponding to \( \lambda _j \) with \(\Vert P_j\Vert =1 \). Then \( P=(P_1,P_2,\ldots ,P_n)\) is the orthogonal matrix such that

$$ P^{\top }A P = \varLambda \quad {\mathrm {where}} \ \varLambda =\mathrm{diag}\,\{\lambda _1,\lambda _2,\ldots ,\lambda _n\}. $$

Since the orthogonal transformation does not influence the Frobenius distance, we reduce \(d(A,\mathbb D)\) to \(d(\varLambda ,\mathbb D)\).

In this case, \(\varLambda -\lambda I=(\varLambda -\lambda I)^{\top }\) and these matrices commute. Hence, the expression (23) is valid. Therefore, the singular values of the matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2(\lambda _j-\lambda )^2+\gamma ^2)+(\lambda _j-\lambda )^4,\ j\in \{ 1,2,\ldots ,n\},$$

namely

$$\mu ^{(j)}_{1,2}=\frac{2(\lambda _j-\lambda )^2+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4(\lambda _j-\lambda )^2}}{2}.$$

Differentiating w.r.t. \(\gamma \), we get the single stationary point \(\gamma =0\). According to [14], to find the 2-norm distance from \(A-\lambda I\) to the manifold of matrices with multiple zero eigenvalue, one should find the singular values \(\sigma _n\) and \(\sigma _{n-1}\) for the matrix \((A-\lambda I)\). They are \(|\lambda _k-\lambda |\) and \(|\lambda _{\ell }-\lambda |\) for some \(k,\ell \). The minimal w.r.t. \(\lambda \) value of \(\sigma _{n-1}\) comes up to \(|\lambda _k-\lambda _{\ell }|/2\) where \(\lambda _k-\lambda =\lambda -\lambda _{\ell }\).

Assume that

$$\min _{1\le k<\ell \le n}|\lambda _k-\lambda _{\ell }|=|\lambda _1-\lambda _2|.$$

Denote

$$Q:=\left[ \begin{array}{rcccc} \frac{1}{\sqrt{2}}&{}-\frac{1}{\sqrt{2}}&{}0&{}\ldots &{}0\\ \frac{1}{\sqrt{2}}&{} \frac{1}{\sqrt{2}}&{}0&{}\ldots &{}0\\ 0&{}0&{}1&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}1 \end{array}\right] ,\ { \mathrm then\ } Q^{\top }\varLambda Q=\left[ \begin{array}{ccccc} \frac{\lambda _1+\lambda _2}{2}&{}\frac{\lambda _2-\lambda _1}{2}&{}0&{}\ldots &{}0\\ \frac{\lambda _2-\lambda _1}{2}&{}\frac{\lambda _1+\lambda _2}{2}&{}0&{}\ldots &{}0\\ 0&{}0&{}1&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}1 \end{array}\right] .$$

For this matrix, \(\tilde{E}_{*}=\left[ \begin{array}{ccccc} 0&{}\frac{\lambda _1-\lambda _2}{2}&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] \). Obviously, we get

$$E_{*}=QP\tilde{E}_{*}P^{\top }Q^{\top }=\frac{\lambda _1-\lambda _2}{4}(P_1+P_2)(P_1-P_2)^{\top }.$$

   \(\square \)

Example 5

For the matrix

$$A=\frac{1}{9}\left[ \begin{array}{rrr} -269&{}-98&{}76\\ -98&{}-296&{}22\\ 76&{}22&{}-209 \end{array}\right] , $$

one has

$$\lambda _1=-45,\lambda _2=-25,\lambda _3=-16 \ , P_1= [2/3,2/3,-1]^{\top }, P_2=[-1/3,2/3,2/3]^{\top }.$$
$$d(A,\mathbb D)=\frac{|-25 +16|}{2}=\frac{9}{2} \quad {and} \ E_{*}= \left[ \begin{array}{rrr} -3/4 &{} 3/4 &{} 0 \\ -3/4 &{} 3/4 &{} 0 \\ -3 &{} 3 &{} 0 \end{array} \right] . $$

5.2 Skew-Symmetric Matrix

Theorem 7

Let the nonzero eigenvalues of a skew-symmetric matrix A be

$$\pm b_1\mathbf {i},\pm b_2\mathbf {i},\ldots ,\pm b_{m}\mathbf {i} \ {\mathrm where} \ 0<b_1<b_2<\ldots <b_m. $$

Then

$$d(A, \mathbb D)=b_1 $$

and the minimal perturbation can be found as

$$\begin{aligned} E_{*}=-b_1 \Re (P_1) \Im (P_1)^{\top }, \end{aligned}$$
(25)

where \(P_1\) is the eigenvector of A corresponding to the eigenvalue \( b_1{\textbf {i}} \) with \(\Vert \Re (P_1)\Vert \) =\(\Vert \Im (P_1)\Vert =1 \).

Proof

For \(j \in \{1,\dots , m\} \), denote \(P_j \) the eigenvector of A corresponding to \(b_j {\textbf {i}} \) with \(\Vert \Re (P_j)\Vert =\Vert \Im (P_j)\Vert =1 \). If A possesses the zero eigenvalue, denote by \(P_{0}\) the corresponding eigenvector with \(\Vert P_{0} \Vert =1 \). Then the orthogonal matrix

$$P=(\Re (P_1),\Im (P_1),\Re (P_2),\Im (P_2),\ldots ,\Re (P_m),\Im (P_m),\{P_{0}\})$$

is such that

$$P^{\top }A P = \varUpsilon \quad {\mathrm {where} } \ \varUpsilon :=\mathrm{diag}\,\{\varUpsilon _1,\varUpsilon _2,\ldots ,\varUpsilon _m,\{0\}\},$$
$$\varUpsilon _k:= \left[ \begin{array}{cc} 0&{}b_k\\ -b_k&{}0 \end{array}\right] , k\in \{1,2,\ldots ,m\} $$

(we set in braces the entries of the matrices corresponding to the case of existence of zero eigenvalue for A).

Since an orthogonal transformation does not influence the Frobenius distance, we reduce \(d(A,\mathbb D)\) to \(d(\varUpsilon ,\mathbb D)\). In this case,

$$(\varUpsilon -\lambda I)(\varUpsilon -\lambda I)^{\top }=\mathrm{diag}\,\{ \tilde{\varUpsilon }_1,\tilde{\varUpsilon }_2,\ldots ,\tilde{\varUpsilon }_m,\{ 0\} \},$$

where

$$\tilde{\varUpsilon }_k:=\left[ \begin{array}{cc} b_k+\lambda ^2&{}0\\ 0&{}b_k^2+\lambda ^2 \end{array}\right] \ {for} \ k\in \{1,\dots ,m\}. $$

It is evident that

$$(\varUpsilon -\lambda I)(\varUpsilon -\lambda I)^{\top }(\varUpsilon -\lambda I)=(\varUpsilon -\lambda I)^2(\varUpsilon -\lambda I)^{\top }.$$

Hence, the expression (23) is valid.

Therefore, the singular values of matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2(\lambda +b_k)^2+\gamma ^2)+(\lambda ^2+b_k^2)^2,\ k\in \{1,2,\ldots ,m\},$$

namely

$$\mu ^{(k)}_{1,2}=\frac{1}{2}\left[ 2(\lambda +b_k)^2+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4(\lambda ^2+b_k^2)}\right] .$$

Differentiating w.r.t. \(\gamma \), we get a single stationary point \(\gamma =0\). According to [14], to find the 2-norm distance from \(\varUpsilon -\lambda I\) to the manifold of matrices with multiple zero eigenvalue, it is sufficient to compute the singular values \(\sigma _n\) and \(\sigma _{n-1}\) of this matrix. They are

$$ \mathrm{either}\ \sigma _n=\sigma _{n-1}=\sqrt{b_k^2+\lambda ^2} \ \mathrm{for\ some} \ k, \ \mathrm{or}\ \sigma _{n-1}=\sqrt{b_k^2+\lambda ^2},\sigma _n=|\lambda |. $$

The minimal w.r.t. \(\lambda \) value of \(\sigma _{n-1}\) comes up to \(b_1\) when \(\lambda =0\).

The corresponding perturbation

$$E_{*}=P\left[ \begin{array}{ccccc} 0&{}-b_1&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \ldots &{}\ldots &{}\ldots &{}\ldots &{}\ldots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] P^{\top }=-b_1\Re (P_1) \Im (P_1)^{\top }. $$

   \(\square \)

Corollary 5

In the notation of Theorem 7, the distance \(d(A, \mathbb D) \) is provided by a continuum of perturbations \( E_{*} \) contained in the set

$$ \left\{ -b_1 (\eta \Re (P_1) +\theta \Im (P_1))(-\eta \Im (P_1)+\theta \Re (P_1))^{\top } \mid \{\eta , \theta \} \subset \mathbb R, \eta ^2+\theta ^2=1 \right\} . $$

5.3 Orthogonal Matrix

Theorem 8

Let \( n \ge 3\), and the eigenvalues of an orthogonal matrix A, other than \(\pm 1\), be

$$\begin{aligned} \cos \alpha _1\pm \mathbf {i}\sin \alpha _1,\cos \alpha _2\pm \mathbf {i}\sin \alpha _2,\ldots ,\cos \alpha _{m}\pm \mathbf {i}\sin \alpha _{m}, \end{aligned}$$
(26)

where \( 0<\sin \alpha _1\le \sin \alpha _2\le \ldots \le \sin \alpha _m \). Then

$$\begin{aligned} d(A, \mathbb D)= \sin \alpha _1 \, , \end{aligned}$$
(27)

and the minimal perturbation can be found as

$$\begin{aligned} E_{*}=-(\sin \alpha _1) \Re (P_1) \Im (P_1)^{\top }, \end{aligned}$$
(28)

where \(P_1\) is the eigenvector of A corresponding to the eigenvalue \(\cos \alpha _1+{\textbf {i}}\sin \alpha _1\) with \(||\Re (P_1)||=||\Im (P_1)||=1\).

We present two independent proofs for this result: the first one following from Theorem 3 while the second one exploiting the considerations of Sect. 4.

Proof

I. Since \( AA^{\top }=I \), the objective function (9) can be transformed into

$$ G(U)=1- \left( U^{\top }AU\right) ^2, $$

and system (19) is then replaced by

$$\begin{aligned} \left( U^{\top }AU\right) \left( A^{\top }+A\right) U - \mu U=\mathbb O . \end{aligned}$$
(29)

Multiply it by \(U^{\top } A^{\top }\):

$$ \left( U^{\top }AU\right) \left[ U^{\top }(A^{\top })^2U + 1 -\mu \right] =0, $$

and we get two alternatives:

$$ \mathrm{either} \ U^{\top }AU=0 \quad \mathrm{or } \ \mu = 1+ U^{\top }A^2U. $$

If the second alternative takes place, substitute the expression for \( \mu \) into (29):

$$ \left( U^{\top }AU\right) (A^{\top }+A)U-(1+ U^{\top }A^2U)U=\mathbb O. $$

Wherefrom it follows that

$$\begin{aligned} (A^{\top }+A)U=\frac{1+ U^{\top }A^2U}{U^{\top }AU}U. \end{aligned}$$
(30)

If there exists a solution \( U=U_{*} \ne \mathbb O \) for this equation, then \( U_{*} \) is necessarily an eigenvector of \( A^{\top }+A \) corresponding to the eigenvalue

$$ \nu _{*}=(1+ U_{*}^{\top }A^2U_{*})/(U_{*}^{\top }AU_{*}). $$

Matrix \( A^{\top }+A \) is a symmetric one with the eigenvalues \( 2\cos \alpha _1, \dots , 2\cos \alpha _m \) of the multiplicity 2 and, probably, \(\pm 2\). Substitution \( U=U_{*} \) into (30) and multiplication by \(U_{*}^{\top } \) yields

$$ \nu _{*}=2U_{*}^{\top }AU_{*}=2 \cos \alpha _j \ \text{ for } \text{ some } \ j. $$

Therefore, the critical values of the function G(U) are in the set \(\{1-\cos ^2 \alpha _j\}_{j=1}^m \). This results in (27).

The alternative \( U_{*}^{\top }AU_{*}=0 \) for \(U_{*}^{\top }U_{*}=1 \) corresponds to the case where A possesses eigenvalues \(\pm {\textbf {i}} \). The result (27) remains valid.    \(\square \)

Proof

II. For \(j \in \{1,\dots ,m\}\), denote by \( P_j\) the eigenvectors of A corresponding to the eigenvalue \(\cos \alpha _j\pm \mathbf {i}\sin \alpha _j\) with \(||\Re (P_j)||=||\Im (P_j)||=1\). Denote \(P_{[1]}\) and \(P_{[-1]}\) the eigenvectors corresponding to the eigenvalues 1 and \(-1\) correspondingly (if any) with \(\Vert P_{[1]}\Vert =\Vert P_{[-1]}\Vert =1\). Then the orthogonal matrix

$$ P=(\Re (P_1),\Im (P_1),\Re (P_2),\Im (P_2),\ldots ,\Re (P_m),\Im (P_m),\{ P_{[1]},P_{[-1]}\}) $$

is such that

$$ P^{\top }A P = \varOmega \quad { \mathrm {where} } \ \varOmega =\mathrm{diag}\,\{\varOmega _1,\varOmega _2,\ldots ,\varOmega _m,\{ 1,-1\} \}, $$

where

$$ \varOmega _k:=\left[ \begin{array}{cc} \cos \alpha _k&{}\sin \alpha _k\\ -\sin \alpha _k&{}\cos \alpha _k \end{array}\right] \quad \mathrm{for} \ k\in \{1,2,\ldots ,m\} $$

(we set in braces the entries of the matrices corresponding to the case of existence of either of eigenvalues 1 or \(-1\) or both for A).

Since the orthogonal transformation does not influence the Frobenius distance, we reduce \(d(A,\mathbb D)\) to \(d(\varOmega ,\mathbb D)\). In this case,

$$(\varOmega -\lambda I)(\varOmega -\lambda I)^{\top }=\mathrm{diag}\,\{\tilde{\varOmega }_1\tilde{\varOmega }_2,\ldots ,\tilde{\varOmega }_m,\{ 1,1\} \},$$

where

$$\tilde{\varOmega }_k:=\left[ \begin{array}{cc} (\cos \alpha _k-\lambda )^2&{}0\\ 0&{}(\cos \alpha _k-\lambda )^2 \end{array}\right] \quad {\mathrm for} \ k\in \{1,2,\dots ,m\}.$$

It is evident that

$$(\varOmega -\lambda I)(\varOmega -\lambda I)^{\top }(\varOmega -\lambda I)=(\varOmega -\lambda I)^2(\varOmega -\lambda I)^{\top }.$$

Hence, expression (23) is valid. In this case, the singular values of the matrix (21) are the zeros of the polynomials

$$\mu ^2-\mu (2((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)+\gamma ^2)+(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k), $$

namely:

$$\mu ^{(k)}_{1,2}=\frac{2((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)+\gamma ^2\pm \gamma \sqrt{\gamma ^2+4((\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k)}}{2}.$$

Differentiating w.r.t. \(\gamma \), we get a single stationary point \(\gamma =0\). According to [14], to find the 2-norm distance from \(\varOmega -\lambda I\) to the manifold of matrices with multiple zero eigenvalue, one should find the singular values \(\sigma _n\) and \(\sigma _{n-1}\) of this matrix. They are either

$$\sigma _n=\sigma _{n-1}=\sqrt{(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k} $$

for  some k or

$$ \sigma _{n-1}=\sqrt{(\cos \alpha _k-\lambda )^2+\sin ^2\alpha _k},\sigma _n=|1-\lambda | \, . $$

The minimal value of \(\sigma _{n-1}\) w.r.t. \(\lambda \) comes up to \(\sin \alpha _1\) in both cases.

The minimal perturbation

$$E_{*}=P\left[ \begin{array}{ccccc} 0&{}-\sin \alpha _1&{}0&{}\ldots &{}0\\ 0&{}0&{}0&{}\ldots &{}0\\ \vdots &{}&{}&{}&{} \vdots \\ 0&{}0&{}0&{}\ldots &{}0 \end{array}\right] P^{\top }=-(\sin \alpha _1)\Re (P_1) \Im (P_1)^{\top }.$$

   \(\square \)

Corollary 6

In the notation of Theorem 8, the distance \(d(A, \mathbb D) \) is provided by a continuum of perturbations \( E_{*} \) contained in the set

$$ \left\{ (-\sin \alpha _1) (\eta \Re (P_1) +\theta \Im (P_1))(-\eta \Im (P_1)+\theta \Re (P_1))^{\top } \mid \{\eta , \theta \} \subset \mathbb R, \eta ^2+\theta ^2=1\right\} . $$

Example 6

For the matrix

$$A=\frac{1}{3}\left[ \begin{array}{rrr} -2&{}-2&{}1\\ 1&{}-2&{}-2\\ -2&{}1&{}-2 \end{array}\right] , $$

one has

$$ \lambda _{1,2}=-\frac{1}{2}\pm \mathbf {i} \frac{\sqrt{3}}{2}, \lambda _3=-1, \ P_1=\left[ -\frac{2}{\sqrt{6}},\frac{1}{\sqrt{6}},\frac{1}{\sqrt{6}} \right] ^{\top } + \mathbf{i}\left[ -\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}},0 \right] ^{\top }. $$

Here \( d(A, \mathbb D) = \sqrt{3}/2\approx 0.866025 \) and there are infinite number corresponding perturbation matrices (10) generated by columns \( U_{*}\) chosen from the span of \(\Re (P_1) \) and \( \Im (P_1)\). For instance:

$$ \begin{array}{c} U_{*}:=\Re (P_1) \\ \Downarrow \\ E_{*}=\left[ \begin{array}{rrr} 0&{}1/2&{}-1/2\\ 0&{}-1/4&{}1/4\\ 0&{}-1/4&{}1/4 \end{array}\right] ; \end{array} \qquad \begin{array}{c} U_{*}:=\Im (P_1) \\ \Downarrow \\ E_{*}=\left[ \begin{array}{rrr} 1/4 &{} 1/4 &{} -1/2\\ -1/4 &{} -1/4 &{} 1/2\\ 0 &{} 0 &{} 0 \end{array}\right] . \end{array} $$

In the both cases, spectrum of matrix \( B_{*}\) is \( \{-1,-1/2,-1/2\}\).    \(\square \)

Remark. In all the cases, where the distance \(d(A,\mathbb {D})\) is achieved at \(\gamma =0\) and two minimal singular values of the matrix (21) coincide, i.e., \(\sigma _{2n-1}(\lambda ,0)=\sigma _{2n}(\lambda ,0)\), we have found the rank 1 minimal perturbation whilst in the work [14] it is described as a rank 2 matrix.

6 Conclusions

We have investigated Wilkinson’s problem for the distance evaluation from a given matrix to the set of matrices possessing multiple eigenvalues. The structure of the perturbation matrix is clarified that gives us an opportunity to compute symbolically the distance equation with the zero set containing the critical values of the squared distance function.

Computational complexity of the proposed solution is (traditionally to analytical approach) high. Although this payment should be agreed with regard to the reliability of the computation results, we still hope to reduce it in further investigations.

There exists a definite similarity of the considered problem to that of Routh–Hurwitz distance to instability computation. For instance, the approach suggested in Sect. 3 has its counterpart in the one developed by Ch. Van Loan for the distance to instability problem [11, 12]. This is also a subject of subsequent discussions.