Distances on Numbers, Polynomials, and Matrices

Deza, Michel Marie; Deza, Elena

doi:10.1007/978-3-662-52844-0_12

Michel Marie Deza³ &
Elena Deza⁴

1862 Accesses

Abstract

Here we consider the most important metrics on the classical number systems: the semiring $\mathbb{N}$ of natural numbers, the ring $\mathbb{Z}$ of integers, and the fields $\mathbb{Q}$, $\mathbb{R}$, $\mathbb{C}$ of rational, real, complex numbers, respectively. We consider also the algebra $\mathcal{Q}$ of quaternions.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Metrics on Numbers

Here we consider the most important metrics on the classical number systems: the semiring $\mathbb{N}$ of natural numbers, the ring $\mathbb{Z}$ of integers, and the fields $\mathbb{Q}$, $\mathbb{R}$, $\mathbb{C}$ of rational, real, complex numbers, respectively. We consider also the algebra $\mathcal{Q}$ of quaternions.

Metrics on natural numbers

There are several well-known metrics on the set $\mathbb{N}$ of natural numbers:
1. 1.
  | n − m | ; the restriction of the natural metric (from $\mathbb{R}$) on $\mathbb{N}$;
2. 2.
  p ^−α, where α is the highest power of a given prime number p dividing m − n, for m ≠ n (and equal to 0 for m = n); the restriction of the p -adic metric (from $\mathbb{Q}$) on $\mathbb{N}$;
3. 3.
  $\ln \frac{lcm(m,n)} {gcd(m,n)}$; an example of the lattice valuation metric;
4. 4.
  w _r(n − m), where w _r(n) is the arithmetic r-weight of n; the restriction of the arithmetic r -norm metric (from $\mathbb{Z}$) on $\mathbb{N}$;
5. 5.
  $\frac{\vert n-m\vert } {mn}$ (cf. M -relative metric in Chap. 5);
6. 6.
  $1 + \frac{1} {m+n}$ for m ≠ n (and equal to 0 for m = n); the Sierpinski metric.
Most of these metrics on $\mathbb{N}$ can be extended on $\mathbb{Z}$. Moreover, any one of the above metrics can be used in the case of an arbitrary countable set X. For example, the Sierpinski metric is defined, in general, on a countable set $X =\{ x_{n}: n \in \mathbb{N}\}$ by $1 + \frac{1} {m+n}$ for all x _m, x _n ∈ X with m ≠ n (and is equal to 0, otherwise).
Arithmetic r-norm metric

Let $r \in \mathbb{N},r \geq 2$. The modified r -ary form of an integer x is a representation
$$\displaystyle{x = e_{n}r^{n} +\ldots +e_{ 1}r + e_{0},}$$
where $e_{i} \in \mathbb{Z}$, and | e _i | < r for all $i = 0,\ldots,n$.

An r-ary form is called minimal if the number of nonzero coefficients is minimal. The minimal form is not unique, in general. But if the coefficients e _i, 0 ≤ i ≤ n − 1, satisfy the conditions $\vert e_{i} + e_{i+1}\vert < r$, and | e _i | < | e _i+1 | if e _i e _i+1 < 0, then the above form is unique and minimal; it is called the generalized nonadjacent form.

The arithmetic r -weight w _r(x) of an integer x is the number of nonzero coefficients in a minimal r -ary form of x, in particular, in the generalized nonadjacent form. The arithmetic r -norm metric on $\mathbb{Z}$ (see, for example, [Ernv85]) is defined by
$$\displaystyle{w_{r}(x - y).}$$
Distance between consecutive primes

The distance between consecutive primes (or prime gap, prime difference function) is the difference $g_{n} = p_{n+1} - p_{n}$ between two successive prime numbers.

It holds g _n ≤ p _n, $\overline{\lim \nolimits }_{n\rightarrow \infty }g_{n} = \infty $ and (Zhang, 2013) lim _{n → ∞} g _n < 7 × 10⁷, improved to ≤ 246 (conjecturally, to ≤ 6) by Polymath8, 2014. There is no lim_{n → ∞} g _n but g _n ≈ lnp _n for the average g _n.

Open Polignac’s conjecture: for any k ≥ 1, there are infinitely many n with g _n = 2k; the case k = 1 (i.e., that lim _{n → ∞} g _n = 2 holds) is the twin prime conjecture.
Distance Fibonacci numbers

Fibonacci numbers are defined by the recurrence $F_{n} = F_{n-1} + F_{n-2}$ for n ≥ 2 with initial terms F ₀ = 0 and F ₁ = 1. Distance Fibonacci numbers are three following generalizations of them in the distance sense, considered by Wloch et al..

Kwaśnik–Wloch, 2000: $F(k,n) = F(k,n - 1) + F(k,n - k)$ for n > k and $F(k,n) = n + 1$ for n ≤ k.

Bednarz et al., 2012: $Fd(k,n) = Fd(k,n - k + 1) + Fd(k,n - k)$ for n ≥ k > 1 and Fd(k, n) = 1 for 0 ≤ n < k.

Wloch et al., 2013: $F_{2}(k,n) = F_{2}(k,n - 2) + F_{2}(k,n - k)$ for n ≥ k ≥ 1 and F ₂(k, n) = 1 for 0 ≤ n < k.
p-adic metric

Let p be a prime number. Any nonzero rational number x can be represented as $x = p^{\alpha }\frac{c} {d}$, where c and d are integers not divisible by p, and α is a unique integer. The p-adic norm of x is defined by $\vert x\vert _{p} = p^{-\alpha }$. Moreover, | 0 | _p = 0 is defined.

The p -adic metric is a norm metric on the set $\mathbb{Q}$ of rational numbers defined by
$$\displaystyle{\vert x - y\vert _{p}.}$$
This metric forms the basis for the algebra of p-adic numbers. The Cauchy completions of the metric spaces $(\mathbb{Q},\vert x - y\vert _{p})$ and $(\mathbb{Q},\vert x - y\vert )$ with the natural metric | x − y | give the fields $\mathbb{Q}_{p}$ of p-adic numbers and $\mathbb{R}$ of real numbers, respectively.

The Gajić metric is an ultrametric on the set $\mathbb{Q}$ of rational numbers defined, for x ≠ y (via the integer part ⌊z⌋ of a real number z), by
$$\displaystyle{\inf \{2^{-n}: n \in \mathbb{Z},\,\,\lfloor 2^{n}(x - e)\rfloor = \lfloor 2^{n}(y - e)\rfloor \},}$$
where e is any fixed irrational number. This metric is equivalent to the natural metric | x − y | on $\mathbb{Q}$.
Continued fraction metric on irrationals

The continued fraction metric on irrationals is a complete metric on the set Irr of irrational numbers defined, for x ≠ y, by
$$\displaystyle{ \frac{1} {n},}$$
where n is the first index for which the continued fraction expansions of x and y differ. This metric is equivalent to the natural metric | x − y | on Irr which is noncomplete and disconnected. Also, the Baire 0-dimensional space $B(\aleph _{0})$ (cf. Baire metric in Chap. 11) is homeomorphic to Irr endowed with this metric.
Natural metric

The natural metric (or absolute value metric , line metric , the distance between numbers) is a metric on $\mathbb{R}$ defined by
$$\displaystyle{\vert x-y\vert = \left \{\begin{array}{ccc} y - x,&\mbox{ if}&x - y < 0,\\ x - y, &\mbox{ if} &x - y \geq 0.\end{array} \right.}$$
On $\mathbb{R}$ all l _p -metrics coincide with the natural metric. The metric space $(\mathbb{R},\vert x - y\vert )$ is called the real line (or Euclidean line).

There exist many other metrics on $\mathbb{R}$ coming from | x − y | by some metric transform (Chap. 4). For example: min{1, | x − y | }, $\frac{\vert x-y\vert } {1+\vert x-y\vert }$, $\vert x\vert + \vert x - y\vert + \vert y\vert $ (for x ≠ y) and, for a given 0 < α < 1, the generalized absolute value metric | x − y | ^α.

Some authors use | x − y | as the Polish notation (parentheses-free and computer-friendly) of the distance function in any metric space.
Zero bias metric

The zero bias metric is a metric on $\mathbb{R}$ defined by
$$\displaystyle{1 + \vert x - y\vert }$$
if one and only one of x and y is strictly positive, and by
$$\displaystyle{\vert x - y\vert,}$$
otherwise, where | x − y | is the natural metric (see, for example, [Gile87]).
Sorgenfrey quasi-metric

The Sorgenfrey quasi-metric is a quasi-metric d on $\mathbb{R}$ defined by
$$\displaystyle{y - x}$$
if y ≥ x, and equal to 1, otherwise. Some similar quasi-metrics on $\mathbb{R}$ are:
1. 1.
  $d_{1}(x,y) =\max \{ y - x,0\}$ (in general, max{ f(y) − f(x), 0} is a quasi-metric on a set X if $f: X \rightarrow \mathbb{R}_{\geq 0}$ is an injective function);
2. 2.
  $d_{2}(x,y) =\min \{ y - x,1\}$ if y ≥ x, and equal to 1, otherwise;
3. 3.
  $d_{3}(x,y) = y - x$ if y ≥ x, and equal to a(x − y) (for fixed a > 0), otherwise;
4. 4.
  $d_{4}(x,y) = e^{y} - e^{x}$ if y ≥ x, and equal to $e^{-y} - e^{-x}$ otherwise.
Real half-line quasi-semimetric

The real half-line quasi-semimetric is defined on the half-line $\mathbb{R}_{>0}$ by
$$\displaystyle{\max \{0,\ln \frac{y} {x}\}.}$$
Janous–Hametner metric

The Janous–Hametner metric is defined on the half-line $\mathbb{R}_{>0}$ by
$$\displaystyle{ \frac{\vert x - y\vert } {(x + y)^{t}},}$$
where $t = -1$ or 0 ≤ t ≤ 1, and | x − y | is the natural metric.
Extended real line metric

An extended real line metric is a metric on $\mathbb{R} \cup \{ +\infty \}\cup \{-\infty \}$. The main example (see, for example, [Cops68]) of such metric is given by
$$\displaystyle{\vert \,f(x) - f(y)\vert,}$$
where $f(x) = \frac{x} {1+\vert x\vert }$ for $x \in \mathbb{R}$, $f(+\infty ) = 1$, and $f(-\infty ) = -1$.

Another metric, commonly used on $\mathbb{R} \cup \{ +\infty \}\cup \{-\infty \}$, is defined by
$$\displaystyle{\vert \arctan x -\arctan y\vert,}$$
where $-\frac{1} {2}\pi <\arctan x < \frac{1} {2}\pi$ for −∞ < x < ∞, and $\arctan (\pm \infty ) = \pm \frac{1} {2}\pi$.
Complex modulus metric

The complex modulus metric on the set $\mathbb{C}$ of complex numbers is defined by
$$\displaystyle{\vert z - u\vert,}$$
where, for any $z = z_{1} + z_{2}i \in \mathbb{C}$, the number $\vert z\vert = \sqrt{z\overline{z}} = \sqrt{z_{1 }^{2 } + z_{2 }^{2}}$ is the complex modulus. The complex argument θ is defined by $z = \vert z\vert (\cos (\theta ) + i\sin (\theta ))$.

The metric space $(\mathbb{C},\vert z - u\vert )$ is called the complex (or Wessel–Argand) plane. It is isometric to the Euclidean plane $(\mathbb{R}^{2},\vert \vert x - y\vert \vert _{2})$. So, the metrics on $\mathbb{R}^{2}$, given in Chaps. 19 and 5, can be seen as metrics on $\mathbb{C}$. For example, the British Rail metric on $\mathbb{C}$ is | z | + | u | for z ≠ u. The p -relative (if 1 ≤ p < ∞) and relative metric (if p = ∞) on $\mathbb{C}$ are defined for | z | + | u | ≠ 0 respectively, by
$$\displaystyle{ \frac{\vert z - u\vert } {\root{p}\of{\vert z\vert ^{p} + \vert u\vert ^{p}}}\mbox{ and }\frac{\vert z - u\vert } {\max \{\vert z\vert,\vert u\vert \}}.}$$
ℤ(η _m)-related norm metrics

A Kummer (or cyclotomic) ring $\mathbb{Z}(\eta _{m})$ is a subring of the ring $\mathbb{C}$ (and an extension of the ring $\mathbb{Z}$), such that each of its elements has the form $\sum _{j=0}^{m-1}a_{j}\eta _{m}^{j}$, where η _m is a primitive m-th root $\exp (\frac{2\pi i} {m})$ of unity, and all a _j are integers.

The complex modulus | z | of $z = a + b\eta _{m} \in \mathbb{C}$ is defined by
$$\displaystyle{\vert z\vert ^{2} = z\overline{z} = a^{2} + (\eta _{ m} + \overline{\eta _{m}})ab + b^{2} = a^{2} + 2ab\cos (\frac{2\pi i} {m}) + b^{2}.}$$
Then $(a + b)^{2} = q^{2}$ for m = 2 (or 1), a ² + b ² for m = 4, and $a^{2} + ab + b^{2}$ for m = 6 (or 3), i.e., for the ring $\mathbb{Z}$ of usual integers, $\mathbb{Z}(i)$ of Gaussian integers and $\mathbb{Z}(\rho )$ of Eisenstein–Jacobi (or EJ) integers.

The set of units of $\mathbb{Z}(\eta _{m})$ contain η _m ^j, 0 ≤ j ≤ m − 1; for m = 5 and m ≥ 6, units of infinite order appear also, since $\cos (\frac{2\pi i} {m})$ is irrational. For m = 2, 4, 6, the set of units is { ± 1}, { ± 1, ±i}, { ± 1, ±ρ, ±ρ ²}, where i = η ₄ and $\rho =\eta _{6} = \frac{1+i\sqrt{3}} {2}$.

The norms $\vert z\vert = \sqrt{a^{2 } + b^{2}}$ and $\vert \vert z\vert \vert _{i} = \vert a\vert + \vert b\vert $ for $z = a + bi \in \mathbb{C}$ give rise to the complex modulus and i -Manhattan metrics on $\mathbb{C}$. They coincide with the Euclidean (l ₂-) and Manhattan (l ₁-) metrics, respectively, on $\mathbb{R}^{2}$ seen as the complex plane. The restriction of the i-Manhattan metric on $\mathbb{Z}(i)$ is the path metric of the square grid $\mathbb{Z}^{2}$ of $\mathbb{R}^{2}$; cf. grid metric in Chap. 19.

The ρ -Manhattan metric on $\mathbb{C}$ is defined by the norm | | z | | _ρ, i.e.,
$$\displaystyle{\min \{\vert a\vert + \vert b\vert + \vert c\vert: z = a + b\rho + c\rho ^{2}\} =\min \{ \vert a\vert + \vert b\vert,\vert a + b\vert + \vert b\vert,\vert a + b\vert + \vert a\vert: z = a + b\rho \}.}$$
The restriction of the ρ-Manhattan metric on $\mathbb{Z}(\rho )$ is the path metric of the triangular grid of $\mathbb{R}^{2}$ (seen as the hexagonal lattice $A_{2} =\{ (a,b,c) \in \mathbb{Z}^{3}: a + b + c = 0\}$), i.e., the hexagonal metric (Chap. 19).

Let f denote either i or $\rho = \frac{1+i\sqrt{3}} {2}$. Given a $\pi \in \mathbb{Z}(\,f)\setminus \{0\}$ and $z,z' \in \mathbb{Z}(\,f)$, we write $z \equiv z'\,(\mbox{ mod}\,\pi )$ if $z - z' =\delta \pi$ for some $\delta \in \mathbb{Z}(\,f)$. For the quotient ring $\mathbb{Z}_{\pi }(\,f) =\{ z\,(\mbox{ mod}\,\pi ): z \in \mathbb{Z}(\,f)\}$, it holds $\vert \mathbb{Z}_{\pi }(\,f)\vert = \vert \vert \pi \vert \vert _{f}^{2}$.

Call two congruence classes z (mod π) and z′ (mod π) adjacent if $z - z' \equiv f^{j}\,(\mbox{ mod}\,\pi )$ for some j. The resulting graph on $\mathbb{Z}_{\pi }(\,f)$ called a Gaussian network or EJ network if, respectively, f = i or f = ρ. The path metrics of these networks coincide with their norm metrics, defined (Fan–Gao, 2004) for z (mod π) and z′ (mod π), by
$$\displaystyle{\min \vert \vert u\vert \vert _{f}: u \in z - z'\,(\mbox{ mod}\,\pi ).}$$

These metrics are different from the previously defined ([Hube94a, Hube94b]) distance on $\mathbb{Z}_{\pi }(\,f)$: | | v | | _f, where v ∈ z − z′ (mod π) is selected by minimizing the complex modulus. For f = i, this is the Mannheim distance (Chap. 16), which is not a metric.
Chordal metric

The chordal metric d _χ is a metric on the set $\overline{\mathbb{C}}$=$\mathbb{C} \cup \{\infty \}$ defined by
$$\displaystyle{d_{\chi }(z,u) = \frac{2\vert z - u\vert } {\sqrt{1 + \vert z\vert ^{2}}\sqrt{1 + \vert u\vert ^{2}}}\mbox{ and }d_{\chi }(z,\infty ) = \frac{2} {\sqrt{1 + \vert z\vert ^{2}}}}$$
for all $u,z \in \mathbb{C}$ (cf. M -relative metric in Chap. 5).

The metric space $(\overline{\mathbb{C}},d_{\chi })$ is called the extended complex plane. It is homeomorphic and conformally equivalent to the Riemann sphere, i.e., the unit sphere $S^{2} =\{ (x_{1},x_{2},x_{3}) \in \mathbb{E}^{3}: x_{1}^{2} + x_{2}^{2} + x_{3}^{2} = 1\}$ (considered as a metric subspace of $\mathbb{E}^{3}$), onto which $(\overline{\mathbb{C}},d_{\chi })$ is one-to-one mapped under stereographic projection.

The plane $\overline{\mathbb{C}}$ can be identified with the plane x ₃ = 0 such that the and imaginary axes coincide with the x ₁ and x ₂ axes. Under stereographic projection, each point $z \in \mathbb{C}$ corresponds to the point (x ₁, x ₂, x ₃) ∈ S ², where the ray drawn from the “north pole” (0, 0, 1) to the point z meets the sphere S ²; the “north pole” corresponds to the point at ∞. The chordal (spherical) metric between two points p, q ∈ S ² is taken to be the distance between their preimages $z,u \in \overline{\mathbb{C}}$.

The chordal metric is defined equivalently on $\overline{\mathbb{R}}^{n} = \mathbb{R}^{n} \cup \{\infty \}$:
$$\displaystyle{d_{\chi }(x,y) = \frac{2\vert \vert x - y\vert \vert _{2}} {\sqrt{1 + \vert \vert x\vert \vert _{2 }^{2}}\sqrt{1 + \vert \vert y\vert \vert _{2 }^{2}}}\mbox{ and }d_{\chi }(x,\infty ) = \frac{2} {\sqrt{1 + \vert \vert x\vert \vert _{2 }^{2}}}.}$$

The restriction of the metric d _χ on $\mathbb{R}^{n}$ is a Ptolemaic metric; cf. Chap. 1

Given α > 0, β ≥ 0, p ≥ 1, the generalized chordal metric is a metric on $\mathbb{C}$ (in general, on $(\mathbb{R}^{n},\vert \vert.\vert \vert _{2})$ and even on any Ptolemaic space (V, | | . | | )), defined by
$$\displaystyle{ \frac{\vert z - u\vert } {\root{p}\of{\alpha +\beta \vert z\vert ^{p}} \cdot \root{p}\of{\alpha +\beta \vert u\vert ^{p}}}.}$$
Metrics on quaternions

Quaternions are members of a noncommutative division algebra $\mathcal{Q}$ over the field $\mathbb{R}$, geometrically realizable in $\mathbb{R}^{4}$ ([Hami66]). Formally,
$$\displaystyle{\mathcal{Q} =\{ q = q_{1} + q_{2}i + q_{3}j + q_{4}k: q_{i} \in \mathbb{R}\},}$$
where the basic units $1,i,j,k \in \mathcal{Q}$ satisfy $i^{2} = j^{2} = k^{2} = -1$ and $ij = -ji = k$.

The quaternion norm is defined by $\vert \vert q\vert \vert = \sqrt{q\overline{q}} = \sqrt{q_{1 }^{2 } + q_{2 }^{2 } + q_{3 }^{2 } + q_{4 }^{2}}$, where $\overline{q} = q_{1} - q_{2}i - q_{3}j - q_{4}k$. The quaternion metric is the norm metric | | q − q′ | | on $\mathcal{Q}$.

The set of all Lipschitz integers and Hurwitz integers are defined, respectively, by
$$\displaystyle{L =\{ q_{1} + q_{2}i + q_{3}j + q_{4}k: q_{i} \in \mathbb{Z}\}\,\,\mbox{ and }}$$

$$\displaystyle{H =\{ q_{1} + q_{2}i + q_{3}j + q_{4}k:\, \mbox{ all }\,q_{i} \in \mathbb{Z}\,\mbox{ or all }q_{i} + \frac{1} {2} \in \mathbb{Z}\}.}$$
A quaternion q ∈ L is irreducible (i.e., q = q′q″ implies {q′, q″} ∩{±1, ±i, ±j, ±k} ≠ ∅) if and only if | | q | | is a prime. Given an irreducible π ∈ L and q, q′ ∈ H, we write $q \equiv q'\,(\mbox{ mod}\,\pi )$ if $q - q' =\delta \pi$ for some δ ∈ L.

For the rings L _π = { q (mod π): q ∈ L} and H _π = { q (mod π): q ∈ H} it holds | L _π | = | | π | | ² and $\vert H_{\pi }\vert = 2\vert \vert \pi \vert \vert ^{2} - 1$.

The quaternion Lipschitz metric on L _π is defined (Martinez et al., 2009) by
$$\displaystyle{d_{L}(\alpha,\beta ) =\min \sum _{1\leq s\leq 4}\vert q_{s}\vert:\alpha -\beta \equiv q_{1} + q_{2}i + q_{3}j + q_{4}k\,(\mbox{ mod}\,\pi ).}$$
The ring H is additively generated by its subring L and $w = \frac{1} {2}(1 + i + j + k)$. The Hurwitz metric on the ring H _π is defined (Guzëltepe, 2013) by
$$\displaystyle{d_{H}(\alpha,\beta ) =\min \sum _{1\leq s\leq 5}\vert q_{s}\vert:\alpha -\beta \equiv q_{1} + q_{2}i + q_{3}j + q_{4}k + q_{5}w\,(\mbox{ mod}\,\pi ).}$$

Cf. the hyper-Kähler and Gibbons–Manton metrics in Sect. 7.3 and the unit quaternions and joint angle metrics in Sect. 18.3

2 Metrics on Polynomials

A polynomial is a sum of powers in one or more variables multiplied by coefficients. A polynomial in one variable (or monic polynomial) with constant real (complex) coefficients is given by $P = P(z) =\sum _{ k=0}^{n}a_{k}z^{k}$, $a_{k} \in \mathbb{R}$ ($a_{k} \in \mathbb{C}$). The set $\mathcal{P}$ of all real (complex) polynomials forms a ring $(\mathcal{P},+,\cdot,0)$. It is also a vector space over $\mathbb{R}$ (over $\mathbb{C}$).

Polynomial norm metric

A polynomial norm metric is a norm metric on the vector space $\mathcal{P}$ of all real (complex) polynomials defined by
$$\displaystyle{\vert \vert P - Q\vert \vert,}$$
where | | . | | is a polynomial norm, i.e., a function $\vert \vert.\vert \vert: \mathcal{P}\rightarrow \mathbb{R}$ such that, for all $P,Q \in \mathcal{P}$ and for any scalar k, we have the following properties:
1. 1.
  | | P | | ≥ 0, with | | P | | = 0 if and only if P ≡ 0;
2. 2.
  | | kP | | = | k | | | P | | ;
3. 3.
  $\vert \vert P + Q\vert \vert \leq \vert \vert P\vert \vert + \vert \vert Q\vert \vert $ (triangle inequality).
The l _p -norm and L _p -norm of a polynomial $P(z) =\sum _{ k=0}^{n}a_{k}z^{k}$ are defined by
$$\displaystyle{\vert \vert P\vert \vert _{p} = (\sum _{k=0}^{n}\vert a_{ k}\vert ^{p})^{1/p}\mbox{ and }\vert \vert P\vert \vert _{ L_{p}} = (\int _{0}^{2\pi }\vert P(e^{i\theta })\vert ^{p}\frac{d\theta } {2\pi })^{\frac{1} {p} }\mbox{ for }1 \leq p < \infty,}$$

$$\displaystyle{\vert \vert P\vert \vert _{\infty } =\max _{0\leq k\leq n}\vert a_{k}\vert \mbox{ and }\vert \vert P\vert \vert _{L_{\infty }} =\sup _{\vert z\vert =1}\vert P(z)\vert \mbox{ for }p = \infty.}$$
The values | | P | | ₁ and | | P | | _∞ are called the length and height of polynomial P.
Distance from irreducible polynomials

For any field $\mathbb{F}$, a polynomial with coefficients in $\mathbb{F}$ is said to be irreducible over $\mathbb{F}$ if it cannot be factored into the product of two nonconstant polynomials with coefficients in $\mathbb{F}$. Given a metric d on the polynomials over $\mathbb{F}$, the distance (of a given polynomial P(z)) from irreducible polynomials is d _ir(P) = infd(P, Q), where Q(z) is any irreducible polynomial of the same degree over $\mathbb{F}$.

Polynomial conjecture of Turán, 1967, is that there exists a constant C with d _ir(P) ≤ C for every polynomial P over $\mathbb{Z}$, where d(P, Q) is the length | | P − Q | | ₁ of P − Q.

Lee–Ruskey–Williams, 2007, conjectured that there exists a constant C with d _ir(P) ≤ C for every polynomial P over the Galois field $\mathbb{F}_{2}$, where d(P, Q) is the Hamming distance between the (0, 1)-sequences of coefficients of P and Q.
Bombieri metric

The Bombieri metric (or polynomial bracket metric ) is a polynomial norm metric on the set $\mathcal{P}$ of all real (complex) polynomials defined by
$$\displaystyle{[P - Q]_{p},}$$
where [. ]_p, 0 ≤ p ≤ ∞, is the Bombieri p -norm.

For a polynomial $P(z) =\sum _{ k=0}^{n}a_{k}z^{k}$ it is defined by
$$\displaystyle{[P]_{p} = (\sum _{k=0}^{n}(_{ k}^{n})^{1-p}\vert a_{ k}\vert ^{p})^{\frac{1} {p} }.}$$
Metric space of roots

The metric space of roots is (Ćurgus–Mascioni, 2006) the space (X, d) where X is the family of all multisets of complex numbers with n elements and the distance between multisets $U =\{ u_{1},\ldots,u_{n}\}$ and $V =\{ v_{1},\ldots,v_{n}\}$ is defined by the following analog of the Fréchet metric:
$$\displaystyle{\min _{\tau \in Sym_{n}}\max _{1\leq j\leq n}\vert u_{j} - v_{\tau (\,j)}\vert,}$$
where τ is any permutation of $\{1,\ldots,n\}$. Here the set of roots of some monic complex polynomial of degree n is considered as a multiset with n elements. Cf. metrics between multisets in Chap. 1

The function assigning to each polynomial the multiset of its roots is a homeomorphism between the metric space of all monic complex polynomials of degree n with the polynomial norm metric l _∞ and the metric space of roots.

3 Metrics on Matrices

An m × n matrix A = ((a _ij)) over a field $\mathbb{F}$ is a table consisting of m rows and n columns with the entries a _ij from $\mathbb{F}$. The set of all m × n matrices with real (complex) entries is denoted by M _m, n or $\mathbb{R}^{m\times n}$ ($\mathbb{C}^{m\times n}$). It forms a group (M _m, n, +, 0_m, n), where $((a_{ij})) + ((b_{ij})) = ((a_{ij} + b_{ij}))$, and the matrix 0_m, n ≡ 0. It is also an mn-dimensional vector space over $\mathbb{R}$ ($\mathbb{C}$).

The transpose of a matrix A = ((a _ij)) ∈ M _m, n is the matrix A ^T = ((a _ji)) ∈ M _n, m. A m × n matrix A is called a square matrix if m = n, and a symmetric matrix if A = A ^T. The conjugate transpose (or adjoint) of a matrix A = ((a _ij)) ∈ M _m, n is the matrix $A^{{\ast}} = ((\overline{a}_{ji})) \in M_{n,m}$. An Hermitian matrix is a complex square matrix A with A = A ^∗.

The set of all square n × n matrices with real (complex) entries is denoted by M _n. It forms a ring (M _n, +, ⋅ , 0_n), where + and 0_n are defined as above, and $((a_{ij})) \cdot ((b_{ij})) = ((\sum _{k=1}^{n}a_{ik}b_{kj}))$. It is also an n ²-dimensional vector space over $\mathbb{R}$ (over $\mathbb{C}$). The trace of a square n × n matrix A = ((a _ij)) is defined by $\text{Tr}(A) =\sum _{ i=1}^{n}a_{ii}$.

The identity matrix is 1_n = ((c _ij)) with c _ii = 1, and c _ij = 0, i ≠ j. An unitary matrix U = ((u _ij)) is a square matrix defined by $U^{-1} = U^{{\ast}}$, where U ⁻¹ is the inverse matrix of U, i.e., $UU^{-1} = 1_{n}$. A matrix A ∈ M _m, n is orthonormal if A ^∗ A = 1_n. A matrix $A \in \mathbb{R}^{n\times n}$ is orthogonal if $A^{T} = A^{-1}$, normal if A ^T A = AA ^T and singular if its determinant is 0.

If for a matrix A ∈ M _n there is a vector x such that Ax = λ x for some scalar λ, then λ is called an eigenvalue of A with corresponding eigenvector x. Given a matrix $A \in \mathbb{C}^{m\times n}$, its singular values s _i(A) are defined as $\sqrt{\lambda (A^{{\ast} } A})$. A real matrix A is positive-definite if v ^T Av > 0 for all nonzero real vectors v; it holds if and only if all eigenvalues of $A_{H} = \frac{1} {2}(A + A^{T})$ are positive. An Hermitian matrix A is positive-definite if v ^∗ Av > 0 for all nonzero complex vectors v; it holds if and only if all λ(A) are positive.

The mixed states of a n-dimensional quantum system are described by their density matrices, i.e., positive-semidefinite Hermitian n × n matrices of trace 1. The set of such matrices is convex, and its extremal points describe the pure states. Cf. monotone metrics in Chap. 7 and distances between quantum states in Chap. 24

Matrix norm metric

A matrix norm metric is a norm metric on the set M _m, n of all real (complex) m × n matrices defined by
$$\displaystyle{\vert \vert A - B\vert \vert,}$$
where | | . | | is a matrix norm, i.e., a function $\vert \vert.\vert \vert: M_{m,n} \rightarrow \mathbb{R}$ such that, for all A, B ∈ M _m, n, and for any scalar k, we have the following properties:
1. 1.
  | | A | | ≥ 0, with | | A | | = 0 if and only if A = 0_m, n;
2. 2.
  | | kA | | = | k | | | A | | ;
3. 3.
  $\vert \vert A + B\vert \vert \leq \vert \vert A\vert \vert + \vert \vert B\vert \vert $ (triangle inequality).
4. 4.
  | | AB | | ≤ | | A | | ⋅ | | B | | (submultiplicativity).
All matrix norm metrics on M _m, n are equivalent. The simplest example of such metric is the Hamming metric on M _m, n (in general, on the set $M_{m,n}(\mathbb{F})$ of all m × n matrices with entries from a field $\mathbb{F}$) defined by | | A − B | | _H, where | | A | | _H is the Hamming norm of A ∈ M _m, n, i.e., the number of nonzero entries in A. Example of a generalized (i.e., not submultiplicative one) matrix norm is the max element norm $\vert \vert A = ((a_{ij}))\vert \vert \max =\max _{i,j}\vert a_{ij}\vert $; but $\sqrt{mn}\vert \vert A\vert \vert _{\max }$ is a matrix norm.
Natural norm metric

A natural (or operator, induced) norm metric is a matrix norm metric on the set M _n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{\text{nat}},}$$
where | | . | | _nat is a natural (or operator, induced) norm on M _n, induced by the vector norm | | x | | , $x \in \mathbb{R}^{n}$ ($x \in \mathbb{C}^{n}$), is a matrix norm defined by
$$\displaystyle{\vert \vert A\vert \vert _{\text{nat}} =\sup _{\vert \vert x\vert \vert \neq 0}\frac{\vert \vert Ax\vert \vert } {\vert \vert x\vert \vert } =\sup _{\vert \vert x\vert \vert =1}\vert \vert Ax\vert \vert =\sup _{\vert \vert x\vert \vert \leq 1}\vert \vert Ax\vert \vert.}$$

The natural norm metric can be defined in similar way on the set M _m, n of all m × n real (complex) matrices: given vector norms $\vert \vert.\vert \vert _{\mathbb{R}^{m}}$ on $\mathbb{R}^{m}$ and $\vert \vert.\vert \vert _{\mathbb{R}^{n}}$ on $\mathbb{R}^{n}$, the natural norm | | A | | _nat of a matrix A ∈ M _m, n, induced by $\vert \vert.\vert \vert _{\mathbb{R}^{n}}$ and $\vert \vert.\vert \vert _{\mathbb{R}^{m}}$, is a matrix norm defined by $\vert \vert A\vert \vert _{\text{nat}} =\sup _{\vert \vert x\vert \vert _{\mathbb{R}^{n}}=1}\vert \vert Ax\vert \vert _{\mathbb{R}^{m}}$.
Matrix p-norm metric

A matrix p -norm metric is a natural norm metric on M _n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{\text{nat}}^{p},}$$
where | | . | | _nat ^p is the matrix (or operator) p-norm, i.e., a natural norm, induced by the vector l _p -norm, 1 ≤ p ≤ ∞:
$$\displaystyle{\vert \vert A\vert \vert _{\text{nat}}^{p} =\max _{ \vert \vert x\vert \vert _{p}=1}\vert \vert Ax\vert \vert _{p},\,\,\mbox{ where }\,\,\vert \vert x\vert \vert _{p} = (\sum _{i=1}^{n}\vert x_{ i}\vert ^{p})^{1/p}.}$$
The maximum absolute column and maximum absolute row metric are the matrix 1-norm and matrix ∞ -norm metric on M _n. For a matrix A = ((a _ij)) ∈ M _n, the maximum absolute column and maximum absolute row sum norm are
$$\displaystyle{\vert \vert A\vert \vert _{\text{nat}}^{1} =\max _{ 1\leq j\leq n}\sum _{i=1}^{n}\vert a_{ ij}\vert \,\,\mbox{ and }\,\,\vert A\vert \vert _{\text{nat}}^{\infty } =\max _{ 1\leq i\leq n}\sum _{j=1}^{n}\vert a_{ ij}\vert.}$$

The spectral norm metric is the matrix 2-norm metric | | A − B | | _nat ² on M _n. The matrix 2-norm | | . | | _nat ², induced by the vector l ₂ -norm, is also called the spectral norm and denoted by | | . | | _sp. For a symmetric matrix A = ((a _ij)) ∈ M _n, it is
$$\displaystyle{\vert \vert A\vert \vert _{sp} = s_{\max }(A) = \sqrt{\lambda _{\max }(A^{{\ast} } A)},}$$
where $A^{{\ast}} = ((\overline{a}_{ji}))$, while s _max and λ _max are largest singular value and eigenvalue.
Frobenius norm metric

The Frobenius norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{Fr},}$$
where | | . | | _Fr is the Frobenius (or Hilbert–Schmidt) norm. For A = ((a _ij)), it is
$$\displaystyle{\vert \vert A\vert \vert _{Fr} = \sqrt{\sum _{i,j } \vert a_{ij } \vert ^{2}} = \sqrt{\text{Tr} (A^{{\ast} } A)} = \sqrt{\sum _{1\leq i\leq \text{rank} (A) } \lambda _{i}} = \sqrt{\sum _{1\leq i\leq \text{rank} (A) } s_{i }^{2}},}$$
where λ _i, s _i are the eigenvalues and singular values of A.

This norm is strictly convex, is a differentiable function of its elements a _ij and is the only unitarily invariant norm among $\vert \vert A\vert \vert _{p} = (\sum _{i=1}^{m}\sum _{j=1}^{n}\vert a_{ij}\vert ^{p})^{\frac{1} {p} }$, p ≥ 1.

The trace norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{tr},}$$
where | | . | | _tr is the trace norm (or nuclear norm) on M _m, n defined by
$$\displaystyle{\vert \vert A\vert \vert _{tr} =\sum _{ i=1}^{\min \{m,n\}}s_{ i}(A) = \text{Tr}(\sqrt{A^{{\ast} } A}).}$$
Schatten norm metric

Given 1 ≤ p < ∞, the Schatten norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{Sch}^{p},}$$
where | | . | | _Sch ^p is the Schatten p -norm on M _m, n. For a matrix A ∈ M _m, n, it is defined as the p-th root of the sum of the p-th powers of all its singular values:
$$\displaystyle{\vert \vert A\vert \vert _{Sch}^{p} = (\sum _{ i=1}^{\min \{m,n\}}s_{ i}^{p}(A))^{\frac{1} {p} }.}$$
For p = ∞, 2 and 1, one obtains the spectral norm metric, Frobenius norm metric and trace norm metric, respectively.
(c,p)-norm metric

Let $k \in \mathbb{N}$, k ≤ min{m, n}, $c \in \mathbb{R}^{k}$, $c_{1} \geq c_{2} \geq \ldots \geq c_{k} > 0$, and 1 ≤ p < ∞.

The (c, p)-norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{(c,p)}^{k},}$$
where | | . | | _(c, p) ^k is the (c, p)-norm on M _m, n. For a matrix A ∈ M _m, n, it is defined by
$$\displaystyle{\vert \vert A\vert \vert _{(c,p)}^{k} = (\sum _{ i=1}^{k}c_{ i}s_{i}^{p}(A))^{\frac{1} {p} },}$$
where $s_{1}(A) \geq s_{2}(A) \geq \ldots \geq s_{k}(A)$ are the first k singular values of A.

If p = 1, it is the c-norm. If, moreover, $c_{1} =\ldots = c_{k} = 1$, it is the Ky Fan k -norm.
Ky Fan k-norm metric

Given $k \in \mathbb{N}$, k ≤ min{m, n}, the Ky Fan k-norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{KF}^{k},}$$
where | | . | | _KF ^k is the Ky Fan k -norm on M _m, n. For a matrix A ∈ M _m, n, it is defined as the sum of its first k singular values:
$$\displaystyle{\vert \vert A\vert \vert _{KF}^{k} =\sum _{ i=1}^{k}s_{ i}(A).}$$
For k = 1 and k = min{m, n}, one obtains the spectral and trace norm metrics.
Cut norm metric

The cut norm metric is a matrix norm metric on M _m, n defined by
$$\displaystyle{\vert \vert A - B\vert \vert _{cut},}$$

where | | . | | _cut is the cut norm on M _m, n defined, for a matrix A = ((a _ij)) ∈ M _m, n, as:
$$\displaystyle{\vert \vert A\vert \vert _{cut} =\max _{I\subset \{1,\ldots,m\},J\subset \{1,\ldots,n\}}\vert \sum _{i\in I,j\in J}a_{ij}\vert.}$$

Cf. in Chap. 15 the rectangle distance on weighted graphs and the cut semimetric, but the weighted cut metric in Chap. 19 is not related.
Matrix nearness problems

A norm | | . | | is unitarily invariant on M _m, n if | | B | | = | | UBV | | for all B ∈ M _m, n and all unitary matrices U, V. All Schatten p -norms are unitarily invariant.

Given a unitarily invariant norm | | . | | on M _m, n, a matrix property $\mathcal{P}$ defining a subspace or compact subset of M _m, n (so that $d_{\vert \vert.\vert \vert }(A,\mathcal{P})$ below is well defined) and a matrix A ∈ M _m, n, then the distance to $\mathcal{P}$ is the point-set distance on M _m, n
$$\displaystyle{d(A) = d_{\vert \vert.\vert \vert }(A,\mathcal{P}) =\min \{ \vert \vert E\vert \vert: A + E\mbox{ has property }\mathcal{P}\}.}$$
A matrix nearness problem is ([High89]) to find an explicit formula for d(A), the $\mathcal{P}$ -closest matrix (or matrices) $X_{\vert \vert.\vert \vert }(A) = A + E$, satisfying the above minimum, and efficient algorithms for computing d(A) and X _{| | . | |}(A). The componentwise nearness problem is to find $d'(A) =\min \{\epsilon: \vert E\vert \leq \epsilon \vert A\vert,A + E\mbox{ has property }\mathcal{P}\}$, where | B | = (( | b _ij | )) and the matrix inequality is interpreted componentwise.

The most used norms for B = ((b _ij)) are the Schatten 2- and ∞ -norms (cf. Schatten norm metric ): the Frobenius norm $\vert \vert B\vert \vert _{Fr} = \sqrt{\text{Tr} (B^{{\ast} } B)}$=$\sqrt{\sum _{1\leq i\leq \text{rank} (B) } s_{i }^{2}}$ and the spectral norm $\vert \vert B\vert \vert _{sp} = \sqrt{\lambda _{\max }(B^{{\ast} } B)} = s_{1}(B)$.

Examples of closest matrices $X = X_{\vert \vert.\vert \vert }(A,\mathcal{P})$ follow.

Let $A \in \mathbb{C}^{n\times n}$. Then $A = A_{H} + A_{S}$, where $A_{H} = \frac{1} {2}(A + A^{{\ast}})$ is Hermitian and $A_{H} = \frac{1} {2}(A - A^{{\ast}})$ is skew-Hermitian (i.e., $A_{H}^{{\ast}} = -A_{H}$). Let $A = U\Sigma V ^{{\ast}}$ be a singular value decomposition (SVD) of A, i.e., U ∈ M _m and V ^∗ ∈ M _n are unitary, while $\Sigma = \text{diag}(s_{1},s_{2},\ldots,s_{\min \{m,n\}})$ is an m × n diagonal matrix with $s_{1} \geq s_{2} \geq \ldots \geq s_{\text{rank}(A)} > 0 =\ldots = 0$. Fan and Hoffman, 1955, showed that, for any unitarily invariant norm, A _H, A _S, UV ^∗ are closest Hermitian (symmetric), skew-Hermitian (skew-symmetric) and unitary (orthogonal) matrices, respectively. Such matrix X _Fr(A) is a unique minimizer in all three cases.

Let $A \in \mathbb{R}^{n\times n}$. Gabriel, 1979, found the closest normal matrix X _Fr(A). Higham found in 1988 a unique closest symmetric positive-semidefinite matrix X _Fr(A) and, in 2001, the closest matrix of this type with unit diagonal (i.e., ab correlation matrix).

Given a SVD $A = U\Sigma V ^{{\ast}}$ of A, let A _k denote $U\Sigma _{k}V ^{{\ast}}$, where $\Sigma _{k}$ is a diagonal matrix $\text{diag}(s_{1},s_{2},\ldots,s_{k},0,\ldots,0)$ containing the largest k singular values of A. Then (Mirsky, 1960) A _k achieves min_{rank(A+E) ≤ k} | | E | | for any unitarily invariant norm. So, $\vert \vert A - A_{k}\vert \vert _{Fr} = \sqrt{\sum _{i=k+1 }^{\text{rank} (A) }s_{i }^{2}}$ (Eckart–Young, 1936) and $\vert \vert A - A_{k}\vert \vert _{sp} = s_{max}(A - A_{k}) = s_{k+1}(A)$. A _k is a unique minimizer X _Fr(A) if s _k > s _k+1.

Let $A \in \mathbb{R}^{n\times n}$ be nonsingular. Then its distance to singularity $d(A,Sing) =\min \{ \vert \vert E\vert \vert: A + E\mbox{ is singular}\}$ is, for both above norms, $s_{n}(A) = \frac{1} {s_{1}(A^{-1})} = \frac{1} {\vert \vert A^{-1}\vert \vert _{sp}} =\sup \{\delta:\delta \mathbb{B}_{\mathbb{R}^{n}} \subseteq A\mathbb{B}_{\mathbb{R}^{n}}\}$; here $\mathbb{B}_{\mathbb{R}^{n}} =\{ x \in \mathbb{R}^{n}: \vert \vert x\vert \vert \leq 1\}$.

Given a closed convex cone $C \subseteq \mathbb{R}^{n}$, call a matrix $A \in \mathbb{R}^{m\times n}$ feasible if $\{Ax: x \in C\} = \mathbb{R}^{m}$; so, for m = n and $C = \mathbb{R}^{n}$, feasibly means nonsingularity. Renegar, 1995, showed that, for feasible matrix A, its distance to infeasibility min{ | | E | | _nat: A + E is not feasible} is $\sup \{\delta:\delta \mathbb{B}_{\mathbb{R}^{m}} \subseteq A(\mathbb{B}_{\mathbb{R}^{n}} \cap C)\}$.

Lewis, 2003, generalized this by showing that, given two real normed spaces X, Y and a surjective convex process (or set valued sublinear mapping) F from X to Y, i.e., a multifunction for which {(x, y): y ∈ F(x)} is a closed convex cone, it holds
$$\displaystyle{\min \{\vert \vert E\vert \vert _{\text{nat}}: E\mbox{ is any linear map }X \rightarrow Y,F+E\mbox{ is not surjective}\} = \frac{1} {\vert \vert \,F^{-1}\vert \vert _{\text{nat}}}.}$$

Donchev et al. 2002, extended this, computing distance to irregularity; cf. metric regularity (Chap. 1). Cf. the above four distances to ill-posedness with distance to uncontrollability (Chap. 18) and distances from symmetry (Chap. 21).
Sym(n, ℝ) ⁺ and Her(n, ℂ) ⁺ metrics

Let $Sym(n, \mathbb{R})^{+}$ and $Her(n, \mathbb{C})^{+}$ be the cones of n × n symmetric real and Hermitian complex positive-definite n × n matrices. The $Sym(n, \mathbb{R})^{+}$ metric is defined, for any $A,B \in Sym(n, \mathbb{R})^{+}$, as
$$\displaystyle{(\sum _{i=1}^{n}\log ^{2}\lambda _{ i})^{\frac{1} {2} },}$$
where λ ₁, c, λ _n are the eigenvalues of the matrix A ⁻¹ B (the same as those of $A^{-\frac{1} {2} }BA^{-\frac{1} {2} }$). It is the Riemannian distance, arising from the Riemannian metric $ds^{2} = \text{Tr}((A^{-1}(dA))^{2})$. This metric was rediscovered in Förstner–Moonen, 1999, and Pennec et al., 2004, via generalized eigenvalue problem: $det(\lambda A - B) = 0$.

The $Her(n, \mathbb{C})^{+}$ metric is defined, for any $A,B \in Her(n, \mathbb{C})^{+}$, by
$$\displaystyle{d_{R}(A,B) = \vert \vert \log (A^{-\frac{1} {2} }BA^{-\frac{1} {2} })\vert \vert _{Fr},}$$
where $\vert \vert H\vert \vert _{Fr} = (\sum _{i,j}\vert h_{ij}\vert ^{2})^{\frac{1} {2} }$ is the Frobenius norm of the matrix H = ((h _ij)). It is the Riemannian distance arising from the Riemannian metric of nonpositive curvature, defined locally (at H) by $ds = \vert \vert H^{-\frac{1} {2} }\,dH\,H^{-\frac{1} {2} }\vert \vert _{Fr}$. In other words, this distance is the geodesic distance
$$\displaystyle{\inf \{L(\gamma ):\gamma \mbox{ is a (differentiable) path from A to B}\},}$$
where $L(\gamma ) =\int _{ A}^{B}\vert \vert \gamma ^{-\frac{1} {2} }(t)\gamma '(t)\gamma ^{-\frac{1} {2} }(t)\vert \vert _{Fr}dt$ and the geodesic [A, B] is parametrized by $\gamma (t) = A^{\frac{1} {2} }(A^{-\frac{1} {2} }BA^{-\frac{1} {2} })^{t}A^{\frac{1} {2} }$ in the sense that d _R(A, γ(t)) = td _R(A, B) for each t ∈ [0, 1]. In particular, the geodesic midpoint $\gamma (\frac{1} {2})$ of [A, B] can be seen as the geometric mean of two positive-definite matrices A and B.

The space $(Her(n, \mathbb{C})^{+},d_{R}))$ is an Hadamard (i.e., complete and CAT(0)) space, cf. Chap. 6 But $Her(n, \mathbb{C})^{+}$ is not complete with respect to matrix norms; it has a boundary consisting of the singular positive-semidefinite matrices.

Above $Sym(n, \mathbb{R})^{+}$ and $Her(n, \mathbb{C})^{+}$ metrics are the special cases of the distance d _R(x, y) among invariant distances on symmetric cones in Chap. 9

Cf. also, in Chap. 24, the trace distance on all Hermitian of trace 1 positive-definite n × n matrices and in Chap. 7, the Wigner–Yanase–Dyson metrics on all complex positive-definite n × n matrices.

The Bartlett distance between two matrices $A,B \in Her(n, \mathbb{C})^{+}$, is defined (Conradsen et al., 2003, for radar applications) by
$$\displaystyle{\ln \left ( \frac{(det(A + B))^{2}} {4det(A)det(B)}\right ).}$$
Siegel distance

The Siegel half-plane is the set SH _n of n × n matrices $Z = X + iY$, where X, Y are symmetric or Hermitian and Y is positive-definite. The Siegel–Hua metric (Siegel, 1943, and independently, Hua, 1944) on SH _n is defined by
$$\displaystyle{ds^{2} = \text{Tr}(Y ^{-1}(dZ)Y ^{-1}(d\overline{Z})).}$$
It is unique metric preserved by any automorphism of SH _n. The Siegel–Hua metric on the Siegel disk S D _n = { W = (Z − iI)(Z + iI)⁻¹: Z ∈ SH _n} is defined by
$$\displaystyle{ds^{2} = \text{Tr}((I - WW^{{\ast}})^{-1}dW(I - W^{{\ast}}W)^{-1}dW^{{\ast}}).}$$
For n=1, the Siegel–Hua metric is the Poincaré metric (cf. Chap. 6) on the Poincaré half-plane S H ₁ and the Poincaré disk S D ₁, respectively.

Let A _n = { Z = iY: Y > 0} be the imaginary axe on the Siegel half-plane. The Siegel–Hua metric on A _n is (cf. [Barb12]) the Riemannian trace metric $ds^{2} = \text{Tr}((P^{^{1} }dP)^{2})$. The corresponding distances are $Sym(n, \mathbb{R})^{+}$ metric or $Her(n, \mathbb{C})^{+}$ metric. The Siegel distance on SH _n ∖ A _n is defined by
$$\displaystyle{d_{Siegel}^{2}(Z_{ 1},Z_{2}) =\sum _{ i=1}^{n}\log ^{2}(\frac{1 + \sqrt{\lambda _{i}}} {1 -\sqrt{\lambda _{i}}});}$$
λ ₁, …, λ _n are the eigenvalues of the matrix $(Z_{1} - Z_{2})(Z_{1} -\overline{Z_{2}})-1(\overline{Z_{1}} -\overline{Z_{2}})(\overline{Z_{1}} - Z_{2})^{-1}$.
Barbaresco metrics

Let z(k) be a complex temporal (discrete time) stationary signal, i.e., its mean value is constant and its covariance function $\mathbb{E}[z(k_{1})z^{{\ast}}(k_{2})]$ is only a function of k ₁ − k ₂. Such signal can be represented by its covariance n × n matrix R = ((r _ij)), where $r_{ij} = \mathbb{E}[z(i),z {\ast} (\,j)] = \mathbb{E}[z(n)z {\ast} (n - i + j)]$. It is a positive-definite Toeplitz (i.e. diagonal-constant) Hermitian matrix. In radar applications, such matrices represent the Doppler spectra of the signal. Matrices R admit a parametrization (complex ARM, i.e., m-th order autoregressive model) by partial autocorrelation coefficients defined recursively as the complex correlation between the forward and backward prediction errors of the (m − 1)-th order complex ARM.

Barbaresco ([Barb12]) defined, via this parametrization, a Bergman metric (Chap. 7) on the bounded domain $\mathbb{R} + xD_{n} \subset \mathbb{C}^{n}$ of above matrices R; here D is a Poincaré disk. He also defined a related Kähler metric on M × S _n, where M is the set of positive-definite Hermitian matrices and SD _n is the Siegel disk (cf. Siegel distance ). Such matrices represent spatiotemporal stationary signals, i.e., in radar applications, the Doppler spectra and spatial directions of the signal.

Ben Jeuris, 2015, extended above metrics on block Toeplitz matrices, i.e., those having blocks that are repeated (as elements of a Toeplitz matrix) down the diagonals of the matrix.

Cf. Ruppeiner metric (Chap. 7) and Martin cepstrum distance (Chap. 21).
Distances between graphs of matrices

The graph G(A)of a complex m × n matrix A is the range (i.e., the span of columns) of the matrix R(A) = ([IA ^T])^T. So, G(A) is a subspace of $\mathbb{C}^{m+n}$ of all vectors v, for which the equation R(A)x = v has a solution.

A distance between graphs of matrices A and B is a distance between the subspaces G(A) and G(B). It can be an angle distance between subspaces or, for example, the following distance (cf. also the Kadets distance in Chap. 1 and the gap metric in Chap. 18).

The spherical gap distance between subspaces A and B is defined by
$$\displaystyle{\max \{\max _{x\in S(A)}d_{E}(x,S(B)),\max _{y\in S(B)}d_{E}(y,S(A))\},}$$
where S(A), S(B) are the unit spheres of the subspaces A, B, d(z, C) is the point-set distance inf_y ∈ C d(z, y) and d _E(z, y) is the Euclidean distance.
Angle distances between subspaces

Consider the Grassmannian space G(m, n) of all n-dimensional subspaces of Euclidean space $\mathbb{E}^{m}$; it is a compact Riemannian manifold of dimension n(m − n).

Given two subspaces A, B ∈ G(m, n), the principal angles $\frac{\pi }{2} \geq \theta _{1} \geq \ldots \geq \theta _{n} \geq 0$ between them are defined, for $k = 1,\ldots,n$, inductively by
$$\displaystyle{\cos \theta _{k} =\max _{x\in A}\max _{y\in B}x^{T}y = (x^{k})^{T}y^{k}}$$
subject to the conditions $\vert \vert x\vert \vert _{2} = \vert \vert y\vert \vert _{2} = 1$, x ^T x ⁱ = 0, y ^T y ⁱ = 0, for 1 ≤ i ≤ k − 1, where | | . | | ₂ is the Euclidean norm.

The principal angles can also be defined in terms of orthonormal matrices Q _A and Q _B spanning subspaces A and B, respectively: in fact, n ordered singular values of the matrix Q _A Q _B ∈ M _n can be expressed as cosines cosθ ₁, $\ldots$, cosθ _n.

The Grassmann distance between subspaces A and B of the same dimension is their geodesic distance defined by
$$\displaystyle{\sqrt{\sum _{i=1 }^{n }\theta _{i }^{2}}.}$$
The Martin distance between subspaces A and B is defined by
$$\displaystyle{\sqrt{\ln \prod _{i=1 }^{n } \frac{1} {\cos ^{2}\theta _{i}}}.}$$
In the case when the subspaces represent ARMs (autoregressive models), the Martin distance can be expressed in terms of the cepstrum of the autocorrelation functions of the models. Cf. the Martin cepstrum distance in Chap. 21.

The Asimov distance between subspaces A and B is defined by θ ₁. The spectral distance (or chordal 2-norm distance) is defined by $2\sin (\frac{\theta _{1}} {2})$.

The containment gap distance (or projection distance) is sinθ ₁. It is the l ₂ -norm of the difference of the orthogonal projectors onto A and B. Many versions of this distance are used in Control Theory (cf. gap metric in Chap. 18).

The Frobenius distance and chordal distance between subspaces A and B are
$$\displaystyle{\sqrt{2\sum _{i=1 }^{n }\sin ^{2 } \theta _{i}\,\,\,}\mbox{ and }\,\,\,\sqrt{\sum _{i=1 }^{n }\sin ^{2 } \theta _{i}},\,\,\,\mbox{ respectively. }}$$
It is the Frobenius norm of the difference of above projectors onto A and B.

Similar distances $\sqrt{1 -\prod _{ i=1 }^{n }\cos ^{2 } \theta _{i}}$ and arccos(∏ _i = 1 ⁿcosθ _i) are called the Binet–Cauchy distance and (cf. Chap. 7) Fubini–Study distance, respectively.
Larsson–Villani metric

Let A and B be two arbitrary orthonormal m × n matrices of full rank, and let θ _ij be the angle between the i-th column of A and the j-th column of B.

We call Larsson–Villani metric the distance between A and B (used by Larsson and Villani, 2000, for multivariate models) the square of which is defined by
$$\displaystyle{n -\sum _{i=1}^{n}\sum _{ j=1}^{n}\cos ^{2}\theta _{ ij}.}$$

The square of usual Euclidean distance between A and B is $2(1 -\sum _{i=1}^{n}\cos \theta _{ii})$.

For n = 1, above two distances are sinθ and $\sqrt{2(1-\cos \theta )}$, respectively.
Lerman metric

Given a finite set X and real symmetric | X | × | X | matrices ((d ₁(x, y))), ((d ₂(x, y))) with x, y ∈ X, their Lerman semimetric (cf. Kendall τ distance on permutations in Chap. 11) is defined by
$$\displaystyle{\vert \{(\{x,y\},\{u,v\}): (d_{1}(x,y) - d_{1}(u,v))(d_{2}(x,y) - d_{2}(u,v)) < 0\}\vert {\vert X\vert + 1\choose 2}^{-2},}$$
where ({x, y}, {u, v}) is any pair of unordered pairs of elements x, y, u, v from X.

Similar Kaufman semimetric between ((d ₁(x, y))) and ((d ₂(x, y))) is
$$\displaystyle{\frac{\vert \{(\{x,y\},\{u,v\}): (d_{1}(x,y) - d_{1}(u,v))(d_{2}(x,y) - d_{2}(u,v)) < 0\}\vert } {\vert \{(\{x,y\},\{u,v\}): (d_{1}(x,y) - d_{1}(u,v))(d_{2}(x,y) - d_{2}(u,v))\neq 0\}\vert }.}$$

References

Barbaresco F. Information Geometry of Covariance Matrix: Cartan-Siegel Homogenous Bounded Domains, Mostow-Berger Fibration and Fréchet Median, in Matrix Information Geometry, Bhatia R. and Nielsen F. (eds.) Springer, 2012.
Google Scholar
Copson E.T. Metric Spaces, Cambridge Univ. Press, 1968.
Book MATH Google Scholar
Ernvall S. On the Modular Distance, IEEE Trans. Inf. Theory, Vol. 31-4, pp. 521–522, 1985.
Article MathSciNet MATH Google Scholar
Giles J.R. Introduction to the Analysis of Metric Spaces, Australian Math. Soc. Lecture Series, Cambridge Univ. Press, 1987.
Google Scholar
Hamilton W.R. Elements of Quaternions, second edition 1899–1901 enlarged by C.J.Joly, reprinted by Chelsea Publ., New York, 1969.
Google Scholar
Higham N.J. Matrix Nearness Problems and Applications, in Applications of Matrix Theory, Gover M.J.C. and Barnett S. (eds.), pp. 1–27. Oxford University Press, 1989.
Google Scholar
Huber K. Codes over Gaussian Integers, IEEE Trans. Inf. Theory, Vol. 40-1, pp. 207–216, 1994.
Article MathSciNet MATH Google Scholar
Huber K. Codes over Eisenstein-Jacobi Integers, Contemporary Math., Vol. 168, pp. 165–179, 1994.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Ecole Normale Supérieure, Paris, France
Michel Marie Deza
Moscow State Pedagogical University, Moscow, Russia
Elena Deza

Authors

Michel Marie Deza
View author publications
You can also search for this author in PubMed Google Scholar
Elena Deza
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Deza, M.M., Deza, E. (2016). Distances on Numbers, Polynomials, and Matrices. In: Encyclopedia of Distances. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-52844-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-662-52844-0_12
Published: 17 August 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-52843-3
Online ISBN: 978-3-662-52844-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics