Distances in Probability Theory

Deza, Michel Marie; Deza, Elena

doi:10.1007/978-3-662-44342-2_14

Michel Marie Deza³ &
Elena Deza⁴

1778 Accesses
2 Citations

Abstract

A probability space is a measurable space $(\Omega,\mathcal{A},P)$, where $\mathcal{A}$ is the set of all measurable subsets of $\Omega $, and P is a measure on $\mathcal{A}$ with $P(\Omega ) = 1$. The set $\Omega $ is called a sample space.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

A probability space is a measurable space $(\Omega,\mathcal{A},P)$, where $\mathcal{A}$ is the set of all measurable subsets of $\Omega $, and P is a measure on $\mathcal{A}$ with $P(\Omega ) = 1$. The set $\Omega $ is called a sample space. An element $a \in \mathcal{A}$ is called an event. P(a) is called the probability of the event a. The measure P on $\mathcal{A}$ is called a probability measure, or (probability) distribution law, or simply (probability) distribution.

A random variable X is a measurable function from a probability space $(\Omega,\mathcal{A},P)$ into a measurable space, called a state space of possible values of the variable; it is usually taken to be $\mathbb{R}$ with the Borel σ-algebra, so $X: \Omega \rightarrow \mathbb{R}$. The range $\mathcal{X}$ of the variable X is called the support of the distribution P; an element $x \in \mathcal{X}$ is called a state.

A distribution law can be uniquely described via a cumulative distribution (or simply, distribution) function CDF, which describes the probability that a random value X takes on a value at most x: $F(x) = P(X \leq x) = P(\omega \in \Omega: X(\omega ) \leq x)$.

So, any random variable X gives rise to a probability distribution which assigns to the interval [a, b] the probability $P(a \leq X \leq b) = P(\omega \in \Omega: a \leq X(\omega ) \leq b)$, i.e., the probability that the variable X will take a value in the interval [a, b].

A distribution is called discrete if F(x) consists of a sequence of finite jumps at x _i; a distribution is called continuous if F(x) is continuous. We consider (as in the majority of applications) only discrete or absolutely continuous distributions, i.e., the CDF function $F: \mathbb{R} \rightarrow \mathbb{R}$ is absolutely continuous. It means that, for every number ε > 0, there is a number δ > 0 such that, for any sequence of pairwise disjoint intervals [x _k, y _k], 1 ≤ k ≤ n, the inequality ∑ _{1 ≤ k ≤ n}(y _k − x _k) < δ implies the inequality ∑ _{1 ≤ k ≤ n} | F(y _k) − F(x _k) | < ε.

A distribution law also can be uniquely defined via a probability density (or density, probability) function PDF of the underlying random variable. For an absolutely continuous distribution, the CDF is almost everywhere differentiable, and the PDF is defined as the derivative $p(x) = F^{^{{\prime}} }(x)$ of the CDF; so, $F(x) = P(X \leq x) =\int _{ -\infty }^{x}p(t)\mathit{dt}$, and $\int _{a}^{b}p(t)\mathit{dt} = P(a \leq X \leq b)$. In the discrete case, the PDF is $\sum _{x_{i}\leq x}p(x_{i})$, where p(x) = P(X = x) is the probability mass function. But p(x) = 0 for each fixed x in any continuous case.

The random variable X is used to “push-forward” the measure P on $\Omega $ to a measure dF on $\mathbb{R}$. The underlying probability space is a technical device used to guarantee the existence of random variables and sometimes to construct them.

We usually present the discrete version of probability metrics, but many of them are defined on any measurable space; see [Bass89, Bass13, Cha08]. For a probability distance d on random quantities, the conditions P(X = Y ) = 1 or equality of distributions imply (and characterize) d(X, Y ) = 0; such distances are called [Rach91] compound or simple distances, respectively. Often, some ground distance d is given on the state space $\mathcal{X}$ and the presented distance is a lifting of it to a distance on distributions. A quasi-distance between distributions is also called divergence or distance statistic.

Below we denote p _X = p(x) = P(X = x), F _X = F(x) = P(X ≤ x), p(x, y) = P(X = x, Y = y). We denote by $\mathbb{E}[X]$ the expected value (or mean) of the random variable X: in the discrete case $\mathbb{E}[X] =\sum _{x}\mathit{xp}(x)$, in the continuous case $\mathbb{E}[X] =\int \mathit{xp}(x)\mathit{dx}$.

The covariance between the random variables X and Y is $\mathit{Cov}(X,Y ) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y ])] = \mathbb{E}[\mathit{XY }] - \mathbb{E}[X]\mathbb{E}[Y ].$ The variance and standard deviation of X are $\mathit{Var}(X) = \mathit{Cov}(X,X)$ and $\sigma (X) = \sqrt{\mathit{Var } (X)}$, respectively. The correlation between X and Y is $\mathit{Corr}(X,Y ) = \frac{\mathit{Cov}(X,Y )} {\sigma (X)\sigma (Y )}$; cf. Chap. 17.

1 Distances on Random Variables

All distances in this section are defined on the set Z of all random variables with the same support $\mathcal{X}$; here X, Y ∈ Z.

p -Average compound metric

Given p ≥ 1, the p -average compound metric (or L _p -metric between variables) is a metric on Z with $\mathcal{X} \subset \mathbb{R}$ and $\mathbb{E}[\vert Z\vert ^{p}] < \infty $ for all Z ∈ Z defined by
$$\displaystyle{(\mathbb{E}[\vert X - Y \vert ^{p}])^{1/p} = (\sum _{ (x,y)\in \mathcal{X}\times \mathcal{X}}\vert x - y\vert ^{p}p(x,y))^{1/p}.}$$
For p = 2 and ∞, it is called, respectively, the mean-square distance and essential supremum distance between variables.
Lukaszyk–Karmovski metric

The Lukaszyk–Karmovski metric (2001) on $\mathbb{Z}$ with $\mathcal{X} \subset \mathbb{R}$ is defined by
$$\displaystyle{\sum _{(x,y)\in \mathcal{X}\times \mathcal{X}}\vert x - y\vert p(x)p(y).}$$

For continuous random variables, it is defined by $\int _{-\infty }^{+\infty }\int _{-\infty }^{+\infty }\vert x - y\vert F(x)F(y)\mathit{dxdy}$. This function can be positive for X = Y. Such possibility is excluded, and so, it will be a distance metric, if and only if it holds
$$\displaystyle{\int _{-\infty }^{+\infty }\int _{ -\infty }^{+\infty }\vert x - y\vert \delta (x - \mathbb{E}[X])\delta (y - \mathbb{E}[Y ])\mathit{dxdy} = \vert \mathbb{E}[X] - \mathbb{E}[Y ]\vert.}$$
Absolute moment metric

Given p ≥ 1, the absolute moment metric is a metric on Z with $\mathcal{X} \subset \mathbb{R}$ and $\mathbb{E}[\vert Z\vert ^{p}] < \infty $ for all Z ∈ Z defined by
$$\displaystyle{\vert (\mathbb{E}[\vert X\vert ^{p}])^{1/p} - (\mathbb{E}[\vert Y \vert ^{p}])^{1/p}\vert.}$$
For p = 1 it is called the engineer metric.
Indicator metric

The indicator metric is a metric on Z defined by
$$\displaystyle{\mathbb{E}[1_{X\neq Y }] =\sum _{(x,y)\in \mathcal{X}\times \mathcal{X}}1_{x\neq y}p(x,y) =\sum _{(x,y)\in \mathcal{X}\times \mathcal{X},x\neq y}p(x,y).}$$
(Cf. Hamming metric in Chap. 1.)
Ky Fan metric K

The Ky Fan metric K is a metric K on Z, defined by
$$\displaystyle{\inf \{\epsilon > 0: P(\vert X - Y \vert >\epsilon ) <\epsilon \}.}$$
It is the case d(x, y) = | X − Y | of the probability distance.
Ky Fan metric K ^∗

The Ky Fan metric K ^∗ is a metric on Z defined by
$$\displaystyle{\mathbb{E}\left [ \frac{\vert X - Y \vert } {1 + \vert X - Y \vert }\right ] =\sum _{(x,y)\in \mathcal{X}\times \mathcal{X}} \frac{\vert x - y\vert } {1 + \vert x - y\vert }p(x,y).}$$
Probability distance

Given a metric space $(\mathcal{X},d)$, the probability distance on Z is defined by
$$\displaystyle{\inf \{\epsilon > 0: P(d(X,Y ) >\epsilon ) <\epsilon \}.}$$

2 Distances on Distribution Laws

All distances in this section are defined on the set $\mathcal{P}$ of all distribution laws such that corresponding random variables have the same range $\mathcal{X}$; here $P_{1},P_{2} \in \mathcal{P}$.

L _p -metric between densities

The L _p -metric between densities is a metric on $\mathcal{P}$ (for a countable $\mathcal{X}$) defined, for any p ≥ 1, by
$$\displaystyle{(\sum _{x}\vert p_{1}(x) - p_{2}(x)\vert ^{p})^{\frac{1} {p} }.}$$
For p = 1, one half of it is called the variational distance (or total variation distance, Kolmogorov distance). For p = 2, it is the Patrick–Fisher distance . The point metric sup_x | p ₁(x) − p ₂(x) | corresponds to p = ∞.

The Lissak–Fu distance with parameter α > 0 is defined as $\sum _{x}\vert p_{1}(x) - p_{2}(x)\vert ^{\alpha }$.
Bayesian distance

The error probability in classification is the following error probability of the optimal Bayes rule for the classification into two classes with a priori probabilities ϕ, 1 −ϕ and corresponding densities p ₁, p ₂ of the observations:
$$\displaystyle{P_{e} =\sum _{x}\min (\phi p_{1}(x),(1-\phi )p_{2}(x)).}$$
The Bayesian distance on $\mathcal{P}$ is defined by 1 − P _e.

For the classification into m classes with a priori probabilities ϕ _i, 1 ≤ i ≤ m, and corresponding densities p _i of the observations, the error probability becomes
$$\displaystyle{P_{e} = 1 -\sum _{x}p(x)\max _{i}P(C_{i}\vert x),}$$
where P(C _i | x) is the a posteriori probability of the class C _i given the observation x and $p(x) =\sum _{ i=1}^{m}\phi _{i}P(x\vert C_{i})$. The general mean distance between m classes C _i (cf. m-hemimetric in Chap. 3) is defined (Van der Lubbe, 1979) for α > 0, β > 1 by
$$\displaystyle{\sum _{x}p(x)\left (\sum _{i}P(C_{i}\vert x)^{\beta }\right )^{\alpha }.}$$
The case α = 1, β = 2 corresponds to the Bayesian distance in Devijver, 1974; the case $\beta = \frac{1} {\alpha }$ was considered in Trouborst et al., 1974.
Mahalanobis semimetric

The Mahalanobis semimetric is a semimetric on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}^{n}$) defined by
$$\displaystyle{\sqrt{(\mathbb{E}_{P_{1 } } [X] - \mathbb{E}_{P_{2 } } [X])^{T } A(\mathbb{E}_{P_{1 } } [X] - \mathbb{E}_{P_{2 } } [X])}}$$
for a given positive-semidefinite matrix A; its square is a Bregman quasi-distance (cf. Chap. 13). Cf. also the Mahalanobis distance in Chap. 17.
Engineer semimetric

The engineer semimetric is a semimetric on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}$) defined by
$$\displaystyle{\vert \mathbb{E}_{P_{1}}[X] - \mathbb{E}_{P_{2}}[X]\vert = \vert \sum _{x}x(p_{1}(x) - p_{2}(x))\vert.}$$
Stop-loss metric of order m

The stop-loss metric of order m is a metric on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}$) defined by
$$\displaystyle{\sup _{t\in \mathbb{R}}\sum _{x\geq t}\frac{(x - t)^{m}} {m!} (p_{1}(x) - p_{2}(x)).}$$
Kolmogorov–Smirnov metric

The Kolmogorov–Smirnov metric (or Kolmogorov metric, uniform metric) is a metric on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}$) defined (1948) by
$$\displaystyle{\sup _{x\in \mathbb{R}}\vert P_{1}(X \leq x) - P_{2}(X \leq x)\vert.}$$

This metric is used, for example, in Biology as a measure of sexual dimorphism.

The Kuiper distance on $\mathcal{P}$ is defined by
$$\displaystyle{\sup _{x\in \mathbb{R}}(P_{1}(X \leq x) - P_{2}(X \leq x)) +\sup _{x\in \mathbb{R}}(P_{2}(X \leq x) - P_{1}(X \leq x)).}$$
(Cf. Pompeiu–Eggleston metric in Chap. 9).

The Crnkovic–Drachma distance is defined by
$$\displaystyle{\sup _{x\in \mathbb{R}}(P_{1}(X \leq x) - P_{2}(X \leq x))\ln \frac{1} {\sqrt{(P_{1 } (X \leq x)(1 - P_{1 } (X \leq x))}}+}$$

$$\displaystyle{+\sup _{x\in \mathbb{R}}(P_{2}(X \leq x) - P_{1}(X \leq x))\ln \frac{1} {\sqrt{(P_{1 } (X \leq x)(1 - P_{1 } (X \leq x))}}.}$$
Cramér–von Mises distance

The Cramér–von Mises distance (1928) is defined on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}$) by
$$\displaystyle{\omega ^{2} =\int _{ -\infty }^{+\infty }(P_{ 1}(X \leq x) - P_{2}(X \leq x))^{2}\mathit{dP}_{ 2}(x).}$$

The Anderson–Darling distance (1954) on $\mathcal{P}$ is defined by
$$\displaystyle{\int _{-\infty }^{+\infty }\frac{(P_{1}(X \leq x) - P_{2})(X \leq x))^{2}} {(P_{2}(X \leq x)(1 - P_{2}(X \leq x))}\mathit{dP}_{2}(x).}$$

In Statistics, above distances of Kolmogorov–Smirnov, Cramér–von Mises, Anderson–Darling and, below, χ ² -distance are the main measures of goodness of fit between estimated, P ₂, and theoretical, P ₁, distributions.

They and other distances were generalized (for example by Kiefer, 1955, and Glick, 1969) on K-sample setting, i.e., some convenient generalized distances $d(P_{1},\ldots,P_{K})$ were defined. Cf. m-hemimetric in Chap. 3.
Energy distance

The energy distance (Székely, 2005) between cumulative density functions F(X), F(Y ) of two independent random vectors $X,Y \in \mathbb{R}^{n}$ is defined by
$$\displaystyle{d(F(X),F(Y )) = 2\mathbb{E}[\vert \vert (X - Y \vert \vert ] - \mathbb{E}[\vert \vert X - X^{{\prime}}\vert \vert ] - \mathbb{E}[\vert \vert (Y - Y ^{{\prime}}\vert \vert ],}$$
where X, X ^′ are iid (independent and identically distributed), Y, Y ^′ are iid and | | . | | is the length of a vector. Cf. distance covariance in Chap. 17.

It holds d(F(X), F(Y )) = 0 if and only if X, Y are iid.
Prokhorov metric

Given a metric space $(\mathcal{X},d)$, the Prokhorov metric on $\mathcal{P}$ is defined (1956) by
$$\displaystyle{\inf \{\epsilon > 0: P_{1}(X \in B) \leq P_{2}(X \in B^{\epsilon }) +\epsilon \mbox{ and }P_{2}(X \in B) \leq P_{1}(X \in B^{\epsilon })+\epsilon \},}$$
where B is any Borel subset of $\mathcal{X}$, and $B^{\epsilon } =\{ x: d(x,y) <\epsilon,y \in B\}$.

It is the smallest (over all joint distributions of pairs (X, Y ) of random variables X, Y such that the marginal distributions of X and Y are P ₁ and P ₂, respectively) probability distance between random variables X and Y.
Levy–Sibley metric

The Levy–Sibley metric is a metric on $\mathcal{P}$ (for $\mathcal{X} \subset \mathbb{R}$ only) defined by
$$\displaystyle{\inf \{\epsilon > 0: P_{1}(X \leq x-\epsilon )-\epsilon \leq P_{2}(X \leq x) \leq P_{1}(X \leq x+\epsilon ) +\epsilon \mbox{ for any }x \in \mathbb{R}\}.}$$

It is a special case of the Prokhorov metric for $(\mathcal{X}, d) = (\mathbb{R}, \vert x - y\vert )$.
Dudley metric

Given a metric space $(\mathcal{X},d)$, the Dudley metric on $\mathcal{P}$ is defined by
$$\displaystyle{\sup _{f\in F}\vert \mathbb{E}_{P_{1}}[f(X)] - \mathbb{E}_{P_{2}}[f(X)]\vert =\sup _{f\in F}\vert \sum _{x\in \mathcal{X}}f(x)(p_{1}(x) - p_{2}(x))\vert,}$$
where $F =\{ f: \mathcal{X} \rightarrow \mathbb{R},\vert \vert f\vert \vert _{\infty } + \mathit{Lip}_{d}(f) \leq 1\}$, and $\mathit{Lip}_{d}(f) =\sup _{x,y\in \mathcal{X},x\neq y}\frac{\vert f(x)-f(y)\vert } {d(x,y)}$.
Szulga metric

Given a metric space $(\mathcal{X},d)$, the Szulga metric (1982) on $\mathcal{P}$ is defined by
$$\displaystyle{\sup _{f\in F}\vert (\sum _{x\in \mathcal{X}}\vert f(x)\vert ^{p}p_{ 1}(x))^{1/p} - (\sum _{ x\in \mathcal{X}}\vert f(x)\vert ^{p}p_{ 2}(x))^{1/p}\vert,}$$
where $F =\{ f: X \rightarrow \mathbb{R},\,\,\mathit{Lip}_{d}(f) \leq 1\}$, and $\mathit{Lip}_{d}(f) =\sup _{x,y\in \mathcal{X},x\neq y}\frac{\vert f(x)-f(y)\vert } {d(x,y)}$.
Zolotarev semimetric

The Zolotarev semimetric is a semimetric on $\mathcal{P}$, defined (1976) by
$$\displaystyle{\sup _{f\in F}\vert \sum _{x\in \mathcal{X}}f(x)(p_{1}(x) - p_{2}(x))\vert,}$$
where F is any set of functions $f: \mathcal{X} \rightarrow \mathbb{R}$ (in the continuous case, F is any set of such bounded continuous functions); cf. Szulga metric, Dudley metric.
Convolution metric

Let G be a separable locally compact Abelian group, and let C(G) be the set of all real bounded continuous functions on G vanishing at infinity. Fix a function g ∈ C(G) such that | g | is integrable with respect to the Haar measure on G, and $\{\beta \in G^{{\ast}}:\hat{ g}(\beta ) = 0\}$ has empty interior; here G ^∗ is the dual group of G, and $\hat{g}$ is the Fourier transform of g.

The convolution metric (or smoothing metric) is defined (Yukich, 1985), for any two finite signed Baire measures P ₁ and P ₂ on G, by
$$\displaystyle{\sup _{x\in G}\vert \int _{y\in G}g(xy^{-1})(\mathit{dP}_{ 1} -\mathit{dP}_{2})(y)\vert.}$$
It can also be seen as the difference $T_{P_{1}}(g) - T_{P_{2}}(g)$ of convolution operators on C(G) where, for any f ∈ C(G), the operator T _P f(x) is ∫ _y ∈ G f(xy ⁻¹)dP(y).

In particular, this metric can be defined on the space of probability measures on $\mathbb{R}^{n}$, where g is a PDF satisfying above conditions.
Discrepancy metric

Given a metric space $(\mathcal{X},d)$, the discrepancy metric on $\mathcal{P}$ is defined by
$$\displaystyle{\sup \{\vert P_{1}(X \in B) - P_{2}(X \in B)\vert: B\mbox{ is any closed ball}\}.}$$
Bi-discrepancy semimetric

The bi-discrepancy semimetric (evaluating the proximity of distributions P ₁, P ₂ over different collections $\mathcal{A}_{1},\mathcal{A}_{2}$ of measurable sets) is defined by
$$\displaystyle{D(P_{1},P_{2}) + D(P_{2},P_{1}),}$$
where $D(P_{1},P_{2}) =\sup \{\inf \{ P_{2}(C): B \subset C \in \mathcal{A}_{2}\} - P_{1}(B): B \in \mathcal{A}_{1}\}$ (discrepancy).
Le Cam distance

The Le Cam distance (1974) is a semimetric, evaluating the proximity of probability distributions P ₁, P ₂ (on different spaces $\mathcal{X}_{1},\mathcal{X}_{2}$) and defined as follows:
$$\displaystyle{\max \{\delta (P_{1},P_{2}),\delta (P_{2},P_{1})\},}$$
where $\delta (P_{1},P_{2}) =\inf _{B}\sum _{x_{2}\in \mathcal{X}_{2}}\vert BP_{1}(X_{2} = x_{2}) - BP_{2}(X_{2} = x_{2})\vert $ is the Le Cam deficiency. Here $BP_{1}(X_{2} = x_{2}) =\sum _{x_{1}\in \mathcal{X}_{1}}p_{1}(x_{1})b(x_{2}\vert x_{1})$, where B is a probability distribution over $\mathcal{X}_{1} \times \mathcal{X}_{2}$, and
$$\displaystyle{b(x_{2}\vert x_{1}) = \frac{B(X_{1} = x_{1},X_{2} = x_{2})} {B(X_{1} = x_{1})} = \frac{B(X_{1} = x_{1},X_{2} = x_{2})} {\sum _{x\in \mathcal{X}_{2}}B(X_{1} = x_{1},X_{2} = x)}.}$$
So, BP ₂(X ₂ = x ₂) is a probability distribution over $\mathcal{X}_{2}$, since $\sum _{x_{2}\in \mathcal{X}_{2}}b(x_{2}\vert x_{1}) = 1$.

Le Cam distance is not a probabilistic distance, since P ₁ and P ₂ are defined over different spaces; it is a distance between statistical experiments (models).
Skorokhod–Billingsley metric

The Skorokhod–Billingsley metric is a metric on $\mathcal{P}$, defined by
$$\displaystyle\begin{array}{rcl} & & \inf _{f}\max \left \{\sup _{x}\vert P_{1}(X \leq x) - P_{2}(X \leq f(x))\vert,\sup _{x}\vert f(x) - x\vert,\right. {}\\ & & \qquad \quad \left.\sup _{x\neq y}\left \vert \ln \frac{f(y) - f(x)} {y - x} \right \vert \right \}, {}\\ \end{array}$$
where $f: \mathbb{R} \rightarrow \mathbb{R}$ is any strictly increasing continuous function.
Skorokhod metric

The Skorokhod metric is a metric on $\mathcal{P}$ defined (1956) by
$$\displaystyle{\inf \{\epsilon > 0:\max \{\sup _{x}\vert P_{1}(X < x) - P_{2}(X \leq f(x))\vert,\sup _{x}\vert f(x) - x\vert \} <\epsilon \},}$$
where $f: \mathbb{R} \rightarrow \mathbb{R}$ is a strictly increasing continuous function.
Birnbaum–Orlicz distance

The Birnbaum–Orlicz distance (1931) is a distance on $\mathcal{P}$ defined by
$$\displaystyle{\sup _{x\in \mathbb{R}}f(\vert P_{1}(X \leq x) - P_{2}(X \leq x)\vert ),}$$
where $f: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}_{\geq 0}$ is any nondecreasing continuous function with f(0) = 0, and f(2t) ≤ Cf(t) for any t > 0 and some fixed C ≥ 1. It is a near-metric, since the C -triangle inequality $d(P_{1},P_{2}) \leq C(d(P_{1},P_{3}) + d(P_{3},P_{2}))$ holds.

Birnbaum–Orlicz distance is also used, in Functional Analysis, on the set of all integrable functions on the segment [0, 1], where it is defined by $\int _{0}^{1}H(\vert f(x) - g(x)\vert )\mathit{dx}$, where H is a nondecreasing continuous function from [0, ∞) onto [0, ∞) which vanishes at the origin and satisfies the Orlicz condition: $\sup _{t>0}\frac{H(2t)} {H(t)} < \infty $.
Kruglov distance

The Kruglov distance (1973) is a distance on $\mathcal{P}$, defined by
$$\displaystyle{\int f(P_{1}(X \leq x) - P_{2}(X \leq x))\mathit{dx},}$$
where $f: \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}_{\geq 0}$ is any even strictly increasing function with f(0) = 0, and f(s + t) ≤ C(f(s) + f(t)) for any s, t ≥ 0 and some fixed C ≥ 1. It is a near-metric, since the C -triangle inequality d(P ₁, P ₂) ≤ C(d(P ₁, P ₃) + d(P ₃, P ₂)) holds.
Bregman divergence

Given a differentiable strictly convex function $\phi (p): \mathbb{R}^{n} \rightarrow \mathbb{R}$ and β ∈ (0, 1), the skew Jensen (or skew Burbea–Rao) divergence on $\mathcal{P}$ is (Basseville–Cardoso, 1995)
$$\displaystyle{J_{\phi }^{(\beta )}(P_{ 1},P_{2}) =\beta \phi (p_{1}) + (1-\beta )\phi (p_{2}) -\phi (\beta p_{1} + (1-\beta )p_{2}).}$$

The Burbea–Rao distance (1982) is the case $\beta = \frac{1} {2}$ of it, i.e., it is
$$\displaystyle{\sum _{x}\left (\frac{\phi (p_{1}(x)) +\phi (p_{2}(x))} {2} -\phi (\frac{p_{1}(x) + (p_{2}(x)} {2} )\right ).}$$

The Bregman divergence (1967) is a quasi-distance on $\mathcal{P}$ defined by
$$\displaystyle{\sum _{x}(\phi (p_{1}(x)) -\phi (p_{2}(x)) - (p_{1}(x) - p_{2}(x))\phi ^{{\prime}}(p_{ 2}(x))) =\lim _{\beta \rightarrow 1}\frac{1} {\beta } J_{\phi }^{(\beta )}(P_{ 1},P_{2}).}$$

The generalised Kullback–Leibler distance $\sum _{x}p_{1}(x)\ln \frac{p_{1}(x)} {p_{2}(x)} -\sum _{x}(p_{1}(x) - p_{2}(x))$ and Itakura–Saito distance (cf. Chap. 21) $\sum _{x}\frac{p_{1}(x)} {p_{2}(x)} -\ln \frac{p_{1}(x)} {p_{2}(x)} - 1$ are the cases $\phi (p) =\sum _{x}p(x)\ln p(x) -\sum _{x}p(x)$ and $\phi (p) = -\sum _{x}\ln p(x)$ of the Bregman divergence. Cf. Bregman quasi-distance in Chap. 13.

Csizár, 1991, proved that the Kullback–Leibler distance is the only Bregman divergence which is an f -divergence.
f -divergence

Given a convex function $f(t): \mathbb{R}_{\geq 0} \rightarrow \mathbb{R}$ with $f(1) = 0,f^{{\prime}}(1) = 0,f^{{\prime\prime}}(1) = 1$, the f -divergence (independently, Csizár, 1963, Morimoto, 1963, Ali–Silvey, 1966, Ziv–Zakai, 1973, and Akaike, 1974) on $\mathcal{P}$ is defined by
$$\displaystyle{\sum _{x}p_{2}(x)f\left (\frac{p_{1}(x)} {p_{2}(x)}\right ).}$$

The cases f(t) = tlnt and f(t) = (t − 1)² correspond to the Kullback–Leibler distance and to the χ ² -distance below, respectively. The case f(t) = | t − 1 | corresponds to the variational distance, and the case $f(t) = 4(1 -\sqrt{t})$ (as well as $f(t) = 2(t + 1) - 4\sqrt{t}$) corresponds to the squared Hellinger metric.

Semimetrics can also be obtained, as the square root of the f-divergence, in the cases f(t) = (t − 1)²∕(t + 1) (the Vajda–Kus semimetric ), f(t) = | t ^a − 1 | ^1∕a with 0 < a ≤ 1 (the generalized Matusita distance), and $f(t) = \frac{(t^{a}+1)^{1/a}-2^{(1-a)/a}(t+1)} {1-1/\alpha }$ (the Osterreicher semimetric ).
α -divergence

Given $\alpha \in \mathbb{R}$, the α -divergence (independently, Csizár, 1967, Havrda–Charvát, 1967, Cressie–Read, 1984, and Amari, 1985) is defined as KL(P ₁, P ₂), KL(P ₂, P ₁) for α = 1, 0 and for α ≠ 0, 1, it is
$$\displaystyle{ \frac{1} {\alpha (1-\alpha )}\left (1 -\sum _{x}p_{2}(x)\left (\frac{p_{1}(x)} {p_{2}(x)}\right )^{\alpha }\right ).}$$
The Amari divergence come from the above by the transformation $\alpha = \frac{1+t} {2}$.
Harmonic mean similarity

The harmonic mean similarity is a similarity on $\mathcal{P}$ defined by
$$\displaystyle{2\sum _{x} \frac{p_{1}(x)p_{2}(x)} {p_{1}(x) + p_{2}(x)}.}$$
Fidelity similarity

The fidelity similarity (or Bhattacharya coefficient, Hellinger affinity) on $\mathcal{P}$ is
$$\displaystyle{\rho (P_{1},P_{2}) =\sum _{x}\sqrt{p_{1 } (x)p_{2 } (x)}.}$$

Cf. more general quantum fidelity similarity in Chap. 24.
Hellinger metric

In terms of the fidelity similarity ρ, the Hellinger metric (or Matusita distance , Hellinger–Kakutani metric) on $\mathcal{P}$ is defined by
$$\displaystyle{(\sum _{x}(\sqrt{p_{1 } (x)} -\sqrt{p_{2 } (x)})^{2})^{\frac{1} {2} } = 2\sqrt{1 -\rho (P_{1 }, P_{2 } )}.}$$
Bhattacharya distance 1

In terms of the fidelity similarity ρ, the Bhattacharya distance 1 (1946) is
$$\displaystyle{(\arccos \rho (P_{1},P_{2}))^{2}}$$
for $P_{1},P_{2} \in \mathcal{P}$. Twice this distance is the Rao distance from Chap. 7. It is used also in Statistics and Machine Learning, where it is called the Fisher distance.

The Bhattacharya distance 2 (1943) on $\mathcal{P}$ is defined by
$$\displaystyle{-\ln \rho (P_{1},P_{2}).}$$
χ ² -distance

The χ ² -distance (or Pearson χ ² -distance ) is a quasi-distance on $\mathcal{P}$, defined by
$$\displaystyle{\sum _{x}\frac{(p_{1}(x) - p_{2}(x))^{2}} {p_{2}(x)}.}$$

The Neyman χ ² -distance is a quasi-distance on $\mathcal{P}$, defined by
$$\displaystyle{\sum _{x}\frac{(p_{1}(x) - p_{2}(x))^{2}} {p_{1}(x)}.}$$

The half of χ ²-distance is also called Kagan’s divergence.

The probabilistic symmetric χ ² -measure is a distance on $\mathcal{P}$, defined by
$$\displaystyle{2\sum _{x}\frac{(p_{1}(x) - p_{2}(x))^{2}} {p_{1}(x) + p_{2}(x)}.}$$
Separation quasi-distance

The separation distance is a quasi-distance on $\mathcal{P}$ (for a countable $\mathcal{X}$) defined by
$$\displaystyle{\max _{x}\left (1 -\frac{p_{1}(x)} {p_{2}(x)}\right ).}$$
(Not to be confused with separation distance in Chap. 9).
Kullback–Leibler distance

The Kullback–Leibler distance (or relative entropy, information deviation, information gain, KL-distance) is a quasi-distance on $\mathcal{P}$, defined (1951) by
$$\displaystyle{\mathit{KL}(P_{1},P_{2}) = \mathbb{E}_{P_{1}}[\ln L] =\sum _{x}p_{1}(x)\ln \frac{p_{1}(x)} {p_{2}(x)},}$$
where $L = \frac{p_{1}(x)} {p_{2}(x)}$ is the likelihood ratio. Therefore,
$$\displaystyle{\mathit{KL}(P_{1},P_{2})\,=\, -\sum _{x}p_{1}(x)\ln \,p_{2}(x) +\sum _{x}p_{1}(x)\ln \,p_{1}(x)\,=\,H(P_{1},P_{2}) - H(P_{1}),}$$
where H(P ₁) is the entropy of P ₁, and H(P ₁, P ₂) is the cross-entropy of P ₁ and P ₂.

If P ₂ is the product of marginals of P ₁ (say, p ₂(x, y) = p ₁(x)p ₁(y)), the KL-distance KL(P ₁, P ₂) is called the Shannon information quantity and (cf. Shannon distance) is equal to $\sum _{(x,y)\in \mathcal{X}\times \mathcal{X}}p_{1}(x,y)\ln \frac{p_{1}(x,y)} {p_{1}(x)p_{1}(y)}$.

The exponential divergence is defined by $\sum _{x}p_{1}(x)(\ln \frac{p_{1}(x)} {p_{2}(x)})^{2}.$
Distance to normality

For a continuous distribution P on $\mathbb{R}$, the differential entropy is defined by
$$\displaystyle{h(P) = -\int _{-\infty }^{\infty }p(x)\ln p(x)\mathit{dx}.}$$
It is $\ln (\delta \sqrt{2\pi e})$ for a normal (or Gaussian) distribution $g_{\delta,\mu }(x) = \frac{1} {\sqrt{2\pi \delta ^{2}}} \exp \left (-\frac{(x-\mu )^{2}} {2\delta ^{2}} \right )$ with variance δ ² and mean μ.

The distance to normality (or negentropy) of P is the Kullback–Leibler distance $\mathit{KL}(P,g) =\int _{ -\infty }^{\infty }p(x)\ln \left (\frac{p(x)} {g(x)}\right )\mathit{dx} = h(g) - h(P)$, where q is a normal distribution with the same variance as P. So, it is nonnegative and equal to 0 if and only if P = g almost everywhere. Cf. Shannon distance.

Also, h(u _a, b) = ln(b − a) for an uniform distribution with minimum a and maximum b > a, i.e., $u_{a,b}(x) = \frac{1} {b-a}$, if x ∈ [a, b], and it is 0, otherwise. It holds h(u _a, b) ≥ h(P) for any distribution P with support contained in [a, b]; so, h(u _a, b) − h(P) can be called the distance to uniformity. Tononi, 2008, used it in his model of consciousness.
Jeffrey distance

The Jeffrey distance (or J-divergence, KL2-distance) is a symmetric version of the Kullback–Leibler distance defined (1946) on $\mathcal{P}$ by
$$\displaystyle{\mathit{KL}(P_{1},P_{2}) + \mathit{KL}(P_{2},P_{1}) =\sum _{x}((p_{1}(x) - p_{2}(x))\ln \frac{p_{1}(x)} {p_{2}(x)}.}$$

The Aitchison distance (1986) is defined by $\sqrt{\sum _{x } (\ln \frac{p_{1 } (x)g(p_{1 } )} {p_{2}(x)g(p_{2})})^{2}}$, where $g(p) = (\prod _{x}p(x))^{ \frac{1} {n} }$ is the geometric mean of components p(x) of p.
Resistor-average distance

The resistor-average distance is (Johnson–Simanović, 2000) a symmetric version of the Kullback–Leibler distance on $\mathcal{P}$ which is defined by the harmonic sum
$$\displaystyle{\left ( \frac{1} {\mathit{KL}(P_{1},P_{2})} + \frac{1} {\mathit{KL}(P_{2},P_{1})}\right )^{-1}.}$$
Jensen–Shannon divergence

Given a number β ∈ [0, 1] and $P_{1},P_{2} \in \mathcal{P}$, let P ₃ denote β P ₁ + (1 −β)P ₂. The skew divergence and the Jensen–Shannon divergence between P ₁ and P ₂ are defined on $\mathcal{P}$ as KL(P ₁, P ₃) and β KL(P ₁, P ₃) + (1 −β)KL(P ₂, P ₃), respectively. Here KL is the Kullback–Leibler distance; cf. clarity similarity.

In terms of entropy H(P) = −∑ _x p(x)ln p(x), the Jensen–Shannon divergence is H(β P ₁ + (1 −β)P ₂) −β H(P ₁) − (1 −β)H(P ₂), i.e., the Jensen divergence (cf. Bregman divergence).

Let $P_{3} = \frac{1} {2}(P_{1} + P_{2})$, i.e., $\beta = \frac{1} {2}$. Then the skew divergence and twice the Jensen–Shannon divergence are called K -divergence and Topsøe distance (or information statistics), respectively. The Topsøe distance is a symmetric version of KL(P ₁, P ₂). It is not a metric, but its square root is a metric.
Clarity similarity

The clarity similarity is a similarity on $\mathcal{P}$, defined by
$$\displaystyle{(\mathit{KL}(P_{1},P_{3}) + \mathit{KL}(P_{2},P_{3})) - (\mathit{KL}(P_{1},P_{2}) + \mathit{KL}(P_{2},P_{1})) =}$$

$$\displaystyle{=\sum _{x}\left (p_{1}(x)\ln \frac{p_{2}(x)} {p_{3}(x)} + p_{2}(x)\ln \frac{p_{1}(x)} {p_{3}(x)}\right ),}$$
where KL is the Kullback–Leibler distance, and P ₃ is a fixed probability law.

It was introduced in [CCL01] with P ₃ being the probability distribution of English.
Ali–Silvey distance

The Ali–Silvey distance is a quasi-distance on $\mathcal{P}$ defined by the functional
$$\displaystyle{f(\mathbb{E}_{P_{1}}[g(L)]),}$$
where $L = \frac{p_{1}(x)} {p_{2}(x)}$ is the likelihood ratio, f is a nondecreasing function on $\mathbb{R}$, and g is a continuous convex function on $\mathbb{R}_{\geq 0}$ (cf. f -divergence).

The case f(x) = x, g(x) = xlnx corresponds to the Kullback–Leibler distance; the case f(x) = −lnx, g(x) = x ^t corresponds to the Chernoff distance.
Chernoff distance

The Chernoff distance (or Rényi cross-entropy) on $\mathcal{P}$ is defined (1954) by
$$\displaystyle{\max _{t\in (0,1)}D_{t}(P_{1},P_{2}),}$$
where 0 < t < 1 and D _t(P ₁, P ₂) = −ln∑ _x(p ₁(x))^t(p ₂(x))^1−t (called the Chernoff coefficient) which is proportional to the Rényi distance.
Rényi distance

Given $t \in \mathbb{R}$, the Rényi distance (or order t Rényi entropy, 1961) is a quasi-distance on $\mathcal{P}$ defined as the Kullback–Leibler distance KL(P ₁, P ₂) if t = 1, and, otherwise, by
$$\displaystyle{ \frac{1} {1 - t}\ln \sum _{x}p_{2}(x)\left (\frac{p_{1}(x)} {p_{2}(x)}\right )^{t}.}$$

For $t = \frac{1} {2}$, one half of the Rényi distance is the Bhattacharya distance 2. Cf. f -divergence and Chernoff distance.
Shannon distance

Given a measure space $(\Omega,\mathcal{A},P)$, where the set $\Omega $ is finite and P is a probability measure, the entropy (or Shannon information entropy) of a function $f: \Omega \rightarrow X$, where X is a finite set, is defined by
$$\displaystyle{H(f) = -\sum _{x\in X}P(f = x)\log _{a}(P(f = x)).}$$
Here a = 2, e, or 10 and the unit of entropy is called a bit, nat, or dit (digit), respectively. The function f can be seen as a partition of the measure space.

For any two such partitions $f: \Omega \rightarrow X$ and $g: \Omega \rightarrow Y$, denote by H(f, g) the entropy of the partition $(f,g): \Omega \rightarrow X \times Y$ (joint entropy), and by H(f | g) the conditional entropy (or equivocation). Then the Shannon distance between f and g is a metric defined by
$$\displaystyle{H(f\vert g) + H(g\vert f) = 2H(f,g) - H(f) - H(g) = H(f,g) - I(f;g),}$$
where I(f; g) = H(f) + H(g) − H(f, g) is the Shannon mutual information.

If P is the uniform probability law, then Goppa showed that the Shannon distance can be obtained as a limiting case of the finite subgroup metric.

In general, the information metric (or entropy metric ) between two random variables (information sources) X and Y is defined by
$$\displaystyle{H(X\vert Y ) + H(Y \vert X) = H(X,Y ) - I(X;Y ),}$$
where the conditional entropy H(X | Y ) is defined by $\sum _{x\in X}\sum _{y\in Y }p(x,y)\ln p(x\vert y)$, and p(x | y) = P(X = x | Y = y) is the conditional probability.

The Rajski distance (or normalized information metric) is defined (Rajski, 1961, for discrete probability distributions X, Y ) by
$$\displaystyle{\frac{H(X\vert Y ) + H(Y \vert X)} {H(X,Y )} = 1 - \frac{I(X;Y )} {H(X,Y )}.}$$
It is equal to 1 if X and Y are independent. (Cf., a different one, normalized information distance in Chap. 11).
Transportation distance

Given a metric space $(\mathcal{X},d)$, the transportation distance (and/or, according to Villani, 2009, Monge–Kantorovich–Wasserstein–Rubinstein–Ornstein–Gini–Dall’Aglio–Mallows–Tanaka distance) is the metric defined by
$$\displaystyle{W_{1}(P_{1},P_{2}) =\inf \, \mathbb{E}_{S}[d(X,Y )] =\inf _{S}\int _{(X,Y )\in \mathcal{X}\times \mathcal{X}}d(X,Y )\mathit{dS}(X,Y ),}$$
where the infimum is taken over all joint distributions S of pairs (X, Y ) of random variables X, Y such that marginal distributions of X and Y are P ₁ and P ₂.

For any separable metric space $(\mathcal{X},d)$, this is equivalent to the Lipschitz distance between measures sup_f ∫ f d(P ₁ − P ₂), where the supremum is taken over all functions f with | f(x) − f(y) | ≤ d(x, y) for any $x,y \in \mathcal{X}$. Cf. Dudley metric.

In general, for a Borel function $c: \mathcal{X}\times \mathcal{X} \rightarrow \mathbb{R}_{\geq 0}$, the c -transportation distance T _c(P ₁, P ₂) is $\inf \,\mathbb{E}_{S}[c(X,Y )]$. It is the minimal total transportation cost if c(X, Y ) is the cost of transporting a unit of mass from the location X to the location Y. Cf. the Earth Mover’s distance (Chap. 21), which is a discrete form of it.

The L _p -Wasserstein distance is $W_{p} = (T_{d^{p}})^{1/p} = (\inf \,\mathbb{E}_{S}[d^{p}(X,Y )])^{1/p}$. For $(\mathcal{X},d) = (\mathbb{R},\vert x - y\vert )$, it is also called the L _p -metric between distribution functions (CDF) F _i with $F_{i}^{-1}(x) =\sup _{u}(P_{i}(X \leq x) < u)$, and can be written as
$$\displaystyle\begin{array}{rcl} (\inf \,\mathbb{E}[\vert X - Y \vert ^{p}])^{1/p}& =& \left (\int _{ \mathbb{R}}\vert F_{1}(x) - F_{2}(x)\vert ^{p}\mathit{dx}\right )^{1/p} {}\\ & =& \left (\int _{0}^{1}\vert F_{ 1}^{-1}(x) - F_{ 2}^{-1}(x)\vert ^{p}\mathit{dx}\right )^{1/p}. {}\\ \end{array}$$

For p = 1, this metric is called Monge–Kantorovich metric (or Wasserstein metric , Fortet–Mourier metric , Hutchinson metric , Kantorovich–Rubinstein metric). For p = 2, it is the Levy–Fréchet metric (Fréchet, 1957).
Ornstein $\overline{d}$ -metric

The Ornstein $\overline{d}$ -metric is a metric on $\mathcal{P}$ (for $\mathcal{X} = \mathbb{R}^{n}$) defined (1974) by
$$\displaystyle{ \frac{1} {n}\inf \int _{x,y}\left (\sum _{i=1}^{n}1_{ x_{i}\neq y_{i}}\right )\mathit{dS},}$$
where the infimum is taken over all joint distributions S of pairs (X, Y ) of random variables X, Y such that marginal distributions of X and Y are P ₁ and P ₂.
Distances between belief assignments

In Bayesian (or subjective, evidential) interpretation, a probability can be assigned to any statement, even if no random process is involved, as a way to represent its subjective plausibility, or the degree to which it is supported by the available evidence, or, mainly, degree of belief. Within this approach, imprecise probability generalizes Probability Theory to deal with scarce, vague, or conflicting information. The main model is Dempster–Shafer theory, which allows evidence to be combined.

Given a set X, a (basic) belief assignment is a function m: P(X) → [0, 1] (where P(X) is the set of all subsets of X) with m(∅) = 0 and ∑ _{A ⊂ P(X)} m(A) = 1. Probability measures are a special case in which m(A) > 0 only for singletons.

For the classic probability P(A), it holds then Bel(A) ≤ P(A) ≤ Pl(A), where the belief function and plausibility function are defined, respectively, by
$$\displaystyle{\mathrm{Bel}(A) =\sum _{B:B\subset A}m(B)\,\mbox{ and }\,\mathrm{Pl}(A) =\sum _{B:B\cap A\neq \varnothing }m(B) = 1 -\mathrm{Bel}(\overline{A}).}$$

The original (Dempster, 1967) conflict factor between two belief assignments m ₁ and m ₂ was defined as c(m ₁, m ₂) = ∑ _{A∩B = ∅} m ₁(A)m ₂(B). This is not a distance since c(m, m) > 0. The combination of m ₁ and m ₂, seen as independent sources of evidence, is defined by $m_{1} \oplus m_{2}(A) = \frac{1} {1-c(m_{1},m_{2})}\sum _{B\cap C=A}m_{1}(B)m_{2}(C)$.

Usually, a distance between m ₁ and m ₂ estimates the difference between these sources in the form d _U = | U(m ₁) − U(m ₂) | , where U is an uncertainty measure; see Sarabi-Jamab et al., 2013, for a comparison of their performance. In particular, this distance is called:
- confusion (Hoehle, 1981) if U(m) − ∑ _A m(A)log₂Bel(A);
- dissonance (Yager, 1983) if U(m) = E(m) = −∑ _A m(A)log₂Pl(A);
- Yager’s factor (Eager, 1983) if $U(m) = 1 -\sum _{A\neq \varnothing }\frac{m(A)} {\vert A\vert }$;
- possibility-based (Smets, 1983) if $U(m) = -\sum _{A}\log _{2}\sum _{B:A\subset B}m(B)$;
- U-uncertainty (Dubois–Prade, 1985) if $U(m)\,=\,I(m)\,=\, -\sum _{A}m(A)\log _{2}\vert A\vert $;
- Lamata–Moral’s (1988) if $U(m) =\log _{2}(\sum _{A}m(A)\vert A\vert )$ and U(m) = E(m) + I(m);
- discord (Klir–Ramer, 1990) if $U(m) = D(m) = -\sum _{A}m(A)\log _{2}(1 -\sum _{B}m(B)\frac{\vert B\setminus A\vert } {\vert B\vert } )$ and a variant: U(m) = D(m) + I(m);
- strife (Klir–Parviz, 1992) if $U(m) = -\sum _{A}m(A)\log _{2}(\sum _{B}m(B)\frac{\vert A\cap B\vert } {\vert A\vert } )$;
- Pal et al.’s (1993) if $U(m) = G(m) = -\sum _{A}\log _{2}m(A)$ and U(m) = G(m) + I(m);
- total conflict (George–Pal, 1996) if $U(m) =\sum _{A}m(A)\sum _{B}(m(B)(1 -\frac{\vert A\cap B\vert } {\vert A\cup B\vert }))$.
Among other distances used are the cosine distance $1 - \frac{m_{1}^{T}m_{ 2}} {\vert \vert m_{1}\vert \vert \vert \vert m_{2}\vert \vert }$, the Mahalanobis distance $\sqrt{(m_{1 } - m_{2 } )^{T } A(m_{1 } - m_{2 } )}$ for some matrices A, and pignistic-based one (Tessem, 1993) $\max _{A}\{\vert \sum _{B\neg \varnothing }(m_{1}(B) - m_{2}(B)\frac{\vert A\cap B\vert } {\vert B\vert } \vert \}$.

References

Abels H. The Gallery Distance of Flags, Order, Vol. 8, pp. 77–92, 1991.
MathSciNet MATH Google Scholar
Aichholzer O., Aurenhammer F. and Hurtado F. Edge Operations on Non-crossing Spanning Trees, Proc. 16-th European Workshop on Computational Geometry CG’2000, pp. 121–125, 2000.
Google Scholar
Aichholzer O., Aurenhammer F., Chen D.Z., Lee D.T., Mukhopadhyay A. and Papadopoulou E. Voronoi Diagrams for Direction-sensitive Distances, Proc. 13th Symposium on Computational Geometry, ACM Press, New York, 1997.
Google Scholar
Akerlof G.A. Social Distance and Social Decisions, Econometrica, Vol. 65-5, pp. 1005–1027, 1997.
MathSciNet Google Scholar
Amari S. Differential-geometrical Methods in Statistics, Lecture Notes in Statistics, Springer-Verlag, 1985.
MATH Google Scholar
Ambartzumian R. A Note on Pseudo-metrics on the Plane, Z. Wahrsch. Verw. Gebiete, Vol. 37, pp. 145–155, 1976.
MathSciNet MATH Google Scholar
Arnold R. and Wellerding A. On the Sobolev Distance of Convex Bodies, Aeq. Math., Vol. 44, pp. 72–83, 1992.
MathSciNet MATH Google Scholar
Baddeley A.J. Errors in Binary Images and an L ^p Version of the Hausdorff Metric, Nieuw Archief voor Wiskunde, Vol. 10, pp. 157–183, 1992.
MathSciNet MATH Google Scholar
Baier R. and Farkhi E. Regularity and Integration of Set-Valued Maps Represented by Generalized Steiner Points Set-Valued Analysis, Vol. 15, pp. 185–207, 2007.
MathSciNet MATH Google Scholar
Barabási A.L. The Physics of the Web, Physics World, July 2001.
Google Scholar
Barbaresco F. Information Geometry of Covariance Matrix: Cartan-Siegel Homogenous Bounded Domains, Mostow-Berger Fibration and Fréchet Median, in Matrix Information Geometry, Bhatia R. and Nielsen F. (eds.) Springer, 2012.
Google Scholar
Barbilian D. Einordnung von Lobayschewskys Massenbestimmung in either Gewissen Allgemeinen Metrik der Jordansche Bereiche, Casopis Mathematiky a Fysiky, Vol. 64, pp. 182–183, 1935.
MATH Google Scholar
Barceló C., Liberati S. and Visser M. Analogue Gravity, Living Rev. Rel. Vol. 8, 2005; arXiv: gr-qc/0505065, 2005.
Google Scholar
Bartal Y., Linial N., Mendel M. and Naor A. Some Low Distortion Metric Ramsey Problems, Discrete and Computational Geometry, Vol. 33, pp. 27–41, 2005.
MathSciNet MATH Google Scholar
Basseville M. Distances measures for signal processing and pattern recognition, Signal Processing, Vol. 18, pp. 349–369, 1989.
MathSciNet Google Scholar
Basseville M. Distances measures for statistical data processing – An annotated bibliography, Signal Processing, Vol. 93, pp. 621–633, 2013.
Google Scholar
Batagelj V. Norms and Distances over Finite Groups, J. of Combinatorics, Information and System Sci., Vol. 20, pp. 243–252, 1995.
Google Scholar
Beer G. On Metric Boundedness Structures, Set-Valued Analysis, Vol. 7, pp. 195–208, 1999.
MathSciNet MATH Google Scholar
Bennet C.H., Gács P., Li M., Vitánai P.M.B. and Zurek W. Information Distance, IEEE Transactions on Information Theory, Vol. 44-4, pp. 1407–1423, 1998.
Google Scholar
Berrou C., Glavieux A. and Thitimajshima P. Near Shannon Limit Error-correcting Coding and Decoding: Turbo-codes, Proc. of IEEE Int. Conf. on Communication, pp. 1064–1070, 1993.
Google Scholar
Blanchard F., Formenti E. and Kurka P. Cellular Automata in the Cantor, Besicovitch and Weyl Topological Spaces, Complex Systems, Vol. 11, pp. 107–123, 1999.
MathSciNet Google Scholar
Bloch I. On fuzzy distances and their use in image processing under unprecision, Pattern Recognition, Vol. 32, pp. 1873–1895, 1999.
Google Scholar
Block H.W., Chhetry D., Fang Z. and Sampson A.R. Metrics on Permutations Useful for Positive Dependence, J. of Statistical Planning and Inference, Vol. 62, pp. 219–234, 1997.
MathSciNet MATH Google Scholar
Blumenthal L.M. Theory and Applications of Distance Geometry, Chelsea Publ., New York, 1970.
MATH Google Scholar
Borgefors G. Distance Transformations in Digital Images, Comp. Vision, Graphic and Image Processing, Vol. 34, pp. 344–371, 1986.
Google Scholar
Bramble D.M. and Lieberman D.E. Endurance Running and the Evolution of Homo, Nature, Vol. 432, pp. 345–352, 2004.
Google Scholar
O’Brien C. Minimization via the Subway metric, Honor Thesis, Dept. of Math., Ithaca College, New York, 2003.
Google Scholar
Broder A.Z., Kumar S. R., Maaghoul F., Raghavan P., Rajagopalan S., Stata R., Tomkins A. and Wiener G. Graph Structure in the Web: Experiments and Models, Proc. 9-th WWW Conf., Amsterdam, 2000.
Google Scholar
Brualdi R.A., Graves J.S. and Lawrence K.M. Codes with a Poset Metric, Discrete Math., Vol. 147, pp. 57–72, 1995.
MathSciNet MATH Google Scholar
Bryant V. Metric Spaces: Iteration and Application, Cambridge Univ. Press, 1985.
MATH Google Scholar
Buckley F. and Harary F. Distance in Graphs, Redwood City, CA: Addison-Wesley, 1990.
MATH Google Scholar
Bullough E. “Psychical Distance” as a Factor in Art and as an Aesthetic Principle, British J. of Psychology, Vol. 5, pp. 87–117, 1912.
Google Scholar
Burago D., Burago Y. and Ivanov S. A Course in Metric Geometry, Amer. Math. Soc., Graduate Studies in Math., Vol. 33, 2001.
Google Scholar
Busemann H. and Kelly P.J. Projective Geometry and Projective Metrics, Academic Press, New York, 1953.
MATH Google Scholar
Busemann H. The Geometry of Geodesics, Academic Press, New York, 1955.
MATH Google Scholar
Busemann H. and Phadke B.B. Spaces with Distinguished Geodesics, Marcel Dekker, New York, 1987.
MATH Google Scholar
Cairncross F. The Death of Distance 2.0: How the Communication Revolution will Change our Lives, Harvard Business School Press, second edition, 2001.
Google Scholar
Calude C.S., Salomaa K. and Yu S. Metric Lexical Analysis, Springer-Verlag, 2001.
Google Scholar
Cameron P.J. and Tarzi S. Limits of cubes, Topology and its Appl., Vol. 155, pp. 1454–1461, 2008.
MathSciNet MATH Google Scholar
Carmi S., Havlin S., Kirkpatrick S., Shavitt Y. and Shir E. A model of internet topology using k-shell decomposition, Proc. Nat. Acad. Sci., Vol. 104, pp. 11150–11154, 2007.
Google Scholar
Cha S.-H. Taxonomy of nominal type histogram distance measures, Proc. American Conf. on Appl, Math., World Scientific and Engineering Academy and Society (WREAS) Stevens Point, Wisconsin, US, pp. 325–330, 2008.
Google Scholar
Cheng Y.C. and Lu S.Y. Waveform Correlation by Tree Matching, IEEE Trans. Pattern Anal. Machine Intell., Vol. 7, pp. 299–305, 1985.
Google Scholar
Chentsov N.N. Statistical Decision Rules and Optimal Inferences, Nauka, Moscow, 1972.
Google Scholar
Chepoi V. and Fichet B. A Note on Circular Decomposable Metrics, Geom. Dedicata, Vol. 69, pp. 237–240, 1998.
MathSciNet MATH Google Scholar
Choi S.W. and Seidel H.-P. Hyperbolic Hausdorff Distance for Medial Axis Transform, Research Report MPI-I-2000-4-003 of Max-Planck-Institute für Informatik, 2000.
Google Scholar
Coifman R.R., Lafon S., A.B., Maggioni M., Nadler B., Warner F., Zucker S.W. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps, Proc. of the National Academy of Sciences, Vol. 102, No. 21, pp. 7426–7431, 2005.
Google Scholar
Collado M.D., Ortuno-Ortin I. and Romeu A. Vertical Transmission of Consumption Behavior and the Distribution of Surnames, mimeo, Universidad de Alicante, 2005.
Google Scholar
Copson E.T. Metric Spaces, Cambridge Univ. Press, 1968.
MATH Google Scholar
Corazza P. Introduction to metric-preserving functions, Amer. Math. Monthly, Vo. 104, pp. 309–323, 1999.
Google Scholar
Cormode G. Sequence Distance Embedding, PhD Thesis, Univ. of Warwick, 2003.
Google Scholar
Critchlow D.E., Pearl D.K. and Qian C. The Triples Distance for Rooted Bifurcating Phylogenetic Trees, Syst. Biology, Vol. 45, pp. 323–334, 1996.
Google Scholar
Croft W. B., Cronon-Townsend S. and Lavrenko V. Relevance Feedback and Personalization: A Language Modeling Perspective, in DELOS-NSF Workshop on Personalization and Recommender Systems in Digital Libraries, pp. 49–54, 2001.
Google Scholar
Cuijpers R.H., Kappers A.M.L and Koenderink J.J. The metrics of visual and haptic space based on parallelity judgements, J. Math. Psychology, Vol. 47, pp. 278–291, 2003.
MathSciNet MATH Google Scholar
Das P.P. and Chatterji B.N. Knight’s Distance in Digital Geometry, Pattern Recognition Letters, Vol. 7, pp. 215–226, 1988.
MATH Google Scholar
Das P.P. Lattice of Octagonal Distances in Digital Geometry, Pattern Recognition Letters, Vol. 11, pp. 663–667, 1990.
MATH Google Scholar
Das P.P. and Mukherjee J. Metricity of Super-knight’s Distance in Digital Geometry, Pattern Recognition Letters, Vol. 11, pp. 601–604, 1990.
MATH Google Scholar
Dauphas N. The U/Th Production Ratio and the Age of the Milky Way from Meteorites and Galactic Halo Stars, Nature, Vol. 435, pp. 1203–1205, 2005.
Google Scholar
Day W.H.E. The Complexity of Computing Metric Distances between Partitions, Math. Social Sci., Vol. 1, pp. 269–287, 1981.
MATH Google Scholar
Deza M.M. and Dutour M. Voronoi Polytopes for Polyhedral Norms on Lattices, arXiv:1401.0040 [math.MG], 2013.
Google Scholar
Deza M.M. and Dutour M. Cones of Metrics, Hemi-metrics and Super-metrics, Ann. of European Academy of Sci., pp. 141–162, 2003.
Google Scholar
Deza M. and Huang T. Metrics on Permutations, a Survey, J. of Combinatorics, Information and System Sci., Vol. 23, Nrs. 1–4, pp. 173–185, 1998.
Google Scholar
Deza M.M. and Laurent M. Geometry of Cuts and Metrics, Springer, 1997.
Google Scholar
Deza M.M., Petitjean M. and Matkov K. (eds) Mathematics of Distances and Applications, ITHEA, Sofia, 2012.
Google Scholar
Ding L. and Gao S. Graev metric groups and Polishable subgroups, Advances in Mathematics, Vol. 213, pp. 887–901, 2007.
MathSciNet MATH Google Scholar
Ehrenfeucht A. and Haussler D. A New Distance Metric on Strings Computable in Linear Time, Discrete Appl. Math., Vol. 20, pp. 191–203, 1988.
MathSciNet MATH Google Scholar
Encyclopedia of Math., Hazewinkel M. (ed.), Kluwer Academic Publ., 1998. Online edition: http://eom.springer.de/default.htm
Ernvall S. On the Modular Distance, IEEE Trans. Inf. Theory, Vol. 31-4, pp. 521–522, 1985.
MathSciNet Google Scholar
Estabrook G.F., McMorris F.R. and Meacham C.A. Comparison of Undirected Phylogenetic Trees Based on Subtrees of Four Evolutionary Units, Syst. Zool, Vol. 34, pp. 193–200, 1985.
Google Scholar
Farrán J.N. and Munuera C. Goppa-like Bounds for the Generalized Feng-Rao Distances, Discrete Appl. Math., Vol. 128, pp. 145–156, 2003.
MathSciNet MATH Google Scholar
Fazekas A. Lattice of Distances Based on 3D-neighborhood Sequences, Acta Math. Academiae Paedagogicae Nyiregyháziensis, Vol. 15, pp. 55–60, 1999.
MathSciNet MATH Google Scholar
Feng J. and Wang T.M. Characterization of protein primary sequences based on partial ordering, J. Theor. Biology, Vol. 254, pp. 752–755, 2008.
Google Scholar
Fellous J-M. Gender Discrimination and Prediction on the Basis of Facial Metric Information, Vision Research, Vol. 37, pp. 1961–1973, 1997.
Google Scholar
Ferguson N. Empire: The Rise and Demise of the British World Order and Lessons for Global Power, Basic Books, 2003.
Google Scholar
Foertsch T. and Schroeder V. Hyperbolicity, CAT( − 1)-spaces and the Ptolemy Inequality, Math. Ann., Vol. 350, pp. 339–356, 2011.
MathSciNet MATH Google Scholar
Frankild A. and Sather-Wagstaff S. The set of semidualizing complexes is a nontrivial metric space, J. Algebra, Vol. 308, pp. 124–143, 2007.
MathSciNet MATH Google Scholar
Frieden B.R. Physics from Fisher information, Cambridge Univ. Press, 1998.
Google Scholar
Gabidulin E.M. and Simonis J. Metrics Generated by Families of Subspaces, IEEE Transactions on Information Theory, Vol. 44-3, pp. 1136–1141, 1998.
MathSciNet Google Scholar
Giles J.R. Introduction to the Analysis of Metric Spaces, Australian Math. Soc. Lecture Series, Cambridge Univ. Press, 1987.
Google Scholar
Godsil C.D. and McKay B.D. The Dimension of a Graph, Quart. J. Math. Oxford Series (2), Vol. 31, pp. 423–427, 1980.
Google Scholar
Goh K.I., Oh E.S., Jeong H., Kahng B. and Kim D. Classification of Scale Free Networks, Proc. Nat. Acad. Sci. US, Vol. 99, pp. 12583–12588, 2002.
MathSciNet MATH Google Scholar
Goppa V.D. Rational Representation of Codes and (L,g)-codes, Probl. Peredachi Inform., Vol. 7-3, pp. 41–49, 1971.
MathSciNet Google Scholar
Gotoh O. An Improved Algorithm for Matching Biological Sequences, J. of Molecular Biology, Vol. 162, pp. 705–708, 1982.
Google Scholar
Grabowski R., Khosa P. and Choset H. Development and Deployment of a Line of Sight Virtual Sensor for Heterogeneous Teams, Proc. IEEE Int. Conf. on Robotics and Automation, New Orleans, 2004.
Google Scholar
Gruber P.M. The space of Convex Bodies in Handbook of Convex Geometry, Gruber P.M. and Wills J.M. (eds.), Elsevier Sci. Publ., 1993.
Google Scholar
Hafner J., Sawhney H.S., Equitz W., Flickner M. and Niblack W. Efficient Color Histogram Indexing for Quadratic Form Distance Functions, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17-7, pp. 729–736, 1995.
Google Scholar
Hall E.T. The Hidden Dimension, Anchor Books, New York, 1969.
Google Scholar
Hamilton W.R. Elements of Quaternions, second edition 1899–1901 enlarged by C.J. Joly, reprinted by Chelsea Publ., New York, 1969.
Google Scholar
Harispe S., Ranwez S., Janaqi S. and Montmain J. Semantic Measures for the Comparison of Units of Language, Concepts or Instances from Text and Knowledge Base Analysis, arXiv:1310.1285[cs.CL], 2013.
Google Scholar
Head K. and Mayer T. Illusory Border Effects: Distance mismeasurement inflates estimates of home bias in trade, CEPII Working Paper No 2002-01, 2002.
Google Scholar
Hemmerling A. Effective Metric Spaces and Representations of the Reals, Theoretical Comp. Sci., Vol. 284-2, pp. 347–372, 2002.
MathSciNet Google Scholar
Higham N.J. Matrix Nearness Problems and Applications, in Applications of Matrix Theory, Gover M.J.C. and Barnett S. (eds.), pp. 1–27. Oxford University Press, 1989.
Google Scholar
Hofstede G. Culture’s Consequences: International Differences in Work-related Values, Sage Publ., California, 1980.
Google Scholar
Huber K. Codes over Gaussian Integers, IEEE Trans. Inf. Theory, Vol. 40-1, pp. 207–216, 1994.
Google Scholar
Huber K. Codes over Eisenstein-Jacobi Integers, Contemporary Math., Vol. 168, pp. 165–179, 1994.
Google Scholar
Huffaker B., Fomenkov M., Plummer D.J., Moore D. and Claffy K., Distance Metrics in the Internet, Proc. IEEE Int. Telecomm. Symp. (ITS-2002), 2002.
Google Scholar
Indyk P. and Venkatasubramanian S. Approximate Congruence in Nearly Linear Time, Proc. 11th ACM-SIAM symposium on Discrete Algorithms, pp. 354–260, San Francisco, 2000.
Google Scholar
Isbell J. Six Theorems about Metric Spaces, Comment. Math. Helv., Vol. 39, pp. 65–74, 1964.
MathSciNet MATH Google Scholar
Isham C.J., Kubyshin Y. and Penteln P. Quantum Norm Theory and the Quantization of Metric Topology, Class. Quantum Gravity, Vol. 7, pp. 1053–1074, 1990.
MATH Google Scholar
Ivanova R. and Stanilov G. A Skew-symmetric Curvature Operator in Riemannian Geometry, in Symposia Gaussiana, Conf. A, Behara M., Fritsch R. and Lintz R. (eds.), pp. 391–395, 1995.
Google Scholar
Jiang T., Wang L. and Zhang K. Alignment of Trees – an Alternative to Tree Edit, in Combinatorial Pattern Matching, Lecture Notes in Comp. Science, Vol. 807, Crochemore M. and Gusfield D. (eds.), Springer-Verlag, 1994.
Google Scholar
Klein R. Voronoi Diagrams in the Moscow Metric, Graphtheoretic Concepts in Comp. Sci., Vol. 6, pp. 434–441, 1988.
Google Scholar
Klein R. Concrete and Abstract Voronoi Diagrams, Lecture Notes in Comp. Sci., Springer-Verlag, 1989.
MATH Google Scholar
Klein D.J. and Randic M. Resistance distance, J. of Math. Chemistry, Vol. 12, pp. 81–95, 1993.
MathSciNet Google Scholar
Koella J.C. The Spatial Spread of Altruism Versus the Evolutionary Response of Egoists, Proc. Royal Soc. London, Series B, Vol. 267, pp. 1979–1985, 2000.
Google Scholar
Kogut B. and Singh H. The Effect of National Culture on the Choice of Entry Mode, J. of Int. Business Studies, Vol. 19-3, pp. 411–432, 1988.
Google Scholar
Kosheleva O., Kreinovich V. and Nguyen H.T. On the Optimal Choice of Quality Metric in Image Compression, Fifth IEEE Southwest Symposium on Image Analysis and Interpretation, 7–9 April 2002, Santa Fe, IEEE Comp. Soc. Digital Library, Electronic edition, pp. 116–120, 2002.
Google Scholar
Larson R.C. and Li V.O.K. Finding Minimum Rectilinear Distance Paths in the Presence of Barriers, Networks, Vol. 11, pp. 285–304, 1981.
MathSciNet MATH Google Scholar
Li M., Chen X., Li X., Ma B. and Vitányi P. The Similarity Metric, IEEE Trans. Inf. Theory, Vol. 50-12, pp. 3250–3264, 2004.
Google Scholar
Luczak E. and Rosenfeld A. Distance on a Hexagonal Grid, IEEE Trans. on Comp., Vol. 25-5, pp. 532–533, 1976.
Google Scholar
Mak King-Tim and Morton A.J. Distances between Traveling Salesman Tours, Discrete Appl. Math., Vol. 58, pp. 281–291, 1995.
Google Scholar
Martin K. A foundation for computation, Ph.D. Thesis, Tulane University, Department of Math., 2000.
Google Scholar
Martin W.J. and Stinson D.R. Association Schemes for Ordered Orthogonal Arrays and (T, M, S)-nets, Can. J. Math., Vol. 51, pp. 326–346, 1999.
MathSciNet MATH Google Scholar
Mascioni V. Equilateral Triangles in Finite Metric Spaces, The Electronic J. Combinatorics, Vol. 11, 2004, R18.
Google Scholar
S.G. Matthews, Partial metric topology, Research Report 212, Dept. of Comp. Science, University of Warwick, 1992.
Google Scholar
McCanna J.E. Multiply-sure Distances in Graphs, Congressus Numerantium, Vol. 97, pp. 71–81, 1997.
MathSciNet Google Scholar
Melter R.A. A Survey of Digital Metrics, Contemporary Math., Vol. 119, 1991.
Google Scholar
Monjardet B. On the Comparison of the Spearman and Kendall Metrics between Linear Orders, Discrete Math., Vol. 192, pp. 281–292, 1998.
MathSciNet MATH Google Scholar
Morgan J.H. Pastoral ecstasy and the authentic self: Theological meanings in symbolic distance, Pastoral Psychology, Vol. 25-2, pp. 128–137, 1976.
Google Scholar
Mucherino A., Lavor C., Liberti L. and Maculan N. (eds.) Distance Geometry, Springer, 2013.
Google Scholar
Murakami H. Some Metrics on Classical Knots, Math. Ann., Vol. 270, pp. 35–45, 1985.
MathSciNet MATH Google Scholar
Needleman S.B. and Wunsh S.D. A general Method Applicable to the Search of the Similarities in the Amino Acids Sequences of Two Proteins, J. of Molecular Biology, Vol. 48, pp. 443–453, 1970.
Google Scholar
Nishida T. and Sugihara K. FEM-like Fast Marching Method for the Computation of the Boat-Sail Distance and the Associated Voronoi Diagram, Technical Reports, METR 2003-45, Dept. Math. Informatics, The University of Tokyo, 2003.
Google Scholar
Okabe A., Boots B. and Sugihara K. Spatial Tessellation: Concepts and Applications of Voronoi Diagrams, Wiley, 1992.
Google Scholar
Okada D. and M. Bingham P.M. Human uniqueness-self-interest and social cooperation, J. Theor. Biology, Vol. 253-2, pp. 261–270, 2008.
Google Scholar
Oliva D., Samengo I., Leutgeb S. and Mizumori S. A Subjective Distance between Stimuli: Quantifying the Metric Structure of Representations, Neural Computation, Vol. 17-4, pp. 969–990, 2005.
Google Scholar
Ong C.J. and Gilbert E.G. Growth distances: new measures for object separation and penetration, IEEE Transactions in Robotics and Automation, Vol. 12-6, pp. 888–903, 1996.
Google Scholar
Ophir A. and Pinchasi R. Nearly equal distances in metric spaces, Discrete Appl. Math., Vol. 174, pp. 122–127, 2014.
MathSciNet MATH Google Scholar
Orlicz W. Über eine Gewisse Klasse von Raumen vom Typus B ^′, Bull. Int. Acad. Pol. Series A, Vol. 8–9, pp. 207–220, 1932.
Google Scholar
Ozer H., Avcibas I., Sankur B. and Memon N.D. Steganalysis of Audio Based on Audio Quality Metrics, Security and Watermarking of Multimedia Contents V (Proc. of SPIEIS and T), Vol. 5020, pp. 55–66, 2003.
Google Scholar
Page E.S. On Monte-Carlo Methods in Congestion Problem. 1. Searching for an Optimum in Discrete Situations, J. Oper. Res., Vol. 13-2, pp. 291–299, 1965.
Google Scholar
Petz D. Monotone Metrics on Matrix Spaces, Linear Algebra Appl., Vol. 244, 1996.
Google Scholar
PlanetMath.org, http://planetmath.org/encyclopedia/
Rachev S.T. Probability Metrics and the Stability of Stochastic Models, Wiley, New York, 1991.
MATH Google Scholar
Requardt M. and Roy S. Quantum Spacetime as a Statistical Geometry of Fuzzy Lumps and the Connection with Random Metric Spaces, Class. Quantum Gravity, Vol. 18, pp. 3039–3057, 2001.
MathSciNet MATH Google Scholar
Resnikoff H.I. On the geometry of color perception, AMS Lectures on Math. in the Life Sciences, Vol. 7, pp. 217–232, 1974.
MathSciNet Google Scholar
Ristad E. and Yianilos P. Learning String Edit Distance, IEEE Transactions on Pattern Recognition and Machine Intelligence, Vol. 20-5, pp. 522–532, 1998.
Google Scholar
Rocher T., Robine M., Hanna P. and Desainte-Catherine M. A Survey of Chord Distances With Comparison for Chord Analysis, Proc. Int. Comp. Music Conf., pp. 187–190, New York, 2010.
Google Scholar
Rosenfeld A. and Pfaltz J. Distance Functions on Digital Pictures, Pattern Recognition, Vol. 1, pp. 33–61, 1968.
MathSciNet Google Scholar
Rubner Y., Tomasi C. and Guibas L.J. The Earth Mover’s Distance as a Metric for Image Retrieval, Int. J. of Comp. Vision, Vol. 40-2, pp. 99–121, 2000.
Google Scholar
Rummel R.J. Understanding Conflict and War, Sage Publ., California, 1976.
Google Scholar
Schweizer B. and Sklar A. Probabilistic Metric Spaces, North-Holland, 1983.
Google Scholar
Selkow S.M. The Tree-to-tree Editing Problem, Inform. Process. Lett., Vol. 6-6, pp. 184–186, 1977.
MathSciNet Google Scholar
Sharma B.D. and Kaushik M.L. Limits intensity random and burst error codes with class weight considerations, Elektron. Inform.-verarb. Kybernetik, Vol. 15, pp. 315–321, 1979.
MathSciNet MATH Google Scholar
Tai K.-C. The Tree-to-tree Correction Problem, J. of the Association for Comp. Machinery, Vol. 26, pp. 422–433, 1979.
MathSciNet MATH Google Scholar
Tailor B. Introduction: How Far, How Near: Distance and Proximity in the Historical Imagination, History Workshop J., Vol. 57, pp. 117–122, 2004.
Google Scholar
Tymoczko D. The Geometry of Musical Chords, Science, Vol. 313, Nr. 5783, pp. 72–74, 2006.
Google Scholar
Tomimatsu A. and Sato H. New Exact Solution for the Gravitational Field of a Spinning Mass, Phys. Rev. Letters, Vol. 29, pp. 1344–1345, 1972.
Google Scholar
Vardi Y. Metrics Useful in Network Tomography Studies, Signal Processing Letters, Vol. 11-3, pp. 353–355, 2004.
Google Scholar
Veltkamp R.C. and Hagendoorn M. State-of-the-Art in Shape Matching, in Principles of Visual Information Retrieval, Lew M. (ed.), pp. 87–119, Springer-Verlag, 2001.
Google Scholar
Watts D.J. Small Worlds: The Dynamics of Networks between Order and Randomness, Princeton Univ. Press, 1999.
Google Scholar
Weinberg S. Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity, Wiley, New York, 1972.
Google Scholar
Weisstein E.W. CRC Concise Encyclopedia of Math., CRC Press, 1999.
Google Scholar
Weiss I. Metric 1-spaces, arXiv:1201.3980[math.MG], 2012.
Google Scholar
Wellens R.A. Use of a Psychological Model to Assess Differences in Telecommunication Media, in Teleconferencing and Electronic Communication, Parker L.A. and Olgren O.H. (eds.), pp. 347–361, Univ. of Wisconsin Extension, 1986.
Google Scholar
Wikipedia, the Free Encyclopedia, http://en.wikipedia.org
Wilson D.R. and Martinez T.R. Improved Heterogeneous Distance Functions, J. of Artificial Intelligence Research, Vol. 6, p. 134, 1997.
MathSciNet Google Scholar
Wolf S. and Pinson M.H. Spatial-Temporal Distortion Metrics for In-Service Quality Monitoring of Any Digital Video System, Proc. of SPIE Int. Symp. on Voice, Video, and Data Commun., September 1999.
Google Scholar
Yianilos P.N. Normalized Forms for Two Common Metrics, NEC Research Institute, Report 91-082-9027-1, 1991.
Google Scholar
Young N. Some Function-Theoretic Issues in Feedback Stabilisation, Holomorphic Spaces, MSRI Publication, Vol. 33, 1998.
Google Scholar
Yutaka M., Ohsawa Y. and Ishizuka M. Average-Clicks: A New Measure of Distance on the World Wide Web, J. Intelligent Information Systems, Vol. 20-1, pp. 51–62, 2003.
Google Scholar
Zelinka B. On a Certain Distance between Isomorphism Classes of Graphs, Casopus. Pest. Mat., Vol. 100, pp. 371–373, 1975.
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Ecole Normale Supérieure, Paris, France
Michel Marie Deza
Moscow State Pedagogical University, Moscow, Russia
Elena Deza

Authors

Michel Marie Deza
View author publications
You can also search for this author in PubMed Google Scholar
Elena Deza
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Deza, M.M., Deza, E. (2014). Distances in Probability Theory. In: Encyclopedia of Distances. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44342-2_14

Download citation

DOI: https://doi.org/10.1007/978-3-662-44342-2_14
Published: 28 August 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44341-5
Online ISBN: 978-3-662-44342-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics