10.1 Introduction

Optimal mass transport (OMT) is currently a very active area of research with applications to areas both applied and theoretical including control, transportation, econometrics, fluid dynamics, probability theory, statistical physics, shape optimization, expert systems, and meteorology; see [25, 30] for extensive lists of references. The original problem was first formulated by the civil engineer Gaspar Monge in 1781, and concerned finding the optimal way, in the sense of minimal transportation cost, of moving a pile of soil from one site to another. Much later the problem was extensively analyzed by Kantorovich [18], and is now known as the Monge–Kantorovich (MK) or optimal mass transport (OMT) problem.

In this paper, we present certain generalizations of OMT to matrix and vector-valued transportation. Our original motivation for this rather nontraditional viewpoint was provided by problems in Signal Analysis, more specifically, the need of a weakly continuous metric to compare (matrix-valued) power spectra of multivariate time series (see [24]). Soon afterward it became apparent Quantum Mechanics was another field that would stand to benefit from such an unusual extension of OMT. In fact, it was this latter subject that provided some of the clues of how to properly set up noncommutative OMT.

The basis of the new theory is a suitable extension of the Liouville (continuity) equation that allows flows in matrix or other spaces. To this end, in [8], we first proposed such a continuity equation and a noncommutative counterpart of OMT where probability distributions are replaced by density matrices (i.e., Hermitian positive-definite matrices with unit trace). The appropriate Wasserstein metric now corresponds to the minimal value of an action integral evaluated on flows connecting end-point density matrices. The key insight, to use such a dynamic formulation in seeking the needed generality was provided by the seminal approach of Benamou and Brenier [3]. Indeed, the Benamou–Brenier formulation recasts OMT as a stochastic control problem. The work we are reporting herein takes this idea along several different directions, and in particular to OMT between matrices, matricial distribution and vectorial distributions [8, 10]. Extensions of these results to distributions that have end-point distributions of unequal overall mass (unbalanced) are reported in [11] (and not included in the current survey).

We note that at about the same time as [8] was originally reported, closely related approaches were formulated independently and simultaneously in [6, 20]. In fact, in our work, we greatly benefited from earlier work by Carlen and Maas [7] on a fermionic Fokker–Planck equation.

10.2 Quantum Continuity Equation

The three papers [6, 8, 20] all begin with the Lindblad equation that describes the evolution of open quantum systems. Open quantum systems are thought of as coupled to a larger system (referred to as the environment or the ancilla) and, thereby, cannot, in general, be described by the Schrödinger equation [17]. In this case, the evolution of density operators \(\rho \) [17] is given by the Lindblad equation

$$\begin{aligned} \dot{\rho }&=-i [H, \rho ]\\&+\sum _{k=1}^N( L_k\rho L_k^*-\frac{1}{2}\rho L_k^*L_k-\frac{1}{2}L_k^*L_k\rho ),\nonumber \end{aligned}$$
(10.1)

where \(^*\) denotes conjugate transpose, and throughout, we assume that \(\hbar =1\). The first term on the right-hand side describes the evolution of the state under the effect of the Hamiltonian H (Schrödinger unitary evolution). The other terms on the right-hand side model diffusion and, thereby, capture the dissipation of energy–they constitute the quantum analogue of Laplace’s operator \(\varDelta \) and are referred to as the Lindblad terms.

Our approach [8, 9] relies on a suitable continuity equation in the space of Hermitian matrices \(\mathscr {H}\) (of a given dimension). To this end, we invoke suitable definitions for the gradient \(\nabla _L\) and divergence \(\nabla ^*_L\) operators on spaces of matrices that are explained below and express the continuity equation in the familiar form

$$\begin{aligned} \dot{\rho }=\nabla ^*_L J, \end{aligned}$$
(10.2)

where J is a matricial flux, in complete analogy with the continuity equation on scalar densities.

Throughout, \(\rho (t)\in \mathscr {H}\) is a positive-semidefinite matrix of trace one, i.e., a density matrix of quantum mechanics. Regarding notation, we let \({\mathscr {H}}_+\) and \({\mathscr {H}}_{++}\) denote the cones of nonnegative and positive-definite matrices, respectively,

$$ \mathscr {D}_+ :=\{\rho \in {\mathscr {H}}_{++} \mid \text {tr}(\rho )=1\} $$

the space of density matrices, and \({\mathscr {S}}\) the space of skew-Hermitian matrices (of the same dimension as \({\mathscr {H}}\)). The flux J is taken in \({\mathscr {S}}^N\), i.e., a vector with matrix entries. Flux typically arises in the form

$$ J=\rho \circ v \; \text{ or } \text{ in } \text{ the } \text{ form } \;J=\nabla _L\rho . $$

The symbol \(\rho \circ v\) denotes one of several possible choices of noncommutative multiplication. We have considered specifically the following two choices, referred to as the anticommutator multiplication (i) and Kubo-Mori product (ii), respectively:

$$ \mathrm{(i)}\;\;\rho \circ v = \frac{1}{2}(\rho v+v\rho ) \text{ and } \mathrm{(ii)}\;\;\rho \circ v=\int _0^1 \rho ^s v\rho ^{1-s} ds, $$

where, for \(\rho \in {\mathscr {H}}\) and \(v\in {\mathscr {S}}^N\),

$$ v\rho := \left[ \begin{array}{c} v_1\rho \\ \vdots \\ v_N \rho \end{array}\right] , \text{ and } \rho v := \left[ \begin{array}{c} \rho v_1\\ \vdots \\ \rho v_N \end{array}\right] . $$

On the other hand, we define the gradient operator with respect to \(L\in {\mathscr {H}}^N\) to be

$$ \nabla _L: {\mathscr {H}}\rightarrow {{\mathscr {S}}}^N, ~~X \mapsto \left[ \begin{array}{c} L_1 X-XL_1\\ \vdots \\ L_N X-X L_N \end{array}\right] . $$

With respect to the standard Hilbert–Schmidt inner product \(\langle X, Y\rangle =\text {tr}(X^*Y)\) (and, for the case when XY are in \({\mathscr {H}}^N\) or \({\mathscr {S}}^N\), the inner product \(\langle X, Y\rangle =\sum _{k=1}^N \text {tr}(X_k^*Y_k)\)), the divergence operator turns out to be

$$ \nabla _L^*: {{\mathscr {S}}}^N \rightarrow {\mathscr {H}},~Y= \left[ \begin{array}{c} Y_1\\ \vdots \\ Y_N \end{array}\right] \mapsto \sum _k^N L_k Y_k-Y_k L_k, $$

and this is what is used in (10.2). We note that for technical reasons, the definition of gradient and divergence require that \(L_k=L_k^*\), i.e., \(L\in {\mathscr {H}}^N\), as above. Also, one can easily verify that \(\nabla _L\) is a derivation, in that,Footnote 1

$$\begin{aligned}&\nabla _L(XY+YX)=(\nabla _L X) Y+ X (\nabla _L Y)\\&\,\,+(\nabla _L Y) X+ Y (\nabla _L X),~~\forall X, Y\in {\mathscr {H}}.\nonumber \end{aligned}$$

With these definitions in place, we define the (matricial) Laplacian as

$$\begin{aligned} \varDelta _L X&:=-\nabla _L^*\nabla _L X\\&=\sum _{k=1}^N( 2L_k X L_k-X L_kL_k-L_kL_k X),\,X\in {\mathscr {H}}, \end{aligned}$$

which is exactly (after scaling by 1 / 2) the diffusion term in the Lindblad equationFootnote 2 (10.1). Hence, Lindblad’s equation can be rewritten as

$$\begin{aligned} \dot{\rho }&= -\nabla _H^*(\rho i)+\frac{1}{2} \nabla ^*_L (\nabla _L \rho )\\&= -\nabla _H(\rho i)+\frac{1}{2} \varDelta _L \rho . \end{aligned}$$

10.3 Matricial Wasserstein 2-Metric

From here on we consider the continuity equation,

$$\begin{aligned} \dot{\rho }=\nabla _L^* (\rho \circ v), \end{aligned}$$
(10.3)

without the diffusion term, but for a general velocity field \(v\in {\mathscr {S}}^N\). A tacit assumption throughout is that the identity matrix I spans the null space or \(\nabla _L\); this can be ensured if one chooses \(L_1,\ldots ,L_N\) to form a basis of \({\mathscr {H}}\) (which is a sufficient, but not a necessary condition).

A Wasserstein distance between two density matrices can now be defined as the least action (minimum control problem) to steer one density matrix to another,

$$\begin{aligned} W_{2,a}(\rho _0, \rho _1)^2&:=\min _{\rho , v} \int _0^1 \langle v, \rho \circ v\rangle dt,\end{aligned}$$
(10.4a)
$$\begin{aligned}&\dot{\rho }=\frac{1}{2} \nabla _L^* (\rho \circ v),\end{aligned}$$
(10.4b)
$$\begin{aligned}&\rho (0)=\rho _0, ~~\rho (1)=\rho _1. \end{aligned}$$
(10.4c)

In this, \(\rho _0\) and \(\rho _1\) in \({\mathscr {D}}_+\) and the optimization is over \(\rho (\cdot )\in {\mathscr {D}}_+\) and \(v\in {\mathscr {S}}^N\). In fact, for \(v\in {\mathscr {S}}^N\), (10.3) already preserves positive definiteness and trace of \(\rho (\cdot )\).

The choice of the anticommutator product \(\rho \circ v = \frac{1}{2}(\rho v+v\rho )\) is especially appealing since, in this case, the matricial Wasserstein metric (10.4) is readily computable. Indeed, (10.4) can be cast as a convex optimization problem in a manner analogous to that in the scalar case [3]. To see this, let \(u:=\rho v=[u_1^*, \ldots ,u_N^*]^*\) and \(u_*:=[u_1, \ldots ,u_N]^*\), and observe that

$$\begin{aligned}&\text {tr}(\rho v^*v)=\sum _{k=1}^N \text {tr}(\rho v_k^*v_k)\\&=\sum _{k=1}^N \text {tr}((\rho v_k)^* \rho ^{-1} \rho v_k) =\text {tr}(u^*\rho ^{-1}u). \end{aligned}$$

Thus, (10.4) can be equivalently expressed as

$$\begin{aligned} W_{2}(\rho _0, \rho _1)^2=&\min _{\rho , u} \int _0^1 \text {tr}(u^*\rho ^{-1}u) dt,\end{aligned}$$
(10.5a)
$$\begin{aligned}&\dot{\rho }=\frac{1}{2} \nabla _L^* (u-u_*), \end{aligned}$$
(10.5b)
$$\begin{aligned}&\rho (0)=\rho _0, ~~\rho (1)=\rho _1 . \end{aligned}$$
(10.5c)

In this, it turns out that although we do not require any structural constraint on u, the optimal u satisfies \(u=\rho v\) for some \(v\in {\mathscr {S}}^N\).

The choice of the Kubo-Mori product, on the other hand, provides a matricial version of the Wasserstein metric for which the gradient flow of the von Neuman entropy \(\text {tr}(\rho \log (\rho ))\) is the Lindblad equation [6, 8, 20]. Thus, it generalizes to the noncommutative setting, the famous result by Jordan, Kinderlehrer, and Otto [16] for the ordinary Wasserstein-2 metric on probability densities where the heat equation is the gradient flow of the entropy. However, it is interesting to note that computation of the Wasserstein metric for the Kubo-Mori product appears challenging as compared to the one based on the anticommutator product above.

To characterize the form of minimizer one can proceed to consider the dual problem, which for the case of the anticommutator product goes as follows. With \(\lambda (\cdot ) \in {\mathscr {H}}\) a smooth Lagrangian multiplier for the constraints we construct the Lagrangian

$$\begin{aligned}&{\mathscr {L}}(\rho ,v,\lambda )=\int _0^1 \left\{ \frac{1}{2}\text {tr}(\rho v^*v)-\text {tr}(\lambda (\dot{\rho }-\frac{1}{2} \nabla _L^* (\rho v+v\rho )))\right\} dt \\&= \int _0^1\!\! \left\{ \!\frac{1}{2}\text {tr}(\rho v^*v)+\frac{1}{2}\text {tr}((\nabla _L\lambda )^* (\rho v+v\rho ))+ \text {tr}(\dot{\lambda }\rho )\!\right\} dt-\text {tr}(\lambda (1)\rho _1)+\text {tr}(\lambda (0)\rho _0). \end{aligned}$$

Point-wise minimization over v yields

$$ v_{opt}(t)=-\nabla _L \lambda (t) $$

and the expression for the corresponding minimum

$$\begin{aligned} \int _0^1 \left\{ -\frac{1}{2}\text {tr}(\rho (\nabla _L \lambda )^* (\nabla _L \lambda ))+ \text {tr}(\dot{\lambda }\rho )\right\} dt\\ - \text {tr}(\lambda (1)\rho _1)+\text {tr}(\lambda (0)\rho _0), \end{aligned}$$

from which we conclude the following sufficient condition for optimality: Suppose there exists \(\lambda (\cdot )\in {\mathscr {H}}\) satisfying

$$\begin{aligned} \dot{\lambda }=\frac{1}{2}(\nabla _L \lambda )^* (\nabla _L \lambda ) =\frac{1}{2}\sum _{k=1}^N(\nabla _L \lambda )_k^* (\nabla _L \lambda )_k \end{aligned}$$
(10.6a)

such that the solution of

$$\begin{aligned} \dot{\rho }=-\frac{1}{2} \nabla _L^* (\rho \nabla _L \lambda +\nabla _L \lambda \rho ) \end{aligned}$$
(10.6b)

matches the marginals \(\rho (0)=\rho _0, \rho (1)=\rho _1\). Then the pair \((\rho , v)\) with \(v=-\nabla _L \lambda \) solves (10.4).

The Wasserstein metric induces a Riemannian structure

$$ \langle \delta _1, \delta _2\rangle _\rho =\frac{1}{2}\text {tr}(\rho \nabla \lambda _1^*\nabla \lambda _2 +\rho \nabla \lambda _2^*\nabla \lambda _1) $$

on the tangent space of Hermitian matrices with a specified trace,

$$ T_\rho =\{\delta \in {\mathscr {H}}\mid \text {tr}(\delta )=0\}. $$

Here \(\lambda _j,~j=1, 2\) is the solution to the Poisson equation

$$\begin{aligned} \delta _j=-\frac{1}{2}\nabla _L^*(\rho \nabla _L\lambda _j+\nabla _L\lambda _j \rho ). \end{aligned}$$
(10.7)

The proof of existence and uniqueness of the solution of (10.7) follows exactly along the same lines as in [7]; details are given in [9]. In fact, given a tangent direction \(\delta \), \(-\nabla _L \lambda \) is the unique minimizer of \(\text {tr}(\rho v^*v)\) over all \(v\in {\mathscr {S}}^N\) satisfying

$$ \delta =\frac{1}{2}\nabla _L^*(\rho v+v \rho ). $$

With the above definition of inner product, \(W_{2}(\cdot , \cdot )\) indeed defines a metric on \({\mathscr {D}}_+\) for which \({\mathscr {D}}_+\) is a geodesic space, i.e., the distance between two given \(\rho _0,\rho _1\in {\mathscr {D}}_+\) can be rewritten as

$$ W_{2,a}(\rho _0, \rho _1)=\min _\rho \int _0^1 \sqrt{\langle \dot{\rho }(t),\dot{\rho }(t) \rangle _{\rho (t)}}dt, $$

where the minimum is taken over all piecewise smooth paths on the manifold \({\mathscr {D}}_+\).

We finally note that, more generally, OMT can be formulated on the space of matrix-valued distributions. In this case, the mass constraint becomes \(\int \text {tr}\rho (x)dx=1\), where x represents a vector of spatial coordinates and dx the volume element. Transport along spatial coordinates, e.g., with \( x\in {\mathbb R}^m\), is effected by a term \( \nabla _x\cdot (\rho \circ w)\) in the continuity equation, with \(w\in {\mathscr {H}}^m\), i.e.,

$$ \dot{\rho }= \nabla _L^* (\rho \circ v) - \nabla _x \cdot (\rho \circ w). $$

Likewise, the cost of transport is duly penalized in a corresponding problem to minimize a suitable action integral; see [8] for details.

10.4 Vector-Valued Transport on \({\mathbb R}^N\)

A vector-valued density \(\rho =[\rho _1,\rho _2,\cdots ,\rho _\ell ]^T\), on \({\mathbb R}^N\) or on a discrete space, may represent power reflected off a surface at different frequencies/colors. The “mass” of these components may transfer between different entries of the density vector (e.g., due to different angles of reflection) along time flows of the vectorial density. Thus, while the total power may be invariant (under some lighting conditions), the proportion of power at different frequencies or polarization may smoothly vary with viewing angle. As another example consider the case where the entries of \(\rho \) represent densities of different species, or particles, and allow for the possibility that mass transfers from one species to another (“mutate”), i.e., between entries of \(\rho \). Thus, in general, we postulate that transport of vector-valued quantities captures flow across space as well as between entries of the density vector.

In this context, an OMT-inspired geometry is aimed to express a suitable continuity and to quantify transport cost for such vectorial distributions. We highlight some of the key elements in [10] for such a theory. It follows a line which is analogous to development of quantum transport that was discussed above.

We begin by considering a vector-valued density \(\rho \) on \({\mathbb R}^N\), i.e., a map from \({\mathbb R}^N\) to \({\mathbb R}^\ell _+\) such that

$$ \sum _{i=1}^\ell \int _{{\mathbb R}^N} \rho _i(x)dx=1, $$

and consider the entries of \(\rho \) as representing density or mass of species/particles that can mutate between one another while maintaining total mass. We denote the set of all vector-valued densities and its interior by \({\mathscr {D}}\) and \({\mathscr {D}}_+\), respectively. The dynamics are described by the following continuity equation:

$$\begin{aligned} \frac{\partial \rho _i}{\partial t}+\nabla _x\cdot (\rho _i v_i)-\sum _{j\ne i} (\rho _j w_{ji}-\rho _i w_{ij})=0, ~~\forall i=1,\ldots ,\ell . \end{aligned}$$
(10.8)

Here \(v_i\) is the velocity field of particles i and \(w_{ij}\ge 0\) is the transfer rate from i to j. Equation (10.8) allows for the possibility to mutate between each pair of entries. More generally, mass transfer may only be permissible between specific types of particles and can be modeled by a graph \({\mathscr {G}}=(\mathcal{V},\mathcal{E})\) (where the entries denote nodes and edges, respectively), in which case, for a subset of indices, the transfer rates \(w_{ji}\) may be restricted to be zero.

Given \(\mu ,\nu \in {\mathscr {D}}_+\), we formulate the optimal mass transport

$$\begin{aligned} W_{2}(\mu ,\nu )^2 :=&\inf _{ \rho ,v,w} \int _0^1\int _{{\mathbb R}^N}\left\{ \sum _{i=1}^\ell \rho _i(t,x) \Vert v_i(t,x)\Vert ^2+\gamma \sum _{i,j=1}^\ell \rho _iw_{ij}^2(t,x)\right\} dxdt \qquad \qquad \\&\frac{\partial \rho _i}{\partial t}+\nabla _x\cdot (\rho _i v_i)-\sum _{j\ne i} (\rho _j w_{ji}-\rho _i w_{ij})=0, ~~\forall i=1,\ldots ,\ell \qquad \qquad \nonumber \\&w_{ij}(t,x) \ge 0, ~~\forall i, j, t, x\qquad \qquad \nonumber \\&\rho (0,\cdot )=\mu (\cdot ), ~ \rho (1,\cdot )=\nu (\cdot ).\qquad \qquad \nonumber \end{aligned}$$
(10.9)

The coefficient \(\gamma >0\) specifies the relative cost between transporting mass in space and trading mass between different types of particles. When \(\gamma \) is large, the solution reduces to independent OMT problems for the different entries to the degree possible. In general, \(W_{2}\) is a quasi-metric in that it satisfies the triangle inequality and positivity, but may not be symmetric.

Setting \(p_{ij}=\rho _i w_{ij}\ge 0\) and \(u_i=\rho _i v_i\), we have \( \rho _iw_{ij}^2=\rho _i^{-1} p_{ij}^2, \) and \( \rho _i\Vert v_i\Vert ^2=\rho _i^{-1}\Vert u_i\Vert ^2. \) It follows that

$$\begin{aligned} W_{2}(\mu ,\nu )^2 \!=\!&\inf _{\rho , u,p} \int _0^1\int _{{\mathbb R}^N}\left\{ \sum _{i=1}^\ell \rho _i(t,x)^{-1} \Vert u_i(t,x)\Vert ^2+\gamma \sum _{i,j=1}^\ell \rho _i^{-1}p_{ij}^2(t,x)\right\} dxdt \\&\frac{\partial \rho _i}{\partial t}+\nabla _x\cdot u_i-\sum _{j\ne i} (p_{ji}-p_{ij})=0, ~~\forall i=1,\ldots ,\ell \nonumber \\&p_{ij}(t,x) \ge 0, ~~\forall i, j, t, x\nonumber \\&\rho (0,\cdot )=\mu (\cdot ), ~ \rho (1,\cdot )=\nu (\cdot ) \,\,\text {which is a convex problem.}\nonumber \end{aligned}$$

Finally, a Riemannian-like metric on \({\mathscr {D}}_+\) can be obtained by symmetrizing the above expression [10]. This is,

$$\begin{aligned} W_{2, \mathrm sym}(\mu ,\nu )^2&=\inf _{\rho ,u,p} \int _0^1\int _{{\mathbb R}^N}\left\{ \sum _{i=1}^\ell \rho _i(t,x)^{-1} \Vert u_i(t,x)\Vert ^2\right. \\\nonumber&\left. +\frac{\gamma }{2} \sum _{i,j=1}^\ell (\rho _i^{-1}+\rho _j^{-1})p_{ij}^2(t,x)\right\} dxdt \end{aligned}$$
(10.10)

under the same constraints. This vector-valued OMT structure is further explored and developed in [10].

10.5 Vector-Valued Transport on Graphs

We conclude by highlighting elements of an OMT theory solely on graphs, cast in the setting of vector-valued densities [10]. As explained earlier, such densities may represent the distribution of multiple species/resources that are allowed to mutate between each other as they transition from node to node. The theory is aimed to capture cost of transport in such a setting.

A vector-valued mass distribution on the graph \({\mathscr {G}}=(\mathcal{V},\mathcal{E})\) (with n nodes and m edges) is a \(\ell \)-tuple \(\rho =(\rho _1,\cdots ,\rho _\ell )\) with each \(\rho _i=(\rho _{i,1},\cdots ,\rho _{i,n})^T\) being a vector in \({\mathbb R}_+^n\) such that

$$ \sum _{i=1}^\ell \sum _{k=1}^n \rho _{i,k}=1. $$

That is, each entry \(\rho _i\), for \(i\in \{1,\ldots ,\ell \}\), is a vector with nonnegative n-entries representing, e.g., color intensity for the i-th color, at the node corresponding to the respective entry. We denote the set of all nonnegative vector-valued mass distributions with \({\mathscr {D}}\) and its interior with \({\mathscr {D}}_+\). Equation (10.8) is now replaced by the following continuity equation:

$$\begin{aligned} \dot{\rho }_i-\nabla _{\mathscr {G}}^*((D_2^T\rho _i)\circ v_i-(D_1^T\rho _i)\circ \bar{v}_i) -\sum _{j\ne i} (\rho _j\circ w_{ji}-\rho _i\circ w_{ij})=0, ~~\forall i=1,\ldots ,\ell , \end{aligned}$$
(10.11)

since the spatial domain is now also discrete (i.e., it is \({\mathscr {G}}\) instead of \({\mathbb R}^N\)). Here, \(D=D_1-D_2\) is the incident matrix of the graph, with \(D_1,D_2\) are matrices with positive entries reflecting the position of sources (\(D_1\)) and sinks (\(D_2\)) by a entry equal to 1 in the corresponding place. Thus, the vector \(D_1^T\rho \) represents density at the sources of an edge and, likewise, \(D_2^T\rho \) represents density at the sinks. Then, also, \(\nabla _{\mathscr {G}}\) represents differencing between neighboring nodes and \(\nabla ^*_{\mathscr {G}}\) represents its dual (i.e., negative divergence) [10]. Finally, \(\circ \) represents entry-wise multiplication (Shur) between two vectors.

Now following the Benamou–Brenier [3] philosophy once again, given two marginal densities \(\mu , \nu \in {\mathscr {D}}_+\), we define their Wasserstein distance as

$$\begin{aligned} W_{2}(\mu ,\nu )^2 :=\!\inf _{\rho , v,w} \int _0^1\!\!\left\{ \sum _{i=1}^\ell [v_i^T ((D_2^T\rho )\circ v_i)+ \bar{v}_i^T((D_1^T\rho )\circ \bar{v}_i)]+\gamma \sum _{i,j=1}^\ell \sum _{k=1}^n \rho _{i,k}w_{ij,k}^2\!\right\} dt \end{aligned}$$

subject to (10.11) as well as \(w_{ij} \ge 0, ~v_i \ge 0, ~\bar{v}_i \ge 0\) for all (or a subset) of (ij)’s and \(\rho (0)=\mu , ~ \rho (1)=\nu .\)

We should note that the problem of transporting vector-valued mass on a graph is quite simpler than in the case where the underlined space is continuous, since it reduces essentially to a scalar mass situation on a suitably larger graph. Indeed, we can view the vector-valued mass as a scalar mass distribution on \(\ell \) identical layers of the graph \({\mathscr {G}}\) where the same nodes at different layers are connected through a complete graph. The two velocity fields vw represent mass transfer within the same layer and between different layers, respectively.

Once again, the computation of the metric has a convex formulation by changing optimization variable from \((\rho ,v_i,\bar{v}_i, w_ij,k)\) to momenta “mass” \(\rho \) and momenta \(u_i=\rho _iv_i\) and \(p_{ij}=\rho _i w_{ij}\), instead.

10.6 Conclusion

The basic fluid dynamical formulation of OMT can be generalized to flows on the space of matrices or vectors, that belong to a simplex of a suitable positive cone. A Wasserstein metric in these spaces can then be defined as a minimal quadratic cost for transferring between two end points. Such metrics appear natural as, in particular, for the space of quantum density matrices, render the Lindblad equation as the gradient flow of the von Neumann entropy. Our interest stems from problems in signal analysis and, more specifically, spectral and image analysis. In both of these application areas, the relevance of weakly continuous metric that can be used to quantify distances between, e.g., matrix-valued power spectra or multicolor images, is self-evident. In particular, geodesics in such spaces naturally model flows and allow morphing between spectra and images, respectively.