Fig. 1
figure 1

Left RGB image of size \(4 \times 2\). Right Image as three-dimensional \(4 \times 2 \times 3\) array. The “mass” is the sum of the RGB values of all pixels (Color figure online)

Fig. 2
figure 2

Left RGB color cube. Middle/Right Color transfer between (1, 0, 0) (red) and (0, 0, 1) (blue) for mirror (middle) and periodic (right) boundary conditions visualized in the RGB cube (Color figure online)

1 Introduction

Color image processing is much more challenging compared to gray-value image processing and usually, approaches for gray-value images do not generalize in a straightforward way to color images. For example, one-dimensional histograms are a very simple, but powerful tool in gray-value image processing, while it is in general difficult to exploit histograms of color images. In particular, there exist several possibilities to represent color images [10]. In this paper we consider the interpolation between two color images in the RGB space (see Fig. 2 (left)) motivated by the fluid mechanics formulation of dynamic optimal transport by Benamou and Brenier [8] and the recent approaches of Papadakis et al. [35] and [33] for gray-value images. In these works gray-value images are interpreted as two-dimensional, finitely supported density functions \(f_0\) and \(f_1\) of absolutely continuous probability measures \(\mu _0\) and \(\mu _1 \) (i.e., \(\mu _i(A)=\int _A f_i \,\mathrm {d}x\), \(i=0,1\)). In particular, we have \(\int _{\mathbb R^2} f_0 \,\mathrm {d}x= \int _{\mathbb R^2} f_1 \,\mathrm {d}x= 1\). Therewith, intermediate images are obtained as the densities \(f_t\) of the geodesic path \(\mathrm {d}\mu _t=f_t \,\mathrm {d}x\) with respect to the Wasserstein distance between \(\mu _0\) and \(\mu _1\).

In this paper, we extend the transport model to discrete RGB color images. The incorporation of color into the above approach appears to be a non-trivial task and this paper is a first proposal in this direction. We consider \(N_1 \times N_2\) RGB images as three-dimensional arrays in \(\mathbb R^{N_1,N_2,3}\), where the third direction is the “RGB direction” that contains the color information; for an illustration see Fig. 1. Particular attention has to be paid to this direction and its boundary conditions. We propose to use periodic boundary conditions, which is motivated as follows: assume we are given two color pixels \(f_0\) and \(f_1\in \mathbb {R}^{1\times 1\times 3}\). Using mirror (Neumann) boundary conditions in the third dimension, the transport of, e.g., a red pixel \(f_0 = (1,0,0)\) into a blue one \(f_1 = (0,0,1)\) goes over (0, 1, 0) (green), see Fig. 2 (middle), which is not intuitive from the viewpoint of human color perception. Furthermore, it implies that the transport depends on the order of the three color channels, which is clearly not desirable. As a remedy, we suggest to use periodic boundary conditions in the color dimension. In case of a red and a blue pixel, it yields a transport via violet, which is also what one would expect, compare Fig. 2 (right) and see also Figs. 5 and 6.

In the following, we propose two variational models for the transport of color images. To handle also the case of unequal masses, the mass conservation constraint is relaxed. Our first model contains the continuity equation as a constraint. It turns out that it generalizes the technique which was proposed in [35] for gray-value images. The second model penalizes the continuity equation, similarly it was also considered for (continuous) gray-value images in [33]. Other approaches to unbalanced optimal transport can e.g., be found in [7, 18, 26, 28].

The interpolation based on an optimal transport model for a special class of images, namely so-called microtextures, has been addressed by Rabin et al. in [37, 41]. The authors show that microtextures can be well modeled as a realization of a Gaussian random field. In this case, theoretical results guarantee that the intermediate measures \(\mu _t\) are Gaussian as well and they can be stated explicitly in terms of the means and covariance matrices learnt from the given images \(f_0\) and \(f_1\). The idea can be generalized for interpolating between more than two microtextures using barycentric coordinates. The approach fails for some special microtexture settings which usually do not appear in practice.

Color interpolation between images of the same shape can be realized by switching to the HSV or HSI space. Then only the hue component has to be transferred using, e.g., by a dynamic extension of the model for the transfer of cyclic measures (periodic histograms) of Delon et al. [25, 40]. For an example we refer to [27]. This idea is closely related to the affine model for color image enhancement in [34]. However, these approaches transfer only the color and leave the original edge structure of the image untouched.

Finally we mention that the interpolation between images can also be tackled by other sophisticated techniques such as metamorphoses, see [49]. These approaches are beyond the scope of this paper. A combination of the optimal dynamic transport model with the metamorphosis approach was proposed in [33].

The outline of our paper is as follows: in Sect. 2 we recall basic results from the theory of optimal transport. At this point, we deal with general \(p \in (1,2]\) instead of just \(p=2\) as in [35]. We propose discrete dynamic transport models in Sect. 3. Here we prefer to give a matrix–vector notation of the problem to make it more intuitive from a linear algebra point of view. We prove the existence of a minimizer and show that there are special settings where the minimizer is not unique. In Sect. 4 we solve the resulting minimization problems by primal-dual minimization algorithms. It turns out that one step of the algorithm requires the solution of a four-dimensional Poisson equation which includes various boundary conditions and can be handled by fast trigonometric transforms. Another step involves to find the positive root of a polynomial of degree \(2q-1\), where \(\frac{1}{p} + \frac{1}{q} =1\) and \(p \in (1,2]\). For this task we propose to use Newton’s algorithm and determine an appropriate starting point to ensure its quadratic convergence. Section 5 shows numerical results, some of which were also reported at the SampTA conference 2015 [27]. More examples can be found on our website http://www.mathematik.uni-kl.de/imagepro/members/laus/color-OT. Finally, Sect. 6 contains conclusions and ideas for future work. In particular, additional priors may be used to improve the dynamic transport, e.g., a total variation prior to avoid smearing effects. The Appendix reviews the diagonalization of certain discrete Laplace operators, and provides basic rules for tensor product computations. Further it contains some technical proofs.

2 Dynamic Optimal Transport

In this section we briefly review some basic facts on the theory of optimal transport. For further details we refer to, e.g.,  [1, 43, 50].

Let \(\mathcal {P}(\mathbb {R}^d)\) be the space of probability measures on \(\mathbb {R}^d\) and \(\mathcal {P}_p(\mathbb {R}^d)\), \(p \in [1,\infty )\) the Wasserstein space of measures having finite p-th moments

$$\begin{aligned} \mathcal{P}_p(\mathbb R^{d}) \,:=\, \left\{ \mu \in \mathcal{P}(\mathbb R^{d}): \int _{\mathbb R^{d}} |x|^p d \mu (x) < +\infty \right\} . \end{aligned}$$

For \(\mu _0, \mu _1\in \mathcal {P}(\mathbb {R}^d)\) let \(\varPi (\mu _0,\mu _1)\) be the set of all probability measures on \(\mathbb {R}^d\times \mathbb {R}^d\) whose marginals are equal to \(\mu _0\) and \(\mu _1\). Therewith, the optimal transport problem (Kantorovich problem) reads as

$$\begin{aligned} \mathop {\mathrm{argmin}}_{\pi \in \varPi (\mu _0,\mu _1) } \int _{\mathbb R^{d}} |x-y|^p \,\mathrm {d}\pi (x,y). \end{aligned}$$

One can show that for \(p \in [1,\infty )\) a minimizer exists, which is uniquely determined for \(p>1\) and also called optimal transport plan. In the special case of the one-dimensional optimal transport problem, if the measure \(\mu _0\) is non-atomic, the optimal transport plan is the same for all \(p \in (1,\infty )\) and can be stated explicitly in terms of the cumulative density functions of the involved measures. The minimal value

$$\begin{aligned} W_p(\mu _0,\mu _1) \,:=\, \left( \min _{\pi \in \varPi (\mu _0,\mu _1) } \int _{\mathbb R^{d}} |x-y|^p \,\mathrm {d}\pi (x,y)\right) ^{\frac{1}{p}} \end{aligned}$$

defines a distance on \(\mathcal {P}_p(\mathbb {R}^d)\), the so-called Wasserstein distance.

Wasserstein spaces \(\left( \mathcal{P}_p(\mathbb {R}^d), W_p(\mathbb {R}^d)\right) \) are geodesic spaces. In particular, there exists for any \(\mu _0,\mu _1 \in \mathcal{P}_p(\mathbb R^{d})\) a geodesic \(\gamma :[0,1] \rightarrow \mathcal{P}_p(\mathbb R^{d})\) with \(\gamma (0) = \mu _0\) and \(\gamma (1) = \mu _1\). For interpolating our images we ask for \(\mu _t = \gamma (t)\), \(t \in [0,1]\).

At least theoretically there are several ways to compute \(\mu _t\). If the optimal transport plan \(\pi \) is known, then \(\mu _t = \mathcal{L}_t {}_\# \pi \, {:=} \pi \circ \mathcal{L}_t^{-1}\) yields the geodesic path, where \(\mathcal{L}_t:\mathbb R^{d} \times \mathbb R^{d} \rightarrow \mathbb R^{d}\), \(\mathcal{L}_t(x,y) = (1-t)x + ty\) is the linear interpolation map, see further [43]. This requires the knowledge of the optimal transport plan \(\pi \) and of \(\mathcal{L}_t^{-1}\). At the moment there are efficient ways for computing the optimal transport plan \(\pi \) only in special cases, in particular in the one-dimensional case by an ordering procedure and for Gaussian distributions in the case \(p=2\) using expectation and covariance matrix. For \(p=2\) one can also use the fact that \(\pi \) is indeed induced by a transport map \(T:\mathbb R^d \rightarrow \mathbb R^d\), i.e., \(\pi = (\mathrm{id}, T)_{\#}\mu _0\), which can be written as \(T = \nabla \psi \), where \(\psi \) fulfills the Monge–Ampère equation, see [15]. However, this second-order nonlinear elliptic PDE is numerically hard to solve and so far, only some special cases were considered [6, 19, 20, 32]. Other numerical techniques to compute optimal transport plans have been proposed, e.g., in [2, 30, 44, 45]. Another approach consists in relaxing the condition of minimizing a Wasserstein distance using instead an entropy regularized Wasserstein distance. Such distances can be computed more efficiently by the Sinkhorn algorithm and were applied within a barycentric approach by Cuturi et al. [21, 22].

The approach in this paper was inspired by the one of Benamou and Brenier in [8]. It involves the velocity field \(\mathrm {v}:[0,1]\times \mathbb {R}^d\rightarrow \mathbb {R}^d\) of the geodesic curve joining \(\mu _0\) and \(\mu _1\). This velocity field \(\mathrm{v}(t, \cdot )\) has constant speed \(\Vert \mathrm{v}(t, \cdot )\Vert _{L^p(\mu _t)} = W_p(\mu _0,\mu _1)\). It can be shown that it minimizes the energy functional

$$\begin{aligned} \mathcal{E}_p(\mathrm{v},\mu ) \,{:=}\, \int _{0}^{1} \int _{\mathbb R^{d}} \frac{1}{p} |\mathrm{v}(x,t)|^p \,\mathrm {d}\mu _t(x)\,\mathrm {d}t \end{aligned}$$

and fulfills the continuity equation

$$\begin{aligned} \partial _t \mu _t + \nabla _x \cdot (\mu _t \mathrm{v}(t, \cdot )) = 0, \end{aligned}$$

where we say that \(t \mapsto \mu _t\) is a measure-valued solution of the continuity equation if for all compactly supported \(\phi \in C^{1}\bigl ((0,1)\times \mathbb R^d\bigr )\) and \(T\in (0,1)\) the relation

$$\begin{aligned} \int _{0}^{T} \int _{\mathbb R^d} \partial _t \phi (x,t) + \langle \mathrm{v}(x,t),\nabla _x \phi (x,t)\rangle \,\mathrm {d}\mu _t(x)\,\mathrm {d}t = 0 \end{aligned}$$

holds true. For more details we refer to [43].

Assuming the measures \(\mu _0\) and \(\mu _1\) to be absolutely continuous with respect to the Lebesgue measure, i.e., \(\mathrm {d}\mu _i = f_i \,\mathrm {d}x\), \(i=0,1\), theoretical results (see for instance [50, Theorem 8.7]) guarantee that the same holds true for \(\mathrm {d}\mu _t = f_t \,\mathrm {d}x\), where \(f_t\) can be obtained as the minimizer over \(\mathrm{v},f\) of

$$\begin{aligned} \mathcal{E}_p(\mathrm{v},f)= \int _0^1\int _{\mathbb R^d} \frac{1}{p} |\mathrm{v}(x,t)|^p f(x,t) \, \,\mathrm {d}x\,\mathrm {d}t \end{aligned}$$
(1)

subject to the continuity equation

$$\begin{aligned}&\partial _t f(x,t) + \nabla _x \cdot (\mathrm{v}(x,t) f(x,t)) = 0,\\&f(0,\cdot )=f_0, \; f(1,\cdot )=f_1, \end{aligned}$$

where we suppose \(\cup _{t \in [0,1]}{{\mathrm{supp}}}f(t,\cdot ) \subseteq [0,1]^d\) with appropriate (spatial) boundary conditions. Unfortunately, the energy functional (1) is not convex in f and \(\mathrm{v}\). As a remedy, Benamou and Brenier suggested in the case \(p=2\) a change of variables \( (f,\mathrm{v})\mapsto (f,f\mathrm{v}) = (f,m). \) This idea can be generalized to \(p \in (1,\infty )\), see [43], which results in the functional

$$\begin{aligned} \int _{0}^{1}\int _{\mathbb {R}^d} J_p\bigl (m(x,t),f(x,t)\bigl )\,\mathrm {d}x\,\mathrm {d}t, \end{aligned}$$
(2)

where \(J_p:\mathbb R^d \times \mathbb R \rightarrow \mathbb R \cup \{+ \infty \}\) is defined as

$$\begin{aligned} J_p (x,y) \, {:=} \, {\left\{ \begin{array}{ll} \frac{1}{p} \frac{|x|^{p}}{y^{p-1}} &{} \mathrm{if } \; y>0,\\ 0 &{} \mathrm{if } \; (x,y)=(0,0),\\ +\infty &{}\mathrm{otherwise} \end{array}\right. } \end{aligned}$$
(3)

and \(|\cdot |\) denotes the Euclidean norm. This functional has to be minimized subject to the continuity equation

$$\begin{aligned} \partial _t f(x,t) + \nabla _x \cdot m(x,t) = 0, \end{aligned}$$
(4)
$$\begin{aligned} f(0,\cdot )=f_0, \; f(1,\cdot )=f_1, \end{aligned}$$
(5)

equipped with appropriate spatial boundary conditions.

Remark 1

The function \(J_p:\mathbb R^d \times \mathbb R \rightarrow \mathbb R \cup \{+ \infty \}\) defined in (3) is the perspective function of \(\psi (s) = \frac{1}{p} |s|^p\), i.e., \(J_p (x,y) = y \psi \left( \frac{x}{y} \right) \). For properties of perspective functions see, e.g., [23]. In particular, since \(\psi \) is convex for \(p \in (1,\infty )\), its perspective \(J_p (x,y)\) is also convex. Further, \(J_p (x,y)\) is lower semi-continuous and positively homogeneous, i.e., \(J_p (\lambda x, \lambda y) = \lambda J_p (x,y)\) for all \(\lambda > 0\).

3 Discrete Transport Models

In practice we are dealing with discrete images whose pixel values are given on a rectangular grid. To get a discrete version of the minimization problem we have to discretize both the integration operator in (2) by a quadrature rule and the differential operators in the continuity equation (4). The discretization of the continuity equation requires the evaluation of discrete “partial derivatives” in time as well as in space. In order to avoid solutions suffering from the well-known checkerboard effect (see for instance [36]) we adopt the idea of a staggered grid as in [35], see Fig. 3.

Fig. 3
figure 3

Staggered grid for the discretization of the dynamic optimal transport problem, where \(N=P=4\). For periodic boundary conditions, the boundary values for m are equal, while they are zero for Neumann boundary conditions

The differential operators in space and time are discretized by forward differences, and depending on the boundary conditions this results in the use of difference matrices of the form

$$\begin{aligned}&D_n \,\, := \, \, \, n \left( \begin{array}{rrrrrrr} -1&{}\; 1&{} \\ &{}-1 &{} \; 1&{} \\ &{} &{} &{}\ddots &{} \\ &{} &{} &{} &{}-1&{}1&{} 0\\ &{} &{} &{} &{} &{}-1&{} 1 \end{array} \right) \in \mathbb {R}^{n-1,n}, \\&D_n^\mathrm{per}\, \,:=\, \, n \left( \begin{array}{rrrrr} 1&{} &{} &{} &{}-1 \\ -1&{}1&{} \\ &{} &{}\ddots &{} &{} \\ &{} &{} &{}1&{} 0\\ &{} &{} &{}-1&{}1 \end{array} \right) \in \mathbb {R}^{n,n}. \end{aligned}$$

For the integration we apply a simple midpoint rule. To handle this part, we introduce the averaging/interpolation matrices

$$\begin{aligned}&S_n \,\, := \,\, \frac{1}{2} \left( \begin{array}{rrrrrr} 1&{}1&{} \\ &{}1&{}1&{} \\ &{} &{} &{}\ddots &{} &{} \\ &{} &{} &{} &{}1&{}1 \end{array} \right) \in \mathbb {R}^{n-1,n}, \\&S_n^{\mathrm{per}} \,\, :=\, \, \frac{1}{2} \left( \begin{array}{rrrrrrr} 1&{}0&{} &{} &{} &{} &{}1\\ 1&{}1&{} &{} \\ &{} 1&{}1&{} &{} \\ &{} &{} &{}\ddots &{} &{} &{} \\ &{} &{} &{} &{}1&{}1&{}0\\ &{} &{} &{} &{} &{}1&{}1 \end{array} \right) \in \mathbb {R}^{n,n}. \end{aligned}$$

Discretization for one spatial dimension + time In the following, we derive the discretization of (2) for one spatial direction, i.e., for the transport of signals. The problem can be formulated in a simple matrix–vector form using tensor products of matrices which makes it rather intuitive from the linear algebra point of view. Moreover, it will be helpful for deriving the fast trigonometric transforms which will play a role within our algorithm. The generalization of our approach to higher dimensions is straightforward and can be found in Appendix 3.

We want to organize the transport between two given one-dimensional, non-negative discrete signals

$$\begin{aligned} f_0 :=\left( f_0\Big ( \tfrac{j-1/2}{N} \Big ) \right) _{j=1}^N \quad \mathrm{and} \quad f_1 :=\left( f_1 \Big (\tfrac{j-1/2}{N} \Big ) \right) _{j=1}^N. \end{aligned}$$

We are looking for the intermediate signals \(f_t\) for \(t = \frac{k}{P}\), \(k=1, \ldots , P-1\). Using the notation \(f_t(x) = f(x,t)\), we want to find \(f(\frac{j-1/2}{N},\frac{k}{P})_{j=1,k=1}^{N,P-1} \in \mathbb {R}^{N,P-1}\). For m we have to take the boundary conditions into account. In the case of mirror boundary conditions, there is no flow over the boundary and since \(m = f\mathrm v\) the value of m is zero at the boundary at each time. In the periodic case both boundaries of m coincide. The values of m are taken at the cell faces \(\frac{j}{N}\), \(j = \kappa ,\ldots ,N-1\) and time \(\frac{k-\tfrac{1}{2}}{P}\), \(k=1,\ldots ,P\), i.e., we are looking for \(\left( m(\frac{j}{N},\frac{k-\tfrac{1}{2}}{P}) \right) _{j=\kappa ,k=1}^{N-1,P} \in \mathbb {R}^{N-\kappa ,P}\), where

$$\begin{aligned} \kappa = \left\{ \begin{array} {ll} 1 &{} \text{ mirror } \text{ boundary },\\ 0 &{} \text{ periodic } \text{ boundary }. \end{array} \right. \end{aligned}$$

The midpoints for the quadrature rule are computed by averaging the neighboring two values of m and f, respectively. To give a sound matrix–vector notation of the discrete minimization problem we reorder m and f columnwise into vectors \(\mathrm{vec}(f) \in \mathbb {R}^{N(P-1)}\) and \(\mathrm{vec} (m) \in \mathbb {R}^{(N-\kappa )P}\), which we again denote by f and m. For the \(\mathrm{vec}\) operator in connection with the tensor product \(\otimes \) of matrices we refer to Appendix 2. More specifically, let \(I_n\in \mathbb {R}^{n,n}\) be the identity matrix and set

$$\begin{aligned}&S_\mathrm{f}\,\, :=\, \,S_P^ {\scriptscriptstyle {\text {T}}}\otimes I_N, \qquad D_\mathrm{f} \,:=\, D_P^{\scriptscriptstyle {\text {T}}}\otimes I_N, \\&S_\mathrm{m} \,:=\, \left\{ \begin{array}{ll} I_P \otimes S_N^{\scriptscriptstyle {\text {T}}}&{} \, \text{ mirror } \text{ boundary },\\ I_P \otimes \left( S_{N}^{\mathrm{per}} \right) ^{\scriptscriptstyle {\text {T}}}&{} \, \text{ periodic } \text{ boundary }, \end{array} \right. \\&D_\mathrm{m} \,:=\, \left\{ \begin{array}{ll} I_P \otimes D_N^{\scriptscriptstyle {\text {T}}}&{} \text{ mirror } \text{ boundary },\\ I_P \otimes \left( D_{N}^{\mathrm{per}} \right) ^{\scriptscriptstyle {\text {T}}}&{} \text{ periodic } \text{ boundary }. \end{array} \right. \end{aligned}$$

Finally, we introduce the vectors

$$\begin{aligned} f^+&\,\, := \,\, \frac{1}{2} \bigl ( f_0^{\scriptscriptstyle {\text {T}}}, \varvec{0}, f_1^{\scriptscriptstyle {\text {T}}}\bigr )^{\scriptscriptstyle {\text {T}}}, \\ f^-&\, \, := \,\, P \bigl (- f_0^{\scriptscriptstyle {\text {T}}},\varvec{0}, f_1^{\scriptscriptstyle {\text {T}}}\bigr )^{\scriptscriptstyle {\text {T}}}, \end{aligned}$$

where we denote by \(\mathbf{0}\) (and \(\mathbf{1}\)) arrays of appropriate size with entries 0 (and 1). They are used to guarantee that the boundary conditions are fulfilled. Now the continuity equation (4) together with the boundary conditions (5) for f can be reformulated as requirement that (mf) has to lie within the hyperplane

$$\begin{aligned} \mathcal {C}_0 :=\left\{ \begin{pmatrix} m\\ f \end{pmatrix}:\underbrace{\big ( D_\mathrm{m} | D_\mathrm{f} \big )}_{A} \begin{pmatrix} m\\ f \end{pmatrix} = f^- \right\} . \end{aligned}$$
(6)

We will see in Proposition 3 that \(A A^{\scriptscriptstyle {\text {T}}}\) is rank one deficient. Since further \(\varvec{1}^{\scriptscriptstyle {\text {T}}}A = \varvec{0}\), we conclude that the under-determined linear system in (6) has a solution if and only if \(\varvec{1}^{\scriptscriptstyle {\text {T}}}f^- = 0\), i.e., if and only if \(f_0\) and \(f_1\) have the same mass

$$\begin{aligned} \varvec{1}^{\scriptscriptstyle {\text {T}}}f_0 = \varvec{1}^{\scriptscriptstyle {\text {T}}}f_1. \end{aligned}$$
(7)

This resembles the fact that dynamic optimal transport is performed between probability measures. The interpretation of a color image as a probability density function has a major drawback; to represent a valid density, the sum of all RGB pixel values of the given images, i.e., the sum of the image intensity values, has to be one (or at least equal). Therefore, we consider more general the set

$$\begin{aligned} \mathcal {C} :=\mathop {\mathrm{argmin}}_{(m,f)} \Vert ( D_\mathrm{m} | D_\mathrm{f} ) \begin{pmatrix} m\\ f \end{pmatrix} -f^-\Vert _2^2. \end{aligned}$$
(8)

Note that the boundary conditions (5) for f are preserved, while the mass conservation (7) is no longer required. Clearly, if (7) holds true, then \(\mathcal{C}\) coincides with \(\mathcal{C}_0\). Let \(\iota _{\mathcal {C}}\) denote the indicator function of \(\mathcal{C}\) defined by

$$\begin{aligned} \iota _{\mathcal {C}} (x) :=\left\{ \begin{array}{ll} 0 &{}\mathrm{if} \; x \in \mathcal {C},\\ +\infty &{}\mathrm{if} \; x \not \in \mathcal {C}. \end{array} \right. \end{aligned}$$

For \(p \in (1,2]\), we consider the following transport problem:

Constrained Transport Problem

$$\begin{aligned} \mathop {\mathrm{argmin}}_{ (m,f) } E(m,f) :=\Vert J_p \big (S_\mathrm{m} m, S_\mathrm{f} f + f^+ \big ) \Vert _1 + \iota _\mathcal{C}(m,f). \end{aligned}$$
(9)

Here, the application of \(J_p\) is meant componentwise and the summation over its (non-negative) components is addressed by the \(\ell _1\)-norm. The interpolation operators \(S_\mathrm{m}\) and \(S_\mathrm{f}\) arise from the midpoint rule for computing the integral.

We can further relax the relaxed continuity assumption \((m,f) \in \mathcal{C}\) by replacing it by

$$\begin{aligned} \Big \Vert \big ( D_\mathrm{m} | D_\mathrm{f} \big ) \begin{pmatrix} m\\ f \end{pmatrix} -f^- \Big \Vert _2^2 \le \tau , \end{aligned}$$

where \(\tau \ge \tau _0 :=\min _{(m,f)} \Vert ( D_\mathrm{m} | D_\mathrm{f} ) ( m^{\scriptscriptstyle {\text {T}}},f^{\scriptscriptstyle {\text {T}}})^{\scriptscriptstyle {\text {T}}}-f^-\Vert _2^2\). For \(\tau = \tau _0\) we have again problem (9). Since there is a correspondence between the solutions of such constrained problems with parameter \(\tau \) and the penalized problem with a corresponding parameter \(\lambda \), see [3, 42, 48], we prefer to consider the following penalized problem with regularization parameter \(\lambda >0\):

Penalized Transport Problem

(10)

Note that also for our penalized model the boundary conditions (5) for f still hold true. In general, both models (9) and (10) do not guarantee that the values of f stay within the RGB cube during the transport. This is not a specific problem for color images but can appear for gray-value images as well (gamut problem). A usual way out is a final backprojection onto the image range. An alternative in the constrained model is a simple modification of the constraining set in (8) towards \( \mathcal {C} :=\mathop {\mathrm{argmin}}_{m,f \in [0,1]^3} \Vert ( D_\mathrm{m} | D_\mathrm{f} ) \begin{pmatrix} m\\ f \end{pmatrix} -f^-\Vert _2^2\). This leads to inner iterations of the Poisson solver and a projection onto the cube in the subsequent Algorithm 1. In the penalized problem, the term \(\iota _{[0,1]^3}\) could be added. However, we observed in all our numerical experiments only very small violations of the range constraint, which are most likely caused by numerical reasons.

A penalized model for the continuous setting and gray-value images was examined in [33]. For recent papers on unbalanced transport we refer to [7, 18, 26, 28].

To show the existence of a solution of the discrete transport problems we use the concept of asymptotically level stable functions. As usual, for a function \(F:\mathbb R^n \rightarrow \mathbb R \cup \{+\infty \}\) and \(\mu >\inf _x F(x)\), the level sets are defined by

$$\begin{aligned} {{\mathrm{lev}}}(F,\mu ) :=\{x \in \mathbb {R}^n: F(x) \le \mu \}. \end{aligned}$$

By \(F_\infty \) we denote the asymptotic (or recession) function of F which according to [24], see also [4, Theorem 2.5.1], can be computed by

$$\begin{aligned} F_\infty (x) = \liminf _{\genfrac{}{}{0.0pt}{}{x'\rightarrow x}{t\rightarrow \infty }}\frac{F \big (tx' \big )}{t}. \end{aligned}$$

The following definition of asymptotically level stable functions is taken from [4, p. 94]: a proper and lower semi-continuous function \(F:\mathbb {R}^{n} \rightarrow \mathbb {R}\cup \{+\infty \}\) is said to be asymptotically level stable if for each \(\rho >0\), each real-valued, bounded sequence \(\{\mu _k\}_k\) and each sequence \(\{x_k\}_k\) satisfying

$$\begin{aligned}&x_k \in {{\mathrm{lev}}}(F,\mu _k), \quad \Vert x_k\Vert _2 \rightarrow +\infty , \nonumber \\&\quad \frac{x_k}{\Vert x_k\Vert _2} \rightarrow \tilde{x} \in \ker (F_\infty ), \end{aligned}$$
(11)

there exists \(k_0\) such that

$$\begin{aligned} x_k- \rho \tilde{x} \in {{\mathrm{lev}}}(F,\mu _k)\qquad \text {for all } k\ge k_0. \end{aligned}$$

If for each real-valued, bounded sequence \(\{\mu _k\}_k\) there exists no sequence \(\{x_k\}_k\) satisfying (11), then F is automatically asymptotically level stable. In particular, coercive functions are asymptotically level stable. It was originally exhibited in [5] (without the notion of asymptotically level stable functions) that any asymptotically level stable function F with \(\inf F>-\infty \) has a global minimizer. A proof was also given in [4, Corollary 3.4.2]. With these preliminaries we can prove the existence of minimizers of our transport models.

Proposition 1

The discretized dynamic transport models (9) and  (10) have a solution.

Proof

We show that the proper, lower semi-continuous functions E and \(E_\lambda \) are asymptotically level stable which implies the existence of a minimizer. For the penalized problem, the asymptotic function \(E_{\lambda ,\infty }\) reads

$$\begin{aligned} E_{\lambda , \infty }(m,f)=\liminf _{\begin{array}{c} (m',f')\rightarrow (m,f),\\ t\rightarrow \infty \end{array}}\frac{E_\lambda \bigl (t \big (m',f' \big )\bigr )}{t}. \end{aligned}$$

We obtain

$$\begin{aligned} \frac{E_\lambda \bigl (t \big (m',f' \big )\bigr )}{t}&= \frac{1}{t}\left( \Bigg \Vert J_p\Bigg (t\Bigg (S_\mathrm{m} m' , S_\mathrm{f} f'+ \frac{1}{t} f^+\Bigg )\Bigg )\Bigg \Vert _1 \right. \\&\ \ \ \ \left. +\,\, \lambda \Bigg \Vert \big ( D_\mathrm{m} | D_\mathrm{f} \big )\begin{pmatrix} tm'\\ tf' \end{pmatrix} - f^- \Bigg \Vert _2^2\right) \\&= \Bigg \Vert J_p\Bigg (S_\mathrm{m} m' , S_\mathrm{f} f'+ \frac{1}{t}f^+\Bigg )\Bigg \Vert _1 \\&\ \ \ \ +\, \lambda t \Bigg \Vert \big ( D_\mathrm{m} | D_\mathrm{f} \big )\begin{pmatrix} m'\\ f' \end{pmatrix} - \frac{1}{t}f^- \Bigg \Vert _2^2. \end{aligned}$$

Thus, \((\tilde{m},\tilde{f})\in \ker (E_{\lambda ,\infty })\) implies

$$\begin{aligned} (\tilde{m},\tilde{f})\in \ker \big (D_{\mathrm {m}}|D_{\mathrm {f}}\big ), \quad \tilde{m} \in \ker (S_{\mathrm {m}}), \quad S_{\mathrm {f}}\tilde{f}\ge 0. \end{aligned}$$
(12)

For the constrained problem we have the same implications so that we can restrict our attention to the penalized one. By the definition of \(S_{\mathrm{m}}\) we obtain \(\mathrm{ker } (S_{\mathrm{m}}) = \left\{ w \otimes \tilde{\varvec{1}}: w \in \mathbb R^P\right\} \) for periodic boundary conditions and even N and \(\mathrm{ker } (S_{\mathrm{m}}) = \{\varvec{0}\}\) otherwise, where \(\tilde{\mathbf{1}}\) is defined as \(\tilde{\mathbf{1}} = (1,-1,\ldots ,1,-1)^{\scriptscriptstyle {\text {T}}}\in \mathbb R^N\). In the case \(\mathrm{ker} (S_{\mathrm{m}}) = \{\varvec{0}\}\), the first and second condition in (12) imply \(D_{\mathrm{f}} \tilde{f} = 0\) so that by the definition of \(D_{\mathrm{f}}\) it holds \(\tilde{f} = 0\). In the other case, we obtain by the first condition in (12) that \(\tilde{f} = -D_{\mathrm{f}}^\dagger D_{\mathrm{m}} \tilde{m}\), where \( D_{\mathrm{f}}^\dagger = (D_{\mathrm{f}}^{\scriptscriptstyle {\text {T}}}D_{\mathrm{f}})^{-1}D_{\mathrm{f}}^{\scriptscriptstyle {\text {T}}}\) denotes the Moore-Penrose inverse of \(D_{\mathrm{f}}\). Then

$$\begin{aligned} S_{\mathrm{f}} \tilde{f} =- S_{\mathrm{f}} D_{\mathrm{f}}^\dagger D_{\mathrm{m}} \tilde{m} = -S_{\mathrm{f}} D_{\mathrm{f}}^\dagger D_{\mathrm{m}} (w \otimes \tilde{\varvec{1}}) \end{aligned}$$

for some \(w \in \mathbb R^P\). Straightforward computation shows

$$\begin{aligned} S_{\mathrm{f}} \tilde{f} =- S_\mathrm{f} D^\dagger _\mathrm{f} D_\mathrm{m} \left( w \otimes \tilde{\varvec{1}}_N \right) =- \tilde{w} \otimes \tilde{\varvec{1}}_N \end{aligned}$$

for some \(\widetilde{w}\in \mathbb R^P\). Now the third condition in (12) can only be fulfilled if \(\tilde{w} = \varvec{0}\). Consequently we have in both cases

$$\begin{aligned} S_{\mathrm{f}} \tilde{f} = 0. \end{aligned}$$
(13)

Let \(\rho >0\), \(\{\mu _k\}_k\) be a bounded sequence and \(\{(m_k,f_k)\}_k\) be a sequence fulfilling (11). By (12) and (13) we conclude

$$\begin{aligned} E_\lambda \Bigg (\big (m_k,f_k \big )-\rho \big (\tilde{m},\tilde{f} \big )\Bigg )&=\Bigg \Vert J_p \Big (S_\mathrm{m} m_k , S_\mathrm{f} f_k + f^+ \Big ) \Bigg \Vert _1\\&\ \ \ \ + \lambda \Bigg \Vert \big ( D_\mathrm{m} | D_\mathrm{f} \big ) \begin{pmatrix} m_k\\ f_k \end{pmatrix} - f^- \Bigg \Vert _2^2\\&= E_\lambda \bigl ((m_k,f_k)\bigr ). \end{aligned}$$

Since \((m_k,f_k)\in {{\mathrm{lev}}}(E_\lambda ,\mu _k)\), this shows that \((m_k,f_k)-\rho (\tilde{m},\tilde{f})\in {{\mathrm{lev}}}(E_\lambda ,\lambda _k)\) as well and finishes the proof. \(\square \)

Unfortunately, \(J_p(u,v)\) is not strictly convex on its domain as it can be deduced from the following proposition.

figure a

Proposition 2

For any two minimizers \((m_i,f_i)\), \(i=1,2\) of (9) the relation

$$\begin{aligned} \frac{S_\mathrm{m} m_1}{S_\mathrm{f} f_1 + f^-} = \frac{ S_\mathrm{m}m_2}{S_\mathrm{f} f_2 + f^-} \end{aligned}$$

holds true.

Proof

We use the perspective function notation from Remark 1. For \(\lambda \in (0,1)\) and \((u_i,v_i)\) with \(v_i >0\), \(i=1,2\), we have (componentwise)

$$\begin{aligned}&J_p \left( \lambda (u_1,v_1) + (1-\lambda ) (u_2,v_2) \right) \\&\quad = \left( \lambda v_1 + (1-\lambda )v_2 \right) \psi \left( \tfrac{\lambda u_1 + (1-\lambda )u_2}{\lambda v_1 + (1-\lambda )v_2} \right) \\&\quad = (\lambda v_1 + (1-\lambda ) v_2) \psi \left( \tfrac{\lambda v_1}{\lambda v_1 + (1-\lambda )v_2} \tfrac{u_1}{v_1}\right. \left. + \tfrac{(1-\lambda ) v_2}{\lambda v_1 + (1-\lambda )v_2} \tfrac{u_2}{v_2} \right) \end{aligned}$$

and if \(\frac{u_1}{v_1} \not = \frac{u_2}{v_2}\) by the strict convexity of \(\psi \) that

$$\begin{aligned}&J_p \left( \lambda (u_1,v_1) + (1-\lambda ) (u_2,v_2) \right) \\&\quad <\lambda J_p(u_1,v_1) + (1-\lambda )J_p(u_2,v_2). \end{aligned}$$

Setting \(u_i :=S_\mathrm{m} m_i\) and \(v_i:=S_\mathrm{f} f_i + f^-\), \(i=\,1,2\), we obtain the assertion. \(\square \)

Remark 2

For periodic boundary conditions, even N and \(f_1 = f_0 + \gamma \tilde{\varvec{1}}\), \(\gamma \in [0,\min f_0 )\) the minimizer of (9) is not unique. This can be seen as follows: obviously, we would have a minimizer (mf) if \(m = w \otimes \tilde{\varvec{1}} \in \mathrm{ker}(S_\mathrm{m})\) for some \(w \in \mathbb R^P\) and there exists \(f\ge 0\) which fulfills the constraints. Setting \(f^{k/P} := f(j-1/2,k)_{j=1}^N\), \(k=0,\ldots ,P\), these constraints read \(- 2 P w \otimes \tilde{\varvec{1}} = P (f^{(k-1)/P} - f^{k/P})_{k=1}^P\). Thus, any \(w \in \mathbb R^P\) such that

$$\begin{aligned} f^{1/P}&= f_0 + 2w_1 \tilde{\varvec{1}}, \; f^{2/P} = f_0 + 2(w_1 + w_2) \tilde{\varvec{1}}, \ldots \, ,\\ f^{1}&= f_0 + 2(w_1 + w_2+ \ldots +w_P) \tilde{\varvec{1}} \end{aligned}$$

are non-negative vectors provides a minimizer of (9). We conjecture that the solution is unique in all other cases, but have no proof so far.

4 Primal-Dual Minimization Algorithm

4.1 Algorithms

For the minimization of our functionals we apply the primal-dual algorithm known as Chambolle-Pock algorithm [17, 38] in the form of Algorithm 8 in [14]. We use the following reformulation of the problems:

Constrained Transport Problem

$$\begin{aligned} \mathop {\mathrm{argmin}}_{ (m,f) }&\Vert J_p(u,v) \Vert _1 + \iota _\mathcal{C}(m,f)\nonumber \\&\text{ subject } \text{ to } \quad S_\mathrm{m} m = u,\; S_\mathrm{f} f + f^+ = v. \end{aligned}$$
(14)

Penalized Transport Problem

$$\begin{aligned} \mathop {\mathrm{argmin}}_{ (m,f) }&\Vert J_p(u,v) \Vert _1 + \lambda \left\| \underbrace{( D_\mathrm{m} | D_\mathrm{f} )}_{A} \begin{pmatrix} m\\ f \end{pmatrix} - f^-\right\| _2^2\nonumber \\&\text{ subject } \text{ to } \quad S_\mathrm{m} m = u,\; S_\mathrm{f} f + f^+ = v. \end{aligned}$$
(15)
figure b

In the following we detail the first two steps of Algorithms 1 and 2:

  • Step 1 of Algorithm 1 requires the projection onto \(\mathcal{C}\),

  • Step 1 of Algorithm 2 results in the solution of a linear system of equations with coefficient matrix \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\) whose Schur complement can be computed via fast trigonometric transforms,

  • Step 2 of both algorithms is the proximal map of \(J_p\).

4.2 Projection onto \(\mathcal{C}\)

Step 1 of Algorithm 1 requires to find the orthogonal projection of \(a :=\begin{pmatrix} m^{(r)} \\ f^{(r)} \end{pmatrix} - \tau \sigma \begin{pmatrix} S_\mathrm{m}^{\scriptscriptstyle {\text {T}}}\bar{b}_u^{(r)}\\ S_\mathrm{f}^{\scriptscriptstyle {\text {T}}}\bar{b}_v^{(r)} \end{pmatrix}\) onto \(\mathcal{C}\). This means that we have to find a minimizer of \(\Vert A x - f^-\Vert _2\) for which \(\Vert a-x\Vert _2\) attains its smallest value. Substituting \(y :=a-x\) we are looking for a minimizer y of \(\Vert A y - A a + f^-\Vert _2\) with smallest norm \(\Vert y\Vert _2\). By [11, Theorem 1.2.10], this minimizer is uniquely determined by \(A^\dagger (A a - f^-)\). Therefore the projection of a onto \(\mathcal{C}\) is given by

$$\begin{aligned} \varPi _{\mathcal{C}} (a)&= a - A^\dagger \left( A a- f^-\right) \nonumber \\&= a - A^{\scriptscriptstyle {\text {T}}}(A A^{\scriptscriptstyle {\text {T}}})^\dagger \left( A a- f^-\right) . \end{aligned}$$
(16)

Note that the projection onto \(\mathcal{C}\) coincides with the one onto \(\mathcal{C}_0\) if the given images \(f_0\) and \(f_1\) have the same mass. The Moore-Penrose inverse of the quadratic matrix \(A A^{\scriptscriptstyle {\text {T}}}\) is defined as follows: Let \( A A^{\scriptscriptstyle {\text {T}}}\) have the spectral decomposition

$$\begin{aligned} A A^{\scriptscriptstyle {\text {T}}}= Q \, \mathrm{diag}(\lambda _j) \, Q^{\scriptscriptstyle {\text {T}}}. \end{aligned}$$

Then it holds

$$\begin{aligned} (A A^{\scriptscriptstyle {\text {T}}})^\dagger&= Q \, \mathrm{diag}(\tilde{\lambda }_j) \, Q^{\scriptscriptstyle {\text {T}}}, \ \text {with } \tilde{\lambda }_j \,:=\, \left\{ \begin{array}{ll} \tfrac{1}{\lambda _j} &{}\mathrm{if} \; \lambda _j >0,\\ 0&{}\mathrm{otherwise.} \end{array} \right. \end{aligned}$$

The following proposition shows the form of \((A A^{\scriptscriptstyle {\text {T}}})^\dagger \) in the one-dimensional spatial case. It appears that the projection onto \(\mathcal{C}\) amounts to solve a two-dimensional Poisson equation which can be realized depending on the boundary conditions by fast cosine and Fourier transforms in \(\mathcal {O}(NP\log (NP))\) operations.

Proposition 3

Let \( C_N := \sqrt{\tfrac{2}{n}} \left( \epsilon _j \cos \frac{j(2k+1)\pi }{2N} \right) _{j,k=0}^{N-1} \) with \(\epsilon _0 := 1/\sqrt{2}\) and \(\epsilon _j := 1\), \(j=1,\ldots ,N-1\) be the N-th cosine matrix and \( F_N := \sqrt{\tfrac{1}{N}} \left( \mathrm {e}^{\frac{-2\pi \mathrm {i}jk}{N}} \right) _{j,k=0}^n \) be the N-th Fourier matrix. Set \(\mathrm{d}_N^{\mathrm{mirr}} := \) and \(\mathrm{d}^{\mathrm{per}}_N := \left( 4 \sin ^2 \frac{k\pi }{N}\right) _{k=0}^{N-1}\). Then the Moore-Penrose inverse \((A A^{\scriptscriptstyle {\text {T}}})^\dagger \) in (16) is given by

$$\begin{aligned} (A A^{\scriptscriptstyle {\text {T}}})^\dagger = {\left\{ \begin{array}{ll} (C_P^{\scriptscriptstyle {\text {T}}}\otimes C_{N-1}^{\scriptscriptstyle {\text {T}}}) \, \mathrm{diag}(\tilde{\mathrm{d}}) \, (C_P \otimes C_{N-1})&{} \\ \qquad \qquad \qquad \qquad \qquad \mathrm{mirror \; boundary},\\ (C_P^{\scriptscriptstyle {\text {T}}}\otimes \bar{F}_N ) \, \mathrm{diag}(\tilde{\mathrm{d}}) \, (C_P \otimes F_N)&{}\\ \qquad \qquad \qquad \qquad \qquad \mathrm{periodic \; boundary}, \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} \mathrm{d}:= {\left\{ \begin{array}{ll} I_P \otimes N^2\mathrm{diag}(\mathrm{d}_{N-1}^{\mathrm{mirr}} ) + P^2 \mathrm{diag}(\mathrm{d}_P^{\mathrm{mirr}} ) \otimes I_{N-1}&{}\\ \qquad \qquad \qquad \qquad \qquad \qquad \mathrm{mirror \; boundary},\\ I_P \otimes N^2\mathrm{diag}(\mathrm{d}_N^{\mathrm{per}} ) + P^2 \mathrm{diag}(\mathrm{d}_P^{\mathrm{mirr}} ) \otimes I_N&{}\\ \qquad \qquad \qquad \qquad \qquad \qquad \mathrm{periodic \; boundary} \end{array}\right. } \end{aligned}$$

and \(\tilde{\mathrm{d}}_j \,\, := \,\, \frac{1}{\mathrm{d}_j}\) if \(\mathrm{d}_j >0\) and \(\mathrm{d}_j=0\) otherwise.

The proof is given in Appendix 3.

4.3 Schur Complement of \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\)

To find the minimizer in Step 1 of Algorithm 2 we set the gradient of the functional to zero which results in the solution of the linear system of equations

$$\begin{aligned}&\left( \lambda A^{\scriptscriptstyle {\text {T}}}A + \tfrac{1}{\tau } I\right) \begin{pmatrix} m \\ f \end{pmatrix}\\&\quad = \lambda A^{\scriptscriptstyle {\text {T}}}f^- + \begin{pmatrix} m^{(r)} \\ f^{(r)} \end{pmatrix} - \tau \sigma \begin{pmatrix} S_\mathrm{m}^{\scriptscriptstyle {\text {T}}}\bar{b}_u^{(r)}\\ S_\mathrm{f}^{\scriptscriptstyle {\text {T}}}\bar{b}_v^{(r)} \end{pmatrix} . \end{aligned}$$

Noting that \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\) is a symmetric and positive definite matrix, this linear system can be solved using standard conjugate gradient methods. Alternatively, the next proposition shows how the inverse \((\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I)^{-1}\) can be computed explicitly with the help of the Schur complement and fast sine, cosine, and Fourier transforms. The proposition refers to the one-dimensional spatial setting but can be generalized to the three-dimensional case in a straightforward way using the results of Appendix 3.

Proposition 4

Let \( S_{N-1} := \sqrt{\tfrac{2}{N}} \left( \sin \frac{jk\pi }{N} \right) _{j,k=1}^{N-1} \) and \( \mathrm{d}^{\mathrm{zero}}_{N-1} := \left( 4 \sin ^2 \frac{k\pi }{2N}\right) _{k=1}^{N-1} \). Then the inverse of the matrix \( \lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I \) is given by

$$\begin{aligned} \left( \begin{array}{ll} I &{} - X^{-1}Y\\ 0 &{} I \end{array} \right) \left( \begin{array}{cc} X^{-1} &{} 0\\ 0 &{} S^{-1} \end{array} \right) \left( \begin{array}{cc} I &{} 0\\ -Y^{\scriptscriptstyle {\text {T}}}X^{-1}&{} I \end{array} \right) , \end{aligned}$$

where

(i) for mirror boundary conditions

$$\begin{aligned} Y&= D_P^{\scriptscriptstyle {\text {T}}}\otimes D_N, \\ X^{-1}&= I_P \otimes S_{N-1} \mathrm{diag}(\lambda N^2 \mathrm{d}_{N-1}^{\mathrm{zero}} + \tfrac{1}{\tau } )^{-1} S_{N-1},\\ S^{-1}&= (S_{P-1} \otimes C_N^{\scriptscriptstyle {\text {T}}}) \mathrm{diag}\Bigl ( \lambda P^2 \mathrm{d}^{\mathrm{zero}}_{P-1} \Bigr . \\&\ \ \ \ \otimes \Bigl . (1+ \tau \lambda N^2 \mathrm{d}^{\mathrm{mirr}}_N )^{-1} + \tfrac{1}{\tau } \Bigr ) (S_{P-1} \otimes C_N), \end{aligned}$$

(ii) for periodic boundary conditions

$$\begin{aligned} Y&= D_P^{\scriptscriptstyle {\text {T}}}\otimes D_N^{\mathrm{per}}, \\ X^{-1}&= I_P \otimes F_N \mathrm{diag}(\lambda N^2 \mathrm{d}_N^{\mathrm{per}} + \tfrac{1}{\tau }) ^{-1} \bar{F}_N,\\ S^{-1}&= (S_{P-1} \otimes F_N) \mathrm{diag}\Bigl ( \lambda P^2 \mathrm{d}^{\mathrm{zero}}_{P-1} \Bigr .\\&\ \ \ \ \Bigl .\otimes (1+ \tau \lambda N^2 \mathrm{d}^{\mathrm{per}}_N )^{-1} + \tfrac{1}{\tau } \Bigr )^{-1}(S_{P-1} \otimes \bar{F}_N). \end{aligned}$$

The proof is given in Appendix 3.

4.4 Proximal Map of \({ J}_p\)

Step 2 of Algorithm 1 consists of an evaluation of the proximal map \(\mathrm{prox}_{\frac{1}{\sigma } J_p}\) of \(J_p\). This can be done using the proximal map \(\mathrm{prox}_{ J^{*}_p}\) of \(J^{*}_p\) and Moreau’s identity \(\mathrm{prox}_\phi (t) + \mathrm{prox}_{\phi ^*}(t) = t\). Therefore, we state in the following first the dual function \(J^{*}_p\).

Lemma 1

For \(p \in (1,+\infty )\) and \(\frac{1}{p} + \frac{1}{q} = 1\) we have

$$\begin{aligned} J_p^*(a,b) = \left\{ \begin{array}{ll} 0&{}\mathrm{if} \; (a,b) \in \mathcal{K}_p,\\ +\infty &{}\mathrm{otherwise}, \end{array} \right. \end{aligned}$$

where

$$\begin{aligned} \mathcal{K}_p := \left\{ (a,b) \in \mathbb R^d \times \mathbb R:\tfrac{1}{q} |a|^q + b \le 0 \right\} . \end{aligned}$$

For the proof we refer to [31] or [43, Lemma 5.17]. After this preparation we are now able to compute \(\mathrm{prox}_{\frac{1}{\sigma } J_p}\).

Proposition 5

Let \(p \in (1,2]\) and \(\frac{1}{p} + \frac{1}{q} = 1\).

  1. (i)

    Then for \(x^* \in \mathbb R^d\), \(y^* \in \mathbb R\) and \(\sigma > 0\) it holds

    $$\begin{aligned}&\mathrm{prox}_{\frac{1}{\sigma } J_p}(x^*,y^*)\\&\quad =\left\{ \begin{array}{ll} (0,0) &{} \mathrm{if} \; (\sigma x^*,\sigma y^*) \in \mathcal{K}_p,\\ \left( x^* \frac{h(\hat{z})}{1+h(\hat{z})} , y^* + \frac{1}{\sigma q} \hat{z}^q \right) &{} \mathrm{otherwise,} \end{array} \right. \end{aligned}$$

    where

    $$\begin{aligned} h(z) := (\sigma y^* + \tfrac{1}{q} z^q) z^{q-2} \end{aligned}$$

    and \(\hat{z} \in \mathbb R_{\ge 0}\) is the unique solution of the equation

    $$\begin{aligned} z \left( 1+ h(z) \right) - \sigma | x^*| = 0 \end{aligned}$$
    (17)

    in the interval \(\big [\max (0, z_0)^{\tfrac{1}{q}},+\infty \big )\), where \(z_0:= - q \sigma y^*\).

  2. (ii)

    The Newton method converges for any starting point \(z \ge z_0\) quadratically to the largest zero of (17).

Proof

(i) By Moreau’s identity it holds

$$\begin{aligned} \mathrm{prox}_\phi (t) + \mathrm{prox}_{\phi ^*}(t) = t \end{aligned}$$

and since \((\tfrac{1}{\sigma } \phi )^*(t) = \tfrac{1}{\sigma } \phi ^*(\sigma t)\) we conclude

$$\begin{aligned} (\hat{x}, \hat{y})&= \mathrm{prox}_{\frac{1}{\sigma } J_p}(x^*,y^*)\nonumber \\&= (x^*,y^*) - \frac{1}{\sigma } \mathrm{prox}_{\sigma J_p^*}(\sigma x^*,\sigma y^*), \end{aligned}$$
(18)

where \(\mathrm{prox}_{\frac{1}{\sigma } J_p^*} = \mathrm{prox}_{J_p^*}\) since \(J_p^*\) is an indicator function. Note that by definition of \(J_p\) we have that \(\hat{y} \ge 0\) and \(\hat{y} = 0\) only if \(|\hat{x}| = 0\). Now, \(\mathrm{prox}_{J_p^*}(\sigma x^*,\sigma y^*)\) is the orthogonal projection of \((\sigma x^*,\sigma y^*)\) onto the set \(\mathcal{K}_p\) (we could also compute the epigraphical projection of \((\sigma x^*,\sigma y^*)\) onto the epigraph of \(\phi (x) = \tfrac{1}{q} |x|^q\) and reflect y, see also Fig. 4).

Fig. 4
figure 4

Projection onto the graph of the function \(\phi (x) = -\tfrac{1}{q} |x|^q\) for \(q=3\) (Color figure online)

Fig. 5
figure 5

Dynamic optimal transport between a red Gaussian and a blue one by the constrained model (9) with different boundary conditions for the third (RGB) dimension. The initial and final images have the same mass. Top mirror boundary conditions, bottom periodic boundary conditions (Color figure online)

If \((\sigma x^*, \sigma y^*) \in \mathcal{K}_p\), then \(\mathrm{prox}_{J_p^*}(\sigma x^*,\sigma y^*) = (\sigma x^*,\sigma y^*)\) and \((\hat{x}, \hat{y}) = (0,0)\). So let \((\sigma x^*, \sigma y^*) \not \in \mathcal{K}_p\), that means

$$\begin{aligned} \sigma y^* + \tfrac{1}{q} |\sigma x^{*}|^q >0. \end{aligned}$$

The tangent plane of the boundary of \(\mathcal{K}_p\) in \((x,y) = (x, -\tfrac{1}{q} |x|^q)\) is spanned by the vectors \((e_i^{\scriptscriptstyle {\text {T}}}, - |x|^{q-2} x_i)^{\scriptscriptstyle {\text {T}}}\), \(i=1,\ldots ,d\), where \(e_i \in \mathbb R^d\) denotes the i-th canonical unit vector. Hence, the projection (xy) is determined by \(y = -\tfrac{1}{q} |x|^q\) and

$$\begin{aligned} 0&= \left\langle \begin{pmatrix} \sigma x^* \\ \sigma y^* \end{pmatrix} - \begin{pmatrix} x \\ y\end{pmatrix}, \begin{pmatrix} e_i \\ - |x|^{q-2} x_i \end{pmatrix} \right\rangle \\&= \sigma x^*_i - x_i - (\sigma y^* - y) |x|^{q-2} x_i, \qquad i=1,\ldots ,d \end{aligned}$$

so that

$$\begin{aligned} x_i = \frac{\sigma x^*_i}{1+h(|x|)}, \quad i=1,\ldots ,d. \end{aligned}$$

Summing over the squares of the last equations gives

$$\begin{aligned} |x|^2 = |x^*|^2 \frac{\sigma ^2}{\left( 1+h(|x|)\right) ^2}. \end{aligned}$$

Since a solution has to fulfill

$$\begin{aligned} \hat{y} = y^* -\tfrac{1}{\sigma } y = y^* + \tfrac{1}{q \sigma } |x|^q = \sigma h(|x|) |x|^{2-q} > 0, \end{aligned}$$

it remains to search for the solutions with \(h(|x|) > 0\). Now, \(h(z) >0\) is fulfilled for \(z >0\) if and only if

$$\begin{aligned} \sigma y^* + \tfrac{1}{q} z^q > 0, \end{aligned}$$
(19)

which is the case if and only if \(z > z_0 := \max (0, - q \sigma y^*)^{\tfrac{1}{q}}\). Then \(z := |x|\) has to satisfy the equation

$$\begin{aligned} z \left( 1+h(z) \right) = \sigma |x^*|. \end{aligned}$$

The function

$$\begin{aligned} \varphi (z) \,\, := \,\, z \left( 1+h(z) \right) - \sigma |x^*| \end{aligned}$$

has exactly one zero in \([z_0,+\infty )\). Indeed, by definition of \(z_0\) and (19) we see that \( \varphi (z_0) \le 0 \), but on the other hand we have \(\varphi (z) \rightarrow +\infty \) as \(z \rightarrow +\infty \), so that \(\varphi \) has at least a zero in \([z_0,+\infty )\). Since \(q \ge 2\) for \(p \le 2\) we have for

$$\begin{aligned} h'(z) = z^{2q-3} + (q-2) \Big (\sigma y^* + \frac{1}{q} z^q \Big ) z^{q-3} > 0, \quad z > z_0. \end{aligned}$$

Hence h and then also \(\varphi \) is strictly monotone increasing for \(z > z_0\). Therefore \(\varphi \) has at most one zero in \([z_0,+\infty )\). Finally, the assertion follows by plugging in (xy) in (18).

Fig. 6
figure 6

Dynamic optimal transport between two polar lights by the constrained model (9) with different boundary conditions for the third (RGB) dimension. The initial and final images have the same mass. Top mirror boundary conditions, bottom periodic boundary conditions (Color figure online)

Fig. 7
figure 7

Example for different color transitions obtained with the constrained model (9) and periodic boundary conditions (first and fourth row), where the initial and final color images have the same mass. The second and fifth row show the corresponding intensity images, while the third and sixth row give the results obtained using two-dimensional transport of the initial and the final intensity images (Color figure online)

(ii) Straightforward computation gives

$$\begin{aligned}{\displaystyle h''(z) }&{\displaystyle \,\, = (3q-5)z^{2q-4} + (q-2)(q-3)( \sigma y^* + \tfrac{1}{q} z^q ) z^{q-4},}\\ {\displaystyle \varphi '(z)}&{\displaystyle \,\, = 1 + h(z) + zh'(z),}\\ {\displaystyle \varphi ''(z)}&={\displaystyle 2h'(z) + z h''(z)}\\&{\scriptstyle \;\;=\,\,3(q-1) z^{2q-3} + (q-1)(q-2)(\sigma y^* + \tfrac{1}{q} z^q) z^{q-3} >0, \quad z>z_0.} \end{aligned}$$

Since \(\varphi \) is monotone increasing and strictly convex for \(z \ge z_0\), the Newton method converges for any starting point \(z \ge z_0\) quadratically. \(\square \)

5 Numerical Results

In the following we provide several numerical examples. In all cases we used \(P=32\) time steps and 2000 iterations in Algorithms  1 and 2, respectively. The parameters \(\sigma \) and \(\tau \) were set to \(\sigma = 50\) and \(\tau = \frac{0.99}{\sigma }\), so that \(\sigma \tau <1\), which guarantees the convergence of the algorithms. Of course, the algorithms do not use tensor products, but relations such as that stated in (26) and their higher dimensional versions. The algorithms were implemented in Matlab and the computations were performed on a Dell computer with an Intel Core i7, 2.93 Ghz and 8 GB of RAM using Matlab 2014, Version 2014b on Ubuntu 14.04 LTS. Exemplary, the run time for \(100\times 100\) color images and 32 time steps varies between 10 and 15 minutes for the constrained and the penalized method, depending on the parameter choices for \(p,\sigma ,\tau \), and \(\lambda \). In our current implementation of the penalized method the fast transform approach is nearly as time consuming as the iterative solution of the linear system of equations with the CG method and an adequate initialization. If not explicitly stated otherwise, the results are displayed for \(p=2\), in which case the computation of the zeros in (17) slightly simplifies.

With our first experiments we illustrate the difference between mirror and periodic boundary conditions in the color dimension, where at this point that the initial and the final images have the same mass. The images are displayed at intermediate timepoints \(t = \frac{i}{8},\) where \(i = 0,\dots , 8\). In Fig. 5, the transport of a red Gaussian into a blue one is shown, either with mirror or with periodic boundary conditions in the color dimension. Figure 6 Footnote 1 depicts the transport between two real images of polar lights. In order to have the equal mass constraint fulfilled, we first normalized both images to mass 1 and afterwards multiplied them with a common factor such that both images have realistic colors. Of course, this procedure works only if the initial and the final image share approximately the same mass. In both cases the use of periodic boundary conditions yields more realistic results.

Fig. 8
figure 8

Dynamic optimal transport between RGB images by the constrained model (9) with periodic boundary conditions. The initial and final images have the same mass (Color figure online)

Fig. 9
figure 9

Example for different color and shape transitions by the constrained model (9) with periodic boundary conditions. The initial and final images do not have the same mass (Color figure online)

Further examples of the constrained model (9) for several Gaussians and real images are given in Figs. 7 and 8 Footnote 2. The first row in Fig. 7 shows the transport of a red and a yellow Gaussian into a cyan and a blue Gaussian. The red and the blue Gaussian are spatially more extended compared to the yellow and the cyan one, but due to the fact that yellow and cyan have higher intensities, the masses of the red and blue Gaussians are approximately the same as those of the yellow respective the cyan ones. This results in a very low interaction between the Gaussians during the transport, which is sightly visible in the background. Mainly, the red Gaussian is transported via a light red into cyan and the yellow Gaussian is transported over violet to blue. The next row displays the intensity of the transported color images \(\frac{1}{3}(R+G+B)\), in contrast to the (two-dimensional) transported intensity images in the third row. One sees slight differences which arise due to the fact that the small mass difference can be transported only spatially and not through the color channels.

The experiment is repeated in the fourth until sixth row, but this time the yellow and the cyan Gaussian are spatially more extended, thus having a significantly higher mass compared to the red respective the blue Gaussian. As a consequence, the interaction during the transport is higher, which is also clearly visible in the corresponding intensity images (fifth row). The color of the red Gaussian is transported similar as before, while the yellow Gaussian splits into two parts, one of them changing (as before) over violet to blue, while the other one goes over a light green to cyan. Further, as the Gaussians do not only travel in space, but also in color direction, the results are slightly smoother compared to the two-dimensional intensity transport, shown in the last row.

Fig. 10
figure 10

Comparison of our constrained model (9) with periodic boundary conditions with the approach in  [41] for microtextures and the polar lights. In both cases, our RGB model is on top of the series and microtexture model is at the bottom (Color figure online)

Also for the real images, we assume that the images have the same mass. Indeed, the initial and final images had approximately the same overall sum of values, so that our normalization had no significant effect. In the first row of Fig. 8, a topographic map of Europe is transported into a satellite image of Europe at night. The second row displays the transport between two images of the Köhlbrandbrücke in Hamburg. In both cases one nicely sees a continuous change of color and shape during the transport.

In Fig. 9 we give further examples for the transport of several Gaussians which may have different shapes in order to illustrate the transport of color and shape. Here, the initial and final images have different masses. The first row shows the transport of a yellow and a red Gaussian placed at the top of the image into a green and a blue Gaussian placed at the bottom. At this point, the yellow and the blue Gaussian are slightly more spatially extended compared to the red respective the green one. The red Gaussian changes over violet to blue. The yellow Gaussian, however, splits into two parts. While the main part is transported to green, a small part is separated and changes over orange to blue. In the second row, again a red and a yellow Gaussian are transported from the top into a green and a blue Gaussian at the bottom, but this time the yellow and the green Gaussian are spatially more extended. Additionally, the Gaussians are no longer isotropic but have an ellipsoidal shape. In this case, additionally to the color also the shape changes continuously during time. Finally, the third row displays the transport of a white Gaussian into a yellow one. It shows that there appears no artificial color during the transport, but the color transition proceeds as one may expect when looking at the RGB cube.

Next we compare our approach with the approach of Rabin et al. [41] for microtextures, which is to the best of our knowledge the only approach that extends the dynamic optimal transport problem to a special class of color images. Note, however, that their approach is completely different from ours and works only for microtextures. At this point, microtextures are textures that fulfill the assumption of being robust towards phase randomization, in contrast to macrotextures, which usually contain periodic patterns with big visible elements (such as brick walls) or—more generally—the elements that form the texture pattern are spatially arranged, see e.g.,  [29]. Based on the fact that microtextures can be well modeled as multivariate Gaussian distributions, the authors of [41] propose to compute geodesics with respect to the Wasserstein distance \(W_2\) between the Gaussian distributions that are estimated from the input textures \(f_0\) and \(f_1\). This approach has the advantage that there exist closed-form solutions for the dynamic optimal transport between Gaussian measures. However, it is limited to the special class of microtextures, as natural images are not robust towards a randomization of their Fourier phase. In Fig. 10 we compare the results of our approach with the one for microtextures. In the case of microtextures both approaches yield similar results. Note that the approach of Rabin et al. [41] may fail for images which are orthogonal at some frequencies in Fourier domain. The second example demonstrates that the microtexture technique [41] fails for natural images which possess contours and edges.

Next, we turn to the penalized model (10). Figure 11 shows the influence of the regularization parameter \(\lambda \) when transporting a red Gaussian into a yellow one. Here, the initial and the final image have significantly different mass. The images are displayed at intermediate timepoints \(t = \frac{i}{8}, i = 0,\dots , 8\). The results change for increasing \(\lambda \) from a nearly linear interpolation of the images to a transport of the mass. Further, for large \(\lambda \) the results approach the one obtained with the constrained model (9), which is reasonable.

Fig. 11
figure 11

Comparison of penalized and constrained color optimal transport (from top to bottom) penalized optimal transport for different regularization parameters \(\lambda \in \{0.1,1,10,100\}\) and constrained optimal transport (Color figure online)

Finally, we consider the influence of the parameter \(p \in (1,2]\). The corresponding results for our penalized model (10) are given in Figs. 12 and 13. In all our experiments we observed only rather small differences. Note however that in [16] Wasserstein barycenters were considered for \(p=1,2,3\) which show significant differences.

Fig. 12
figure 12

Dynamic optimal transport of RGB images using the penalized model (10) with periodic boundary conditions. Comparison of \(p=2\) (top) and \(p=1.5\) (bottom) for \(\lambda = 1\). The images are displayed at intermediate timepoints \(t = \frac{i}{4}, i = 0,\dots , 4\) (Color figure online)

Fig. 13
figure 13

Dynamic optimal transport of RGB images using the penalized model (10) with periodic boundary conditions and \(\lambda =1\) for \(p=2\) and \(p=1.5\). From left to right initial images \(f_0\), \(f_1\), result for \(p=2\) and \(p=1.5\) at time \(t=0.5\) and the absolute difference between the two results (Color figure online)

Further examples and videos, in particular for real images, can be found on our website http://www.mathematik.uni-kl.de/imagepro/members/laus/color-OT.

Fig. 14
figure 14

Transport of a one-dimensional signal with sharp edges using a TV-penalized functional (20) (top), where \(\gamma = 0.03\) and the result without TV regularization (bottom)

Fig. 15
figure 15

Transport of a two-dimensional square with sharp edges using a TV-penalized functional, where \(\gamma = 0.05\). First row isotropic TV, second row anisotropic TV, third row no TV regularization (Color figure online)

6 Summary and Conclusions

Our contribution can be summarized as follows:

  1. (i)

    We propose two discrete variational models for the interpolation of RGB color images based on the dynamic optimal transport approach. To this end, we consider color images as three-dimensional objects, where the “RGB direction” is handled in a periodic way. We focus on a discrete matrix–vector approach.

  2. (ii)

    Our first model relaxes the continuity constraint so that a transport between images of different mass is possible, while the second model allows even more flexibility by just penalizing the continuity constraint with different regularization parameters.

  3. (iii)

    We provided an existence proof and a brief discussion on the uniqueness of the minimizer.

  4. (iv)

    Interestingly, the step in the chosen primal-dual algorithm which takes the continuity constraint into account requires the solution of four-dimensional Poisson equations with simultaneous mirror/periodic (constrained model) or zero/mirror/periodic boundary conditions (penalized model). Here, fast sine, cosine, and Fourier transforms come into the play.

  5. (v)

    We consider the case \(p\in (1,2]\) and give a careful analysis of the proximal mapping of \(J_p\), \(p \in (1,2]\). This includes the determination of a starting point for the Newton algorithm to ensure its quadratic convergence and a stable performance of the overall algorithm.

  6. (vi)

    We show numerous numerical examples.

There are several directions for future work.

  1. (1)

    One possibility is to add several additional priors. So far, the present model has difficulties to transfer sharp contours. A remedy could be the penalization of a total variation (TV) term with respect to f, which results for some \(\gamma >0\) in the functional

    $$\begin{aligned} \mathop {\mathrm{argmin}}_{ (m,f) \in \mathcal{C}} \big \{ \Vert J_p(S_\mathrm{m} m, S_\mathrm{f} f + f^+) \Vert _1 + \gamma \mathrm{TV}(f) \big \}. \end{aligned}$$
    (20)

    In the following we use a spatial TV term, summed over time, i.e., in one dimension

    $$\begin{aligned} \mathrm{TV}(f) = \Vert (I_P \otimes D_N^{\scriptscriptstyle {\text {T}}}) f \Vert _1. \end{aligned}$$

    Figure 14 shows the performance of such a model in the one-dimensional case. Here, the approach without the TV term leads to some overshooting and blurring at the edges. With the TV term the transport takes place in a more natural way, in particular the sharp edges are preserved. For the one-dimensional example this works well. However, in higher dimensions one has to be more careful. In Fig. 15a comparison of an isotropic and an anisotropic TV regularizer is shown. The isotropic one leads, as could be expected, to a rounding of the corners. The anisotropic regularizer prefers horizontal and vertical edges. In this way, the shape of the object is preserved during the transport. Note that due to the smeared boundary the square appears to be smaller. To preserve the shape of arbitrary transported objects, one would have to adjust the regularizer according to the direction of the edges. The idea of penalizing TV terms for the transport can be found for gray-value images, e.g., in [12, 33]. For image denoising a Wasserstein-TV model was successfully applied in [9, 13, 47].

  2. (2)

    Using a barycentric approach the interpolation of microtextures in [41] works also between more than two images. So far this task cannot be handled via the dynamic optimal transport approach. One idea might it be to formulate a dynamic barycenter optimal transport problem or to use a multimarginal model, see e.g., Section 1.4.7. in [43] and the references therein.