Abstract
We propose two models for the interpolation between RGB images based on the dynamic optimal transport model of Benamou and Brenier (Numer Math 84:375–393, 2000). While the application of dynamic optimal transport and its extensions to unbalanced transform were examined for gray-value images in various papers, this is the first attempt to generalize the idea to color images. The non-trivial task to incorporate color into the model is tackled by considering RGB images as three-dimensional arrays, where the transport in the RGB direction is performed in a periodic way. Following the approach of Papadakis et al. (SIAM J Imaging Sci 7:212–238, 2014) for gray-value images we propose two discrete variational models, a constrained and a penalized one which can also handle unbalanced transport. We show that a minimizer of our discrete model exists, but it is not unique for some special initial/final images. For minimizing the resulting functionals we apply a primal-dual algorithm. One step of this algorithm requires the solution of a four-dimensional discretized Poisson equation with various boundary conditions in each dimension. For instance, for the penalized approach we have simultaneously zero, mirror, and periodic boundary conditions. The solution can be computed efficiently using fast Sin-I, Cos-II, and Fourier transforms. Numerical examples demonstrate the meaningfulness of our model.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Color image processing is much more challenging compared to gray-value image processing and usually, approaches for gray-value images do not generalize in a straightforward way to color images. For example, one-dimensional histograms are a very simple, but powerful tool in gray-value image processing, while it is in general difficult to exploit histograms of color images. In particular, there exist several possibilities to represent color images [10]. In this paper we consider the interpolation between two color images in the RGB space (see Fig. 2 (left)) motivated by the fluid mechanics formulation of dynamic optimal transport by Benamou and Brenier [8] and the recent approaches of Papadakis et al. [35] and [33] for gray-value images. In these works gray-value images are interpreted as two-dimensional, finitely supported density functions \(f_0\) and \(f_1\) of absolutely continuous probability measures \(\mu _0\) and \(\mu _1 \) (i.e., \(\mu _i(A)=\int _A f_i \,\mathrm {d}x\), \(i=0,1\)). In particular, we have \(\int _{\mathbb R^2} f_0 \,\mathrm {d}x= \int _{\mathbb R^2} f_1 \,\mathrm {d}x= 1\). Therewith, intermediate images are obtained as the densities \(f_t\) of the geodesic path \(\mathrm {d}\mu _t=f_t \,\mathrm {d}x\) with respect to the Wasserstein distance between \(\mu _0\) and \(\mu _1\).
In this paper, we extend the transport model to discrete RGB color images. The incorporation of color into the above approach appears to be a non-trivial task and this paper is a first proposal in this direction. We consider \(N_1 \times N_2\) RGB images as three-dimensional arrays in \(\mathbb R^{N_1,N_2,3}\), where the third direction is the “RGB direction” that contains the color information; for an illustration see Fig. 1. Particular attention has to be paid to this direction and its boundary conditions. We propose to use periodic boundary conditions, which is motivated as follows: assume we are given two color pixels \(f_0\) and \(f_1\in \mathbb {R}^{1\times 1\times 3}\). Using mirror (Neumann) boundary conditions in the third dimension, the transport of, e.g., a red pixel \(f_0 = (1,0,0)\) into a blue one \(f_1 = (0,0,1)\) goes over (0, 1, 0) (green), see Fig. 2 (middle), which is not intuitive from the viewpoint of human color perception. Furthermore, it implies that the transport depends on the order of the three color channels, which is clearly not desirable. As a remedy, we suggest to use periodic boundary conditions in the color dimension. In case of a red and a blue pixel, it yields a transport via violet, which is also what one would expect, compare Fig. 2 (right) and see also Figs. 5 and 6.
In the following, we propose two variational models for the transport of color images. To handle also the case of unequal masses, the mass conservation constraint is relaxed. Our first model contains the continuity equation as a constraint. It turns out that it generalizes the technique which was proposed in [35] for gray-value images. The second model penalizes the continuity equation, similarly it was also considered for (continuous) gray-value images in [33]. Other approaches to unbalanced optimal transport can e.g., be found in [7, 18, 26, 28].
The interpolation based on an optimal transport model for a special class of images, namely so-called microtextures, has been addressed by Rabin et al. in [37, 41]. The authors show that microtextures can be well modeled as a realization of a Gaussian random field. In this case, theoretical results guarantee that the intermediate measures \(\mu _t\) are Gaussian as well and they can be stated explicitly in terms of the means and covariance matrices learnt from the given images \(f_0\) and \(f_1\). The idea can be generalized for interpolating between more than two microtextures using barycentric coordinates. The approach fails for some special microtexture settings which usually do not appear in practice.
Color interpolation between images of the same shape can be realized by switching to the HSV or HSI space. Then only the hue component has to be transferred using, e.g., by a dynamic extension of the model for the transfer of cyclic measures (periodic histograms) of Delon et al. [25, 40]. For an example we refer to [27]. This idea is closely related to the affine model for color image enhancement in [34]. However, these approaches transfer only the color and leave the original edge structure of the image untouched.
Finally we mention that the interpolation between images can also be tackled by other sophisticated techniques such as metamorphoses, see [49]. These approaches are beyond the scope of this paper. A combination of the optimal dynamic transport model with the metamorphosis approach was proposed in [33].
The outline of our paper is as follows: in Sect. 2 we recall basic results from the theory of optimal transport. At this point, we deal with general \(p \in (1,2]\) instead of just \(p=2\) as in [35]. We propose discrete dynamic transport models in Sect. 3. Here we prefer to give a matrix–vector notation of the problem to make it more intuitive from a linear algebra point of view. We prove the existence of a minimizer and show that there are special settings where the minimizer is not unique. In Sect. 4 we solve the resulting minimization problems by primal-dual minimization algorithms. It turns out that one step of the algorithm requires the solution of a four-dimensional Poisson equation which includes various boundary conditions and can be handled by fast trigonometric transforms. Another step involves to find the positive root of a polynomial of degree \(2q-1\), where \(\frac{1}{p} + \frac{1}{q} =1\) and \(p \in (1,2]\). For this task we propose to use Newton’s algorithm and determine an appropriate starting point to ensure its quadratic convergence. Section 5 shows numerical results, some of which were also reported at the SampTA conference 2015 [27]. More examples can be found on our website http://www.mathematik.uni-kl.de/imagepro/members/laus/color-OT. Finally, Sect. 6 contains conclusions and ideas for future work. In particular, additional priors may be used to improve the dynamic transport, e.g., a total variation prior to avoid smearing effects. The Appendix reviews the diagonalization of certain discrete Laplace operators, and provides basic rules for tensor product computations. Further it contains some technical proofs.
2 Dynamic Optimal Transport
In this section we briefly review some basic facts on the theory of optimal transport. For further details we refer to, e.g., [1, 43, 50].
Let \(\mathcal {P}(\mathbb {R}^d)\) be the space of probability measures on \(\mathbb {R}^d\) and \(\mathcal {P}_p(\mathbb {R}^d)\), \(p \in [1,\infty )\) the Wasserstein space of measures having finite p-th moments
For \(\mu _0, \mu _1\in \mathcal {P}(\mathbb {R}^d)\) let \(\varPi (\mu _0,\mu _1)\) be the set of all probability measures on \(\mathbb {R}^d\times \mathbb {R}^d\) whose marginals are equal to \(\mu _0\) and \(\mu _1\). Therewith, the optimal transport problem (Kantorovich problem) reads as
One can show that for \(p \in [1,\infty )\) a minimizer exists, which is uniquely determined for \(p>1\) and also called optimal transport plan. In the special case of the one-dimensional optimal transport problem, if the measure \(\mu _0\) is non-atomic, the optimal transport plan is the same for all \(p \in (1,\infty )\) and can be stated explicitly in terms of the cumulative density functions of the involved measures. The minimal value
defines a distance on \(\mathcal {P}_p(\mathbb {R}^d)\), the so-called Wasserstein distance.
Wasserstein spaces \(\left( \mathcal{P}_p(\mathbb {R}^d), W_p(\mathbb {R}^d)\right) \) are geodesic spaces. In particular, there exists for any \(\mu _0,\mu _1 \in \mathcal{P}_p(\mathbb R^{d})\) a geodesic \(\gamma :[0,1] \rightarrow \mathcal{P}_p(\mathbb R^{d})\) with \(\gamma (0) = \mu _0\) and \(\gamma (1) = \mu _1\). For interpolating our images we ask for \(\mu _t = \gamma (t)\), \(t \in [0,1]\).
At least theoretically there are several ways to compute \(\mu _t\). If the optimal transport plan \(\pi \) is known, then \(\mu _t = \mathcal{L}_t {}_\# \pi \, {:=} \pi \circ \mathcal{L}_t^{-1}\) yields the geodesic path, where \(\mathcal{L}_t:\mathbb R^{d} \times \mathbb R^{d} \rightarrow \mathbb R^{d}\), \(\mathcal{L}_t(x,y) = (1-t)x + ty\) is the linear interpolation map, see further [43]. This requires the knowledge of the optimal transport plan \(\pi \) and of \(\mathcal{L}_t^{-1}\). At the moment there are efficient ways for computing the optimal transport plan \(\pi \) only in special cases, in particular in the one-dimensional case by an ordering procedure and for Gaussian distributions in the case \(p=2\) using expectation and covariance matrix. For \(p=2\) one can also use the fact that \(\pi \) is indeed induced by a transport map \(T:\mathbb R^d \rightarrow \mathbb R^d\), i.e., \(\pi = (\mathrm{id}, T)_{\#}\mu _0\), which can be written as \(T = \nabla \psi \), where \(\psi \) fulfills the Monge–Ampère equation, see [15]. However, this second-order nonlinear elliptic PDE is numerically hard to solve and so far, only some special cases were considered [6, 19, 20, 32]. Other numerical techniques to compute optimal transport plans have been proposed, e.g., in [2, 30, 44, 45]. Another approach consists in relaxing the condition of minimizing a Wasserstein distance using instead an entropy regularized Wasserstein distance. Such distances can be computed more efficiently by the Sinkhorn algorithm and were applied within a barycentric approach by Cuturi et al. [21, 22].
The approach in this paper was inspired by the one of Benamou and Brenier in [8]. It involves the velocity field \(\mathrm {v}:[0,1]\times \mathbb {R}^d\rightarrow \mathbb {R}^d\) of the geodesic curve joining \(\mu _0\) and \(\mu _1\). This velocity field \(\mathrm{v}(t, \cdot )\) has constant speed \(\Vert \mathrm{v}(t, \cdot )\Vert _{L^p(\mu _t)} = W_p(\mu _0,\mu _1)\). It can be shown that it minimizes the energy functional
and fulfills the continuity equation
where we say that \(t \mapsto \mu _t\) is a measure-valued solution of the continuity equation if for all compactly supported \(\phi \in C^{1}\bigl ((0,1)\times \mathbb R^d\bigr )\) and \(T\in (0,1)\) the relation
holds true. For more details we refer to [43].
Assuming the measures \(\mu _0\) and \(\mu _1\) to be absolutely continuous with respect to the Lebesgue measure, i.e., \(\mathrm {d}\mu _i = f_i \,\mathrm {d}x\), \(i=0,1\), theoretical results (see for instance [50, Theorem 8.7]) guarantee that the same holds true for \(\mathrm {d}\mu _t = f_t \,\mathrm {d}x\), where \(f_t\) can be obtained as the minimizer over \(\mathrm{v},f\) of
subject to the continuity equation
where we suppose \(\cup _{t \in [0,1]}{{\mathrm{supp}}}f(t,\cdot ) \subseteq [0,1]^d\) with appropriate (spatial) boundary conditions. Unfortunately, the energy functional (1) is not convex in f and \(\mathrm{v}\). As a remedy, Benamou and Brenier suggested in the case \(p=2\) a change of variables \( (f,\mathrm{v})\mapsto (f,f\mathrm{v}) = (f,m). \) This idea can be generalized to \(p \in (1,\infty )\), see [43], which results in the functional
where \(J_p:\mathbb R^d \times \mathbb R \rightarrow \mathbb R \cup \{+ \infty \}\) is defined as
and \(|\cdot |\) denotes the Euclidean norm. This functional has to be minimized subject to the continuity equation
equipped with appropriate spatial boundary conditions.
Remark 1
The function \(J_p:\mathbb R^d \times \mathbb R \rightarrow \mathbb R \cup \{+ \infty \}\) defined in (3) is the perspective function of \(\psi (s) = \frac{1}{p} |s|^p\), i.e., \(J_p (x,y) = y \psi \left( \frac{x}{y} \right) \). For properties of perspective functions see, e.g., [23]. In particular, since \(\psi \) is convex for \(p \in (1,\infty )\), its perspective \(J_p (x,y)\) is also convex. Further, \(J_p (x,y)\) is lower semi-continuous and positively homogeneous, i.e., \(J_p (\lambda x, \lambda y) = \lambda J_p (x,y)\) for all \(\lambda > 0\).
3 Discrete Transport Models
In practice we are dealing with discrete images whose pixel values are given on a rectangular grid. To get a discrete version of the minimization problem we have to discretize both the integration operator in (2) by a quadrature rule and the differential operators in the continuity equation (4). The discretization of the continuity equation requires the evaluation of discrete “partial derivatives” in time as well as in space. In order to avoid solutions suffering from the well-known checkerboard effect (see for instance [36]) we adopt the idea of a staggered grid as in [35], see Fig. 3.
The differential operators in space and time are discretized by forward differences, and depending on the boundary conditions this results in the use of difference matrices of the form
For the integration we apply a simple midpoint rule. To handle this part, we introduce the averaging/interpolation matrices
Discretization for one spatial dimension + time In the following, we derive the discretization of (2) for one spatial direction, i.e., for the transport of signals. The problem can be formulated in a simple matrix–vector form using tensor products of matrices which makes it rather intuitive from the linear algebra point of view. Moreover, it will be helpful for deriving the fast trigonometric transforms which will play a role within our algorithm. The generalization of our approach to higher dimensions is straightforward and can be found in Appendix 3.
We want to organize the transport between two given one-dimensional, non-negative discrete signals
We are looking for the intermediate signals \(f_t\) for \(t = \frac{k}{P}\), \(k=1, \ldots , P-1\). Using the notation \(f_t(x) = f(x,t)\), we want to find \(f(\frac{j-1/2}{N},\frac{k}{P})_{j=1,k=1}^{N,P-1} \in \mathbb {R}^{N,P-1}\). For m we have to take the boundary conditions into account. In the case of mirror boundary conditions, there is no flow over the boundary and since \(m = f\mathrm v\) the value of m is zero at the boundary at each time. In the periodic case both boundaries of m coincide. The values of m are taken at the cell faces \(\frac{j}{N}\), \(j = \kappa ,\ldots ,N-1\) and time \(\frac{k-\tfrac{1}{2}}{P}\), \(k=1,\ldots ,P\), i.e., we are looking for \(\left( m(\frac{j}{N},\frac{k-\tfrac{1}{2}}{P}) \right) _{j=\kappa ,k=1}^{N-1,P} \in \mathbb {R}^{N-\kappa ,P}\), where
The midpoints for the quadrature rule are computed by averaging the neighboring two values of m and f, respectively. To give a sound matrix–vector notation of the discrete minimization problem we reorder m and f columnwise into vectors \(\mathrm{vec}(f) \in \mathbb {R}^{N(P-1)}\) and \(\mathrm{vec} (m) \in \mathbb {R}^{(N-\kappa )P}\), which we again denote by f and m. For the \(\mathrm{vec}\) operator in connection with the tensor product \(\otimes \) of matrices we refer to Appendix 2. More specifically, let \(I_n\in \mathbb {R}^{n,n}\) be the identity matrix and set
Finally, we introduce the vectors
where we denote by \(\mathbf{0}\) (and \(\mathbf{1}\)) arrays of appropriate size with entries 0 (and 1). They are used to guarantee that the boundary conditions are fulfilled. Now the continuity equation (4) together with the boundary conditions (5) for f can be reformulated as requirement that (m, f) has to lie within the hyperplane
We will see in Proposition 3 that \(A A^{\scriptscriptstyle {\text {T}}}\) is rank one deficient. Since further \(\varvec{1}^{\scriptscriptstyle {\text {T}}}A = \varvec{0}\), we conclude that the under-determined linear system in (6) has a solution if and only if \(\varvec{1}^{\scriptscriptstyle {\text {T}}}f^- = 0\), i.e., if and only if \(f_0\) and \(f_1\) have the same mass
This resembles the fact that dynamic optimal transport is performed between probability measures. The interpretation of a color image as a probability density function has a major drawback; to represent a valid density, the sum of all RGB pixel values of the given images, i.e., the sum of the image intensity values, has to be one (or at least equal). Therefore, we consider more general the set
Note that the boundary conditions (5) for f are preserved, while the mass conservation (7) is no longer required. Clearly, if (7) holds true, then \(\mathcal{C}\) coincides with \(\mathcal{C}_0\). Let \(\iota _{\mathcal {C}}\) denote the indicator function of \(\mathcal{C}\) defined by
For \(p \in (1,2]\), we consider the following transport problem:
Constrained Transport Problem
Here, the application of \(J_p\) is meant componentwise and the summation over its (non-negative) components is addressed by the \(\ell _1\)-norm. The interpolation operators \(S_\mathrm{m}\) and \(S_\mathrm{f}\) arise from the midpoint rule for computing the integral.
We can further relax the relaxed continuity assumption \((m,f) \in \mathcal{C}\) by replacing it by
where \(\tau \ge \tau _0 :=\min _{(m,f)} \Vert ( D_\mathrm{m} | D_\mathrm{f} ) ( m^{\scriptscriptstyle {\text {T}}},f^{\scriptscriptstyle {\text {T}}})^{\scriptscriptstyle {\text {T}}}-f^-\Vert _2^2\). For \(\tau = \tau _0\) we have again problem (9). Since there is a correspondence between the solutions of such constrained problems with parameter \(\tau \) and the penalized problem with a corresponding parameter \(\lambda \), see [3, 42, 48], we prefer to consider the following penalized problem with regularization parameter \(\lambda >0\):
Penalized Transport Problem
Note that also for our penalized model the boundary conditions (5) for f still hold true. In general, both models (9) and (10) do not guarantee that the values of f stay within the RGB cube during the transport. This is not a specific problem for color images but can appear for gray-value images as well (gamut problem). A usual way out is a final backprojection onto the image range. An alternative in the constrained model is a simple modification of the constraining set in (8) towards \( \mathcal {C} :=\mathop {\mathrm{argmin}}_{m,f \in [0,1]^3} \Vert ( D_\mathrm{m} | D_\mathrm{f} ) \begin{pmatrix} m\\ f \end{pmatrix} -f^-\Vert _2^2\). This leads to inner iterations of the Poisson solver and a projection onto the cube in the subsequent Algorithm 1. In the penalized problem, the term \(\iota _{[0,1]^3}\) could be added. However, we observed in all our numerical experiments only very small violations of the range constraint, which are most likely caused by numerical reasons.
A penalized model for the continuous setting and gray-value images was examined in [33]. For recent papers on unbalanced transport we refer to [7, 18, 26, 28].
To show the existence of a solution of the discrete transport problems we use the concept of asymptotically level stable functions. As usual, for a function \(F:\mathbb R^n \rightarrow \mathbb R \cup \{+\infty \}\) and \(\mu >\inf _x F(x)\), the level sets are defined by
By \(F_\infty \) we denote the asymptotic (or recession) function of F which according to [24], see also [4, Theorem 2.5.1], can be computed by
The following definition of asymptotically level stable functions is taken from [4, p. 94]: a proper and lower semi-continuous function \(F:\mathbb {R}^{n} \rightarrow \mathbb {R}\cup \{+\infty \}\) is said to be asymptotically level stable if for each \(\rho >0\), each real-valued, bounded sequence \(\{\mu _k\}_k\) and each sequence \(\{x_k\}_k\) satisfying
there exists \(k_0\) such that
If for each real-valued, bounded sequence \(\{\mu _k\}_k\) there exists no sequence \(\{x_k\}_k\) satisfying (11), then F is automatically asymptotically level stable. In particular, coercive functions are asymptotically level stable. It was originally exhibited in [5] (without the notion of asymptotically level stable functions) that any asymptotically level stable function F with \(\inf F>-\infty \) has a global minimizer. A proof was also given in [4, Corollary 3.4.2]. With these preliminaries we can prove the existence of minimizers of our transport models.
Proposition 1
The discretized dynamic transport models (9) and (10) have a solution.
Proof
We show that the proper, lower semi-continuous functions E and \(E_\lambda \) are asymptotically level stable which implies the existence of a minimizer. For the penalized problem, the asymptotic function \(E_{\lambda ,\infty }\) reads
We obtain
Thus, \((\tilde{m},\tilde{f})\in \ker (E_{\lambda ,\infty })\) implies
For the constrained problem we have the same implications so that we can restrict our attention to the penalized one. By the definition of \(S_{\mathrm{m}}\) we obtain \(\mathrm{ker } (S_{\mathrm{m}}) = \left\{ w \otimes \tilde{\varvec{1}}: w \in \mathbb R^P\right\} \) for periodic boundary conditions and even N and \(\mathrm{ker } (S_{\mathrm{m}}) = \{\varvec{0}\}\) otherwise, where \(\tilde{\mathbf{1}}\) is defined as \(\tilde{\mathbf{1}} = (1,-1,\ldots ,1,-1)^{\scriptscriptstyle {\text {T}}}\in \mathbb R^N\). In the case \(\mathrm{ker} (S_{\mathrm{m}}) = \{\varvec{0}\}\), the first and second condition in (12) imply \(D_{\mathrm{f}} \tilde{f} = 0\) so that by the definition of \(D_{\mathrm{f}}\) it holds \(\tilde{f} = 0\). In the other case, we obtain by the first condition in (12) that \(\tilde{f} = -D_{\mathrm{f}}^\dagger D_{\mathrm{m}} \tilde{m}\), where \( D_{\mathrm{f}}^\dagger = (D_{\mathrm{f}}^{\scriptscriptstyle {\text {T}}}D_{\mathrm{f}})^{-1}D_{\mathrm{f}}^{\scriptscriptstyle {\text {T}}}\) denotes the Moore-Penrose inverse of \(D_{\mathrm{f}}\). Then
for some \(w \in \mathbb R^P\). Straightforward computation shows
for some \(\widetilde{w}\in \mathbb R^P\). Now the third condition in (12) can only be fulfilled if \(\tilde{w} = \varvec{0}\). Consequently we have in both cases
Let \(\rho >0\), \(\{\mu _k\}_k\) be a bounded sequence and \(\{(m_k,f_k)\}_k\) be a sequence fulfilling (11). By (12) and (13) we conclude
Since \((m_k,f_k)\in {{\mathrm{lev}}}(E_\lambda ,\mu _k)\), this shows that \((m_k,f_k)-\rho (\tilde{m},\tilde{f})\in {{\mathrm{lev}}}(E_\lambda ,\lambda _k)\) as well and finishes the proof. \(\square \)
Unfortunately, \(J_p(u,v)\) is not strictly convex on its domain as it can be deduced from the following proposition.
Proposition 2
For any two minimizers \((m_i,f_i)\), \(i=1,2\) of (9) the relation
holds true.
Proof
We use the perspective function notation from Remark 1. For \(\lambda \in (0,1)\) and \((u_i,v_i)\) with \(v_i >0\), \(i=1,2\), we have (componentwise)
and if \(\frac{u_1}{v_1} \not = \frac{u_2}{v_2}\) by the strict convexity of \(\psi \) that
Setting \(u_i :=S_\mathrm{m} m_i\) and \(v_i:=S_\mathrm{f} f_i + f^-\), \(i=\,1,2\), we obtain the assertion. \(\square \)
Remark 2
For periodic boundary conditions, even N and \(f_1 = f_0 + \gamma \tilde{\varvec{1}}\), \(\gamma \in [0,\min f_0 )\) the minimizer of (9) is not unique. This can be seen as follows: obviously, we would have a minimizer (m, f) if \(m = w \otimes \tilde{\varvec{1}} \in \mathrm{ker}(S_\mathrm{m})\) for some \(w \in \mathbb R^P\) and there exists \(f\ge 0\) which fulfills the constraints. Setting \(f^{k/P} := f(j-1/2,k)_{j=1}^N\), \(k=0,\ldots ,P\), these constraints read \(- 2 P w \otimes \tilde{\varvec{1}} = P (f^{(k-1)/P} - f^{k/P})_{k=1}^P\). Thus, any \(w \in \mathbb R^P\) such that
are non-negative vectors provides a minimizer of (9). We conjecture that the solution is unique in all other cases, but have no proof so far.
4 Primal-Dual Minimization Algorithm
4.1 Algorithms
For the minimization of our functionals we apply the primal-dual algorithm known as Chambolle-Pock algorithm [17, 38] in the form of Algorithm 8 in [14]. We use the following reformulation of the problems:
Constrained Transport Problem
Penalized Transport Problem
In the following we detail the first two steps of Algorithms 1 and 2:
-
Step 1 of Algorithm 1 requires the projection onto \(\mathcal{C}\),
-
Step 1 of Algorithm 2 results in the solution of a linear system of equations with coefficient matrix \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\) whose Schur complement can be computed via fast trigonometric transforms,
-
Step 2 of both algorithms is the proximal map of \(J_p\).
4.2 Projection onto \(\mathcal{C}\)
Step 1 of Algorithm 1 requires to find the orthogonal projection of \(a :=\begin{pmatrix} m^{(r)} \\ f^{(r)} \end{pmatrix} - \tau \sigma \begin{pmatrix} S_\mathrm{m}^{\scriptscriptstyle {\text {T}}}\bar{b}_u^{(r)}\\ S_\mathrm{f}^{\scriptscriptstyle {\text {T}}}\bar{b}_v^{(r)} \end{pmatrix}\) onto \(\mathcal{C}\). This means that we have to find a minimizer of \(\Vert A x - f^-\Vert _2\) for which \(\Vert a-x\Vert _2\) attains its smallest value. Substituting \(y :=a-x\) we are looking for a minimizer y of \(\Vert A y - A a + f^-\Vert _2\) with smallest norm \(\Vert y\Vert _2\). By [11, Theorem 1.2.10], this minimizer is uniquely determined by \(A^\dagger (A a - f^-)\). Therefore the projection of a onto \(\mathcal{C}\) is given by
Note that the projection onto \(\mathcal{C}\) coincides with the one onto \(\mathcal{C}_0\) if the given images \(f_0\) and \(f_1\) have the same mass. The Moore-Penrose inverse of the quadratic matrix \(A A^{\scriptscriptstyle {\text {T}}}\) is defined as follows: Let \( A A^{\scriptscriptstyle {\text {T}}}\) have the spectral decomposition
Then it holds
The following proposition shows the form of \((A A^{\scriptscriptstyle {\text {T}}})^\dagger \) in the one-dimensional spatial case. It appears that the projection onto \(\mathcal{C}\) amounts to solve a two-dimensional Poisson equation which can be realized depending on the boundary conditions by fast cosine and Fourier transforms in \(\mathcal {O}(NP\log (NP))\) operations.
Proposition 3
Let \( C_N := \sqrt{\tfrac{2}{n}} \left( \epsilon _j \cos \frac{j(2k+1)\pi }{2N} \right) _{j,k=0}^{N-1} \) with \(\epsilon _0 := 1/\sqrt{2}\) and \(\epsilon _j := 1\), \(j=1,\ldots ,N-1\) be the N-th cosine matrix and \( F_N := \sqrt{\tfrac{1}{N}} \left( \mathrm {e}^{\frac{-2\pi \mathrm {i}jk}{N}} \right) _{j,k=0}^n \) be the N-th Fourier matrix. Set \(\mathrm{d}_N^{\mathrm{mirr}} := \) and \(\mathrm{d}^{\mathrm{per}}_N := \left( 4 \sin ^2 \frac{k\pi }{N}\right) _{k=0}^{N-1}\). Then the Moore-Penrose inverse \((A A^{\scriptscriptstyle {\text {T}}})^\dagger \) in (16) is given by
where
and \(\tilde{\mathrm{d}}_j \,\, := \,\, \frac{1}{\mathrm{d}_j}\) if \(\mathrm{d}_j >0\) and \(\mathrm{d}_j=0\) otherwise.
The proof is given in Appendix 3.
4.3 Schur Complement of \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\)
To find the minimizer in Step 1 of Algorithm 2 we set the gradient of the functional to zero which results in the solution of the linear system of equations
Noting that \(\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I\) is a symmetric and positive definite matrix, this linear system can be solved using standard conjugate gradient methods. Alternatively, the next proposition shows how the inverse \((\lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I)^{-1}\) can be computed explicitly with the help of the Schur complement and fast sine, cosine, and Fourier transforms. The proposition refers to the one-dimensional spatial setting but can be generalized to the three-dimensional case in a straightforward way using the results of Appendix 3.
Proposition 4
Let \( S_{N-1} := \sqrt{\tfrac{2}{N}} \left( \sin \frac{jk\pi }{N} \right) _{j,k=1}^{N-1} \) and \( \mathrm{d}^{\mathrm{zero}}_{N-1} := \left( 4 \sin ^2 \frac{k\pi }{2N}\right) _{k=1}^{N-1} \). Then the inverse of the matrix \( \lambda A^{\scriptscriptstyle {\text {T}}}A + \frac{1}{\tau } I \) is given by
where
(i) for mirror boundary conditions
(ii) for periodic boundary conditions
The proof is given in Appendix 3.
4.4 Proximal Map of \({ J}_p\)
Step 2 of Algorithm 1 consists of an evaluation of the proximal map \(\mathrm{prox}_{\frac{1}{\sigma } J_p}\) of \(J_p\). This can be done using the proximal map \(\mathrm{prox}_{ J^{*}_p}\) of \(J^{*}_p\) and Moreau’s identity \(\mathrm{prox}_\phi (t) + \mathrm{prox}_{\phi ^*}(t) = t\). Therefore, we state in the following first the dual function \(J^{*}_p\).
Lemma 1
For \(p \in (1,+\infty )\) and \(\frac{1}{p} + \frac{1}{q} = 1\) we have
where
For the proof we refer to [31] or [43, Lemma 5.17]. After this preparation we are now able to compute \(\mathrm{prox}_{\frac{1}{\sigma } J_p}\).
Proposition 5
Let \(p \in (1,2]\) and \(\frac{1}{p} + \frac{1}{q} = 1\).
-
(i)
Then for \(x^* \in \mathbb R^d\), \(y^* \in \mathbb R\) and \(\sigma > 0\) it holds
$$\begin{aligned}&\mathrm{prox}_{\frac{1}{\sigma } J_p}(x^*,y^*)\\&\quad =\left\{ \begin{array}{ll} (0,0) &{} \mathrm{if} \; (\sigma x^*,\sigma y^*) \in \mathcal{K}_p,\\ \left( x^* \frac{h(\hat{z})}{1+h(\hat{z})} , y^* + \frac{1}{\sigma q} \hat{z}^q \right) &{} \mathrm{otherwise,} \end{array} \right. \end{aligned}$$where
$$\begin{aligned} h(z) := (\sigma y^* + \tfrac{1}{q} z^q) z^{q-2} \end{aligned}$$and \(\hat{z} \in \mathbb R_{\ge 0}\) is the unique solution of the equation
$$\begin{aligned} z \left( 1+ h(z) \right) - \sigma | x^*| = 0 \end{aligned}$$(17)in the interval \(\big [\max (0, z_0)^{\tfrac{1}{q}},+\infty \big )\), where \(z_0:= - q \sigma y^*\).
-
(ii)
The Newton method converges for any starting point \(z \ge z_0\) quadratically to the largest zero of (17).
Proof
(i) By Moreau’s identity it holds
and since \((\tfrac{1}{\sigma } \phi )^*(t) = \tfrac{1}{\sigma } \phi ^*(\sigma t)\) we conclude
where \(\mathrm{prox}_{\frac{1}{\sigma } J_p^*} = \mathrm{prox}_{J_p^*}\) since \(J_p^*\) is an indicator function. Note that by definition of \(J_p\) we have that \(\hat{y} \ge 0\) and \(\hat{y} = 0\) only if \(|\hat{x}| = 0\). Now, \(\mathrm{prox}_{J_p^*}(\sigma x^*,\sigma y^*)\) is the orthogonal projection of \((\sigma x^*,\sigma y^*)\) onto the set \(\mathcal{K}_p\) (we could also compute the epigraphical projection of \((\sigma x^*,\sigma y^*)\) onto the epigraph of \(\phi (x) = \tfrac{1}{q} |x|^q\) and reflect y, see also Fig. 4).
If \((\sigma x^*, \sigma y^*) \in \mathcal{K}_p\), then \(\mathrm{prox}_{J_p^*}(\sigma x^*,\sigma y^*) = (\sigma x^*,\sigma y^*)\) and \((\hat{x}, \hat{y}) = (0,0)\). So let \((\sigma x^*, \sigma y^*) \not \in \mathcal{K}_p\), that means
The tangent plane of the boundary of \(\mathcal{K}_p\) in \((x,y) = (x, -\tfrac{1}{q} |x|^q)\) is spanned by the vectors \((e_i^{\scriptscriptstyle {\text {T}}}, - |x|^{q-2} x_i)^{\scriptscriptstyle {\text {T}}}\), \(i=1,\ldots ,d\), where \(e_i \in \mathbb R^d\) denotes the i-th canonical unit vector. Hence, the projection (x, y) is determined by \(y = -\tfrac{1}{q} |x|^q\) and
so that
Summing over the squares of the last equations gives
Since a solution has to fulfill
it remains to search for the solutions with \(h(|x|) > 0\). Now, \(h(z) >0\) is fulfilled for \(z >0\) if and only if
which is the case if and only if \(z > z_0 := \max (0, - q \sigma y^*)^{\tfrac{1}{q}}\). Then \(z := |x|\) has to satisfy the equation
The function
has exactly one zero in \([z_0,+\infty )\). Indeed, by definition of \(z_0\) and (19) we see that \( \varphi (z_0) \le 0 \), but on the other hand we have \(\varphi (z) \rightarrow +\infty \) as \(z \rightarrow +\infty \), so that \(\varphi \) has at least a zero in \([z_0,+\infty )\). Since \(q \ge 2\) for \(p \le 2\) we have for
Hence h and then also \(\varphi \) is strictly monotone increasing for \(z > z_0\). Therefore \(\varphi \) has at most one zero in \([z_0,+\infty )\). Finally, the assertion follows by plugging in (x, y) in (18).
(ii) Straightforward computation gives
Since \(\varphi \) is monotone increasing and strictly convex for \(z \ge z_0\), the Newton method converges for any starting point \(z \ge z_0\) quadratically. \(\square \)
5 Numerical Results
In the following we provide several numerical examples. In all cases we used \(P=32\) time steps and 2000 iterations in Algorithms 1 and 2, respectively. The parameters \(\sigma \) and \(\tau \) were set to \(\sigma = 50\) and \(\tau = \frac{0.99}{\sigma }\), so that \(\sigma \tau <1\), which guarantees the convergence of the algorithms. Of course, the algorithms do not use tensor products, but relations such as that stated in (26) and their higher dimensional versions. The algorithms were implemented in Matlab and the computations were performed on a Dell computer with an Intel Core i7, 2.93 Ghz and 8 GB of RAM using Matlab 2014, Version 2014b on Ubuntu 14.04 LTS. Exemplary, the run time for \(100\times 100\) color images and 32 time steps varies between 10 and 15 minutes for the constrained and the penalized method, depending on the parameter choices for \(p,\sigma ,\tau \), and \(\lambda \). In our current implementation of the penalized method the fast transform approach is nearly as time consuming as the iterative solution of the linear system of equations with the CG method and an adequate initialization. If not explicitly stated otherwise, the results are displayed for \(p=2\), in which case the computation of the zeros in (17) slightly simplifies.
With our first experiments we illustrate the difference between mirror and periodic boundary conditions in the color dimension, where at this point that the initial and the final images have the same mass. The images are displayed at intermediate timepoints \(t = \frac{i}{8},\) where \(i = 0,\dots , 8\). In Fig. 5, the transport of a red Gaussian into a blue one is shown, either with mirror or with periodic boundary conditions in the color dimension. Figure 6 Footnote 1 depicts the transport between two real images of polar lights. In order to have the equal mass constraint fulfilled, we first normalized both images to mass 1 and afterwards multiplied them with a common factor such that both images have realistic colors. Of course, this procedure works only if the initial and the final image share approximately the same mass. In both cases the use of periodic boundary conditions yields more realistic results.
Further examples of the constrained model (9) for several Gaussians and real images are given in Figs. 7 and 8 Footnote 2. The first row in Fig. 7 shows the transport of a red and a yellow Gaussian into a cyan and a blue Gaussian. The red and the blue Gaussian are spatially more extended compared to the yellow and the cyan one, but due to the fact that yellow and cyan have higher intensities, the masses of the red and blue Gaussians are approximately the same as those of the yellow respective the cyan ones. This results in a very low interaction between the Gaussians during the transport, which is sightly visible in the background. Mainly, the red Gaussian is transported via a light red into cyan and the yellow Gaussian is transported over violet to blue. The next row displays the intensity of the transported color images \(\frac{1}{3}(R+G+B)\), in contrast to the (two-dimensional) transported intensity images in the third row. One sees slight differences which arise due to the fact that the small mass difference can be transported only spatially and not through the color channels.
The experiment is repeated in the fourth until sixth row, but this time the yellow and the cyan Gaussian are spatially more extended, thus having a significantly higher mass compared to the red respective the blue Gaussian. As a consequence, the interaction during the transport is higher, which is also clearly visible in the corresponding intensity images (fifth row). The color of the red Gaussian is transported similar as before, while the yellow Gaussian splits into two parts, one of them changing (as before) over violet to blue, while the other one goes over a light green to cyan. Further, as the Gaussians do not only travel in space, but also in color direction, the results are slightly smoother compared to the two-dimensional intensity transport, shown in the last row.
Also for the real images, we assume that the images have the same mass. Indeed, the initial and final images had approximately the same overall sum of values, so that our normalization had no significant effect. In the first row of Fig. 8, a topographic map of Europe is transported into a satellite image of Europe at night. The second row displays the transport between two images of the Köhlbrandbrücke in Hamburg. In both cases one nicely sees a continuous change of color and shape during the transport.
In Fig. 9 we give further examples for the transport of several Gaussians which may have different shapes in order to illustrate the transport of color and shape. Here, the initial and final images have different masses. The first row shows the transport of a yellow and a red Gaussian placed at the top of the image into a green and a blue Gaussian placed at the bottom. At this point, the yellow and the blue Gaussian are slightly more spatially extended compared to the red respective the green one. The red Gaussian changes over violet to blue. The yellow Gaussian, however, splits into two parts. While the main part is transported to green, a small part is separated and changes over orange to blue. In the second row, again a red and a yellow Gaussian are transported from the top into a green and a blue Gaussian at the bottom, but this time the yellow and the green Gaussian are spatially more extended. Additionally, the Gaussians are no longer isotropic but have an ellipsoidal shape. In this case, additionally to the color also the shape changes continuously during time. Finally, the third row displays the transport of a white Gaussian into a yellow one. It shows that there appears no artificial color during the transport, but the color transition proceeds as one may expect when looking at the RGB cube.
Next we compare our approach with the approach of Rabin et al. [41] for microtextures, which is to the best of our knowledge the only approach that extends the dynamic optimal transport problem to a special class of color images. Note, however, that their approach is completely different from ours and works only for microtextures. At this point, microtextures are textures that fulfill the assumption of being robust towards phase randomization, in contrast to macrotextures, which usually contain periodic patterns with big visible elements (such as brick walls) or—more generally—the elements that form the texture pattern are spatially arranged, see e.g., [29]. Based on the fact that microtextures can be well modeled as multivariate Gaussian distributions, the authors of [41] propose to compute geodesics with respect to the Wasserstein distance \(W_2\) between the Gaussian distributions that are estimated from the input textures \(f_0\) and \(f_1\). This approach has the advantage that there exist closed-form solutions for the dynamic optimal transport between Gaussian measures. However, it is limited to the special class of microtextures, as natural images are not robust towards a randomization of their Fourier phase. In Fig. 10 we compare the results of our approach with the one for microtextures. In the case of microtextures both approaches yield similar results. Note that the approach of Rabin et al. [41] may fail for images which are orthogonal at some frequencies in Fourier domain. The second example demonstrates that the microtexture technique [41] fails for natural images which possess contours and edges.
Next, we turn to the penalized model (10). Figure 11 shows the influence of the regularization parameter \(\lambda \) when transporting a red Gaussian into a yellow one. Here, the initial and the final image have significantly different mass. The images are displayed at intermediate timepoints \(t = \frac{i}{8}, i = 0,\dots , 8\). The results change for increasing \(\lambda \) from a nearly linear interpolation of the images to a transport of the mass. Further, for large \(\lambda \) the results approach the one obtained with the constrained model (9), which is reasonable.
Finally, we consider the influence of the parameter \(p \in (1,2]\). The corresponding results for our penalized model (10) are given in Figs. 12 and 13. In all our experiments we observed only rather small differences. Note however that in [16] Wasserstein barycenters were considered for \(p=1,2,3\) which show significant differences.
Further examples and videos, in particular for real images, can be found on our website http://www.mathematik.uni-kl.de/imagepro/members/laus/color-OT.
6 Summary and Conclusions
Our contribution can be summarized as follows:
-
(i)
We propose two discrete variational models for the interpolation of RGB color images based on the dynamic optimal transport approach. To this end, we consider color images as three-dimensional objects, where the “RGB direction” is handled in a periodic way. We focus on a discrete matrix–vector approach.
-
(ii)
Our first model relaxes the continuity constraint so that a transport between images of different mass is possible, while the second model allows even more flexibility by just penalizing the continuity constraint with different regularization parameters.
-
(iii)
We provided an existence proof and a brief discussion on the uniqueness of the minimizer.
-
(iv)
Interestingly, the step in the chosen primal-dual algorithm which takes the continuity constraint into account requires the solution of four-dimensional Poisson equations with simultaneous mirror/periodic (constrained model) or zero/mirror/periodic boundary conditions (penalized model). Here, fast sine, cosine, and Fourier transforms come into the play.
-
(v)
We consider the case \(p\in (1,2]\) and give a careful analysis of the proximal mapping of \(J_p\), \(p \in (1,2]\). This includes the determination of a starting point for the Newton algorithm to ensure its quadratic convergence and a stable performance of the overall algorithm.
-
(vi)
We show numerous numerical examples.
There are several directions for future work.
-
(1)
One possibility is to add several additional priors. So far, the present model has difficulties to transfer sharp contours. A remedy could be the penalization of a total variation (TV) term with respect to f, which results for some \(\gamma >0\) in the functional
$$\begin{aligned} \mathop {\mathrm{argmin}}_{ (m,f) \in \mathcal{C}} \big \{ \Vert J_p(S_\mathrm{m} m, S_\mathrm{f} f + f^+) \Vert _1 + \gamma \mathrm{TV}(f) \big \}. \end{aligned}$$(20)In the following we use a spatial TV term, summed over time, i.e., in one dimension
$$\begin{aligned} \mathrm{TV}(f) = \Vert (I_P \otimes D_N^{\scriptscriptstyle {\text {T}}}) f \Vert _1. \end{aligned}$$Figure 14 shows the performance of such a model in the one-dimensional case. Here, the approach without the TV term leads to some overshooting and blurring at the edges. With the TV term the transport takes place in a more natural way, in particular the sharp edges are preserved. For the one-dimensional example this works well. However, in higher dimensions one has to be more careful. In Fig. 15a comparison of an isotropic and an anisotropic TV regularizer is shown. The isotropic one leads, as could be expected, to a rounding of the corners. The anisotropic regularizer prefers horizontal and vertical edges. In this way, the shape of the object is preserved during the transport. Note that due to the smeared boundary the square appears to be smaller. To preserve the shape of arbitrary transported objects, one would have to adjust the regularizer according to the direction of the edges. The idea of penalizing TV terms for the transport can be found for gray-value images, e.g., in [12, 33]. For image denoising a Wasserstein-TV model was successfully applied in [9, 13, 47].
-
(2)
Using a barycentric approach the interpolation of microtextures in [41] works also between more than two images. So far this task cannot be handled via the dynamic optimal transport approach. One idea might it be to formulate a dynamic barycenter optimal transport problem or to use a multimarginal model, see e.g., Section 1.4.7. in [43] and the references therein.
Notes
Images from Wikimedia Commons: AGOModra_aurora.jpg by Comenius University under CC BY SA 3.0, Aurora-borealis_andoya.jpg by M. Buschmann under CC BY 3.0.
Images from Wikimedia Commons: Europe_satellite_orthographic.jpg and Earthlights_2002.jpg by NASA, Köhlbrandbrücke5478.jpg by G. Ries under CC BY SA 2.5, Köhlbrandbrücke.jpg by HafenCity1 under CC BY 3.0.
References
Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows in Metric Spaces and in the Space of Probability Measures. Springer, Heidelberg (2006)
Angenent, S., Haker, S., Tannenbaum, A.: Minimizing flows for the Monge-Kantorovich problem. SIAM J. Math. Anal. 35, 61–97 (2003)
Aravkin, A.Y., Burke, J.V., Friedlander, M.P.: Variational properties of value functions. SIAM J. Optim. 23(3), 1689–1717 (2013)
Auslender, A., Teboulle, M.: Asymptotic Cones and Functions in Optimization and Variational Inequalities. Springer, New York (2003)
Baiocchi, C., Buttazzo, G., Gastaldi, F., Tomarelli, F.: General existence theorems for unilateral problems in continuum mechanics. Arch. Ration. Mech. Anal. 100(2), 149–189 (1988)
Benamou, J.D.: A domain decomposition method for the polar factorization of vector-valued mappings. SIAM J. Numer. Anal. 32(6), 1808–1838 (1995)
Benamou, J.-D.: Numerical resolution of an ‘unbalanced’ mass transport problem. ESAIM Math. Model. Numer. Anal. 37(5), 851–868 (2003)
Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)
Benning, M., Calatroni, L., Düring, B., Schönlieb, C.-B.: A primal-dual approach for a total variation Wasserstein flow. Geometric Science of Information. LNCS, pp. 413–421. Springer, Berlin (2013)
Bertalmio, M.: Image Processing for Cinema. CRC Press, Boca Raton (2014)
Björck, A.: Numerical Methods for Least Squares Problems. SIAM, Philadelphia (1996)
Brune, C.: 4D imaging in tomography and optical nanoscopy. Dissertation, PhD Thesis, Münster (Westfalen) University, (2010)
Burger, M., Franek, M., Schönlieb, C.-B.: Regularized regression and density estimation based on optimal transport. Appl. Math. Res. eXpress 2012(2), 209–253 (2012)
Burger, M., Sawatzky, A., Steidl, G.: First order algorithms in variational image processing. ArXiv:1412.4237 (2014)
Caffarelli, L.A.: A localization property of viscosity solutions to the monge-ampere equation and their strict convexity. Ann. Math. 131(1), 129–134 (1990)
Carlier, G., Oberman, A., Oudet, E.: Numerical methods for matching for teams and Wasserstein barycenters. ArXiv:1411.3602 (2014)
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision 40(1), 120–145 (2011)
Chizat, L., Schmitzer, B., Peyré, G., Vialard, F.-X.: An interpolating distance between optimal transport and Fisher-Rao. ArXiv:1506.06430 (2015)
Cullen, M.J.: Implicit finite difference methods for modelling discontinouos atmospheric flows. J. Comput. Phys. 81, 319–348 (1989)
Cullen, M.J., Purser, R.J.: An extended Lagrangian theory of semigeostrophic frontogenesis. J. Atmos. Sci. 41, 1477–1497 (1989)
Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. Adv. Neural Inf. Process. Syst. 26, 2292–2300 (2013)
Cuturi, M., Coucet, A.: Fast computation of Wasserstein barycenters. In: Proceedings of the 31st International Conference on Machine Learning, pp. 685–693 (2014)
Dacorogna, B., Maréchal, P.: The role of perspective functions in convexity, polyconvexity, rank-one convexity and separate convexity. J. Convex Anal. 15(2), 271–284 (2008)
Dedieu, J.-P.: Cônes asymptotes d’un ensemble non convexe. Application à l’optimisation. C. R. Acad. Sci. 287, 91–103 (1977)
Delon, J.: Midway image equalization. J. Math. Imaging Vision 21(2), 119–134 (2004)
Ferradans, S., Papadakis, N., Peyré, G., Aujol, J.-F.: Regularized discrete optimal transport. SIAM J. Imaging Sci. 7(3), 1853–1882 (2014)
Fitschen, J.H., Laus, F., Steidl, G.: Dynamic optimal transport with mixed boundary condition for color image processing. In: International Conference on Sampling Theory and Applications (SampTA). ArXiv:1501.04840, pp. 558–562 (2015)
Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a Wasserstein loss. Adv. Neural Inf. Process. Syst. 28, 2044–2052 (2015)
Galerne, B., Gousseau, Y., Morel, J.-M.: Random phase textures: theory and synthesis. IEEE Trans. Image Process. 20(1), 257–267 (2011)
Haber, E., Rehman, T., Tannenbaum, A.: An efficient numerical method for the solution of the \(l_2\) optimal mass transfer problem. SIAM J. Sci. Comput. 32(1), 197–211 (2010)
Jimenez, C.: Dynamic formulation of optimal transport problems. J. Convex Anal. 15(3), 593–622 (2008)
Kochengin, S.A., Oliker, V.I.: Determination of reflector surfaces from near-field scattering data. Inverse Probl. 13(2), 363–373 (1997)
Maas, J., Rumpf, M., Schönlieb, C., Simon, S.: A generalized model for optimal transport of images including dissipation and density modulation. ArXiv:1504.01988 (2015)
Nikolova, M., Steidl, G.: Fast hue and range preserving histogram specification: theory and new algorithms for color image enhancement. IEEE Trans. Image Process. 23(9), 4087–4100 (2014)
Papadakis, N., Peyré, G., Oudet, E.: Optimal transport with proximal splitting. SIAM J. Imaging Sci. 7(1), 212–238 (2014)
Patankar, S.: Numerical Heat Transfer and Fluid Flow. CRC Press, New York (1980)
Peyré, G., Fadili, J., Rabin, J.: Wasserstein active contours. In: 19th IEEE ICIP, pp. 2541–2544 (2012)
Pock, T., Chambolle, A., Cremers, D., Bischof, H.: A convex relaxation approach for computing minimal partitions. In: IEEE Conference Computer Vision and Pattern Recognition, pp. 810–817 (2009)
Potts, D., Steidl, G.: Optimal trigonometric preconditioners for nonsymmetric Toeplitz systems. Linear Algebra Appl. 281, 265–292 (1998)
Rabin, J., Delon, J., Gousseau, Y.: Transportation distances on the circle. J. Math. Imaging Vision 41(1–2), 147–167 (2011)
Rabin, J., Peyré, G., Delon, J., Bernot, M.: Wasserstein barycenter and its application to texture mixing. In: SSVM, pp. 435–446. Springer (2012)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Santambrogio, F.: Optimal Transport for Applied Mathematicians. Springer, New York (2015)
Schmitzer, B.: A sparse multi-scale algorithm for dense optimal transport. ArXiv:1510.05466 (2015)
Schmitzer, B., Schnörr, C.: A hierarchical approach to optimal transport. In: SSVM, pp. 452–464. Springer (2013)
Strang, G., MacNamara, S.: Functions of difference matrices are Toeplitz plus Hankel. SIAM Rev. 56, 525–546 (2014)
Swoboda, P., Schnörr, C.: Convex variational image restoration with histogram priors. SIAM J. Imaging Sci. 6(3), 1719–1735 (2013)
Teuber, T., Steidl, G., Chan, R.H.: Minimization and parameter estimation for seminorm regularization models with I-divergence constraints. Inverse Probl. 29, 1–28 (2013)
Trouvé, A., Younes, L.: Metamorphoses through Lie group action. Found. Comput. Math. 5(2), 173–198 (2005)
Villani, C.: Optimal Transport: Old and New. Springer, Berlin (2008)
Acknowledgments
Funding by the DFG within the Research Training Group 1932 is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Diagonalization of Structured Matrices
In the following we collect known facts on the eigenvalue decomposition of various difference matrices. For further information we refer, e.g., to [39, 46]. The following matrices \(F_n\), \(C_n\) and \(S_n\) are unitary, resp., orthogonal matrices. The Fourier matrix
diagonalizes circulant matrices, i.e., for \(a := (a_j)_{j=0}^{n-1} \in \mathbb R^n\) we have
In particular it holds
with \(\mathrm{d}^{\mathrm{per}}_n := \left( 4 \sin ^2 \frac{k\pi }{n}\right) _{k=0}^{n-1}\). The operator \(\varDelta _n^{\mathrm{per}}\) typically appears when solving the one-dimensional Poisson equation with periodic boundary conditions by finite difference methods.
The DST-I matrix
and the DCT-II matrix
with \(\epsilon _0 := 1/\sqrt{2}\) and \(\epsilon _j := 1\), \(j=1,\ldots ,n-1\) are related by
where \( \mathrm{d}^{\mathrm{zero}}_{n-1} := \left( 4 \sin ^2 \frac{k\pi }{2n}\right) _{k=1}^{n-1} \). Further they diagonalize sums of certain symmetric Toeplitz and persymmetric Hankel matrices. In particular it holds
and
with \( \mathrm{d}^{\mathrm{mirr}}_n := \begin{pmatrix} 0\\ \mathrm{d}_{n-1}^{\mathrm{zero}} \end{pmatrix} = \left( 4\sin ^2 \frac{j \pi }{2n} \right) _{j=0}^{n-1} \). The operators \(\varDelta _{n-1}^{\mathrm{zero}}\) and \(\varDelta _{n}^{\mathrm{mirr}}\) are related to the Poisson equation with zero boundary conditions and mirror boundary conditions, respectively.
Appendix 2: Computation with Tensor Products
The tensor product (Kronecker product) of matrices
and
is defined by
The tensor product is associative and distributive with respect to the addition of matrices.
Lemma 2
(Properties of Tensor Products)
-
(i)
\(( A \otimes B)^{\scriptscriptstyle {\text {T}}}= A^{\scriptscriptstyle {\text {T}}}\otimes B^{\scriptscriptstyle {\text {T}}}\) for \( A \in \mathbb {C}^{m, n}\), \( B \in \mathbb {C}^{s , t}\). Let \(A, C \in \mathbb {C}^{m, m}\) and \( B, D \in \mathbb {C}^{n , n}\). Then the following holds:
-
(ii)
\(( A \otimes B)( C \otimes D) = A C \otimes B D\) for \(A, C \in \mathbb {C}^{m, m}\) and \( B, D \in \mathbb {C}^{n , n}\).
-
(iii)
If A and B are invertible, then \( A \otimes B\) is also invertible and
$$\begin{aligned} ( A \otimes B)^{-1} = A^{-1} \otimes B^{-1} \, . \end{aligned}$$
The tensor product is needed to establish the connection between images and their vectorized versions, i.e., we consider images \(F\in \mathbb {R}^{n_1\times n_2}\) columnwise reshaped as
Then the following relation holds true:
Appendix 3: Proofs and Generalization of the Tensor Product Approach to 3D
Proof of Proposition 3
By definition of A and using (25), (22), we obtain for periodic boundary conditions
Similarly we get with (25) for mirror boundary conditions
which finishes the proof. \(\square \)
Proof of Proposition 4
By definition of A we obtain
so that the inverse can be written by the help of the Schur complement
as
By (22) and (24) we have with \(D \in \{D_N^{\mathrm{per}},D_N\}\) that
The Schur complement reads as
By (21) we have
and
so that we obtain for periodic boundary boundary conditions
Therewith it follows with (24)
which yields the assertion for \(S^{-1}\) in the periodic case.
For mirror boundary conditions we compute using (23)
and inverting this matrix finishes the proof. \(\square \)
Discretization for three spatial dimensions + time For RGB images of size \(N_1 \times N_2 \times N_3\), where \(N_3 = 3\), we have to work in three spatial dimensions. Setting \(N :=(N_1,N_2,N_3)\), \(j :=(j_1,j_2,j_3)\) and defining the quotient \(\tfrac{j}{N}\) componentwise we obtain
-
\(f_i = \bigg (f_i \Big ( \tfrac{j-1/2}{N}\Big ) \bigg )_{j=(1,1,1)}^{N} \in \mathbb R^{N_1,N_2,N_3}\), \(i= 0,1\),
-
\(f = \bigg ( f \Big ( \tfrac{j-1/2}{N}, \tfrac{k}{P} \Big ) \bigg )_{j={(1,1,1)},k=1}^{N,P-1} \in \mathbb R^{N_1,N_2,3,P-1}\),
-
\(m = (m_1, m_2, m_3)\), with
$$\begin{aligned}&\bigg (m_1 \Big (\tfrac{j_1}{N_1},\tfrac{j_2-1/2}{N_2},\tfrac{j_3-1/2}{3},\tfrac{k-1/2}{P} \Big )\bigg )_{j_1=1,j_2=1,j_3=1,k=1}^{N_1-1,N_2,3,P}\\&\in \mathbb R^{N_1-1,N_2,3,P},\\&\bigg (m_2\Big (\tfrac{j_1-1/2}{N_1},\tfrac{j_2}{N_2},\tfrac{j_3-1/2}{3},\tfrac{k-1/2}{P}\Big )\bigg )_{j_1=1,j_2=1,j_3=1,k=1}^{N_1,N_2-1,3,P}\\&\in \mathbb R^{N_1,N_2-1,3,P},\\&\bigg (m_3\Big (\tfrac{j_1-1/2}{N_1},\tfrac{j_2-1/2}{N_2},\tfrac{j_3}{3},\tfrac{k-1/2}{P}\Big )\bigg )_{j_1=1,j_2=1,j_3=0,k=1}^{N_1,N_2,2,P}\\&\in \mathbb R^{N_1,N_2,3,P}. \end{aligned}$$
In the definition of m we take the periodic boundary for the third spatial direction into account. Analogously as in the one-dimensional case, when reshaping m and f into long vectors, the interpolation and differentiation operators can be written using tensor products. For the interpolation operator we have
and
which means, that \(S_{\text {m}}m\) computes the average of \(m_i\) with respect to the i-th coordinate, \(i=1,2,3\), and \(S_{\text {f}}f\) computes the average of f with respect to the time variable. Similarly we generalize the difference operator. Then, reordering f and m into large vectors, the matrix form of the operator A is
so that \(AA^{\scriptscriptstyle {\text {T}}}\) reads as
where
For the three-dimensional spatial setting we have to solve a four-dimensional Poisson equation, which can be handled separately in each dimension. For the constrained problem, this can be computed using fast cosine and Fourier transforms with a complexity of \(\mathcal {O}(N_1 N_2 P\log (N_1 N_2 P))\).
Rights and permissions
About this article
Cite this article
Fitschen, J.H., Laus, F. & Steidl, G. Transport Between RGB Images Motivated by Dynamic Optimal Transport. J Math Imaging Vis 56, 409–429 (2016). https://doi.org/10.1007/s10851-016-0644-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-016-0644-x