Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Many practical problems in image processing and computer vision can be modeled as multilabel problems, where the task is to optimally assign the unknown variable \(l\), chosen from some finite set \(\{l_1,\ldots ,l_n\}\), at each point of the image domain \(\varOmega \). It has become an important paradigm to formulate such labeling problems as the optimization of an energy function/functional which mathematically encodes all the information needed for the specified imaging and vision task. Such optimization problems can be formulated by either regarding the image domain as discrete or continuous.

In the spatially discrete setting, graph cut has become one of the most important and efficient techniques to tackle such problems, by computing max-flow or min-cut on appropriately constructed graphs. Applications of max-flow/min-cut in computer vision range from image segmentation or labeling [1, 2], stereo [3, 4], 3D reconstruction [5] etc. Unfortunately, most minimization problems involving more than two labels are NP-hard, therefore only approximate algorithms are available [1, 4]. However, for a particular set of multilabeling problems with convex interaction penalty, Ishikawa [6] showed that exact solutions can be computed by max-flow and min-cut. Such energy functions are important in e.g. stereo reconstruction. Despite the efficiencies of graph-based methods, their computation results are often comparatively rough and biased by the discrete graph setting, i.e. metrication errors occur. Reducing such visual artifacts requires either considering more neighbour nodes, which increases memory burden largely, or applying more complex schemes such as high-order potentials [7].

Recently, the variational approach has become more and more popular for obtaining optimal labelings in the spatially continuous setting, where the problem is formulated as the minimization of a continuous energy functional. In contrast to the graph-based model, there are many advantages of the variational approach: the variational model perfectly avoids metrication errors due to its crucial rotation invariance; moreover, its reduced continuous numerical scheme is reliable, tractable, and can be easily implemented and accelerated in many different ways, e.g. parallel, multigrid or GPU hardwares; last but not least, the continuous models require far less memory in computation.

The application of variational methods to optimal labelings is often to relax the combinatorial constraints to a proper convex set. It leads to a constrained convex minimization problem such that global and exact optimums, in some special cases, are available. For example, Chan et al. [8] showed that global and exact binary optimizers can be obtained by thresholding the computation result of the convex relaxed model; therefore, a sequence of so-called continuous min-cuts can be obtained. [9, 10] generalized Ishikawa’s work [6] to the spatially continuous setting, where both the image domain and label values are continuous, by representing the optimal labeling function as the discontinuity set of a binary function in a one-dimensional higher space, i.e. a spatially continuous min-cut. Such a lifting approach is related to earlier mathematical theories of calibrations and Cartesian currents [11, 12]. Optimal labeling functions could be obtained by applying the result of Chan et al. in the higher dimensional space, i.e. first solve the relaxed binary problem and then threshold the result.

Motivations and Contributions

For discrete graphs, it is well known that the minimum cut problem is dual to the maximum flow problem by the max-flow and min-cut theorem [13]. Actually, the fastest graph cut algorithms are based on maximizing flow instead of computing the min-cut directly, e.g. the Ford-Fulkerson algorithm [14] and the push-relabel algorithm [15]. The minimal ‘cut’ is finally recovered along edges with ’saturated’ flows, i.e. cuts appear at the flow-bottlenecked edges [4, 16]. In contrast, max-flow models and algorithms in the spatially continuous setting have been much less studied. Some work has appeared that deal with partitioning problems involving two regions: Strang [17] was the first to formulate max-flow and min-cut problems over a continuous domain; In [18], edge based max-flow and min-cut was formulated in which certain interior and exterior points must be specified in advance; Yuan et al. [19, 20] proposed a direct continuous analogue of the typical discrete max-flow and min-cut models that are used for solving binary labeling problems in image processing and computer vision. Max-flow and min-cut interpretations of recent convex relaxations for Potts model have been made in [21]. However in these cases there is generally a duality gap and the original problems can only be solved approximately.

To our knowledge, this is the first work to address continuous max-flow and min-cut duality for problems where the labeling function can take several discrete values. Motivated by Ishikawa [6] and Yuan et al. [19], we interpret the problem as a continuous min-cut problem over a mixed continuous/discrete domain and build up a novel continuous max-flow model in analogy with Ishikawa’s discrete graph construction. The max-flow model can be used to produce global solutions of the original non-convex problem with discrete label values. In particular, it is shown that the max-flow model is dual to an exact convex relaxation of the original problem. Strict duality is also established between the max-flow model and the original problem, by extending the thresholding scheme of [8] from two to several regions.

A new continuous max-flow based algorithm is proposed. Its efficiency and convergence can be verified by standard convex optimization theories. The labeling function is updated as an unconstrained lagrange multiplier each iteration, and does not need to be projected back onto any feasible set. Numerical experiments show a significantly faster convergence rate than the primal-dual algorithm in Pock et al. [9, 10] and later [22], especially at high precisions.

A significantly extended version of this paper is available at [23], which contains extensions to other regularizers and more experiments. This conference paper contains some novelties which are not in [23], such as the discussion on saturated/unsaturated edges in Sect. 3.5.

2 Preliminaries: Ishikawa’s Work

Ishikawa [6] studied image labeling problems over an image graph which can be generally formulated as:

$$\begin{aligned} \min _{u \in U} \sum _{v \in \mathcal {P}} \rho (u_v,v) + \alpha \sum _{(v,w) \in \mathcal {N}}g(u_v - u_w) \, , \end{aligned}$$
(1)

where \(\mathcal {P}\) denotes a discrete image grid in 2-D or N-D; \(\mathcal {N} \subset \mathcal {P} \times \mathcal {P}\) is a neighborhood system on \(\mathcal {P}\); \(U = \{u :\; \mathcal {P} \mapsto L\) is the set of all feasible labeling functions, where \(L = \{\ell _1,...,\ell _n\}\}\). The potential prior \(g(x)\) in (1) is assumed to be convex and \(\rho \) is any bounded function, but not necessarily convex. It was shown by [6] that problems of the form (1) can be exactly optimized by finding the minimal cut over a specially constructed multi-layered graph \(G = (\mathcal {V}, \mathcal {E})\), where each layer corresponds to one label.

We adopt Ishikawa’s notations [6] in this work and study the simplified graph which uses \(n-1\) layers instead of \(n\) and avoids infinite capacities on the source edges [24] (see Fig. 1 for a 1-D example). The vertex set \(\mathcal {V}\) and the edge set \(\mathcal {E}\) are defined as follows:

$$\begin{aligned} \mathcal {V} \, = \, \mathcal {P} \times L \cup \{s,t\} \, = \{u_{v,i} \, | v \in \mathcal {P} \, ; \; i=1,...,n-1 \} \cup \{s,t\} \end{aligned}$$
(2a)
$$\begin{aligned} \mathcal {E} \, = \, \mathcal {E}_D \cup \mathcal {E}_C \cup \mathcal {E}_P \end{aligned}$$
(2b)

where the edge set \(\mathcal {E}\) is composed of three types of edges

  • Data edges \(\mathcal {E}_D= \bigcup _{v \in \mathcal {P}} \mathcal {E}_D^v\), where

    $$\begin{aligned} \mathcal {E}_D^v \,= \, (s,u_{v,1}) \cup \{(u_{v,i}, u_{v,i+1})\, | \, i=1, \ldots ,n-2 \} \cup (u_{v,n-1},t) \, . \end{aligned}$$
    (3)
  • Penalty edges \(\mathcal {E}_P = \bigcup _{v \in \mathcal {P}} \mathcal {E}_C^v\), where

    $$\begin{aligned} \mathcal {E}_C^v \, = \, (u_{v,1},s) \cup \{(u_{v,i+1}, u_{v,i}) \cup (t,u_{v,n-1}) \,| \, i=1, \ldots ,n-2 \} \, . \end{aligned}$$
    (4)
  • Regularization edges \(\mathcal {E}_R\):

    $$\begin{aligned} \mathcal {E}_R \, = \, \{(u_{v,i}, u_{w,j})\, | \, (v,w) \in \mathcal {N} \, , \; i,j = 1,...,n \} \, . \end{aligned}$$
    (5)

2.1 Anisotropic Total-Variation Regularization

When a pairwise prior \(g(u_v - u_w) = C(v,w) \left|u_v - u_w\right|\) is given, (1) corresponds to an anisotropic total-variation regularized image labeling problem, i.e.

$$\begin{aligned} \min _{u \in U} \sum _{v \in \mathcal {P}} \rho (u_v,v) + \alpha \sum _{(v,w) \in \mathcal {N}} C(v,w)\left|u_v - u_w\right| \, \end{aligned}$$
(6)

which is a discrete counterpart of the total-variation regularizer.

Fig. 1.
figure 1

1D illustration: (a) Legal cut, (b) Illegal cut. Severed edges are depicted as dotted arrows. The gray curve visualizes the cut. Vertices interior to the curve belongs to \(V_s\) while vertices exterior to the curve belongs to \(V_t\).

Now we define flow configurations over the graph (2a) and (2b) such that its max-flow corresponds to the minimizer of (6):

  • Capacity of source flows: the directed flow \(p_1(v)\) along each edge from the source \(s\) to the node \(u_{v,1}\) of the first layer, i.e. the edge \((s,u_{v,1})\), is constrained by

    $$\begin{aligned} p_1(v) \le \rho (\ell _1,v) \, , \quad \forall v \in \mathcal {P} \, ; \end{aligned}$$
    (7)
  • Capacity of flows between layers: the directed flow \(p_i(v)\) along each edge \((u_{v,i},u_{v,i+1})\) from the node \(u_{v,i}\) of the \(i\)-th layer to the node \(u_{v,i+1}\) of the \(i+1\)-th layer is constrained by

    $$\begin{aligned} p_i(v) \le \rho (\ell _i,v) \, , \quad \forall v \in \mathcal {P} \;\;\; i = 1,...,n-2 \end{aligned}$$
    (8)
  • Capacity of sink flows: the directed flow \(p_n(v)\) along each edge from the node \(u_{v,n-1}\) of the last layer to the sink \(t\) is constrained by

    $$\begin{aligned} p_n(v) \le \rho (\ell _n,v) \, , \quad \forall v \in \mathcal {P} \, ; \end{aligned}$$
    (9)
  • Capacity of spatial flows at each layer: the undirected flow \(q_i(v,w)\) of each edge \((v,w) \in \mathcal {N}\) at the layer \(i\), \(i=1, \ldots , n-1\), is constrained by

    $$\begin{aligned} |q_i(v,w)| \le C(v,w) \, ; \end{aligned}$$
    (10)

    this corresponds to the well-known anisotropic total-variation regularizer in case of a 4 nearest neighborhood system \(\mathcal {N}\);

  • Conservation of flows: flow conservation means that in-coming flows should be balanced by out-going flows at any node \(v \in \mathcal {P}\) of each layer \(i=1, ..., n-1\), i.e.

    $$\begin{aligned} \big ( \sum _{w:(w,v)\in \mathcal {N}} q_i(v,w) - \sum _{w:(v,w)\in \mathcal {N}} q_i(v,w) \big ) - p_i(v) + p_{i+1}(v)\, = \, 0 \, . \end{aligned}$$
    (11)

Since there is no lower bound on the flows (7)–(9), the flow on the penalty edges (4) can become arbitrarily large. This implies that each edge in the set \(\mathcal {E}_D^v\) which links the source and sink can only be cut once, i.e. illegal cuts as shown in Fig. 1(b) have infinite cost and are not allowed.

Therefore, the max-flow problem over the graph is to find the largest amount of flow allowed to pass from the source \(s\) to sink \(t\) through the \(n-1\) graph layers, i.e.

$$\begin{aligned} \max _{p,q} \; \sum _{v \in \mathcal {P}} \, p_1(v) \, \end{aligned}$$
(12)

subject to the flow constraints (7), (8), (9), (10) and (11).

Due to duality between the max-flow and min-cut problem [13], one can solve the max-flow problem and then extract a solution to the min-cut problem (6).

3 Multilabeling by Continuous Max-Flow and Min-Cut

Define the feasible set of functions as \(U = \{u \, : \, \varOmega \mapsto \{\ell _1,...,\ell _n\} \; \text {s.t.} \int _\varOmega |\nabla u| \le \infty \}\), where \(\ell _1 < ... < \ell _n\) are real numbers. The continuous counterpart of (1) can be formulated as

$$\begin{aligned} \min _{u \in U} \int _\varOmega \rho (u(x),x) \, dx + \int _\varOmega C(x) |\nabla u(x)| \, dx \, , \end{aligned}$$
(13)

where \(\rho \, : \mathbb {R} \times \varOmega \mapsto \mathbb {R}\) is any bounded function, not necessarily convex. The set \(U\) is a non-convex set of discrete valued labeling functions. This is in contrast to Pock et al. who considered a convex feasible set of continuous valued labeling functions. We show this problem can be regarded as a continuous min-cut problem by following the ideas of Ishikawa.

We start by rewriting (13) in terms of the upper level sets of \(u \in U\)

$$\lambda _i(x) = {\left\{ \begin{array}{ll} 1 \, , &{} \; \text{ if } u(x) > \ell _i \\ 0 \, , &{} \; \text{ if } u(x) \le \ell _i \end{array}\right. } \, , \forall x \in \varOmega \; \quad i = 1, \ldots , n-1 \, . $$

Let \(\lambda _0(x) = 1\) and \(\lambda _n(x) = 0\), \(\text {a.e.} \; x \in \varOmega \). Clearly, we have

$$\begin{aligned} 1 = \lambda _0(x) \ge \lambda _1(x) \ge \lambda _2(x) \ge ... \ge \lambda _{n-1}(x) \ge \lambda _n(x) = 0 \; \text {a.e.} \; x \in \varOmega . \end{aligned}$$
(14)

By the coarea formula, we have for any function \(u \in U\) that

$$ \int _\varOmega C(x)|\nabla u| \, dx \, = \, \sum _{i=1}^{n-1} \int _{\varOmega } C_i(x) |\nabla \lambda _i| \, dx \, , $$

where \(C_i(x) = (\ell _{i+1} - \ell _i)C(x), \; i=1,...,n-1\). In this work, we will mostly focus on the case where \(C(x) = \alpha \) is constant for simplicity.

Therefore, (13) can be equivalently rewritten as

$$\begin{aligned} \min _{\{\lambda _i\}_{i=1}^{n-1} \in \mathcal {B} } \,&\sum _{i=1}^{n} \int _\varOmega (\lambda _{i-1} - \lambda _i)\, \rho (\ell _i,x)\, dx \, + \, \alpha \sum _{i=1}^{n-1} (\ell _{i+1} - \ell _i)\int _\varOmega |\nabla \lambda _i| \, dx \end{aligned}$$
(15)

subject to the constraint (14), where the binary constraint \(\mathcal {B}\) is defined as

$$\begin{aligned} \mathcal {B} = \{\phi \, : \; \varOmega \mapsto \{0,1\}, \; \text {s.t.} \; \int _\varOmega |\nabla \phi | < \infty \} \end{aligned}$$
(16)

The problem (15) is obviously a nonconvex optimization problem due to the binary constraints (16).

After solving (15), the labeling function \(u\) can be recovered from \(\lambda _i\) by \( u\,=\,\sum _{i=1}^{n} (\lambda _{i-1} - \lambda _i) \ell _i \, \).

3.1 Primal Model: Continuous Max-Flow

In this section, we build up a max-flow model in continuous settings, which simulates Ishikawa’s graph configuration. It will be shown that solutions of (15) and (13) can be obtained by exploring the dual of this maximization problem.

To this end, we place \(n-1\) image domains \(\varOmega _i\), \(i=1,\ldots ,n-1\) with \(\varOmega _i = \varOmega \), layered in a sequential order between two terminals: the source \(s\) and the sink \(t\). The source \(s\) is linked to each image pixel \(x\) of the first layer \(\varOmega _1\) by an edge \(e_1(x)\); the same image pixel \(x\) between two sequential image layers \(\varOmega _{i-1}\) and \(\varOmega _{i}\), \(i=2,\ldots ,n-1\), is linked by the edge \(e_i(x)\); and the pixel \(x\) at the last layer \(\varOmega _{n-1}\) is also linked to the sink \(t\) by the edge \(e_{n}(x)\). Define flow functions \(p_i \, : \varOmega \mapsto \mathbb {R}\) corresponding to each edge function \(e_i\) , \(i=1,\ldots ,n\). Within each image layer \(\varOmega _i\), \(i=1,\ldots ,n-1\), the spatial flow functions are given by \(q_i \in C^ \infty (\varOmega )^N\), where \(N\) is the dimension of the image domain.

As a generalization of the discrete constraints (7)–(11), we now give constraints on flow functions \(p_i \in L^1(\varOmega )\), \(i=1,\ldots ,n\), and \(q_i \in {C^\infty }(\varOmega )^N\), \(i=1,\ldots ,n-1\)

$$\begin{aligned} \left|q_i(x)\right| \, \le \, C_i(x) \,&\text {for} \; x \in \varOmega \, , \;\;&i=1,\ldots , n-1 \, \end{aligned}$$
(17)
$$\begin{aligned} p_i(x) \, \le \, \rho (\ell _i,x) \,&\text {for} \; x \in \varOmega \, , \;\;&i = 1, \ldots , n \,\end{aligned}$$
(18)
$$\begin{aligned} \big ( {{\mathrm{div}}}\,q_i - p_{i} + p_{i+1}\big )(x) \, = \, 0 \,&\text {for} \; x \in \varOmega \, , \;\;&i=1,\ldots ,n-1 \, \end{aligned}$$
(19)
$$\begin{aligned} q_i \cdot n = 0 \;&\text {on} \; \partial \varOmega \, , \;\;&i=1,\ldots ,n-1 \, . \end{aligned}$$
(20)

Therefore, the continuous max-flow model, in analogue with Ishikawa’s discrete one (12), can be formulated by

$$\begin{aligned} \sup _{p,q} E^P(p) = \, \int _{\varOmega } \, p_1(x) \, dx \end{aligned}$$
(21)

subject to the constraints (17)–(20). In this work, we call (21) the primal model. Observe the maximization problem (21) is bounded above by \(\int _{\varOmega } \, \rho (\ell _1(x),x) \, dx\) due to the constraint (18).

3.2 Primal-Dual Model

By introducing multiplier functions \(\lambda _i(x)\), \(i=1,\ldots ,n-1\), to the linear equality constraints of flow conservation (19), we have the equivalent primal-dual model of (21):

$$\begin{aligned} \inf _{\lambda } \sup _{p, q} \quad&E(p,q;\lambda ) \, = \,\int _{\varOmega } \big \{ p_1 + \sum _{i=1}^{n-1} \lambda _i\big ( {{\mathrm{div}}}q_i - p_{i} + p_{i+1} \big )\big \}\, dx \end{aligned}$$
(22)

subject to (17), (18) and (20).

After rearrangement, the above primal-dual formulation (22) can be equivalently written as

$$\begin{aligned} \inf _{\lambda } \sup _{p, q} \quad&E(p,q;\lambda ) \, = \, \sum _{i=1}^{n} \int _{\varOmega } \, (\lambda _{i-1} - \lambda _{i}) p_{i} \, dx \, + \, \sum _{i=1}^{n-1} \int _{\varOmega } \, \lambda _i \,{{\mathrm{div}}}\, q_i \, dx \end{aligned}$$
(23)

subject to (17), (18) and (20).

3.3 Dual Model: Continuous Min-Cut

Now we show that optimizing the primal-dual model (23) over all the flow functions \(p\) and \(q\) leads to the equivalent dual model, i.e. the continuous min-cut model:

$$\begin{aligned} \inf _{\lambda } E^D(\lambda ) = \quad&\sum _{i=1}^{n} \int _{\varOmega } (\lambda _{i-1} - \lambda _{i}) \rho (\ell _i,x) \, dx \, + \, \sum _{i=1}^{n-1} \int _{\varOmega } C_i(x)\left|\nabla \lambda _i\right| \, dx \\ \text {s.t.} \quad&1 \,= \, \lambda _0(x) \, \ge \, \lambda _1(x) \, \ge \, \ldots \, \ge \, \lambda _{n-1}(x) \, \ge \,\lambda _n(x) \, = \, 0 \, , \quad \forall x \in \varOmega \, . \nonumber \end{aligned}$$
(24)

Optimization of Flow Functions. In this regard, we consider the optimization problem

$$\begin{aligned} f(v) \, = \, \sup _{w \le C} \, v \cdot w \, , \end{aligned}$$
(25)

where \(v\), \(w\) and \(C\) are scalars. When \(v < 0\), \(w\) can be arbitrarily large in order to maximize the value \(v \cdot w\), i.e. \(f(v) = + \infty \). Therefore, we must have \(v \ge 0\) so as to make the function \(f(v)\) meaningful and

$$ \left\{ \begin{array}{ll} \text {if} \; v = 0 \, , &{}\text {then}~w < C~\mathrm{and}~f(v)~\mathrm{reaches \;its \;maximum}~0 \\ \text {if} \; v > 0 \, , &{}\text {then}~w = C~\mathrm{and}~f(v)~\mathrm{reaches\; its\; maximum}~v \cdot C \end{array} \right. \, . $$

Therefore, we have

$$\begin{aligned} f(v) \, = \, \left\{ \begin{array}{ll} v \cdot C \, , &{} v \ge 0 \, , \\ \infty &{} v < 0 \, \, \end{array} \right. \, . \end{aligned}$$
(26)

The function \(f(v)\) given in (25) provides us with a prototype to maximize the flow functions \(p_i(x)\), \(i=1,\ldots ,n\), in the primal-dual model (23).

For each \(x \in \varOmega \), consider

$$\begin{aligned} f_i(x) \,= \, \sup _{p_i(x) \le \rho (\ell _i,x)} (\lambda _{i-1}(x) -\lambda _i(x) )\, p_i(x) \, , \quad i = 1, \ldots ,n \, . \nonumber \end{aligned}$$

In view of (26), we have

$$\begin{aligned} f_i(x) \, = \left\{ \begin{array}{ll} (\lambda _{i-1}(x) - \lambda _i(x)) \, \rho (\ell _i,x) \, , &{} \lambda _{i-1}(x) \ge \lambda _{i}(x) \, \\ \infty &{} \lambda _{i-1}(x) < \lambda _{i}(x) \, \end{array} \right. \, , \; i = 1, \ldots ,n \, . \end{aligned}$$
(27)

On the other hand, it is well known that for any \(\lambda _i \in BV(\varOmega )\)

$$\begin{aligned} \sup _{q_i} \int _\varOmega \lambda _i(x){{\mathrm{div}}}q_i(x) \, dx \,= \, \int _\varOmega C_i(x) |\nabla \lambda _i(x)| \, dx \, , \end{aligned}$$
(28)

when \(q_i\) is optimized over the set (17) and (20). In view of (27) and (28), maximizing (23) over all the flow functions \(p\) and \(q\) leads directly to the equivalent dual model (24). The constraints (14) must be satisfied for an optimal \(\phi \), otherwise the energy would be infinite, contradicting boundedness of the max-flow problem from above.

Note that a solution to the problem (24) exists since (24) is convex, lower semi-continuous and bounded from below and the constraints (24) are convex. Regarding existence of a solution to the max-flow problem (21), due to boundedness from above a maximizing sequence \(\{p^i,q^i\}_{i=1}^\infty \) exists to the problem (21). However, it may not admit a maximizing subsequence w.r.t. \(q^i\) which converges to a \(q^* \in C_\nu \) because the supremum may be attained for a discontinuous \(q^*\) which lies in the closure of the set of smooth vector fields \(C^\infty (\varOmega )^N\) and not in the set itself. In this paper we still speak of \((p^*,q^*)\) as a primal-dual solution even though \(q^*\) may be discontinuous to ease readability. A more formal presentation can be given if arguments involving \((p^*,q^*)\) are replaced with \(\lim _{i \rightarrow \infty } \{p^i,q^i\}_{i=1}^\infty \) for the maximizing sequence \(\{p^i,q^i\}_{i=1}^\infty \).

3.4 Exact and Global Optimums

The functions \(\lambda _i\), \(i=1 \ldots n-1\), of the convex model (24) are relaxed to take values in the convex set \([0,1]\), which is in contrast to the binary constraints of the original nonconvex formulation (15). The following proposition establishes a primal-dual relationship between the max-flow problem (21) and the original non-convex problem (15). By solving the max-flow problem (21) a set of optimizers to the original binary constrained problem (15) can be obtained by thresholding each layer function \(\lambda _i^*\).

Proposition 1

Assume \(\phi ^*\) is a minimizer of (24) and let \(\{t_i\}_{i=1}^{n-1}\) be a sequence such that \(0 < t_1 = t_2 = ... = t_{n-1} \le 1\). Define the level sets

$$\begin{aligned} S^{t_i}_i = \{x \, : \; \lambda _i^*(x) \ge t_i\} \, , \quad i \, = \, 1 \ldots n-1 \, \end{aligned}$$
(29)

and let \(\lambda _i^{t_i}(x)\) be the characteristic function of \(S^{t_i}_i\), i.e.

$$ \lambda _i^{t_i}(x) \, := \, \left\{ \begin{array}{ll} 1 \, , \,&{} {\lambda _i^{*}(x) \ge t_i} \\ 0 \, , \,&{} {\lambda _i^{*}(x) < t_i} \end{array} \right. \, . $$

then the set of binary functions \(\lambda _i^{t_i}(x)\), \(i=1,\ldots ,n-1\), is a global optimum of the original nonconvex multi-labeling problem (15). Furthermore, if \((p^*,q^*; \lambda ^*)\) is any optimal primal-dual solution of (22), the cut given by \(\lambda _i^{t_i}(x)\), \(i=1,\ldots ,n-1\), has an energy equal to the max flow energy in (21), i.e.

$$E^D(\lambda ^t) = \int _{\varOmega } \, p_1^{*}(x) \, dx = E^P(p^*) . $$

Proof

Since \(p_i^{*}\), \(i=1,...,n\) and \(q_i^{*}, \lambda _i^{*}\), \(i=1,...,n-1\) is a global optimum of the primal-dual problem (22), then \(p_i^{*}\), \(q_i^{*}\) optimize the dual problem (21) and \(\lambda ^{*}_i(x)\) optimizes (24).

For simplification reasons, define \(t_0 = 0\) such that \(S^{t_0}_0 = \varOmega \). Since \(l_i\) is increasing with \(i\) we must have

$$S^{t_0}_0 \supseteq S^{t_1}_1 \supseteq S^{t_2}_2 \supseteq ... \supseteq S^{t_{n-1}}_{n-1}$$

Since the variables are optimal, the flow conservation condition (19) must hold, i.e

$$ {{\mathrm{div}}}q^{*}_i(x) - p_i^{*}(x) + p_{i+1}^{*}(x) \, = \, 0 \, , \;\;\; \text {a.e.} \; x \in \varOmega , \; i=1,...,n-1. $$

The proof is given by induction in \(S^{t_i}_i\). For any \(k \in \{1,...,n-1\}\) define the function

$$\begin{aligned} E^k \, =&\sum _{i=1}^{k}\int _{S^{\ell _{i-1}}_{i-1} \backslash S^{t_i}_i} \, \rho (\ell _i,x) \, dx + \int _{S^{\ell _{k}}_{k}} p^*_{k+1}(x) \, dx + \alpha \sum _{i=1}^{k} L_{S^{t_i}_i} \, \end{aligned}$$

where \(L_{S^{t_i}_i}\) is the length \(|\partial S^{t_i}_i \backslash (\partial S^{t_i}_i \cap \partial \varOmega )|\). We will prove \(E^k = E^P(p^*)\) for any \(k \in \{1,...,n-1\}\) and start by considering \(k=1\). By the formula (27), it follows that

$$p^*_1(x) = \rho (\ell _1,x), \;\;\; \text {for any point} \;\; x \in \varOmega \backslash S^{t_1}_1 = S^{t_0}_0 \backslash S^{t_1}_1$$

This, together with the fact that

$$p^*_1(x) = p^*_2(x) + {{\mathrm{div}}}q^*_1(x), \;\;\; \text {a.e.} \; x \in S^{t_1}_1$$

implies that the total max-flow energy defined in (21) can be written

$$\begin{aligned} E^P(p^*) \, =&\int _{\varOmega \backslash S^{t_1}_1} \, \rho (\ell _1,x) \, dx + \int _{S^{t_1}_1} \, \big (p^*_2(x) + {{\mathrm{div}}}q^{*}_1(x)\big ) \, dx\\ =&\int _{\varOmega \backslash S^{t_1}_1} \, \rho (\ell _1,x) \, dx + \int _{S^{t_1}_1} \, p^*_2(x) \, dx + \int _{S^{t_1}_1} \, {{\mathrm{div}}}q^{*}_1(x) \, dx \\ =&\int _{S^{t_0}_0 \backslash S^{t_1}_1} \, \rho (\ell _1,x) \, dx + \int _{S^{t_1}_1} \, p^*_2(x) \, dx + \alpha L_{S^{t_1}_1} \, = E^1 \end{aligned}$$

The last term follows because

$$\begin{aligned} \int _{S^{\ell _i}_i} \, {{\mathrm{div}}}q^{*}_i(x) \, dx \, = \, \int _{\varOmega } \, \lambda _i^* {{\mathrm{div}}}q^*_i \, dx \, = \, \alpha \int _\varOmega |\nabla \lambda _i^{\ell _i}|\, dx = \alpha \left|\partial S^{\ell _i}_i \backslash (\partial S^{t_i}_i \cap \partial \varOmega )\right| \, . \end{aligned}$$
(30)

where the second equality is due to Prop. 4 of [25]. Note that the boundary length \(L_{S^{t_1}_1}\) is necessarily finite, otherwise the energy would be infinite, contradicting boundedness from above.

Assume now that \(E^k = E^P(p^*)\) for some \(k \in \{1,...,n-2\}\), we will show this implies \(E^{k+1} = E^P(p^*)\)

$$\begin{aligned} E^P(p^*) = E^k \, =&\sum _{i=1}^{k-1}\int _{S^{\ell _{i-1}}_{i-1} \backslash S^{t_i}_i} \, \rho (t_i,x) \, dx + \int _{S^{\ell _{k-1}}_{k-1}} p^*_k(x) \, dx + \alpha \sum _{i=1}^{k-1} L_{S^{t_i}_i} \, . \end{aligned}$$

By the definition (29) it follows that \(\lambda _{k-1}(x) - \lambda _k(x) > t_{k-1} - t_k = 0\) for all \(x \in S^{t_k-1}_{k-1} \backslash S^{t_k}_{k}\). Therefore, by formula (27), for any point \(x \in S^{t_k-1}_{k-1} \backslash S^{t_k}_{k}\) we must have \(p^*_{k}(x) = \rho (\ell _{k},x)\). Combining this with the fact that

$$p^*_k(x) = p^*_{k+1}(x) + {{\mathrm{div}}}q^*_k(x), \; \text {a.e.} \; x \in \varOmega $$

the above expression can be written

$$\begin{aligned} E^P(p^*) = E^k \, =&\sum _{i=1}^{k-1}\int _{S^{\ell _{i-1}}_{i-1} \backslash S^{t_i}_i} \, \rho (t_i,x) \, dx + \int _{S^{\ell _{k-1}}_{k-1} \backslash S^{\ell _{k}}_{k}} \rho (\ell _{k},x) \, dx \\&+ \int _{S^{\ell _{k}}_{k}} p^*_{k+1}(x) \, dx + L_{S^{t_k}_k} + \alpha \sum _{i=1}^{k-1} L_{S^{t_i}_i} \, = E^{k+1}. \nonumber \end{aligned}$$
(31)

Hence, we can conclude that also \(E^{n-1} = E^P(p^*)\). By noting from (27) that for all \(x \in S^{t_{n-1}}_{n-1}\) we must have \(p^*_n(x) = \rho (\ell _n,x)\), the total max flow energy defined in (21) can be written

$$\begin{aligned} E^P(p^*) = E^{n-1} \, =&\int _{\varOmega \backslash S^{t_1}_1} \, \rho (\ell _1,x) \, dx + \sum _{i=2}^{n-1}\int _{S^{\ell _{i-1}}_{i-1} \backslash S^{t_i}_i} \, \rho (t_i,x) \, dx\\ +&\int _{S^{t_{n-1}}_{n-1}} \rho (\ell _n,x) \, dx + \alpha \sum _{i=1}^{n-1} L_{S^{t_i}_i} \, \nonumber \end{aligned}$$
(32)

By writing this expression in terms of the characteristic functions \(\lambda ^{t_i}_i\) of each region \(S^{t_i}_i\), we get

$$\begin{aligned} E^P(p^*) = \sum _{i=1}^{n} \, \int _\varOmega (\lambda _{i-1}^{\ell _{i-1}}(x) - \lambda _i^{t_i}(x))\, \rho (t_i,x)\, dx \, + \, \alpha \sum _{i=1}^{n-1} \, \int _\varOmega |\nabla \lambda _i^{t_i}| \, dx = E^D(\lambda ^\ell ) \end{aligned}$$

which is exactly the primal model energy (24) of the set of binary functions \(\lambda ^{t_i}_i\). Therefore, by duality between the max-flow problem (21) and the convex relaxed problem (24), \(\lambda ^{t_i}_i\) must be a global minimum of the min-cut problem (24) and therefore also a global minimum of the original problem (15).

3.5 ‘Saturated’/‘Unsaturated’ Edges

In the discrete setting, it is well known that the minimum cut severs edges that are saturated in the max-flow problem. This section attempts to give a variational explanation to the phenomena for the continuous max-flow and min-cut problems studied in this work. Let \(\lambda ^*_1,...,\lambda ^*_{n-1}\) be optimal to the dual problem (24). Assume that for some \(x \in \varOmega \) and \(i \in 1,...,n\) \(\lambda _{i}(x) > t > \lambda _{i+1}(x)\), where \(t \in (0,1)\). Thresholding at \(t\) will generate the binary solution \(\lambda _0(x) = ... = \lambda _i(x) = 1\) and \(\lambda _{i+1}(x),...,\lambda _{n}(x) = 0\). Therefore the cut generated by the binary function ‘severs’ the edge \(e_{i+1}(x)\) between layer \(i\) and \(i+1\). Since \(\lambda _{i}(x) > \lambda _{i+1}(x)\) it follows by (27) that the optimal flow function must satisfy \(p^*_{i}(x) = \rho (\ell _i,x)\), i.e. the edge \(e_{i}(x)\) is saturated. Assume on the contrary that for some \(x \in \varOmega \) and \(i \in 1,...,n\) \(p^*_i(x) < \rho (\ell _i,x)\). In this case \(\lambda ^*_i(x) = \lambda ^*_{i+1}(x)\), otherwise \(p^*_i(x) \) would not be optimal since increasing \(p^*_i(x) \) would also increase the energy. Consequently, for any threshold level \(t \in (0,1]\), \(\lambda ^ t_i(x) = \lambda ^ t_{i+1}(x)\), i.e. the edge \(e_i(x)\) is not severed by the cut.

Similar interpretations of the spatial flow can be made by using the identity

$$\begin{aligned} \int _\varOmega \lambda \,{{\mathrm{div}}}\, q \, dx = \int _\varOmega q \cdot \nabla \lambda \, dx \end{aligned}$$
(33)

If for some \(x \in \varOmega \) and a neighborhood \(\mathcal {N}_\epsilon (x) = \{y \in \varOmega \, : \; ||y-x|| < \epsilon \}\), \(|q^ *_i(y) | < \alpha \) for all \(y \in \mathcal {N}_\epsilon (x)\), we say the spatial flow is unsaturated in \(\mathcal {N}_\epsilon (x)\). Then \(\lambda _i\) is constant in \(\mathcal {N}_\epsilon (x)\). Consequently, for any threshold \(t \in (0,1]\), \(\lambda ^ t_i(y)\) is either identically \(0\) or \(1\) in \(\mathcal {N}_\epsilon (x)\) and the cut will not sever the spatial domain \(\mathcal {N}_\epsilon (x)\) at the \(i\)-th layer. Assume \(\nabla \lambda _i \ne 0\) in some domain \(S \subset \varOmega \), then by (33) \(|q^*_i(x)| = |\alpha \nabla \lambda _i^*/|\nabla \lambda _i^*|| = \alpha \) a.e. \(x \in S\). Consequently, for any threshold \(t \in (0,1]\), \(|q^*_i| = \alpha \) whenever \(\nabla \lambda ^ t_i \ne 0\) in the distributional sense.

3.6 Extension to Continuous Labelings

Assume now that the feasible label values are constrained to the continuous interval \([\ell _{\min }, \ell _{\max }]\). As the number of labels goes to the limit of infinity, the max-flow problem (21) with the flow constraints (17)–(19) turns into

$$\begin{aligned} \sup _{p,q} \quad&\,\int _{\varOmega } p(\ell _{\min },x) \, dx \end{aligned}$$
(34)
$$\begin{aligned} \text {s.t.} \quad&\, p(\ell , x) \, \le \, \rho (\ell , x) \, , \quad \left|q(\ell , x)\right| \, \le \, \alpha , \quad&\forall x \in \varOmega , \;\;\; \forall \ell \in [\ell _{\min }, \ell _{\max }] \end{aligned}$$
(35)
$$\begin{aligned}&{{\mathrm{div}}}_x q(\ell , x) + \partial _{\ell } \, p(\ell , x) \, = \, 0 \, , \quad&\text {a.e.} \; x \in \varOmega , \;\;\; \ell \in [\ell _{\min }, \ell _{\max }].\, \end{aligned}$$
(36)

where \(\ell \in [\ell _{\min }, \ell _{\max }]\) is the set of all feasible continuous-valued labels. The flow functions \(p(x)\) and \(q(x)\) are defined in the one dimensional higher space \([\ell _{\min }, \ell _{\max }] \times \varOmega \). By carrying out similar steps as in the last section, the following dual problem can be derived

Proposition 2

The max-flow model (34) with continuous label-values is dual / equivalent to the following min-cut model over \([\ell _{\min }, \ell _{\max }] \times \varOmega \):

$$\begin{aligned} \min _{\lambda (\ell , x) \in [0,1]} \; \int _{\ell _{\min }}^{\ell _{\max }} \int _{\varOmega } \big \{ \alpha \left|\nabla _x \lambda \right| \, - \, \rho (\ell , x)\partial _{\ell }\, \lambda (\ell , x)\big \} \, dx d\ell \,\nonumber \\ \!\!+ \int _{\varOmega } (1-\lambda (\ell _{\min },x))\rho (\ell _{\min },x) + \lambda (\ell _{\max },x)\rho (\ell _{\max },x)\, dx \end{aligned}$$
(37)

subject to

$$\begin{aligned} \partial _{\ell }\, \lambda (\ell , x) \, \le \, 0 \, , \quad \lambda (\ell _{\min }, x) \le 1 \, , \quad \lambda (\ell _{\max }, x) \ge 0 \,, \quad \forall x \in \varOmega , \;\;\; \forall \ell \in [\ell _{\min }, \ell _{\max }] . \end{aligned}$$
(38)

The proof can be found in [23].

The labeling function \(u(x)\) can finally be reconstructed from the binary function \(\lambda (\ell , x)\) by \( u(x) \, = \, \ell _{\min } + \int _{\ell _{\min }}^{\ell _{\max }} \lambda (\ell , x) \, d \ell \,\).

In [9], Pock et al. gave a similar formulation of continuous labeling problems, as the search for a binary function defined over \([\ell _{\min }, \ell _{\max }] \times \varOmega \), which minimizes

$$\begin{aligned} \min _{\lambda (\ell , x) \in \{0,1\}} \, \int _{\ell _{\min }}^{\ell _{\max }} \int _{\varOmega } \big \{\alpha \left|\nabla _x \lambda \right| + \rho (\ell , x)\left|\partial _{\ell }\lambda (\ell , x)\right| \big \}\, dx d\ell \, . \end{aligned}$$
(39)

subject to

$$\begin{aligned} \lambda (\ell _{\min }, x) = 1 \, , \quad \lambda (\ell _{\max }, x) = 0 \, , \quad x \in \varOmega \end{aligned}$$
(40)

In order to solve this non-convex binary problem, the convex relaxation of [8] was adopted by minimizing over \(\lambda (x,\ell ) \in [0,1]\). By applying the thresholding result of [8], binary optimizers could be obtained by thresholding the computed result.

Some differences can be observed between our formulation (37), (38) and the formulation (39), (40): The constraint \(\partial _{\ell }\lambda (\ell , x) \le 0\) is not forced explicitly in [9]. However, it turns out the presence of the absolute value of the term \(\rho (\ell ,x)\left|\partial _{\ell }\lambda (\ell , x)\right|\) forces this constraint to hold. Observe that if \(\rho (\ell ,x) < 0\) is negative, the formulation of (39) is non-convex, and can therefore not be solved globally. This is in contrast to our formulation (37), which is convex also in this case. The functional (39) could be made convex by adding a sufficiently large number to the data term at every \(x \in \varOmega \). In the more recent work of Pock et al. [10], a more strict derivation resulted in a little different formulation. In this formulation, the integrand of the energy functional is infinite if \(\partial _{\ell }\lambda (\ell , x) \le 0\), hence this constraint is forced to hold. Their derivations rely heavily on results from the theory of calibrations [12] and cartesian currents [26, 27]. Label values ranged over the whole real line \(\mathbb {R}\) was assumed, which required to impose limits at infinity: \(\lim _{\ell \mapsto + \infty } \lambda (\ell ,x) = 0\) and \(\lim _{\ell \mapsto - \infty } \lambda (\ell ,x) = 1\).

figure a

We eventually stick to a finite label value set in practice. After discretization, the label space also becomes discrete in [10]. However, it has not been proven if all properties, such at the thresholding scheme and monotonicity constraint hold exactly after discretization. In contrast, these properties were proved to hold exactly for our model with discrete label values developed in Sect. 3.

Last but not the least, a primal-dual algorithm was proposed in [10], which consists of iteratively taking ascent steps over the dual variables \(p\) and \(q\) and descent step over the primal variable \(\lambda \), followed by projections of all the variables onto the nearest points of the feasible sets iteratively until convergence.

4 Algorithms

4.1 Multiplier-Based Max-Flow Algorithm

In this section, it is assumed that the image domain \(\varOmega \) is discrete and the differential operators are discretized, such that the optimization problems become finite dimensional. We stick to the continuous notation, using \(\int , \nabla \) and \(\div \) to ease readability. As stated in the previous section, the energy formulation of (22) is just the Lagrangian function of (21) and \(\lambda _i\), \(i=1,\ldots ,n-1\), are the multiplier functions. To this end, we define its respective augmented Lagrangian function as

$$\begin{aligned} L_c(p,q, \lambda ) \, := \, \int _\varOmega p_1 \, + \sum _{i=1}^{n-1} \lambda _i( {{\mathrm{div}}}p_i + p_{i+1} - p_i) - \frac{c}{2} |{{\mathrm{div}}}p_i + p_{i+1} - p_i|^2 \, dx , \end{aligned}$$
(44)

where \(c > 0\).

We propose an algorithm for the continuous maximal flow problem (21) based on the augmented Lagrangian method [29] , see Algorithm 3.6. Algorithm 3.6 is an example of an alternating direction method of multipliers, where (44) is maximized alternatively with respect to each variable \(p_i,q\), followed by an updating of the Lagrange multipliers \(\lambda _i\), \(i=1,\ldots ,n-1\) at each iteration. For the two-label case, a similar flow-maximization scheme for the continuous min-cut problem was proposed in [19, 20].

Instead of solving the sub-problem (41) iteratively by the projected-gradient algorithm [28], an inexact solution can be obtained by the linearization:

$$\begin{aligned} q_i^{k+1} = \varPi _\alpha \bigg ( q_i^k + c\nabla ({{\mathrm{div}}}q_i^k + p_{i+1}^k - p_{i}^{k+1} - \lambda _i^k/c). \bigg ) \end{aligned}$$
(45)

where \(\varPi _\alpha \) is the projection onto the convex set \(C_\alpha =\{ q \ | \Vert q\Vert _\infty \le \alpha \}\). There are extended convergence results for such a linearization for closely related problems [30].

5 Numerical Experiments

In this work, we focus on applications to image segmentation and stereo reconstruction. Comparisons are made to the discrete approach [6] and the primal-dual algorithm of [9].

Fig. 2.
figure 2

(a) Ground truth, (b) input, (c) Rescaled labeling function before threshold, (d) Rescaled labeling function after thresholding each \(\lambda _i\) at \(0.5\).

Fig. 3.
figure 3

(a) Input image damaged by impulse noise; (b) reconstructed labeling function with non-convex data term (47) before threshold, (c) labeling function after thresholding each \(\lambda _i\) at \(0.5\), (d) reconstructed labeling function with convex data term (46) and \(\beta = 1\).

In case of image segmentation we assume \(\ell _i = i\), \(i=1,...,n\) and \(n\) is the number of regions. \(\rho (i,x)\) is the data cost of assigning pixel \(x\) to region \(i\). One possibility is

$$\begin{aligned} \rho (i,x) = |I(x) - c_i|^\beta , \;\;\; i=1,...,n \end{aligned}$$
(46)
Fig. 4.
figure 4

(a) Input, (b) Labeling function before threshold (c) Labeling function after thresholding each \(\lambda _i\) at \(0.5\).

where \(I\) is the input image and \(c_i\) is the average intensity value of region \(i\). They are assumed to be fixed in this work. Such a data term is convex for \(\beta \ge 1\) and non-convex for \(\beta < 1\). Results with \(\beta = 2\) are shown in Figs. 2, 4. We also demonstrate image segmentation with a non-convex data term in Fig. 3. The ground truth image from Fig. 2(a) has been damaged by impulse noise in Fig. 3(a). More specifically, \(70 \%\) of the pixels have been randomly selected and given a random number between \(0\) and \(255\) (max gray value). For this type of noise, the convex data terms does not perform well, as shown in Fig. 3(d) where we have selected (46) with \(\beta = 1\). Instead the following non-convex data term can be used

$$\begin{aligned} \rho (i,x) \, := \, \left\{ \begin{array}{ll} 0 \, , &{} \text {if} \; i = \text {argmin}_k \, |I(x) - c_k|\\ 1 \, , &{} \text {else} \end{array} \right. \, . \end{aligned}$$
(47)

In the stereo application we are given two color images \(I_L\) and \(I_R\) of a scene taken from horizontally slightly different viewpoints and would like to reconstruct the depth map \(u\). The quality of the matching between \(I_L\) and \(I_R\) for a depth value \(u\) is measured by using the following \(\rho \) in the data term of (13)

$$\begin{aligned} \rho (u,x) = \sum _{j=1}^3 |I_L^{j}(x) - I^j_R(x + (u,0)^T)|. \end{aligned}$$
(48)

Here \(I^j(x)\) denotes the \(jth\) component of the color vector \(I(x)\). The above data term (48) is obviously highly non-convex. The results on a standard example are shown in Fig. 5, where comparison are also given [10] and graph cut with a neighborhood system of 4 and 8. Graph cut produces a single non-unique solution which is shown in Fig. 5(f) and (g) with 4 and 8 neighbors respectively. As we see, such solutions suffer from metrication artifacts because of the discrete grid bias.

Fig. 5.
figure 5

Stereo depth estimation.

Table 1. Iteration counts for each experiment. Number of iterations to reach an energy precision of \(10^{-3}\) and \(10^{-4}\) are shown. PD = Primal-dual. Proposed 1 stands for Algorithm 1 where the subproblem is solved by 5 iterations of Chambolle’s algorithm each outer iteration (indicated by the number in the parenthesis). Proposed 2 stands for Algorithm 1 with the subproblems solved inexactly in one step through the linearization (45).
Table 2. CPU time in seconds for each experiment for reaching an energy precision of \(10^{-3}\) and \(10^{-4}\). PD = Primal-dual. Proposed 1 stands for Algorithm 1 where the subproblem is solved by 5 iterations of Chambolle’s algorithm each outer iteration (indicated by the number in the parenthesis). Proposed 2 stands for Algorithm 1 with the subproblems solved inexactly in one step through the linearization (45).
Table 3. Iteration counts for stereo experiment. Number of iterations to reach an energy precision of \(10^{-4}\), \(10^{-5}\) and \(10^{-6}\) are shown. PD = Primal-dual.

Iteration counts for all experiments are presented in Table 1 and CPU times are shown in Table 2. The two variants of Algorithm 1 are evaluated against the primal-dual method of Pock et al. [10]. The relative energy precision at iteration \(i\) is given by

$$\begin{aligned} \varepsilon = \frac{E^i - E^*}{E^*}, \end{aligned}$$
(49)

where \(E^i\) is the energy at iteration \(i\) and \(E^*\) is the final energy. A good estimate of \(E^*\) is obtained by using a huge amount of iterations of each method and each experiment. The table shows how many iterations are required to reach an energy precision of \(10^{-3}\) and \(10^{-4}\). Our algorithms are implemented with a mimetic finite difference spatial discretization [31, 32]. In order to make the comparison as accurate as possible, the primal-dual algorithm [10] is also implemented with such a mimetic finite difference discretization, although a slightly different forward scheme for the gradient and backward scheme for the divergence was used in [10].

The first variant of Algorithm 3.6 solves the subproblem (41) iteratively by Chambolle’s algorithm [28]. Since the previous solution is available as a good initialization, not many iterations of this algorithm is required. In our experiments, 5 inner iterations was used for each step. Increasing the number of inner iterations beyond 5 did not seem to have any impact on the convergence rate in our experience.

The primal-dual method of [10] avoids the inner problem, but as we see requires significantly more iterations to reach the same energy precisions. Our algorithm also requires less total number of iterations (inner times outer iterations). The difference becomes progressively clearer with higher energy precision. For the stereo example, which is by far most difficult computationally, our approach reached an energy precision of \(\epsilon < 10^{-5}\) after \(1310\) iterations, \(\epsilon < 10^{-6}\) after \(1635\) iterations and \(\epsilon < 10^{-7}\) after \(2340\) iteration. The primal-dual algorithm [10] failed to ever reach an energy precision of \(10^{-5}\) or lower within our predetermined number of maximum iterations (30000). We believe this difference is due to the fact that our approach avoids the iterative projections of the labeling function and hence progresses in the exact steepest descent direction every iteration.

The second variant of the Algorithm 1 instead computes an inexact solution to (41) through the linearization (45) and hence avoids the inner iterations. However, the penalty parameter \(c\) must be set lower to maintain convergence, hence more outer iterations are required. Overall it converges a little faster than the first variant and outperforms the primal-dual algorithm [10] for all the experiments.

Comparison to discrete graph cut [33] is more complicated. Our algorithms are implemented in matlab, in contrast to the optimized c++ discrete max-flow implementation of [33]. Our algorithm consists mainly of floating point matrix and vector arithmetic and is therefore highly suited for massive parallel implementation on GPU. Traditional max-flow algorithms have a much more serial nature, which makes them more dependent on an efficient serial CPU. In the near future, hardware improvements are also expected to be largely of the parallel aspect. Hence, we see our work as more suited for the current and future generation of hardware.

6 Conclusions

In this paper we proposed and investigated a novel max-flow formulation of multilabelings in the continuous setting. It is a direct mapping of Ishikawa’s graph-based configuration to the continuous setting. We proved the maximization problem is dual to an equivalent min-cut formulation by variational analysis. In addition, we proposed a new and reliable multiplier-based max-flow algorithm with convergence that can verified by optimization theories, which was demonstrated to significantly outperform eariler approaches. Due to its continuous formulation, the algorithm can easily be speeded up by a multigrid or parallel implementation, in contrast to graph-based methods. The memory requirement is also not as strict.