1 Introduction

Geometric quantities are particularly useful as regularizers in low-level image analysis, especially when little prior information is known about the shape of interest. Length penalization is a well-behaved, general-purpose regularizer, and many models in the literature make use of it, from active contours [8] to level-set formulations [26, 27]. Discrete graph-based variational models have been particularly successful to incorporate length penalization as a penalizer while keeping the ability to extract a global optimum [2, 5].

However, length regularization shows limitations when segmenting small or thin and elongated objects, as it tends to shrink solutions or yields disconnected solutions. Such drawbacks can be problematic in image segmentation, image restoration or image inpainting. It is thus rather natural to consider curvature (especially squared curvature) as a potential regularizer. Energies involving both length and squared curvature are often called the Elastica model (that dates back to Euler). The Willmore energy is its n-dimensional extension. These were brought to attention in computer vision by Mumford [31]. A similar concept is also present in the second-order regularizer of the original snake model [20]. Indeed, an explicit curvature term was often employed in early level-set methods, like [3, 26, 27] for segmentation or inpainting. However, Caselles et al. in [8] observed that these terms as employed were not geometric, in the sense that they depend on the discretization parameters.

1.1 Existing works

The use of curvature for surface regularization in the case of the Willmore energy was studied in [4]. Although the results look promising, the authors used an inexact linearized version of the square curvature at every step.

One of the first successful uses of curvature in image processing is the inpainting algorithm described in [30], where authors evaluate the absolute curvature along the level lines of a simply connected image to reconstruct its occluded parts. The non-intersection property of level lines induces the construction of an efficient dynamic programming algorithm. The curvature energy is nevertheless only coarsely approximated. We may quote [9] as another geometric inpainting method involving Elastica, which is optimized with level sets.

In image segmentation, Zehiry and Grady have shown that injecting curvature as a regularizer can help recover thin and elongated objects [15]. Similarly, Schoenemann et al. [38] have considered the Elastica energy in their ratio-based image segmentation approach. In [39], Schoenemann et al. extend ratio-cut approaches to segmentation with a curvature term. Their global optimization framework shows the considerable advantages of using such regularizer in common binary image segmentation tasks. However, the time complexity of their algorithm is prohibitive, even if curvature is again coarsely approximated. In a more recent work, Lim et al. propose an edge-weighted Willmore-like flow to segment vertebra [25] using a level-set formulation. They observe the improved results compared to [8] and other perimeter-based priors, but observe that a shape prior and a close-enough initialization must also be provided to achieve good results.

In fact, it is still a very challenging task to efficiently handle curvature in the context of image segmentation. State-of-the-art methods are in practice difficult to optimize and do not scale easily [15, 32, 38, 40]. In order to achieve reasonable running times, such approaches make use of coarse curvature estimations for which the approximation error is often unknown. In the segmentation method of [38], for example, the curvature is estimated by a simple formula involving angles and edge lengths at polygon vertices. The quality of the estimation is highly influenced by the number of vertices and where they are located along the curve. More importantly, in this estimator, the curve representation lies in the continuous domain and the properties of such estimator are not immediately valid when the curve is translated to the digital domain. In particular, it is not straightforward to include an image data term in this model without adding extra assumptions about the curve representation. One could use a spline approximation with exact curvature information, and the same kind of approach would have the same pros and cons. Improving the quality of the curvature estimator has an important impact on the accuracy of the results, but we should use the appropriate tools to evaluate its quality. In the context of measurements in digital data, this tool is multigrid convergence.

Segmentation methods using Elastica or Willmore energies can be formulated via the computation of the corresponding flow, i.e., following its gradient. Well-founded level-set methods involving Willmore energy [14] have been used in medical image segmentation [25]. Another level-set formulation similar to the Chan-Vese model was proposed in [42]. Willmore flow can also be carried out by phase-field models (see [7]), but they are less suited to image segmentation since interfaces are blurred in such models. Threshold dynamics can also be considered for such energies [18] and has been proposed for image disocclusion [17]. However, flow-based methods will always deliver a local minimum, which heavily depend on initialization for good results. A similar approach is to formulate the Elastica as a regularization in a variational framework and use operator splitting to optimize it. Tai et al. [41] and more recently Deng et al. [13] have proposed this approach, which is well-suited to inverse problem solving. The problem remains non-convex, and the nature of the converged iterates is not studied, i.e., how close they are to a global optimum.

In contrast, researchers have sought convex relaxation of curvature-related formulation. In [19], Goldluecke and Cremers introduce the Total Curvature, based on the Menger–Melnikov curvature of a measure. The formulation is non-convex but can be relaxed into a well-correlated but non-tight convex one. The relaxed formulation can be used to solve inverse problems including segmentation. However, computing a solution is still expensive, requiring significant parallelization efforts with a GPU to achieve acceptable computing times. In [6], Bredies et al. study a convex approximation of the Elastica energy, using a lifting scheme. Thanks to the lifting, it is not restricted to binary sets. It shows the promising results and can be used for image restoration in addition to inpainting and segmentation. The scheme is, however, neither tight nor exact and required four dimensions for a 2D approximation. In addition, it does not carry over to nD. It also remains computationally expensive.

Discrete approaches have been used to tackle the Elastica difficulties. As noted before, [38] have linearized a mesh-based approach, which was expanded upon by [40], while [15] have used a binary solver for a non-submodular approach on a regular square grid of pixels, which was expanded in [16]. In [32], a specific solver is developed to yield a better solution to an extension of the formulation in [15].

1.2 Motivation

In spite of the multiplicity of existing approaches, it seems none is currently completely satisfactory. While the sought solution is expected to be very smooth based on geometrical expectations, the mathematical properties of the related operators make this smoothness elusive. Non-convexity and the presence of high-order terms make Elastica difficult to optimize and to approximate, let alone compute efficiently. The computed solutions in the literature are rarely evaluated, and the complexity of the proposed solution is often high.

Consequently, we are interested in studying purely discrete variational segmentation models involving an Elastica energy. Recently, new estimators of curvature along digital contours have been proposed [11, 36, 37] with the desirable multigrid convergence property. This motivates us to propose models in which they can be used successfully.

In this paper, we propose to investigate the use of the digital integral invariant curvature estimator [11] in a discrete variational segmentation framework. More precisely, we show how to incorporate it in a digital flow minimizing its squared curvature. This model is expressible as a discrete combinatorial optimization model, which we solve using a classical variant of QPBO algorithm [35]. Our solution does not compute a global solution to the whole domain but a global solution to a band surrounding a given shape. Our approach thus leads to a digital flow similar to continuous Elastica, as shown by our experiments. We present an application of this model as a post-processing step in a segmentation framework and demonstrate how it can improve the standard results. This paper is an extended version of a previous work that appeared in [1]. We have included a deeper discussion on our optimization method, and we present a more thorough comparison of our segmentation method with two other related works. Moreover, we have included a new local combinatorial optimization approach for Elastica minimization, which shows that our discrete model is well-founded. Finally, the code related to this paper is freely available on GitHub.Footnote 1

1.3 Outline

Section 2 reviews the concept of multigrid convergence and highlights its importance for the definition of digital estimators. We describe two convergent estimators used in this paper, one that approaches the length of elementary discrete contour elements, and the other that approaches the curvature of a discrete contour. They are used in the optimization model and in the definition of the digital Elastica. Section 3 describes a local combinatorial optimization model suitable for several curvature estimators that presents interesting results but is too costly in practice. Section 4 describes the proposed Elastica-driven curvature evolution model along with several illustrations of digital flows. Section 5 explains how to use this evolution model as a post-processing step in an image segmentation framework and compares it to two related approaches. We conclude and draw some perspectives to this work in Sect. 6.

2 Multigrid Convergent Estimators

Our objective is to delineate shapes within digital images with some priors related to continuous geometry. The goal of this section is to introduce the concept of multigrid convergence and its potential when analyzing digital images with variational models involving geometric quantities, with the constraint that only digitized shapes are observed.

A digital shape is the result of some quantization process over an object X lying in some continuous space of dimension 2 (here). For example, the Gauss digitization of a continuous subset X of the Euclidean plane \(\mathbb {R}^2\) with grid step \(h>0\) is defined as

$$\begin{aligned} D_h(X) = X \cap (h\mathbb {Z})^2. \end{aligned}$$

Given a shape X and its digitization \(D_h(X)\), a digital estimator \(\hat{u}\) for some geometric quantity u is intended to compute u(X) by using only the digitization. This problem is not well-posed, as the same digital object could be the digitization of infinitely many objects very different from X. Therefore, a characterization of what constitutes a good estimator is necessary.

Let u be some geometric quantity of X (e.g., tangent, curvature). We wish to devise a digital estimator \(\hat{u}\) for u. It is reasonable to state that \(\hat{u}\) is a good estimator if \(\hat{u}(D_h(X))\) converges to u(X) as we refine our grid. For example, counting pixels is a convergent estimator for area (with a rescale of \(h^2\)), but counting boundary pixels (with a rescale of h) is not a convergent estimator for perimeter. Multigrid convergence is the mathematical tool that makes this definition precise. Given any subset Z of \((h\mathbb {Z})^2\), we can represent it as a union of axis-aligned squares with edge length h centered on the point of Z. The topological boundary of this union of cubes is called h-frontier of Z. When \(Z=D_h(X)\), we call it h-boundary of X and denote it by \(\partial _h X\).

Definition 1

(Multigrid convergence for local geometric quantites) A local discrete geometric estimator \(\hat{u}\) of some geometric quantity u is (uniformly) multigrid convergent for the family \(\mathbb {X}\) if and only if, for any \(X \in \mathbb {X}\), there exists a grid step \(h_X>0\) such that the estimate \(\hat{u}(D_h(X), {p},h)\) is defined for all \({p} \in \partial _hX\) with \( 0< h < h_X\), and for any \(x \in \partial X\),

$$\begin{aligned}&\forall {p} \in \partial _hX \text { with } {\Vert {p} - x \Vert }_{\infty }\\&\quad \le h, {\Vert \hat{u}(D_h(X),{p},h) - u(X,x)\Vert } \le \tau _{X}(h), \end{aligned}$$

where \(\tau _{X}:\mathbb {R}^{+}\setminus \{0\} \rightarrow \mathbb {R}^{+}\) has null limit at 0. This function defines the speed of convergence of \(\hat{u}\) toward u for X.

For a global geometric quantity (e.g., perimeter, area, volume), the definition remains the same, except that the mapping between \(\partial X\) and \(\partial _h X\) is no longer necessary.

Multigrid convergent estimators provide a quality guarantee and should be preferred over non-multigrid convergent ones. In the next section, we describe two estimators that are important for our purpose.

2.1 Tangent and Perimeter Estimators

The literature presents several perimeter estimators that are multigrid convergent (see [10, 12] for a review), but in order to define the digital Elastica we need a local estimation of length and we wish that integration over these local length elements gives a multigrid convergent estimator for the perimeter.

Definition 2

(Elementary Length) Let a digital curve C be represented as a sequence of grid vertices in a grid cell representation of digital objects (in a grid with step h). Further, let \(\hat{\mathbf {v}}\) be a multigrid convergent estimator for the unit tangent vector. The elementary length \(\hat{s}(\mathbf {e})\) at some oriented grid edge \( \mathbf {e} \in C\) is defined as

$$\begin{aligned} \hat{s}(\mathbf {e}) = h ~{\hat{\mathbf {v}}}(\mathbf {e}) \cdot \mathbf {e}. \end{aligned}$$

The integration of the elementary length along the digital curve is a multigrid convergent estimator for perimeter if one uses the \(\lambda \)-MST [24] tangent estimator (see [23]).

2.2 Integral Invariant Curvature Estimator

Generally speaking, an invariant is a function whose value is unaffected by the action of some group on the elements of the domain.

Perimeter and curvature are examples of invariants for shapes on \(\mathbb {R}^2\) with respect to the Euclidean group of rigid transformations. Definition of integral area invariant and its link with curvature is proven in [28].

Definition 3

(Integral area invariant) Let \(X \subset \mathbb {R}^2\) and \(B_r(p)\) the ball of radius r centered at point p. Further, let \(\mathbb {1}_X(\cdot )\) be the characteristic function of X. The integral area invariant \(\sigma _{X,r}(\cdot )\) is defined as

$$\begin{aligned} \forall p \in \partial X, \quad \sigma _{X,r}(p) = \int _{B_r(p)}{ \mathbb {1}_X(x) \mathrm{d}x}. \end{aligned}$$

The value \(\sigma _{X,r}(p)\) is the area of the intersection of the ball \(B_r(p)\) with shape X. By approaching the shape at point \(p \in \partial X\), one can rewrite the intersection area \(\sigma _{X,r}(p)\) in the form of the Taylor expansion [33]:

$$\begin{aligned} \sigma _{X,r}(p) = \frac{\pi }{2}r^2 - \frac{\kappa (X,p)}{3}r^3 + O(r^4), \end{aligned}$$

where \(\kappa (X,p)\) is the curvature of X at point p. By isolating \(\kappa \), we can define a curvature estimator

$$\begin{aligned} \tilde{\kappa }(p) :=\frac{3}{r^3}\left( \frac{\pi r^2}{2} - \sigma _{X,r}(p) \right) , \end{aligned}$$
(1)
Fig. 1
figure 1

Flower shape in (a) and the cellular-grid model representation in (b) of the rectangle-bounded region. In (b), pixels are colored in gray, linels in green and pointels in blue

Fig. 2
figure 2

Blue pixels in (a) illustrates a 3-ring set. In (b), we highlight the contour of an element of set \(\mathcal {N}_{11}\)

Such an approximation is convenient as one can simply devise a multigrid convergent estimator for the area.

Definition 4

Given a digital shape \(D \subset (h \mathbb {Z})^2\), the area estimator \(\widehat{\text {Area}}(D,h)\) is defined as

$$\begin{aligned} \widehat{\text {Area}}(D,h) :=h^2\text {Card}\left( D \right) . \end{aligned}$$
(2)

It is well known that this area estimator is multigrid convergent (e.g., see [21]). In [11], the authors combine the approximation(1) and digital estimator (2) to define a multigrid convergent estimator for the curvature.

Definition 5

(Integral Invariant Curvature Estimator) Let \(D \subset (h \mathbb {Z})^2\) a digital shape. The integral invariant curvature estimator is defined as

$$\begin{aligned} \hat{\kappa }_{r}(D,{p},h) :=\frac{3}{r^3} \left( \frac{\pi r^2}{2} - \widehat{\text {Area}} \left( B_{r} ( {p} ) \cap D, h \right) \right) . \end{aligned}$$

This estimator is multigrid convergent with speed \(O(h^\frac{1}{3})\) for radii chosen as \(r=\varTheta (h^\frac{1}{3})\). This estimator is also robust to noise and can be extended to estimate the mean curvature of three dimensional shapes.

2.3 Elastica Energy Estimator

In the remaining of this article, we are considering the minimization of the Elastica energy. Given a Euclidean shape X with smooth enough boundary, the Elastica is defined as

$$\begin{aligned} E(X) = \int _{\partial X}{(\alpha + \beta \kappa ^2) \mathrm{d}s}, \quad \text {for } \alpha \ge 0, \beta \ge 0. \end{aligned}$$
(3)

The digital version of Elastica energy approaching the continuous Elastica energy is defined as follows (it uses multigrid convergent estimators):

$$\begin{aligned} \hat{E}( D_h(X) ) = \sum _{\mathbf {e} \in \partial D_h(X)}{ \hat{s}(\mathbf {e})\left( \; \alpha + \beta \hat{\kappa }_{r}^2(D_h(X),\dot{\mathbf {e}},h) \; \right) }, \end{aligned}$$
(4)

where \(\dot{\mathbf {e}}\) denotes the center of the edge \(\mathbf {e}\). In the expression above, we will substitute an arbitrary subset S of \(\mathbb {Z}^2\) to \(D_h(X)\) since the continuous shape X is unknown. In the following, we omit the grid step h to simplify expressions (or, putting it differently, we assume that the shape of interest is rescaled by 1/h and we set \(h=1\)).

3 Local Combinatorial Optimization

The objectives of this section are to show both the potential of minimizing a purely digital Elastica energy but also to underline the difficulties of using it in a global combinatorial optimization framework.

Given a digital shape \(S^{(0)}\), we describe a process that generates a sequence \(S^{(i)}\) of shapes with non-increasing Elastica energy. The idea is to define a neighborhood of shapes \(\mathcal {N}^{(i)}\) to the shape \(S^{(i)}\) and choose the element of \(\mathcal {N}^{(i)}\) with the lowest energy. The process is suited for the integral invariant estimator but also for other curvature estimators, for example, MDCA [36]. As a matter of fact, our experiments have shown that either estimator induce similar results.

Let S be a two-dimensional digital shape embedded in a domain \(\varOmega \subset \mathbb {Z}^2\). We adopt the cellular-grid model to represent S, i.e., pixels and its lower dimensional counterparts, linels and pointels, are part of S (see Fig. 1). In particular, we denote by \(\partial S\) the topological boundary of S, i.e., the connected sequence of linels, such that for each linel we have one of its incident pixels in S and the other not in S.

Let \(d_{S}:\varOmega \rightarrow \mathbb {R}\) be the signed Euclidean distance transformation with respect to shape S. The value \(d_S(p)\) gives the Euclidean distance between \(p \notin S\) and the closest pixel in S. For points \(p \in S\), \(d_S(p)\) gives the negative distance between p and the closest pixel not in S.

Definition 6

(m-Ring Set) Given a digital shape \(S\in \varOmega \), its signed distance transformation \(d_S\) and natural number \(m \ne 0\), the m-ring set of S is defined as

$$\begin{aligned} R_m(S)&:= L_m \cup L_{-m}, \end{aligned}$$

where

$$\begin{aligned} L_m(S)&:= \left\{ \quad \begin{array}{cc} \left\{ p \in \varOmega \; | \; m-1< d_S(p) \le m \right\} &{} , \quad m>0\\ \left\{ p \in \varOmega \; | \; m+1 > d_S(p) \ge m \right\} &{} , \quad m<0 \end{array} \right. \end{aligned}$$
Fig. 3
figure 3

Local combinatorial optimization process results using II estimator with radius 5 for the square and flower shapes. Shapes are displayed at every 5 iterations

Consider the following set of neighbor candidates to S:

$$\begin{aligned} \mathcal {U}(S) = \{ D \; | D \subset R_1(S) \cup S \; \text {and} \; D\text { is connected} \}. \end{aligned}$$

Such set can be extremely large, and its complete exhaustion is prohibitively expensive. Instead, we are going to use a subset of it (Fig. 2).

Definition 7

(n-neighborhood) Given a digital shape \(S \in \varOmega \), its n-neighborhood \(\mathcal {N}_n(S)\) is defined as the set of digital shapes that can be built from S by adding or removing a sequence of \(k \in [0,n]\) connected pixels in \(R_1(S)\).

Algorithm 1 describes the local combinatorial process, and Fig. 3 presents the digital curve evolution when executing this algorithm for two different shapes with \(n=50\).

figure d

The running time of algorithm 1 is summarized in Table 1. All the experiments in this paper were executed on a 32-core 2.4Ghz CPU, and the number of pixels in the square and flower shapes is, respectively, 841, 1641 for grid step \(h=1.0\). Although its use in practical applications is limited, we demonstrate that digital estimators are effective in their measurements and the flows evolve as expected. In particular, the digital Elastica energy for the experiments in Fig. 3 approaches the global optimum (Fig. 4). We observe that it is a complete digital approach, and we do not suffer from discretization and rounding problems, a common issue in continuous models. Furthermore, we have checked that this approach works indifferently with integral invariant curvature estimator and Maximal Digital Circular Arc curvature estimator. So the convergence of the digital curvature estimator seems to be the cornerstone to get a digital curve behaving like a continuous Elastica. In the next section, we explore a more efficient approach.

Table 1 Running times for the local combinatorial optimization algorithm with \(n=50\)
Fig. 4
figure 4

Digital Elastica computation along iterations of algorithm 1. The final shape has energy value close to the optimal value \(2\pi /5\) (thin horizontal dashed line)

4 Digital Curvature Flow

Given a digital shape \(S \subset \varOmega \subset \mathbb {Z}^2\), the goal is to compute a flow \(S^{(i)}\) where \(S^{(i+1)}\) has lower digital Elastica than \(S^{(i)}\).

We observe that the integral invariant estimator (1) is originally designed to be evaluated on the digital boundary of the shape, which is the unknown of our problem. Injecting contour information leads to a third-order model, which we prefer to avoid. Instead, we are going to exploit the II estimator definition to compute a flow that evolves proportionally to the squared curvature of the current shape.

4.1 Discrete Variational Model for Discrete Curve Evolution

We assume an ordering in \(\varOmega \), i.e., there exists a bijective function \(\omega : \varOmega \rightarrow \{1 \cdots |\varOmega | \}\). Moreover, we associate with any subset \(P \subset \varOmega \) the set of binary variables X(P) defined as

$$\begin{aligned} {X(P) := \left\{ x_{\omega (p)} \in \{0,1\} \; | \; p \in P \right\} .} \end{aligned}$$

In order to guarantee connectivity, we limit the optimization region to a subset of \(\varOmega \), namely the inner pixel boundary of \(S^{(i)}\). We also use the notation |S| to denote the cardinal of a digital set S.

Definition 8

(Inner pixel boundary) Given a digital shape S embedded in a domain \(\varOmega \), we define its inner pixel boundary set I(S) as

$$\begin{aligned} {I(S)} := \left\{ \, p \; | \; p \in S, |\mathcal {N}_4(x) \cap S|<4 \, \right\} , \end{aligned}$$

where \(\mathcal {N}_4(p)\) denotes the 4-adjacent neighbor set of p (without p).

To simplify notation, the inner pixel boundary of \(S^{(i)}\) is simply denoted \(I^{(i)}\). At each iteration, the set \(X^{(i)}\) of optimization variables is defined as

$$\begin{aligned} { X^{(i)} := X(I^{(i)}). } \end{aligned}$$

In other words, each flow iteration decides which pixels in the inner boundary are to be removed from \(S^{(i)}\) and which are to be kept in \(S^{(i)}\). We recall the definition of the II digital curvature estimator:

$$\begin{aligned} \hat{\kappa }^2(p)&= c_1\Big ( c_2 - | B_r(p) \cap S^{{(i)}} | \Big )^2, \end{aligned}$$
(5)

where \(c_1=9/r^6\) and \(c_2=\pi r^2/2\).

The following sets are important in the expansion of (5).

$$\begin{aligned} F^{(i)}&:= S^{(i)} \setminus I^{(i)} \\ F_r^{(i)}(p)&:= F^{(i)} \cap B_r(p)\\ I_r^{(i)}(p)&:= I^{(i)} \cap B_r(p) \\ X_r^{(i)}(p)&:= X\big ( I_r^{(i)}(p) \big ). \end{aligned}$$

Expanding (5), we get

$$\begin{aligned} \hat{\kappa }^2(p)&= c_1\Big ( c_2 - |F_{r}^{(i)}(p)| - \sum _{x_j \in X_r^{(i)}(p)} {x_j} \Big )^2 \\&= c_1 \Big ( C + 2\left( |F_{r}^{(i)}(p)| - c_2 \right) \quad \sum _{x_j \in X_{r}^{(i)}(p)}\quad {x_j} \\&\quad +\quad \sum _{x_j \in X_{r}^{(i)}(p)}\quad {x_j^2} + \quad \sum _{ \begin{array}{c} x_j,x_k \in X_{r}^{(i)}(p) \\ j<k \end{array} }\quad {2x_jx_k} \Big ), \end{aligned}$$

where \(C=c_2^2 - 2c_2 \cdot |F_{r}^{(i)}(p)| + |F_{r}^{(i)}(p)|^2\) is a constant. By ignoring constants and multiplication factors and using the binary character of the variables, we define the following family of energies for given parameters \(\alpha ,\beta , \gamma \ge 0\).

$$\begin{aligned}&E_m(X^{{(i)}},S^{{(i)}}) = \sum _{x \in X^{{(i)}}}{\alpha s(x)} \nonumber \\&\quad + \sum _{ \begin{array}{c} p \in \\ R_m(S^{{(i)}}) \end{array}} {2c_1} \beta \Big ( (1/2+ |F_{r}^{(i)}(p)|-c_2) \nonumber \\&\quad \cdot \sum _{ \begin{array}{c} x_j \in \\ X_{r}^{(i)}(p) \end{array}}{x_j} + \sum _{ \begin{array}{c} j<k, \\ x_j,x_k \in \\ X_{r}^{(i)}(p) \end{array} }{x_jx_k} \Big ) + \sum _{x \in X^{{(i)}}}{\gamma g(S^{{(i)}},x)}, \end{aligned}$$
(6)

where g(Sx) denotes a data term (e.g., distance to initial shape S or fidelity to data) and s(x) denotes a length penalization term. Each choice of m generates a different flow, which is generally described in the digital curve evolution (DCE) algorithm 2.

Fig. 5
figure 5

Directly using the optimization result of (6) does not decrease squared curvature because contour information is not present in the energy

figure e

We emphasize that the minimization of (6) for \(S^{(i)}\) is not sufficient to derive the next shape \(S^{(i+1)}\) because contour information is not included in the model (see Fig. 5). Recall that the integral invariant estimator approaches curvature by computing the difference between half of the area of a chosen ball and the area of the intersection of this ball with the shape. In particular, regions of positive curvature have fewer pixels in their intersection set than on its complement w.r.t the estimation ball. This implies that variables in such regions are labeled with 1, as the unbalance grows otherwise. We attenuate curvature if we shift the center of the estimation ball toward the interior of the shape, which means removing the 1-labeled pixels. That is why we take the complement of the optimization solution.

The same reasoning applies for non-convex parts. Indeed, concave regions are convex in the shape complement. In the expansion mode, we apply the same reasoning on the image complement, and by doing this we are able to handle concavities. It is called expansion mode because the optimization region, in this case, is the outer pixel boundary of the original shape. Table 2 sums up these arguments.

Length and data terms should be properly defined in order to comply with the complement step of the DCE algorithm. The length penalization is defined as

$$\begin{aligned} s(x_{w(p)})&=\sum _{q \in \mathcal {N}_4(p)}{ t(q) }, \quad \text {where } t(q) \nonumber \\&= \left\{ \begin{array}{ll} (x_{w(p)}-x_{w(q)})^2, &{} \text {if } q \in I^{(i)}\\ (x_{w(p)}-0), &{} \text {if } q \in F^{(i)}\\ (x_{w(p)}-1), &{} \text {otherwise } \end{array}\right. \end{aligned}$$
(7)

We do not use data terms in this section, thus, we postpone the definition of such terms until later in this article, with the description of how DCE can be used in an image segmentation framework.

Table 2 Since the curvature is negated when reversing the curve (i.e., \(\bar{\kappa }=-\kappa \)), this process can only shrink convex parts in shrink mode and expand concave parts in expansion mode

In Fig. 6, we see some results for the DCE algorithm with \(m=1\). We observe a global evolution toward rounder shapes, but with several artifacts. We minimize the effects of a jaggy boundary by setting \(\alpha > 0\). Nonetheless, a higher radius of the estimation ball creates unstable shapes. In fact, the estimator is very sensitive in regions of low squared curvature, and it is precisely in those regions that spurious pixels are created.

Fig. 6
figure 6

Algorithm is very sensitive to the little variations in the estimator, which are particularly important in regions of low squared curvature. Artifacts are somewhat reduced with a length penalization but increases if we use a higher ball radius

Fig. 7
figure 7

By positioning the estimation ball on farther rings, we minimize artifacts creation

4.2 A More Stable Model

In the previous section, we noticed that the algorithm produces shapes with many artifacts due to the small uncertainties of the estimator along regions of low squared curvature. We argue that, by evaluating the estimation ball along outer ring sets, we avoid those sensitive areas by focusing the optimization process only on regions with highest squared curvature value.

In our experiments, the best results are obtained by executing DCE algorithm with m equal to r, where r is the estimation ball radius (see Fig. 7). We observe that digital Elastica may increase after some iterations if chosen radius is too large, as in the case of the triangle in Fig. 8 in which the flow converges to a single point. We conjecture that an appropriated value for the radius should be given by the shape reach. The produced flow has no difficulties in handling changes on topology, and it presents different speeds for regions with low and high curvature values, as illustrated in Fig. 9.

4.3 Optimization method

Let f be a function of n binary variables with unary and pairwise terms, i.e.,

$$\begin{aligned} f(y_1,\cdots , y_n) = \sum _{j}{f_j(y_j)} + \sum _{j < k}{f_{j,k}(y_j,y_k)}. \end{aligned}$$

The function f is submodular if and only if the following inequality holds for each pairwise term \(f_{j,k}\) [22]:

$$\begin{aligned} \quad f_{j,k}(0,0) + f_{j,k}(1,1) \le f_{j,k}(0,1) + f_{j,k}(1,0). \end{aligned}$$

The energy \(E_m\) is non-submodular, and optimizing it is a difficult problem, which constrains us to use heuristics and approximation algorithms. The QPBO method [35] transforms the original problem in a max-flow/min-cut formulation and yields a full optimal labeling for submodular energies. For non-submodular energies, the method is guaranteed to return a partial labeling with the property that the set of labeled variables is part of an optimal solution. That property is called partial optimality.

In practice, QPBO can leave many pixels unlabeled. There exist two extensions to QPBO that alleviate this limitation: QPBOI (improve) and QPBOP (probe). The first is an approximation method that is guaranteed to not increase the energy, but loses the property of partial optimality. The second is an exact method which is reported to label more variables than QPBO.

The percentage of unlabeled pixels by QPBOP for \(E_1\) is quite high, but the percentage decreases to zero as we set m equal to r. Therefore, we are more confident in taking the solution for values of m close to r. However, the way it varies across values of m differs from shape to shape, as is illustrated in Fig. 11. We also noticed that, for \(m=r\), all the pixels were labeled, which may indicate that \(E_r\) is an easy instance of the general non-submodular energy \(E_m\), but this remains to be proved. The number of pairwise terms in \(E_r\) is roughly half of those in \(E_1\) (see Fig. 10).

Fig. 8
figure 8

Choice of radius impacts the flow. In the figures, the flow ceases to evolve for all shapes when \(r=3\) (a). In (b), for \(r=5\), the triangle evolves to a single point, while the others stop in an intermediate shape, as in (a). In (c), we observe that for a given choice of radius, the digital Elastica may increase after a certain number of iterations

Fig. 9
figure 9

High curvature regions evolves faster than lower ones (a). The flow can handle topological changes (b)

We have used QPBOI to solve \(E_m\). Naturally, in the case where all pixels are labeled by QPBOP, QPBOI returns the same labeling as QPBOP.

Fig. 10
figure 10

We plot the ratio of pairwise terms among all \(\left( {\begin{array}{c}|X^{(i)}|\\ 2\end{array}}\right) \) combinations. The highest ring has roughly half the number of pairwise terms as the lowest ring

5 Application to Image Segmentation

We present an application of our digital curve evolution algorithm to supervised image segmentation. The DCE acts as a contour correction method. Here, we use a data fidelity term in order to characterize the object of interest. Given foreground and background seeds selected by the user, we derive mixed Gaussian distributions of color intensities \(G_f\),\(G_b\), and we define the data fidelity term as the cross-entropy, i.e.,

$$\begin{aligned} g({x_{w(p)}}) = -(1-{x_{w(p)}})\log {G_f(p)} - {x_{w(p)}}\log {G_b(p)}. \end{aligned}$$
(8)

We use the DCE algorithm to regularize an initial contour output by some segmentation algorithm or delineated by the user. In this application, the data term of the DCE is set to the data fidelity term (8).

figure f

The algorithm can be initialized by a collection of compact sets, or with the result of a third-party segmentation algorithm, as GrabCut [34]. We include an additional parameter d that dilates the initial sets using a square of side one before executing the flow.

We evaluate our method using the BSD300 database [29]. All images contain the same number of pixels, the resolution being \(321\times 481 (481 \times 321)\) in portrait (landscape) mode. We compare the results of our method with segmentations given by GrabCut and Schoenemanns’s method [38]. We report an average of 3s per flow iteration and an average of 30 iterations per image. While GrabCut executes in less than one second, Schoenemann’s method may take several hours to complete.

In Fig. 12, we can observe the results of a curvature regularization in comparison with a pure length regularization. The curvature can fill gaps and is smoother than the one produced by length only, resulting in more pleasant segmentations. In Figs. 13 and 14, we list several results of our method and we compare them with segmentation produced by GrabCut and Schoenemann’s method. Our method is successful in producing curvature regularized segmentations and demonstrates the completion property of curvature. Moreover, it does not suffer from over-segmentation, it is much faster than Schoenemann’s, and in several cases produces better segmentations than GrabCut.

6 Conclusion

We have studied in depth several digital curve evolution models based on a digital version of the Elastica energy, and we have presented an application to image segmentation. The processes we have described are completely digital and do not suffer from issues that typically arise in models that pass through a discretization stage, such as rounding. Moreover, the model can handle changes in topology and its results are competitive with similar approaches while achieving reasonable running times.

Fig. 11
figure 11

For each plot, we first produce shapes \(\left\{ S^{(i)} \right\} \) executing DCE with \(m=r\). Then, for each shape in \(\left\{ S^{(i)} \right\} \), we execute one iteration of DCE for different values of m and we count the unlabeled pixels. The number of unlabeled pixels by QPBOP remains high for lower values of m and goes to zero when \(m=r\). We observe the same behavior for varying radius values

Fig. 12
figure 12

Comparison of squared curvature regularization (first row) and length regularization (second row)

Fig. 13
figure 13

Proposed method regularizes GrabCut [34] contours and returns the meaningful results. We can observe the completion feature of curvature in the second row, and we do not suffer from over-segmentation issues as Schoenemann’s method [38]. However, our flow may stop in a local optimum as in the fourth row, while Schoenemann’s is able to extrapolate such solutions

Future developments of this work will include extending the DCE algorithm to 3d, as the integral invariant estimator is available in higher dimensions. As a result, a denoising application may be derived by lifting the flow to 3d with a data fidelity defined by image intensities or by distance to initial position.

Fig. 14
figure 14

Additional comparison results between GrabCut [34], Schoenemman [38] and Contour correction segmentation methods