Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

We focus on the image segmentation problem, which is one of the challenging problems in Computer Vision. Throughout the last three-four decades numerous approaches have been proposed and developed in order to attack the problem. In 1989, Mumford and Shah [18] minimized a certain energy functional in order to compute the segmentation. The functional contains three terms: regularity term on the length of the inter-phase contours, regularity term on the smoothness of the intensity function v, and the data fidelity term which measures the \(L^2\) distance between the input intensity u and the output intensity v. Since the Mumford-Shah functional is non-convex and non-smooth, the optimization problem is difficult to solve. Simplifications of the model, however, have been proposed. Among these is the piecewise smooth convex relaxation by functional lifting [19]. Another frequently applied strategy is to restrict v within the class of piece-wise constant functions, so that the second regularity term in the original functional is omitted. This piecewise constant model, combined with classical gradient based active contour models lead to the Chan-Vese model ([8] for 2-phase and [25] for multi-phase segmentation). There are many other approaches for 2-phase image segmentation based on [8] and its convex relaxation [7], e.g., [4, 9, 29]. In [6], the regularity term in the Mumford-Shah functional is replaced by the Rudin-Osher-Fatemi (ROF) functional [21]. A new multiphase segmentation model based on iteratively thresholding the minimizer of the ROF functional of a convex relaxation of the Mumford-Shah functional is presented and relation between the solution in the 2-phase case and the one from the Chan-Vese model is established.

Graph-based image segmentation has been another active research field for the past 40 years. While the works we mentioned earlier use a continuous setup, in the graph based approaches, one studies a suitably constructed graphs whose vertices are the voxels describing the image. In [28], Zahn uses the minimum spanning tree (MST) of a weighted graph to obtain the phases. The edge weights are defined as the differences of the intensities of neighboring voxels. A “heavy” MST edges are then cut out, leaving the different connected components to be the different phases. Some shortcomings of the model have been overcome in [24], where the weights are normalized. A segmentation method, based on finding minimum cuts in a graph, is developed in [27]. The method appeared to be biased more towards finding small components. A normalized cut criterion, which takes into account self-similarity of regions, is developed in [22]. The latter leads to an NP-hard problem and in [22] the authors propose several polynomial approximating algorithms. The work of Weiss [26] relates such eigenvector-based approximations to more standard (spectral) partitioning methods on graphs. An efficient greedy algorithm for multiphase segmentation, based on edge detection, is developed in [10]. Although the algorithm uses local optimization procedures it runs in almost linear time with respect to the number of edges, and the output segmentation satisfies global properties.

Other graph-Laplacian-based segmentation models [13, 16, 17, 23] can be viewed as an intersection of the general approaches we have just described. In these works, the image voxels are again part of a graph structure, and the corresponding 2-Laplacian functional [1, 5] is used as the data fidelity term in an optimization problem. The data smoothing is not addressed via additional regularity terms in the functional, but by carefully choosing the edge weights.

The variety of segmentation techniques is huge and we cannot cover it all. There are other approaches, some of them considered classical (e.g., the K-means method and its modifications). We refer to [3, 12], for the corresponding techniques and review of the literature.

In this paper we consider a constraint 2-phase segmentation problem, where one of the phases is simply connected and of fixed volume. The motivation comes from industry and more precisely from Computer Tomography (CT). Porous materials are of current interest within a wide range of applications and their properties strongly depend on various measurements such as absolute porosity, average pore size, size and shape of individual pores. Therefore, accurate segmentation of the 3D industrial CT reconstruction of the corresponding specimen is crucial for further numerical simulations. Due to the highly irregular structure of the segmentation phases and the presence of noise in the image, the methods, described earlier are not reliable and in some cases the results between different algorithms may differ drastically (even in \(50\,\%\) of the voxels). To say the least, such a task is nontrivial.

An important constraint is the volume constraint. It is practical to assume that: (1) the volume (e.g. the number of “solid” voxels) of the solid phase is fixed (determined from the density and the weight of the material); and (2) the specimen consists of only one material piece (connected component).

In our work, we aim to design algorithms that give accurate segmentation, but also which respect the constraints given in (1) and (2). Recently, a promising step in this direction was made in [11], where, based on the techniques from [13] algorithms for accurate segmentation (with the volume constraint) were reported. Here, we introduce a different approach, based on MST properties. We propose a new class of algorithms, which give promising results and provide a framework for future research on the constrained image segmentation. Our mathematical model requires minimization of functionals measuring the approximation error with piece-wise constant functions. For the theory, we consider a discrete version of the fidelity term of the Chan-Vese model (called fitting energy) defined on the characteristic functions \(\chi _S\) of simply-connected subsets \(S\subset \varOmega \) of cardinality \(|S|=M\). Then, in the experimental part we add a regularity term to it for data smoothing.

The rest of the paper is organized as follows. We give preliminary notation and definitions in Sect. 2. Next, in Sect. 3 we formulate the problem relevant to the 2-phase constrained image segmentation. In Sect. 4 we describe in more detail the 3 stage algorithm and in Sect. 5 we test its performance.

2 Preliminaries

We introduce the notation needed to formulate a problem which we refer to in what follows as the 2-phase image segmentation problem. We are given a volume \(\varOmega \) in 3D (2D) which is split in \(n_x\times n_y\times n_z\) cubes (squares in 2D case when \(n_z=1\)). The cubes are called voxels. The total number of voxels is \(N=n_xn_yn_z\), and the set of voxels is \(\mathcal {V}\). We thus have

$$ \overline{\varOmega }= \bigcup _{K \in \mathcal {V}}\overline{K}. $$

We assume that we are given a piece-wise constant (with respect to the partition of \(\varOmega \) in voxels) function called intensity u. Denoting by \(\chi _{S}\) the characteristic function of a set S, we have that

$$ u = \sum _{K\in \mathcal {V}} u_K \chi _K. $$

Here, \(u_K=u\big |_K\). Since the space of such piece-wise constant functions is isomorphic to \(\mathbb {R}^N\), we also denote by u the corresponding vector \(\{u_K\}_{K\in \mathcal {V}}\) in \(\mathbb {R}^N\), hoping that there is no ambiguity in such notation.

The formulation of the 2-phase image segmentation problem involves topological connectivity with respect to various graphs, thus we introduce the relevant notation next. We denote \(G_{1}(\mathcal {V},\mathcal {E}_1)\) and \(G_{\infty }(\mathcal {V},\mathcal {E}_\infty )\) to be the indirected graphs with vertices the set of voxels and edge sets \(\mathcal {E}_p\), \(p=1,\infty \) defined as follows:

$$ \mathcal {E}_p = \left\{ (i,j)\in \mathcal {V}\times \mathcal {V}\;\big |\; \Vert i-j\Vert _{\ell ^p} = 1\right\} . $$

Since we only consider undirected graphs, \((i,j)\in \mathcal {E}_p\) implies that \((j,i)\in \mathcal {E}_p\). The neighborhood \(\mathcal {N}_p(i)\) of a voxel is defined as

$$ \mathcal {N}_p(i) = \left\{ j\in \mathcal {V}\;\big |\;(i,j)\in \mathcal {E}_p\right\} . $$

For example, \(\mathcal {N}_1(i)\) consists of the 6 voxels that share a common face with i, while \(\mathcal N_{\infty }(i)\) consists of the 26 voxels that build the \(3\times 3\times 3\) cube, centered at i.

The graph G is called connected if and only if for every pair of voxels i and j, there is a path formed by elements of \(\mathcal {E}\) connecting them.

In the following definitions, we assume that we have fixed a connected graph \(G=(\mathcal {V},\mathcal {E})\) whose set of vertices is the set of voxels.

Definition 1

Let \(S\subset \mathcal {V}\) be a set of voxels. We call \(G_S=(S,\mathcal {E}_S)\) the graph induced by S if \(G_S\) has as vertices the voxels in S and as edges all edges in \(\mathcal {E}\) for which both ends are in S.

Definition 2

Let \(S\subset \mathcal {V}\) be a set of voxels. We call S a G-connected set if the graph \(G_S\) induced by S is connected.

When each edge \(e=(i,j)\in \mathcal E\) has a weight \(\omega _{ij}\ge 0\), the graph G is called weighted. The weights may have various meanings when the graph is related to real-life problems (e.g., gain, cost, penalty, etc.). In this paper, they will measure the dissimilarities between the edge endpoints (e.g., difference in intensities or/and gradient values of the corresponding voxels), so \(w_{ij}\sim 0\) means that \(i\sim j\) in a given sense. Every connected weighted graph possesses a minimum spanning tree. This tree is a computationally efficient way to store both connectivity and similarity information, and it plays a central role in our algorithm. Therefore, we briefly cover the MST theory we use in the paper. Let \(G=G_{\infty }(\mathcal {V},\mathcal {E})\).

Definition 3

We say that the graph \(T(\mathcal {V}_T,\mathcal {E}_T)\) is a minimum spanning tree (MST) of G, if \(\mathcal {V}_T=\mathcal {V}_G\), T contains no cycles, and \(\displaystyle \sum _{(i,j)\in \mathcal {E}_T}\omega _{ij}\rightarrow \min \).

Important properties of T are stated below:

  • Cycle property: For any cycle C in G, if \(\bar{e}={\mathrm {argmax}}_{e\in C}\omega _e\) is unique, then \(\bar{e}\notin T\).

  • Cut property: For any cut C in G, if \(\bar{e}={\mathrm {argmin}}_{e\in C}\omega _e\) is unique, then \(\bar{e}\in T\).

  • Contraction: If \(\mathcal {T}\subset T\) is a tree, then we can contract it to a single vertex and maintain the MST property for the factor graph.

Definition 4

We call \(\mathcal {L}_T:=\{i\in \mathcal {V}:\exists ! j\in \mathcal {V}\; s.t.\; (i,j)\in \mathcal {E}_T\}\) the set of leaves of T. The is \(l_T:={\mathrm {argmax}}_{i\in \mathcal {L}_T}\{\omega _{ij}\;:\;(i,j)\in \mathcal E_T\}\).

To construct the MST, we apply the Kruskal’s algorithm [15], which is linear in the number of edges (thus of \(\mathrm {O}(N\log N)\) complexity). Possible accelerations using local techniques and parallel realizations (the so called approximate Kruskal algorithm is found on pp.600–602 in Kraus [14]) are available, but since the purpose of this paper is mainly to address the constraint segmentation problem, we do not pursue this avenue here, and we use the classical algorithm in [15].

3 Problem Formulation

Here we state in a precise fashion the mathematical problem for determining a 2-phase segmentation of an image with intensity \(u\in [0,1]^N\).

We begin by giving the definition of an admissible 2-phase (image) segmentation.

Definition 5

Given an intensity u and a graph G, we say that S is an admissible 2-phase segmentation if the following properties are satisfied

  • Connectedness Property (CP): S is G-connected.

  • Approximate Dominating Property (ADP): There exists an intensity \(v\approx u\) which provides the same solution S and satisfies Dominating Property (DP):

    $$\min _{K\in S} u_K > \max _{K\in \bar{S}} u_K\qquad \bar{S}=\mathcal {V}\setminus S.$$

When u itself satisfies (DP), we call S (DP)-admissible. In such a case, the image phases are well-separated and even direct segmentation methods such as hard thresholding will do the job. In this paper, we consider noisy and blurry images, where the boundary between S and \(\bar{S}\) is not that sharp. We note that the notion of \(v\approx u\) is a bit vague here. Intuitively, one may think of v as the original (denoised and deblurred version of u) image intensity, while the \(\approx \) sign implies certain constraints on the magnitude of both the noise and blur levels of the image.

We consider a 2-phase segmentation problem, where the solid phase is connected and of fixed volume.

Problem 1

Given u and G, find 2-phase admissible segmentation S with a given cardinality \(|S|=M>1\).

We introduce the following family of functionals \(J: 2^{\mathcal {V}}\mapsto \mathbb {R}\), where \(2^{\mathcal {V}}\) is the set of all subsets of the set of vertices \(\mathcal {V}\).

$$\begin{aligned} J(S) = \Vert u-\chi _S\Vert ^2, \quad \chi _S \; \text{ is } \text{ the } \text{ characteristic } \text{ function } \text{ of }\; S. \end{aligned}$$
(1)

The values of the functional depend on the norm that we take, and we are going to have two types of norm, thus two different functionals:

$$\begin{aligned} J_0(S) = \Vert u-\chi _S\Vert ^2_{\ell ^2(\mathcal {V})}, \quad J_{1}(S) =\Vert u-\chi _S\Vert ^2_{\ell ^2(\mathcal {V})} + \lambda \Vert \nabla \left( u-\chi _S\right) \Vert ^2_{\omega }. \end{aligned}$$
(2)

Here, \(\lambda >0\) is a parameter, \(\nabla :\mathbb {R}^{n}\mapsto \mathbb {R}^{n_e}\) is the discrete gradient defined as

$$\begin{aligned} (\nabla v)_e = \delta _e v = v_i-v_j, \quad i<j, \quad (i,j)=e\in \mathcal {E}. \end{aligned}$$
(3)
$$\begin{aligned} (\nabla v, \nabla w)_{\omega } = \sum _{e\in \mathcal {E}} \omega _e \delta _e v\delta _e w. \end{aligned}$$
(4)

The non-negative weights \(\{\omega _e\}_{e\in \mathcal {E}}\) may depend on the intensity u, and will be dealt with in Sect. 4. Finally, the weighted norm of the gradient is defined as

$$\begin{aligned} \Vert \nabla v\Vert ^2_{\omega } = (\nabla v,\nabla v)_{\omega }. \end{aligned}$$
(5)

Note that \(J_0\) can be seen as the discrete and simplified version of the 2D Chan-Vese fitting energy [8]

$$F_1(\mathcal C)+F_2(\mathcal C)=\int _{inside(\mathcal C)}|u-c_0|^2 dxdy+\int _{outside(\mathcal C)}|u-c_1|^2 dxdy,$$

where \(\mathcal C\) is a 2D curve. The 2-phase segmentation there is obtained via minimizing the fitting energy with respect to \(\mathcal C\), \(c_0\), and \(c_1\) together with regularity terms on the length of \(\mathcal C\) and the area of its interior. Then, the interior and the exterior of \(\mathcal C\) are the two phases. Here, we set \(c_0=\min _i u_i=0\), and \(c_1=\max _i u_i=1\). We have no regularity terms, but impose two additional constraints: the interior to be connected and of cardinality M. If no constraints are addressed, it is straightforward to check that the minimizer of \(J_0\) corresponds to direct hard-threshold segmentation (e.g., \(i\in S\;\Leftrightarrow \;u_i\ge 0.5\)). The functional \(J_1\) is \(J_0\), penalized by a regularity term. The regularization depends on the choice of the weights \(\omega \).

We now have the following definition.

Definition 6

We say that the set S provides an optimal 2-phase segmentation for u if and only if S is an admissible 2-phase segmentation of cardinality M and it minimizes the functional J(S), namely,

$$ S=\arg \min \left\{ J(S) \;\big |\; |S| = M, \; S \;\text{ is } \text{ connected }\right\} \!. $$

Such definition leads to a simple characterization of the minimizer for the norm choices (2). Indeed we have for all \(S\in 2^{\mathcal {V}}\)

$$\begin{aligned} J_0(S)= & {} \sum _{j\in S} (u_j-1)^2 +\sum _{j\notin S} u_j^2 = \Vert u\Vert _{\ell ^2}^2 -2\sum _{j\in S} u_j + \Vert \chi _S\Vert ^2_{\ell ^2}\\= & {} -2\sum _{j\in S} u_j + \Vert u\Vert _{\ell ^2}^2 + M. \end{aligned}$$

Thus, minimizing \(J_0(S)\) is equivalent to finding a G-connected S, such that

$$ S=\arg \max J_*(S), \quad J_*(S) := \sum _{j\in S} u_j. $$

Note that \(J_*(\cdot )\) is a linear functional in u. Next, we look into the other norm. Denote the set of edges connecting S with the complement of S, denoted here by \(\overline{S}\) by \(\mathcal {E}_{c}\) (c stands for “cut”). We note that \(\nabla \chi _S = 0\) for all edges interior to S or \(\overline{S}\). We then compute

$$\begin{aligned} \Vert \nabla u -\nabla \chi _S\Vert _{\omega }^2= & {} \Vert \nabla u\Vert _{\omega }^2 - 2\sum _{e\in \mathcal {E}_c} \omega _e\delta _e u +\sum _{e\in \mathcal {E}_c} \omega _e = \Vert \nabla u\Vert _{\omega }^2 + \sum _{e\in \mathcal {E}_c} (1-2\delta _e u)\omega _e \end{aligned}$$

Note that, from the definition of the gradient, the above formula is correct if we have ordered first the vertices in S, so that \(\delta _e\chi _S\ge 0\). Thus, we can obtain an optimal solution if we minimize

$$ J_{**}(S) = -2\sum _{j\in S} u_j + \lambda \sum _{e\in \mathcal {E}_c} (1-2\delta _e u)\omega _e $$

4 Constrained Segmentation

In this section, we propose a three stage segmentation algorithm. The steps are as follows: (1) smoothing step which removes the local extrema in the intensity vector; (2) Selecting M voxels and constructing a connected component in the graph containing all of these voxels; (3) trimming the connected component so that the approximation to the “solid” part of the image has exactly M voxels.

4.1 Step 1: Removing Local Maxima

For a fixed G, we say that u has a strict local maximum at \(K\in \mathcal {V}\) if and only if

$$ u_K > \max \{u_J, \quad J\in \mathcal {N}(K)\}. $$

Since by assumption we are looking for segmentation with \(|S|>1\), all strict local maxima are due to image artefacts (e.g., noise). We can now modify the intensity and remove them, still having an admissible solution S. We use the following algorithm for removal of local maxima:

figure a

We have the following equivalence results about the segmentations corresponding to intensities u and v whose proof is straightforward.

Lemma 1

Let v be obtained from u via Algorithm 1. Then

  • If u satisfies (DP) then v also satisfies (DP).

  • If S(u) is a 2-phase admissible segmentation solving Problem 1, so is S(v).

4.2 Defining Edge Weights

From now on, we assume that the underlying graph G is fixed and in what follows, we take \(G=G_{\infty }\). Furthermore, we assume \(u\in [0,1]^N\).

We aim at solving constraint segmentation problems, where the cardinality of the admissible 2-phase segmentation S is a priori known (i.e., \(|S|=M\)). For this purpose, we split \(\mathcal {V}\) into three subsets:

$$ \begin{aligned}&\mathcal {V}_1:=\left\{ i\in \mathcal {V}\;\big |\;u_i\sim 1\; \& \;f(\nabla u_i)\le \varepsilon \right\} \!,\\&\mathcal {V}_0:=\left\{ i\in \mathcal {V}\;\big |\;u_i\sim 0\; \& \;f(\nabla u_i)\le \varepsilon \right\} \!,\\&\mathcal {V}_U:=\mathcal {V}\setminus \left( \mathcal {V}_0\cup \mathcal {V}_1\right) . \end{aligned}$$

Here, the maximal image intensity is 1 and the minimal is 0; \(\nabla u_i\) is the G-gradient at i (i.e., \((\nabla u_i)_j=u_j-u_i,\;\forall j\in \mathcal {N}(i)\)), \(f:\mathbb {R}^{26}\rightarrow [0,+\infty )\), and \(\varepsilon \) is a small parameter, chosen by the user. The similarity relations \(u_i\sim 1\) and \(u_i\sim 0\) also need to be specified for the image. They should depend on the noise and blur levels. Typically, one uses \(u_i\ge 1-\eta \) and \(u_i\le \eta \) for a suitable \(\eta \in (0,1/2)\).

The idea is that \(\mathcal {V}_1\subset S\), \(\mathcal {V}_0\subset \bar{S}\), while the origin of the voxels within \(\mathcal {V}_U\) remains unclear and, depending on M, they should be distributed somehow between the two phases S and \(\bar{S}\).

For the weights of the edges, we propose to add an “uncertainty penalizer”, e.g.,

$$\begin{aligned} \omega _{ij}:=|u_i-u_j|+\delta g\left( f(\nabla u_i),f(\nabla u_j)\right) ,\quad \forall (i,j)\in \mathcal {E}. \end{aligned}$$
(6)

Here \(g:[0,+\infty )^2\rightarrow [0,+\infty )\), and \(\delta \) is a small, positive parameter. Such weights need to favorize edges between \(\mathcal {V}_U\) and \(\mathcal {V}_0\cup \mathcal {V}_1\) and penalize edges within \(\mathcal {V}_U\). The latter will help us to “clarify” the origin of the unclear voxels, while in the same time it decouples them and they don’t cluster. Hence, the elements of \(\mathcal {V}_U\) can be treated individually, which is very important for our constraint problem.

To achieve that, we need to impose some assumptions on fg:

Definition 7

We say that fg are admissible if they satisfy the following:

  • Symmetry: \(f\left( x_1,\dots ,x_{26}\right) =f\left( x_{\sigma (1)},\dots ,x_{\sigma (26)}\right) \), resp. \(g(x_1,x_2)=g(x_2,x_1)\), where \(\sigma :\{1,\dots ,26\}\rightarrow \{1,\dots ,26\}\) is an arbitrary permutation.

  • Positivity: \(f(x)=0,\;g(x)=0\quad \Leftrightarrow \quad x=\mathbf {0}\).

  • 1-Homogeneity: \(f(\lambda x)=\lambda f(x),\;g(\lambda x)=\lambda g(x),\qquad \forall \lambda \ge 0\).

  • Monotonicity: \(g(x_1+\alpha ,x_2)\ge g(x_1,x_2),\quad \forall \alpha >0\).

Examples: \(\Vert \cdot \Vert _{\ell ^p},\;\forall p\ge 1\), \(\min (\cdot )\), and many others.

Lemma 2

(Properties of \(\omega _{ij}\) from (6)). Let \(G=G_{\infty }(\mathcal {V},\mathcal {E})\), \(i\in \mathcal {V}\), \(j\in \mathcal {N}(i)\), fg are admissible, and \(\omega _{ij}\) is given by (6). Then

  1. (i)

    G is undirected graph (\(w_{ij}=w_{ji}\)) and \(w_{ij}\) is invariant under rotations of \(\varOmega \).

  2. (ii)

    \(\omega _{ij}=0\quad \Leftrightarrow \quad u|_{\mathcal {N}(i)\cup \mathcal {N}(j)}=const\). In particular, \(\omega _{ij}\sim 0\) if \(\{i,j\}\subset \mathcal {V}_0\cup \mathcal {V}_1\).

  3. (iii)

    Let \(j\in \mathcal {V}_0\cup \mathcal {V}_1\), \(k\in \mathcal {V}_U\cap \mathcal {N}(i)\), and \(u_i=u_j=u_k\). Then \(\omega _{ij}\le \omega _{ik}\). If f and g are strictly monotone, then \(\omega _{ij}<\omega _{ik}\).

Proof

Since f is symmetric, G is \(\varOmega \)-rotation-invariant. Since g is symmetric, \(\omega _{ij}=\omega _{ji}\). Hence (i) is verified. Property (ii) follows from the positivity of f and g. For (iii), w.l.o.g., let \(j\in \mathcal {V}_1\). Then \(u_j\sim 1\), thus \(u_k\sim 1\), and \(k\in \mathcal {V}_U\) iff \(f(\nabla u_k)>\varepsilon \ge f(\nabla u_j)\). Finally,

$$\omega _{ik}=\delta g\left( f(\nabla u_i),f(\nabla u_k)\right) \ge \delta g\left( f(\nabla u_i),f(\nabla u_j)\right) =\omega _{ij},$$

due to monotonicity of g. If g is strictly monotone, the inequality is also strict.

4.3 Stage 2: Connecting Different Components

We now focus on the second stage in the image segmentation algorithm. Let \(S\subset \mathcal {V}\) be the set of voxels whose intensities are the M largest components of the intensity vector. Since such hard thresholding does not guarantee any connectivity of S, we have several connected components \(S_1\ldots S_k\). Without loss of generality we may assume that \(S_1\) has cardinality larger than the other components. Let \(\widetilde{G}\) be the factor graph, where we consider two vertices equivalent if they lie in one and the same connected component \(S_j\). We perform a lexicographical breadth first search (LBFS) [20] in \(\widetilde{G}\) and construct the corresponding lexicographical BFS tree rooted at \(S_1\). For the lexicographical BFS we need to introduce ordering of the vertices, which, in our case is by intensity values. This is aimed at minimizing \(J_0\) (or alternatively maximizing \(J_*\), because the BFS tree contains edges between vertices with high intensity. The final step of the algorithm connects each of \(S_j\), \(j=2,\dots ,k\) with \(S_1\) via the tree branches.

Other algorithms for choosing paths between \(S_1\) and the rest of \(S_j\), \(j=2,\dots ,k\) which maximizes \(J_*\) can also be used in place of what we propose here.

4.4 Stage 3: Cutting Heavy Leaves

We now discuss some theoretical aspects of the third phase of the algorithm aimed at trimming the connected component from the previous section in order to obtain an image segmentation that satisfies the volume constraint.

Let \(u\in \{0,1\}^N\) be discrete, and S be the (DP) admissible 2-phase segmentation with respect to u. The latter means that \(u|_S=1\), and \(u|_{\bar{S}}=0\). Let fg be admissible and strictly monotone, and the graph \(G=G_{\infty }\) is build w.r.t. the weights (6). Let S be G-connected, T be a MST of \(G_S\), \(l_T\in \mathcal {L}_T\) with \((l_T,j_T)\in \mathcal {E}_T\), and \(\omega _{l_Tj_T}>0\). Then

$$\begin{aligned} l_T\in \partial S:=\left\{ i\in S\;:\;\begin{array}{c}\mathcal {N}(i)\cap S\ne \emptyset \\ \mathcal {N}(i)\cap \bar{S}\ne \emptyset \end{array}\right\} =S\cap \mathcal {V}_U. \end{aligned}$$
(7)

Indeed, first of all, \(\mathcal {V}_0=\mathop {\mathrm {int}}\bar{S}\), \(\mathcal {V}_1=\mathop {\mathrm {int}}S\), thus the set equality in (7) holds true.

Now, assume the contrary, i.e., \(l_T\in \mathop {\mathrm {int}}S\). Since \(l_T\in \mathcal {L}_T\cap \mathop {\mathrm {int}}S\), the MST cut property, applied for \((l_T,S\setminus \{l_T\})\) gives rise to \(\min _{j\in \mathcal {N}(l_T)}\omega _{l_Tj}=\omega _{l_Tj_T}>0\), and from Lemma 2(ii) it follows that \(\mathcal {N}(l_T)\subset \partial S\). For every \(j\in \mathcal {N}(l_T)\setminus \{j_T\}\), \((l_T,j)\notin \mathcal {E}_T\), because \(l_T\in \mathcal {L}_T\). Due to the strict monotonicity of g and the positivity of f, we have for each \(k\in \mathcal {N}(j)\cap (\mathop {\mathrm {int}}S)^c\)

$$\omega _{jk}=|u_j-u_k|+\delta g\left( f(\nabla u_j),f(\nabla u_k)\right) >\delta g\left( f(\nabla u_j),0\right) =\omega _{jl_T}.$$

Applying again the MST cut property this time for \((j,S\setminus \{j\})\), we derive that there exists \(k_j\in \mathcal {N}(j)\cap \mathop {\mathrm {int}}S\), \(k_j\ne l_T\), and \((j,k_j)\in \mathcal {E}_T\). Thus, we have a \(3\times 3\times 3\) cube \(\mathcal {N}(l_T)\), centered at \(l_T\), all 26 boundary voxels of which are also from the G-boundary of the G-connected set S, while at least 25 of them are also G-connected to other interior points of S (different from \(l_T\)!) within the \(5\times 5\times 5\) cube \(\mathcal {N}^2(l_T)\), centered at \(l_T\). It is straightforward to show that there should be at least 6 different external points (one for each \(3\times 3\) interior of a side of the cube), and at least 5 internal points (on the side of \(j_T\) there may not be one) each two of them at a distance at least 2 in \(\Vert \cdot \Vert _{\ell ^{\infty }}\). This is impossible. The rigorous proof of (7) is rather elaborate and is beyond the scope of the paper. What we need is a corollary from this result, which we state now.

Proposition 1

Let \(u\in \{0,1\}^N\) be discrete, S be the admissible 2-phase segmentation with respect to u, fg be admissible and strictly monotone, and the graph \(G=G_{\infty }\) be build w.r.t. the weights (6). If S is G-connected, then for every MST T of \(G_S\) its heaviest leaf \(l_T\) belongs to \(\mathcal {V}_U\).

Proof

Let \(j_T\) be as before. If \(\omega _{l_Tj_T}>0\) the result follows from the arguments above. Assume the contrary, i.e., \(l_T\in \mathop {\mathrm {int}}S\). Thus \(\omega _{l_Tj_T}=0\) and Lemma 2(ii) implies \(j_T\in \mathop {\mathrm {int}}S\), and \(\mathcal {L}_T\subset \mathop {\mathrm {int}}S\). Now we aggregate \(l_T\) and \(j_T\) into a new (super) vertex/voxel \(l^1_T\). We obtain \(S^1=S\cup \{l^1_T\}\setminus \{i\cup j\}\), and \(\mathcal {E}_{S^1}\) can be straightforwardly derived from \(\mathcal {E}_S\), since \(l_T\) and \(j_T\) agree on all the “doubled” edges (i.e., \(\omega _{l_Tk}=\omega _{j_Tk},\;\forall k\in \mathcal {N}(l_T)\cap \mathcal {N}(j_T)\)). Due to the MST contraction property, the graph \(G_{S^1}\) is G-connected with MST \(T^1(S^1,\mathcal {E}_T\setminus \{(l_T,j_T)\}\). Thus, all the leaves \(\mathcal {L}_T\setminus \{l_T\}\) remain leaves in \(T^1\) and their weight is preserved as zero, because \(\omega _{l_Tj_T}=0\) was the heaviest leaf in T. \(l^1_T\in \mathop {\mathrm {int}}S^1\) may or may not be a leaf in \(T^1\), but since

$$|\mathop {\mathrm {int}}S^1|=|\mathop {\mathrm {int}}S|-1<|\mathop {\mathrm {int}}S|,$$

after finitely many contractions m (\(m\le |\mathop {\mathrm {int}}S|\)) we will end up with a factor graph \(G_{S^m}\), where \(\mathcal {L}_{T^m}\subset \mathop {\mathrm {int}}S^m\) and the heaviest leaf weight is strictly positive. This leaf can appear only after aggregation, thus belongs to \(\mathop {\mathrm {int}}S^m\). Contradiction with the arguments above.

4.5 An Algorithm for Constrained Image Segmentation

The steps of the algorithm described above are formally written as follows.

figure b

4.6 Properties of the Algorithm

Let u be the input intensity (input image). Compute \(\mathcal {V}_0,\mathcal {V}_1,\mathcal {V}_U\) for it. Note that, removing local maxima and local minima is just denoising, so for the recomputed sets \(\bar{\mathcal {V}}_0,\bar{\mathcal {V}}_1,\bar{\mathcal {V}}_U\) after stage 1 the following inclusions hold true \(\mathcal {V}_0\subseteq \bar{\mathcal {V}}_0\), \(\mathcal {V}_1\subseteq \bar{\mathcal {V}}_1\). Denote by \(C_1\) the minimal \(G_{\infty }\)-connected set that contains \(\bar{\mathcal {V}}_1\). Let \(S_M\) be the set from step 3. We say that u is admissible, if \(\mathcal {N}(\bar{\mathcal {V}}_1)\subseteq S_M\), \(|\mathcal {N}(\bar{\mathcal {V}}_1)\cup C_1|\le M\), and \(\mathcal {N}(\bar{\mathcal {V}}_0)\subseteq \bar{S}_M\). If not, it is clear that either the parameter choices in \(\mathcal {V}_1\) or \(\mathcal {V}_0\) were poor or the constraint parameter M approximates badly the solid phase volume.

For admissible u, Proposition 1 implies that we cut only \(\mathcal {V}_U\) voxels in stage 3, thus \(\bar{\mathcal {V}}_1\subseteq S\) at every moment. Moreover, due to Lemma 2(ii) the heaviest leaf weight is strictly positive. No \(i\in \bar{\mathcal {V}}_0\) belongs to any shortest path between the S components, thus after stage 2 \(\bar{\mathcal {V}}_0\subseteq \bar{S}\). Since in step 6. We only cut, the inclusion remains true for the output image, as well. Finally, since we always cut out leaves from \(T_S\), after stage 2 till the end S is always \(G_{\infty }\)-connected. To summarize:

Theorem 3

For any admissible input image u, Algorithm 2 terminates and produces an output 2-phase segmentation S that is \(G_{\infty }\)-connected, has cardinality M, fully contains \(\mathcal {V}_1\), and doesn’t intersect with \(\mathcal {V}_0\). The complexity of the algorithm is \(\mathcal {O}(N\log N)\).

5 A Numerical Test

In this section we assess the performance of our Algorithm 2 on a part of an image of a trabecular bone. The image is taken from [2], then convoluted with a Gaussian kernel with \(\sigma =2\) (i.e., the image is blurred), and \(10\,\%\) white (Gaussian) noise is added to derive the input image u. The bone part image has size \(64\times 64\times 64\). 50604 of its voxels are bone material (porosity \(80.7\,\%\)), thus \(M=50604\). Figure 1 summarizes the results. The most left image is of the original discretized bone. The second one is the result of direct segmentation, where the M voxels of highest intensity are taken as the solid phase. The third one is the output of our Algorithm 2. The last one is the output of the segmentation in [11], based on fully constrained convex \(\ell ^2\)-norm minimization.

Fig. 1.
figure 1

From left to right: Segmented bone part (binary image), direct M-segmentation of the noisy and blurry version u, connected M-segmentation via Algorithm 2, segmentation from [11].

The direct M-segmentation is quite noisy. It consists of lots of 1-element components, as well as other larger ones. Unlike it, our segmentation is G-connected. There are still some 1-voxel-wide branches, due to small noisy components in the set \(S_3\) at step 3., which have been aggregated to the main component \(C_0\) in stage 2 (see Fig. 2). The result of the segmentation in [11] lacks any noise, because of the smoothing role of the edge weights there, but is not G-connected and consists of three different components, thus it is not admissible with respect to Definition 5.

Fig. 2.
figure 2

From left to right: Segmented bone part (binary image), the set S after stage 2, the set of cut leaves in stage 3, and the final result of Algorithm 2.

Note that none of the \(S_3\) connected components is of cardinality 1, due to the removing of local maximums. During the leaf cutting, some of those noisy branches have been erased, but some of them remain in the result. The reason is the usage of only voxels’ intensity values throughout steps 1–4 of the algorithm, thus no regularization has been applied in the process, and the connected components of \(S_3\) are not as “homogeneous” as they should be when the gradient is taken into account in the expanding process. Minimizing the functional \(J_1\) instead of \(J_0\) in step 4 should improve the quality of the result and is a subject of future work. We want to point out that out of 234 cut leaves in stage 3, only 5 of them belong to the actual bone, and the remaining 229 are indeed noise. The \(\ell ^1\) difference of our result with the original bone is 20 844, which is larger than the 15 524 difference of the result in [11], but is by almost a thousand better than the difference of the direct M-segmentation (which is 21 796, as computed in [11]). The former means that there is plenty of room for improvement (e.g., possible combination of the two constraint algorithms, “thickening” the minimal paths, “homogenizing” the connected components, etc.), while the latter indicates that by simply removing local extrema and only replacing 234 candidate voxels with another, better group of 234 voxels, we already gain a lot.

6 Conclusions and Future Work

We proposed and tested a class of algorithms for constrained image segmentation. The algorithms are based on the minimization of suitable functionals measuring the best approximation of the input image within the space of step functions. For the output image, the approximate segmentation produced by the algorithm has a connected solid phase with fixed volume. Such type of algorithms, and especially multilevel versions of these algorithms, show potential to be robust tools in the image analysis.