Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Image segmentation is the process of assigning meaningful labels to pixels (2D) or voxels (3D). In medical imaging, the set of labels corresponds to a number of biologically relevant regions of interest (ROIs), such as different organs, types of cells, etc., as well as a background, in most cases. Segmentation is a key preprocessing step for a wide array of subsequent analyses, such as volumetry or shape analysis, and is therefore one of the core problems in medical imaging.

As useful as traditional manual segmentation is, its usage is encumbered by the necessity that images have to be delineated by an expert, which leads to three main issues: (a) it is not reproducible; (b) it often requires expertise that might not be present at every research center; and (c) it is typically very time consuming – and thus expensive. This last limitation is particularly problematic in 3D data (in which multiple 2D slices are traced to create a 3D segmentation) for two reasons. First, the increasing resolution of medical images requires segmenting a larger number of 2D slices. And second, several iterations are often needed in order to ensure that segmentations made in a particular orientation are consistent and smooth when displayed in the orthogonal views. Moreover, 3D segmentation is typically a very inefficient process, since a lot of time is spent labeling 2D images that are very similar to their neighboring slices.

Some of the limitations of manual segmentation can be addressed with automated techniques. Most popular families of automatic segmentation methods are supervised and rely on training data, e.g., shape models [1], multi-atlas segmentation [2, 3], probabilistic atlases [4], or voxel classifiers based on learning models such as support vector machines [5], random forests [6] or, more recently, deep neural networks (e.g., [7]). However, creating databases of labeled training data for supervised techniques still requires manually segmenting images. And even for unsupervised methods, manual delineations are needed to produce a gold standard with validation purposes.

A faster, more reproducible way of generating gold standards is through semi-automated segmentation, which represents a compromise between automated and manual methods: the user is required to provide a relatively small amount of input, which an automated algorithm subsequently uses to produce a dense segmentation of the whole image. In 2D, the popular live wire method [8, 9] uses a shortest path algorithm to produce a continuous contour out of a set of points placed along the (possibly ill-defined) boundary of an object. The GrabCut algorithm [10] iteratively uses graph cuts [11] to create a segmentation of an object from a rectangular, user-provided bounding box. The Random Walker algorithm [12] uses a set of user-provided scribbles to compute a dense segmentation based on the probability that a random walker starting at each unlabeled pixel first reaches one of the prelabeled pixels. GeoS [13] also relies on scribbles to produce a geodesic distance map that, in the context of a conditional random field, yields the final segmentation. Many of these methods can be made interactive by allowing the user to modify his input while updating the output in real-time – if this is computationally feasible.

Extending the semi-automated methods described above to 3D is immediate in most cases. A notable exception is live wire, which requires finding a minimal surface joining user-specified contours. Such an extension has a direct, very useful application to semi-automated segmentation: a user can manually delineate one slice every n in a 3D volume (possibly with 2D semi-automated techniques) and then use this method to produce a smart “interpolation” of the segmentation for the unlabeled slices. This strategy effectively exploits the similarity between adjacent slices and can save large amounts of manual labeling effort.

After some attempts in the literature to reconstruct 2D surfaces from sets of 1D curves (e.g., [14]), Grady proposed an algorithm [15] to compute a globally minimal surface given its boundary. The solution is efficiently found by solving a minimum-cost flow problem [16]. Compared with other globally optimal surface algorithms, such as LOGISMOS [17] and its variants (all based on [18]), Grady’s method has the advantages of being able to handle topological changes in the optimal surface; being agnostic to the type and modality of the input images; and not requiring a preliminary volumetric segmentation.

Generalizing Grady’s method to multi-class segmentation is not straightforward. One can apply the algorithm one label at the time and combine the binary outputs into a multi-class segmentation, but this approach has three disadvantages. First, it requires handcrafting rules for handling conflicts (overlaps, holes) when merging segmentations. Second, it is suboptimal in terms of cost function. And third, the quality of the segmentation can in some cases be considerably worse than that of an algorithm that jointly computes the surfaces, e.g., high-contrast ROIs can help the segmentation of neighboring lower-contrast ROIs.

Here we present a generalization of Grady’s algorithm to multiple ROIs. The method computes surfaces simultaneously for all labels while ensuring that no holes or overlaps are produced, and also inherits the ability to handle topological changes in the surfaces. We show that, despite the coupling between the surfaces, the globally optimal solution can still be obtained by solving a linear (rather than integer) program, which can be done in practical times with existing techniques.

2 Methods

2.1 Continuous Formulation of Joint Surface Fitting

In the continuous domain, the problem of joint surface fitting can be framed as a constrained optimization problem, illustrated in Fig. 1. Given a cuboid shaped 3D image domain \(\varOmega \), let \(\{\mathcal {U}_l: l=1,\ldots ,L\}\) represent user-provided segmentations for L different ROIs (including the background), defined on one or more planes. In our target application, there would be two parallel planesFootnote 1, but this does not need to be the case in our formulation. Let \(\{\mathcal {R}_l = \delta \mathcal {U}_l: l=1,\ldots ,L\}\) represent the boundaries of the segmentations; note that each \(\mathcal {R}_l\) can represent more than one contour, e.g., contours on different planes, as in Fig. 1. Finally, let \(\{\mathcal {Z}_l: l=1,\ldots ,L\}\) represent the L surfaces to fit; note that \(\mathcal {Z}_l\) is necessarily an open surface since it has a non-empty boundary \(\mathcal {R}_l\). The problem is then:

$$\begin{aligned} \min _{\{\mathcal {Z}_l\}} \quad&\sum _{l=1}^L \left( \alpha \int _{\mathcal {Z}_l} dS + \int _{\mathcal {Z}_l} g_l(I,\hat{\varvec{n}}; \beta ) dS \right) \\ \text {s.t.} \quad&\mathcal {B}(\mathcal {Z}_l) = \mathcal {R}_l, \quad \bigcup _{l=1}^L \mathrm {Vol}(\mathcal {Z}_l \cup \mathcal {U}_l) = \Omega , \quad \bigcap _{l=\{l_1,l_2\}} \mathrm {Vol}(\mathcal {Z}_l \cup \mathcal {U}_l) =\emptyset , \forall l_1 \ne l_2, \nonumber \end{aligned}$$
(1)

where dS is the area element; \(\mathcal {B}\) is the boundary operator; \(\mathrm {Vol}(\mathcal {Z}_l \cup \mathcal {U}_l)\) is the 3D segmentation enclosed by \(\mathcal {Z}_l \cup \mathcal {U}_l\); \(\alpha ,\beta \) are scalar constants; I represents the image intensities; and \(\hat{\varvec{n}}\) is the surface normal at each point (pointing towards the inside of the volume). The first term in Eq. 1 is a regularizer that penalizes the total surface area with relative weight \(\alpha \). The second term encourages the surfaces to follow perpendicular image gradients, by means of a function \(g_l\) of the image intensities and the normal \(\hat{\varvec{n}}\), parametrized by \(\beta \). A frequent choice is:

$$\begin{aligned} g_l(I,\hat{\varvec{n}}; \beta ) = \exp (-\beta \Vert \nabla _{\hat{\varvec{n}}} I \Vert ^2 ), \end{aligned}$$
(2)

which we use in this study. Note that the cost of a surface is independent of its label l in Eq. 2, but this does not need to be the case in general.

Fig. 1.
figure 1

We aim to find the open surfaces \(\{\mathcal {Z}_l\}\) that: (a) globally minimize the cost function in Eq. 1; (b) are constrained to have boundaries \(\{\mathcal {R}_l\}\) given by the contours of user-specified segmentations \(\{\mathcal {U}_l\}\); and (c) do not intersect or leave holes in between.

2.2 Discrete Formulation

We follow [15] to discretize the problem in Eq. 1, which enables us to compute a globally optimal solution. We assume that our image intensities are defined at the center of cuboid voxels. These centers correspond to the nodes of a 6-connected 3D lattice, which defines our primal graph (see Fig. 2). The (primal) nodes are connected by K primalFootnote 2 edges \(\{e^p_k: k=1,\ldots ,K\}\), which have corresponding (label-dependent) weights \(\varvec{w}_l=[w_{l1}, \ldots , w_{lK}]^T\). In addition, we also consider the dual graph, in which nodes corresponds to volumes, edges to facets, and vice versa. The volumes of the dual graph correspond to the cuboids of the image voxels, such that the dual facets \(\{f^d_k: k=1,\ldots ,K\}\) correspond to the faces of these cuboids (where our surfaces will be defined). The dual facets have a direct correspondence with the primal edges, and hence share the same weights \(w_{lk}\). Note that it is crucial to include each primal edge/dual facet twice in \(\{e^p_k\}\), \(\{f^d_k\}\) and \(\varvec{w}_l\) with opposite orientations. This is because we need to discern whether the segmented ROI is on one side of the dual facet or the other, following the convention that the surface normal points toward the inside of the enclosed volume. In a similar fashion, we also consider the M dual edges \(\{e^d_m: m=1,\ldots ,M\}\) with corresponding primal facets \(\{f^p_m: m=1,\ldots ,M\}\).

Fig. 2.
figure 2

Duality in 3D, 6-connected lattice. The image intensities are defined on the primal nodes. The surfaces we consider in this paper are given by sets of dual facets, which are equivalent to the faces of the cuboids representing image voxels.

To represent each surface, we use a discrete function \(\varvec{z}_l=[z_{l1}, \ldots , z_{lK}]^T\) that encodes whether dual facet \(f^d_k\) is part of the \(l^\mathrm{{th}}\) surface (or, equivalently, whether primal edge \(e^p_k\) intersects the surface), and with what multiplicity. Hence, we are replacing each continuous surface \(\mathcal {Z}_l\) by a set of voxel faces represented by \(\varvec{z}_l\).

Cost Function: We discretize the cost function by replacing integrals by sums in Eq. 1. If dual facet \(f^d_k\) is part of the \(l^\mathrm{{th}}\) surface, its contribution to the cost is:

$$ w_{lk} = \text {Area}(f^d_k) \left( \alpha + \exp \left[ -\beta (I_{j_1}-I_{j_2})^2\right] \right) , $$

where \(I_{j_1}\) and \(I_{j2}\) are the image intensities at the primal nodes defining the corresponding primal edge \(e^p_k\); and \(\text {Area}(f^d_k)\) is the area of facet \(f^d_k\), given by the voxel dimensions. In our case, due to the choice of \(g_l\) in Eq. 2, the weight is the same for both orientations of the edge, but this does not need to be the case in general. We have kept the weights \(\{\varvec{w}_l\}\) separated in the notation (even if they are the same for all ROIs) for practical reasons in the implementation of the algorithm (Sect. 3.2 below). The total cost to minimize is then given by:

$$\begin{aligned} \min _{\{\varvec{z}_l\}} \quad \sum _l \varvec{w}_{l}^T \varvec{z}_l = \sum _l \sum _k w_{lk} z_{lk}. \end{aligned}$$
(3)

Constraints: We need to consider two sets of constraints. First, we must ensure that the boundaries of the surfaces correspond to the contours of the specified 2D segmentations. This can be achieved through the edge-facet incidence matrix of the dual graph \(\varvec{B}\), which yields the discrete 3D boundary operator:

$$\begin{aligned} B_{mk} = {\left\{ \begin{array}{ll} 1, &{} \text {if } e^d_m \text { borders } f^d_k \text { with coherent orientation,}\\ -1, &{} \text {if } e^d_m \text { borders } f^d_k \text { without coherent orientation,}\\ 0, &{} \text {otherwise},\\ \end{array}\right. } \end{aligned}$$
(4)

where orientation coherence is given by the right-hand rule. Next, we define the contour vectors \(\{\varvec{r}_l: l=1,\ldots ,L\}\), also on the dual graph, which represent the boundaries of the user-defined segmentations on one or more slices:

$$\begin{aligned} r_{lm} = {\left\{ \begin{array}{ll} 1, &{} \text {if } e^d_m \text { is on the l}^\mathrm{{th}}\text { contour with coherent orientation,}\\ -1, &{} \text {if } e^d_m \text { is on the l}^\mathrm{{th}}\text { contour without coherent orientation,}\\ 0, &{} \text {otherwise}.\\ \end{array}\right. } \end{aligned}$$
(5)

With these definitions of \(\varvec{B}\) and \(\varvec{r}_l\), we can write the constraints simply as:

$$\begin{aligned} \varvec{B} \varvec{z}_l = \varvec{r}_l, \quad \forall l. \end{aligned}$$
(6)

The second set of constraint specifies that the union of the enclosed volumes needs to be equal to the image domain \(\varOmega \), while the intersection must be the empty set. We define the matrix \(\varvec{C}\) and vector \(\varvec{t}\) as:

$$\begin{aligned} C_{kk'} = {\left\{ \begin{array}{ll} 1, &{} \text {if } k' = k, \\ -1, &{} \text {if } k' = \tilde{k}(k),\\ 0, &{} \text {otherwise},\\ \end{array}\right. } ,\quad t_k = {\left\{ \begin{array}{ll} 1, &{} \text {if } f^d_k \in \delta \varOmega _{[\mathrm {inwards}]}, \\ -1, &{} \text {if } f^d_k \in \delta \varOmega _{[\mathrm {outwards}]},\\ 0, &{} \text {otherwise},\\ \end{array}\right. } \end{aligned}$$
(7)

where \(\tilde{k}(k)\) is the index of facet \(f^d_{\tilde{k}}\) with opposite orientation to \(f^d_k\), whereas \(\delta \varOmega _{[\mathrm {inwards}]}\) and \(\delta \varOmega _{[\mathrm {outwards}]}\) are the sets of dual facets on the external boundary of \(\varOmega \) whose associated normals point towards the inside and outside of the image domain, respectively. We also define \(\varvec{C}^*=\left[ \varvec{C} \varvec{C} \cdots \varvec{C}\right] \), and \(\varvec{z}^* = [\varvec{z}_1^T \cdots \varvec{z}_L^T]^T\). Then, the constraints can be encoded in the following system of linear equations:

$$\begin{aligned} \varvec{C}^* \varvec{z}^* = \varvec{t}. \end{aligned}$$
(8)

Forcing \(\sum _{l=1}^L \left( z_{lk} - z_{l\tilde{k}} \right) \) in Eq. 8 to be zero inside \(\varOmega \) ensures that, if a facet is part of a surface on one side, it must also be part of another on the other side, thus avoiding holes and overlaps in the segmentation. Forcing the sum to be \(\pm 1\) (depending on the orientation) on the walls ensures that the whole image domain will be covered by the surfaces. In practice, including one of the orientations in \(\varvec{C}\) and \(\varvec{t}\) is sufficient, since swapping k and \(\tilde{k}\) yields the same constraints.

2.3 Integer Program and Linear Programming Relaxation

We can combine the cost function in Eq. 3 with the constraints defined in Eqs. 6 and 8 to define the following integer program (IP):

$$\begin{aligned} \min _{\varvec{z}^*} \quad&\varvec{W}^T \varvec{z}^* \nonumber \\ \text {s.t.} \quad&\varvec{A} \varvec{z}^* = \varvec{v} \\&\varvec{z}^* \succeq 0, \nonumber \\&\varvec{z}^* \in \mathbb {Z}^{L\times K}, \nonumber \end{aligned}$$
(9)

where we have defined:

$$\begin{aligned} \varvec{W}=\left[ \begin{array}{c} \varvec{w}_1\\ \varvec{w}_2\\ \vdots \\ \varvec{w}_L\\ \end{array}\right] ,\quad \varvec{A}=\left[ \begin{array}{cclc} \varvec{B}&{}0&{}\cdots &{}0\\ 0&{}\varvec{B}&{}\cdots &{}0\\ \vdots &{} \vdots &{}\ddots &{}\vdots \\ 0&{}0&{}\cdots &{}\varvec{B}\\ \hline &{}&{}\varvec{C}^*&{}\\ \end{array}\right] ,\quad \varvec{v}=\left[ \begin{array}{c} \varvec{r}_1\\ \varvec{r}_2\\ \vdots \\ \varvec{r}_L\\ \hline \varvec{t}\\ \end{array}\right] . \end{aligned}$$
(10)

Integer programming is notoriously an NP-hard problem [19]. Therefore, solving (9) directly is impractical. However, it can be shown [20] that, if the equality constraint matrix \(\varvec{A}\) is totally unimodularFootnote 3 (TU) and the equality constraint vector \(\varvec{v}\) is integer valued, then the linear programming (LP) relaxation of the IP is guaranteed to produce an integer solution, which is equal to the solution of the original IP (note that this condition is sufficient, but not necessary).

Lemma

The equality constraint matrix \(\varvec{A}\) in IP (9) is TU.

Proof

It can be shown [21] that a sufficient condition for TU is that the columns of the matrix add up to zero while all its elements are in the set \(\{-1,0,1\}\). Here, we first observe that the matrix \(\varvec{B}\) can be partitioned into \(\varvec{B}=[\varvec{B}_1 | \varvec{B}_2]\), such that \(\varvec{B}_1\) includes each dual facet only once – no matter with which of the two possible orientations. Then, if we sort the columns of \(\varvec{B}_2\) such that they follow the same facet order as \(\varvec{B}_1\), we have that \(\varvec{B}_2 = -\varvec{B}_1\), since the columns correspond to the same facets but with opposite edge orientations. Hence, the columns of \(\varvec{B}\) add up to the zero vector. The columns of \(\varvec{C}^*\) also add up to the zero vector, since each row contains L elements equal to 1 and L equal to \(-1\). Therefore, the columns of \(\varvec{A}\) also add up to zero (Eq. 10). Since all elements of \(\varvec{A}\) are in the set \(\{-1,0,1\}\) (from the definitions of \(\varvec{B}\) and \(\varvec{C}^*\) in Eqs. 4 and 7), then \(\varvec{A}\) is TU.

Since \(\varvec{A}\) is TU and \(\varvec{v}\) is integer (see the definitions of \(\varvec{r}_l\) and \(\varvec{t}\) in Eqs. 5 and 7), the following LP relaxation yields the same solution as (9):

$$\begin{aligned} \min _{\varvec{z}^*} \quad&\varvec{W}^T \varvec{z}^* \nonumber \\ \text {s.t.} \quad&\varvec{A} \varvec{z}^* = \varvec{v} \\&\varvec{z}^* \succeq 0, \nonumber \\&\varvec{z}^* \in \mathbb {R}^{L \times K}, \nonumber \end{aligned}$$
(11)

Linear programming has polynomial complexity, and efficient solvers exist to compute the globally optimal solution of (11) in practical times.

3 Experiments and Results

3.1 Data

We used two publicly available datasets in our experiments. The first dataset consists of 35 T1-weighted brain MRI scans from the 2013 MICCAI SATA challengeFootnote 4. The images were acquired on a 3 T scanner with an MP-RAGE sequence at 1 mm isotropic resolution. Fourteen structures were labeled by experts in coronal plane: left and right amygdala, caudate, accumbens, hippocampus, putamen, thalamus and pallidum. We refer to these data as the “brain dataset”.

The second dataset consists of five ex vivo MRI scans of single human hippocampi (3 right, 3 left) [22]. The scans were acquired on a 9.4 T scanner with a multi-spin echo sequence at \(0.2\times 0.2\times 0.3\) mm resolution (coronal). Five hippocampal subfields were manually delineated on the images by an expert: CA1; CA2 and CA3 (CA23); hilus of the dentate gyrus (DG:H); stratum radiatum, stratum lacunosum-moleculare and the vestigial hippocampal sulcus (SR+SLM+VHS); and stratum moleculare of the dentate gyrus (DG:SM). The manual segmentations were mostly made on the coronal plane, with verification in the sagittal plane. We refer to these images as the “hippocampal dataset”.

3.2 Experimental Setup

We evaluated the performance of our algorithm as a function of \(n_\mathrm{{skip}}\), the number of unlabeled slices between each pair of labeled slices. As baseline approach, we applied Grady’s algorithm to each label independently, and then merged the resulting L binary segmentations into a multi-label volume – if two segmentations overlapped in given region, we selected the label that gave the lowest cost (Eq. 3).

For each value of \(n_\mathrm{{skip}}\), we randomly sampled 10 and 50 stacks of \(n_\mathrm{{skip}}+2\) coronal slices from each scan of the brain and hippocampal datasets, respectively. This yielded 600 test volumes (350 brain, 250 hippocampal), in which the segmentation was assumed to be known for the first and last slice. We then used the two competing methods to segment the rest of the slices in each volume, and measured the overlap with the ground truth using Dice scores. We merged the results of the left and right side of each ROI for simplicity of presentation.

Implementation Details: We min-max normalized the images to [0, 255], and manually tuned the parameters (on a separate brain MRI dataset) to \(\beta =0.005\) and \(\alpha =10^{-5}/n_\mathrm{{skip}}\). We solved the LP (11) with Gurobi 7.0 (www.gurobi.com). To handle the boundaries \(\delta \varOmega \) for which no segmentation was specified by the user (e.g., the “walls” completing the surface of the cuboid between two labeled slices), we assumed that the background ROI surrounded all others (which was the case in all our experiments) and set the costs \(w_{lk}\) of the corresponding surface facets (with the normal facing inwards) to a large negative constant:

$$ \varvec{W}=\left[ \begin{array}{c} \tilde{\varvec{w}}_1\\ \vdots \\ \tilde{\varvec{w}}_L\\ \end{array}\right] , \text { where } \tilde{w}_{lk} = {\left\{ \begin{array}{ll} -\infty , &{} \text {if } l = \text {background, } f^d_k \in \delta \varOmega _{[\mathrm {inwards}]} \text { and } f^d_k \\ &{} \text {does not correspond to a segmented voxel}\\ w_{lk}, &{} \text {otherwise}. \end{array}\right. } $$

This setup also enabled us to drop the rows corresponding to the dual facets on \(\delta \varOmega \) from the \(\varvec{C}\) matrix (Eq. 7), speeding up convergence of the solver in practice.

Fig. 3.
figure 3

Brain dataset: Dice score as a function of the number of unlabeled slices between manually segmented slices for each brain structure, as well as average across structures.

Fig. 4.
figure 4

Hippocampal dataset: Dice score as a function of the number of unlabeled slices between manually segmented slices for each subfield, as well as average across subfields.

3.3 Results

Figures 3 and 4 show the results for the brain and hippocampal datasets, respectively, whereas Figs. 5 and 6 display sample outputs for the two competing algorithms. In the brain dataset, the proposed method exploits the neighboring relationships between the structures in order to outperform the baseline algorithm for every ROI. The differences between the two methods become larger as the separation \(n_\mathrm{{skip}}\) grows, averaging 5% Dice at \(n_\mathrm{{skip}}=5\). The gap is particularly large (>11% at \(n_\mathrm{{skip}}=5\)) for the pallidum; this structure is often heavily undersegmented when processed on its own, but frequently emerges when segmented jointly with the neighboring, high-contrast putamen (as in Fig. 5).

Fig. 5.
figure 5

Sample segmentations from brain dataset with \(n_\mathrm{{skip}}=4\). Green is hippocampus, orange is putamen, blue is pallidum, dark red is thalamus, brown is caudate, and purple is amygdala. Note the missing pallidum in the baseline approach. (Color figure online)

Fig. 6.
figure 6

Sample segmentations from the hippocampal dataset with \(n_\mathrm{{skip}}=4\). Blue is CA1, white is CA23, red is DG:H, violet is DG:SM and pink is SR+SLM+VHS. Note the missing DG:SM in the baseline method. (Color figure online)

The results are similar in the hippocampal dataset. For the larger, high-contrast CA1, both methods produce almost identical results. However, for the internal, lower-contrast subfields, the proposed method outperforms the baseline, averaging \(4\%\) higher Dice by \(n_\mathrm{{skip}}=5\). The difference is largest for the stratum moleculare (DG:SM) and hilus (DG:H) of the dentate gyrus, which share a boundary that is practically invisible in the images (see example in Fig. 6).

In absolute terms, the proposed algorithm yields satisfactory results (Dice above 85–90%) for many structures all the way to \(n_\mathrm{{skip}}=5\) (hippocampus, putamen, thalamus, CA1, CA23, DG:H), and for most structures at \(n_\mathrm{{skip}} \le 3\). Exceptions are ROIs with faint boundaries and poor reliability (e.g., amygdala, SR+SLM+VHS). However, even low values of \(n_\mathrm{{skip}}\) can be very useful in practice, since they can save \(100 \times n_\mathrm{{skip}}/(1+n_\mathrm{{skip}})\) percent of manual labeling effort.

4 Conclusion

We have presented a semi-automated segmentation method that can compute a globally optimal set of discrete coupled surfaces, whose boundaries are specified by the contours of user-provided delineations on one or more (typically parallel) slices. The results have shown that the method outperforms the application of the binary version to each ROI independently.

The proposed method can easily be made interactive: if the segmentation is not satisfactory, feedback can be provided to correct the output. More specifically, the user can mark an oriented boundary between two voxels, specifying which label l should be found on a given side of it. Then, the weight of the facet at hand \(w_{lk}\) (with the appropriate orientation) can be set to a large, negative constant. This procedure effectively forces the surfaces to go through the specified points when the LP in (11) is solved to update the segmentation – since the solver is guaranteed to yield the global optimum.

In addition to the difficulties to handle surfaces with high curvature (inherited from [15]), the main limitation of the method is its computational requirements. The null-space of \(\varvec{A}\) in Eq. 10 does not yield a concatenation of volume-facet incidence matrices due to the coupling terms introduced by Eq. 8. This prevents rewriting the optimization as a minimum-cost flow problem, which can be more efficiently solved [15]. We were able to solve the LP relaxation relatively quickly in our experiments (minutes in the worst cases), but computation times grow quickly with image size. This is not a problem when “interpolating” segmentations across slices (a task the can be carried out offline), but it is a limiting factor for the interactive extension of the algorithm discussed above.

Future work will follow four directions. First, we will investigate ways of further simplifying the LP relaxation (Eq. 11). If we identify the graph for which \(\varvec{A}\) is the incidence matrix, we can device a faster optimization algorithm based on its dual graph. Second, we will implement and validate the interactive version of the algorithm. Third, we will evaluate the algorithm more extensively, including experiments with multiple orientations (e.g., 2.5D) and comparisons with other methods (e.g., alpha-expansion [11]). And fourth, we will explore other possible definitions of \(g_l\) (Eq. 2); while a simple exponential captures most visible edges (relying on the regularizer around ill-defined boundaries), ROI-specific weights computed with supervised edge detectors should yield higher performance.

As the amounts of available imaging data and their resolution grow, and as the requirements of labeled data to train state-of-the art supervised algorithms increase, we believe that semi-automated approaches like the method proposed in this paper will become increasingly important in medical image analysis.