Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Detection and segmentation of individual cell nuclei is a standard procedure in many biomedical applications as it is often essential for the quantitative analysis of biological processes. Due to the increasing amount of accumulated image data, manual labeling is usually not a viable option in practice. This problem is aggravated even more due to the difficulty of segmenting three-dimensional data as well as cases of closely juxtaposed or touching cell nuclei, which may form densely packed clusters that can be difficult to separate even for experts. Despite the large body of work and vast methodological diversity (see Sect. 2), no general approach for tackling the segmentation of cell nuclei seems to be available, which is likely due to the variation among different cell types regarding shape and size, as well as the diverse characteristics of the available imaging techniques.

Our images have been obtained using ultramicroscopy, which is based on the principle of light sheet fluorescence microscopy. In this imaging technique the specimen, in our case optically cleared murine fetal tissue, is illuminated by a thin sheet of light, which may be generated by a scanned laser beam or cylinder lenses [12]. This light sheet is projected into the focal plane of an orthogonally oriented detection objective which collects fluorescence signals from the specimen. Moving the sample through the light sheet in a stepwise fashion allows to generate image stacks from which high resolution 3D data sets can be reconstructed [17]. In contrast to confocal laser scanning microscopes, the light sheet setup uncouples the illumination from the detection light path which allows nearly isotropic resolution while keeping the illumination intensity of the sample minimal [22]. Light sheet microscopes can be realized with low magnification lenses, allowing the inspection of large samples with volumes of up to 1 cm\(^3\) at still subcellular resolution. In our samples, mouse embryo wholemount preparations were immunostained using an antiserum against a homeobox transcription factor to identify expressing cells. For subsequent light sheet imaging the specimen were optically cleared using the BABB protocol [26].

Several challenges arise in the segmentation of cell nuclei in ultramicroscopy image data which are due to the properties of the imaging modality and the biological samples. The images often suffer from a large variation of fluorescence intensity both in the labeled objects as well as the background (see Fig. 1(a)), which prohibits the application of global thresholding approaches, especially since the background intensity in dense regions is higher than foreground intensities in other regions. Moreover, foreground objects usually have fuzzy or blurry contours. One of the largest challenges, however, is the separation of touching cell nuclei of seemingly arbitrary orientation which form dense clusters extending in all directions within the three-dimensional data (see Fig. 1(b)).

Fig. 1.
figure 1

Example images of immunostained cell nuclei in ultramicroscopy images

Our main contribution is a pipeline specifically designed for ultramicroscopy images which combines machine learning and watershed segmentation as well as the integration of a geometric marker extension into the cluster splitting.

In the remainder of this paper, we first give an overview of existing methods for three-dimensional segmentation of clustered cell nuclei, followed by the workflow of our method. Next, we focus on the main part of our algorithm, the cluster splitting step, and subsequently evaluate the performance of our algorithm.

2 Related Work

Cell nuclei segmentation and quantification is one of the fundamental problems in biomedical image processing. The large diversity of imaging modalities and cell types as well as the complexity of the task have led to a large body of work and a tremendous variety of approaches for over more than ten years.

One of the most basic methods for segmentation is thresholding which is usually based on the intensity histogram of the image. An overview of thresholding techniques is given by Sezgin and Sankur [19]. Often, thresholding is applied to obtain an initial foreground segmentation which is followed by further refinement and cluster splitting computations. Lin et al. [10] have proposed an algorithm which relies on thresholding followed by several post-processing steps. Afterwards, they apply a 3D watershed segmentation on the gradient-weighted distance transform of the thresholded image. To compensate for over-segmentation of the cell nuclei clusters, this is followed by a model-based merging step relying on a priori knowledge about the size and shape of the cell nuclei. Wählby et al. [24] have also proposed a method which is based on watershed segmentation. Initially, they detect foreground seeds in the original image by applying a h-maxima transform. The watershed segmentation is then performed on the gradient magnitude image of the data. To improve the cluster separation results, they perform an additional watershed segmentation step on the distance transform of the previous result. Gertych et al. [7] have recently proposed an algorithm which combines a 3D radial symmetry transform for seed detection with a watershed-based segmentation after an initial background removal step.

Another approach for cell nuclei segmentation is the application of machine learning techniques. Fehr et al. [5] have proposed a method for segmentation and classification of cell nuclei in 3D fluorescence microscopy data which performs a voxel-wise classification on gray scale invariant features using a support vector machine (SVM). Tek et al. [23] have applied an interactively trained random forest classifier to perform a voxel-wise foreground segmentation, which is afterwards improved by morphological post-processing and the application of a size filter for the connected components.

Another class of cell nuclei segmentation methods is based on graph cuts. Daněk et al. [4] have proposed a two-stage graph cut method which performs the foreground segmentation using a two-terminal graph cut where initial seed labels are determined using histogram analysis. Afterwards, each connected component of the binary image is processed separately using a multi-terminal graph cut where seeds are identified by computing local maxima in the distance transform. Al-Kofahi et al. [2] have proposed an approach which utilizes a graph cut relying on initial foreground class probabilities computed by bimodal histogram analysis to perform the foreground segmentation. Afterwards, for each connected component in the binary image a multi-scale Laplacian-of-Gaussian (LoG) filter is applied to obtain seed points for the individual nuclei and perform a clustering step. The result of this step is then refined using a multi-terminal graph cut.

There are also several methods for cell nuclei segmentation that are based on level set active contours. Harder et al. [8] have proposed an approach which initializes the contours using local adaptive thresholding followed by a distance transform and a 3D watershed segmentation. Bergeest and Rohr [3] have proposed a method based on minimizing a convex energy functional using a level set representation and applying the 3D Laplacian operator for the cluster splitting.

The problem of separating densely packed cell nuclei has also been addressed using geometric approaches, usually based on a convexity-concavity analysis of the binary image obtained by applying one of the various foreground segmentation methods. Such approaches rely on the fact that cell nuclei, although not necessarily round, can be considered approximately convex. However, often these methods are based on analyzing two-dimensional contours to find concave splitting points (e.g., [16, 25]) and are not directly applicable to three-dimensional images. Indhumathi et al. [9] have applied this principle to 3D images by performing the concavity analysis and detection of splitting paths slice-wise while taking into account the relationship with the previous slice as a reference. Mathew et al. [11] have proposed a different approach for three-dimensional data which performs an approximate convex decomposition of a cluster of cell nuclei. Their method detects local maxima in the distance transform to establish the number k of components in the cluster. Afterwards, a set of line segments between sampled surface points is defined and Lines of Sight (LoS), i.e., line segments which do not leave the foreground component, are selected from this set. The LoS are then clustered into k clusters which are used to label the surface points. Afterwards, the label of each voxel is chosen by majority voting of the nearest surface points.

In our proposed method, we combine machine learning by interactively training a voxel classifier to obtain a binary image with a multi-step watershed approach which consists of a seed detection step, a geometric component involving the distance transform of the binary image, and a marker-based watershed segmentation step to compute the final labeling for each individual voxel.

3 Workflow Overview

Our approach uses a well-established workflow consisting of three basic steps. First, foreground segmentation is performed to obtain a binary image. Next, a connected component labeling is computed on the binary image. In the last step, the cell nuclei cluster splitting is applied separately for each component.

As outlined above, global thresholding does not perform well on ultramicroscopy images due to the intensity inhomogeneities in the fluorescence signal. Moreover, the histograms of our images do not show a bimodal distribution. This prohibits the use of several methods relying on modeling the histogram using a mixture of Gaussian or Poisson distributions such as in [4] or [2] to conveniently initialize more sophisticated methods segmentation algorithms like graph cuts. Instead, we perform the foreground segmentation step using voxel classification by interactively training a random forest classifier, which has been shown to be among the classifiers with the best overall performance [6]. For the interactive training, we use the open source tool Ilastik [21], which has already been applied in a similar way (although on different data) by Tek et al. [23]. We set up a binary (foreground vs. background) classification problem and interactively label voxels in a training data set while inspecting the intermediate results. Since Ilastik provides the option to visualize regions of high uncertainty in the classification, an expert performing the labeling can directly locate such regions of interest in the training data. After finishing the interactive training phase, the classifier can be used for batch processing of the actual data sets. We use a set of rotation-invariant features (smoothed raw data, gradient magnitude, and Hessian eigenvalues) at three-different Gaussian scales with \(\sigma _1 = 1.0, \sigma _2 = 1.6\), and \(\sigma _3 = 3.5\). An example of the foreground segmentation is depicted in Fig. 2.

Fig. 2.
figure 2

3D rendering example of the foreground segmentation.

After foreground segmentation, we apply a median filter on the binary image as a post-processing step. Then, we perform connected component analysis on the result. We discard components below a size threshold of 200 voxels which might originate from staining artifacts or false positives output by the voxel classifier. The choice of this threshold results from a size distribution of cell nuclei derived from a test data set which has been labeled voxel-wise by a domain expert (see Sect. 5). Each of the resulting connected components corresponds to either a single cell nucleus or a cluster of touching cell nuclei. The cluster splitting is then computed separately for each of the connected components, which will be outlined in detail throughout the following section.

4 Cluster Splitting

Our cluster splitting consists of three basic steps. First, we detect seeds which correspond to local maxima in the fluorescence intensities within the cluster. In the next step, the seeds are extended geometrically to spheres before labeling the remaining voxels of the cluster using a marker-based watershed segmentation. The complete workflow of our proposed cluster splitting method is depicted in Fig. 3. In the following, the three key steps will be discussed in detail.

Fig. 3.
figure 3

Workflow of our proposed cluster splitting approach. The original image and the binary image obtained from foreground segmentation are depicted using the stippled boxes. The key steps of our method are indicated by the double borders of the boxes.

4.1 Seed Detection

In the first step, seeds for the individual nuclei in the cluster are detected. Due to the (approximative) convexity of cell nuclei, this is often done by finding local maxima in the distance transform of the binary image. Unfortunately, this method did not perform well on our images, which is due to the relatively small radius of the nuclei as well as the blurry contours of the foreground which might significantly influence the result of the foreground segmentation and thus modify the convexity and concavity properties of the cluster. Moreover, the clusters are often very dense so that convex subsets can not be identified anymore.

Since fluorescence intensity in our images increases towards the centers of cell nuclei, we detect local maxima in the original image intensities instead of applying the distance transform. After smoothing the image with a 3D Gaussian filter kernel, we apply watershed by rainfall simulation [15] to the three-dimensional image. This method is based on the idea of rain falling on terrain and flowing along the path of steepest descent until reaching a catchment basin corresponding to a local minimum. In our case, we invert the flow direction of the algorithm. For each voxel in the smoothed image masked by the foreground segmentation, we start a path which will always flow to the neighbor voxel with the highest intensity value of its three-dimensional 26-neighborhood until reaching a local maximum (see Fig. 4(a)). Each of the detected local maxima is then considered a seed point. In contrast to the common application of this procedure, we do not label any of the start voxels to create a segmentation, but only use the method for detecting the local maxima. Moreover, for each maximum m we can easily compute the number \(s_m\) of start voxels with a flow path ending in m. This information can then be used to remove m from the list of seeds if \(s_m\) is below a threshold \(t_s\), e.g., if the local maximum results from an artifact in the foreground segmentation. Since the local maxima correspond to the centers of cell nuclei, the number k of seed points corresponds to the number of detected cell nuclei within the cluster.

Fig. 4.
figure 4

Schematic examples of the seed detection and marker extension steps.

4.2 Marker Extension

For each seed point \(v_i, 0 \le i < k\) detected in the previous step, the label i should be assigned to all of the voxels within the cluster which belong to the corresponding cell nucleus. Before assigning all of the voxels to a seed, we perform a geometric marker extension which is based on the fact that cell nuclei are approximately convex. For each seed \(v_i\), we compute a sphere \(c_i\) with center \(v_i\) which contains voxels that can unquestionably be assigned the label i. The radius \(r_i\) of the sphere \(c_i\) is given by

$$\begin{aligned} \small {r_i = \min \left\{ \delta (v_i), \frac{1}{2} \min \left\{ d(v_i,v_j)\right\} \right\} , 0 \le i,j < k, i\ne j,} \end{aligned}$$
(1)

where \(\delta (v_i)\) is the Euclidean distance of \(v_i\) to the background, and \(d(v_i,v_j)\) is the Euclidean distance between \(v_i\) and \(v_j\). For the first part, we compute the distance transform of the binary image obtained in the foreground segmentation, and compute \(\delta (v_i)\) by sampling the distance transform at \(v_i\). The second term is a constraint which accounts for the fact that, depending on the density of the cluster and orientation of the nuclei, spheres centered in the seed points with just the distance to the background might overlap and in such cases restricts the radius to half of the distance of \(v_i\) to its nearest neighbor in the set of seed points. Figure 4(b) illustrates an example of the marker extension computation. All of the voxels within the sphere \(c_i\) are assigned the label i and act as an initialization for the next step.

4.3 Marker-Based Watershed

Using the spheres computed in the previous step, a marker-controlled watershed segmentation [20] is applied to label all of the remaining voxels in the cluster. This method floods the image from the markers (i.e., the spheres which have already been labeled) using the intensity values of the smoothed image. In each step, we select the voxel with the highest intensity value from the set of unlabeled voxels and assign a neighboring label to it. When viewed topologically, the markers derived from the local maxima correspond to hills from which the labels flow down while the boundaries between the labels (i.e., cell nuclei) correspond to basins. Some examples of the cluster splitting results are shown in Fig. 5.

Fig. 5.
figure 5

3D examples of cluster splitting results. Left: original image with silhouettes of the foreground segmentation, right: resulting segmentation.

5 Performance Evaluation

We have evaluated the performance of our segmentation approach using two reference data sets which have been manually labeled by domain experts, which took an effort of several weeks. The first data set with a size of \(195\times 168 \times 129\) voxels containing 139 cell nuclei has been labeled voxel-wise, while in the second data set with a size of \(425 \times 276 \times 129\) voxels containing 190 cell nuclei the centroids of the nuclei have been marked. All of the data sets are 16-bit images and have a resolution of 1 \(\mu \)m in \(x-,y-,\) and \(z-\)direction. In all tests, the Gaussian smoothing of the original data in the cluster splitting step has been performed using a \(5\times 5 \times 5\) filter kernel with \(\sigma = 0.6\). For the post-processing step of the foreground segmentation, two successive median filter operations using a \(3\times 3\times 3\) kernel have been used. The threshold \(t_s\) in the seed detection has been set to 1 so that no filtering of local maxima was applied.

From the voxel-wise labeled cell nuclei we have calculated the mean volume in voxels as well as the standard deviation. We have only incorporated nuclei which do not contain any border voxels to avoid cases where nuclei are only partially visible. Additionally, we have excluded one outlier with a volume of 3339 voxels which had been identified to include a lot of background at its boundary. From the remaining 105 nuclei we have calculated the mean volume of 1094 voxels and the standard deviation of 533 voxels. Since the smallest cell nucleus in the manual labeling had a volume of 266 voxels, we conservatively chose a size threshold of 200 voxels for the connected components in the foreground segmentation.

Unfortunately, no definitive ground truth can be generated for the data, since even manual segmentation is a very time-consuming and error-prone task due to the three-dimensional nature of the images, a certain amount of ambiguity, and the overall complexity of the cluster splitting task, especially for a voxel-wise labeling. Outliers in the size distribution of the nuclei might thus indicate cases of over- and under-segmentation in the manual labeling.

For the quantitative performance evaluation we adopt the method of [11] which is based on the well-established metrics Recall, Precision, Accuracy, and F-measure. To calculate the metrics, we match the centroids of the cell nuclei in the manual reference segmentation (RS) to the centroids computed by the automated segmentation (AS). For each centroid in the RS, a spherical neighborhood of 9 voxels was considered. If a single centroid of the AS was found within the radius, it is counted as a true positive (TP) and removed from the list. For multiple matches, the closest is counted as a TP. Based on those matches, the false negatives (FN), i.e., centroids in the RS for which no match was found, and false positives (FP), i.e., centroids in the AS which have not been selected as a TP, are computed. The four metrics can then be calculated as follows:

$$\begin{aligned}&\text{ Recall }&=~&\frac{\text{ TP }}{\text{ TP } + \text{ FN }}&\text{ Precision }&=~&\frac{\text{ TP }}{\text{ TP } + \text{ FP }}&~&\\&\text{ F-Measure }&=~&2 \cdot \frac{\text{ Precision } \cdot \text{ Recall }}{\text{ Precision } + \text{ Recall }}&\text{ Accuracy }&=~&\frac{\text{ TP }}{\text{ TP } + \text{ FN } + \text{ FP }}&~&\end{aligned}$$

We compare our method to three different state-of-the art approaches, a 3D watershed algorithm proposed by Ollion et al. [14] which is available as a Fiji plugin [18], the Lines-of-Sight (LoS) decomposition method by Mathew et al. [11], and the graph-cut based method proposed by Al-Kofahi et al. [2] which has been implemented in the FARSight toolkit [1]. Where required, the best parameter sets for the methods have been determined by parameter scanning. For the FARSight implementation, the images had to be converted to 8-bit. For the first data set, the voxel-wise manual segmentation was used as a foreground segmentation (except for the FARSight implementation of the method of Al-Kofahi) to solely evaluate the cluster splitting. For the LoS method, this foreground segmentation was additionally processed using a \(3\times 3 \times 3\) median filter to improve the results. On this data set, our cluster splitting method achieved an accuracy of 0.77 and an F-measure of 0.87 and considerably outperformed all of the other algorithms. On the second data set, we have computed the foreground segmentation using our voxel classification method. For both the LoS method as well as the 3D watershed, we have used the foreground segmentation computed by our random forest classifier since using the proposed thresholding approaches further diminished the performance. Again, our method outperformed all of the other approaches. On the second data set, we achieved an accuracy of 0.75 and an F-measure of 0.85. All of the results of our quantitative evaluation are listed in Table 1. It should be noted that in contrast to other approaches our method generally produces almost no false positives.

Table 1. Quantitative evaluation of our proposed method in comparison to three state-of-the-art methods on two data sets (DS). For DS 1, the manual labeling was used as a foreground segmentation except for FARSight (*). The second column shows the number of cell nuclei in the reference segmentation (RS). The metrics Recall, Precision, Accuracy, and F-measure range from 0 (worst performance) to 1 (best performance).

We have also tested the influence of the marker extension in the cluster splitting workflow on the second data set. When leaving out the marker extension, the accuracy slightly decreases to 0.73 and the F-measure decreases to 0.84. Moreover, while the mean volume of the segmented cell nuclei in both variants of our methods is 1205 voxels, without the marker extension step the standard deviation increases from 848 to 876.

6 Conclusion

We have proposed a method for automated detection and segmentation of clustered immunostained cell nuclei in three-dimensional ultramicroscopy images. Our approach yields good overall results and outperforms several other state-of-the art algorithms on our images, which we have evaluated using reference data that has been manually labeled by domain experts. Our method produces almost no false positives, so that errors in the segmentation are almost exclusively due to cases of under-segmentation. To improve the applicability of the algorithm in quantification applications, an additional uncertainty computation could be derived from the size distribution of the cell nuclei by analyzing the number of detected cell nuclei in relation to the total volume of a cluster. This information could then be integrated into the workflow in form of a guided manual correction.