1 Introduction

During the last two decades, the time–frequency localized and variational methods have gained their popularity for image segmentation. The time–frequency localized techniques apply a certain transformation, such as wavelet transform (WT) (Mallat 1987; Yinhui and Zifen 2011), contourlet transform (CT) (Do and Vetterli 2002; Liu and Zheng 2013) or nonsubsampled contourlet transform (NSCT) (Cunha et al. 2006; Khalighi et al. 2015; Chen et al. 2017), on the input images first, followed by conducting a statistical analysis, while the variational techniques (Tony et al. 2000; Chan and Vese 2001; Sabeena et al. 2016; Bhadauria and Dewal 2014; Ali and Madabhushi 2012; Huaming et al. 2010) directly extract certain statistical characteristics from the original images.

Of the time–frequency localized techniques, the CT (Do and Vetterli 2002) is an efficient image representation employing the Laplacian pyramids and directional filter banks (DFBs) to achieve the multiresolution and multidirection decomposition, respectively. The CT is a classical time–frequency localized technique, which is the basis of other methods. For example, Liu and Zheng (2013) present the multiscale image segmentation based on the CT-based hidden Markov tree (HMT) model. The adaptive context structures of original images are acquired by HMT based on the multiscale CT. This method can obtain the desirable segmentation effect with the regional integrity of original images. However, due to downsampling and upsampling, the CT is shift-variant, which is undesirable in some image analysis applications such as the vector-valued active contour model (Tony et al. 2000). A shift-invariant version of the CT is called the NSCT (Cunha et al. 2006), which is built upon iterated nonsubsampled filter banks to obtain a shift-invariant directional multiresolution image representation. Considering its nonsubsampled characteristics, NSCT can be applied in many cases. Khalighi et al. (2015) compute the magnitude of NSCT coefficients of all the directional subbands at a specific level and compare it with the adaptive multilevel threshold. The proposed algorithm can get the accurate image segmentation, but it cannot get the integrity of images. Chen et al. (2017) combine HMT model based on the shift invariance and multidirectional expansion properties of the NSCT. The simulation results prove the generalization of this method. Like Chen et al. (2017), NSCT can fully consider the detailed information, but it does not consider the integrity of images.

The Chan–Vese (CV) model (Chan and Vese 2001) is a region-based segmentation algorithm, which is based on the ideas of contour evolution. The philosophy of the CV model involves minimizing certain energy function based on the level set to get the image contour and at the same time get rid of unimportant information. So compared with the NSCT, the CV model can get the integrity of images. Bhadauria and Dewal (2014) propose an automatic segmentation for brain images combining CV model and fuzzy C means, which uses the regional information. It can segment images accurately, and at the same time it does not need manual work. However, it cannot achieve the detailed information well. Ali and Madabhushi (2012) present a novel synergistic CV active contour, which combines boundary and shape in a level-set framework for multiple object overlap resolution in histological imagery. Bhadauria and Dewal (2014) and Ali and Madabhushi (2012) may lose some important directional information and cannot totally capture the intrinsic geometrical structures. In addition, the CV model can be applied to the vector-valued model (Tony et al. 2000), which is adopted in the proposed model. The vector-valued CV model is presented in most cases such as objects in different channels or intensities. Huaming et al. (2010) present a segmentation framework for delineating typhoon to improve the accuracy of typhoon forecast by the vector-valued CV model. Through the use of multichannel image information to exclude the unnecessary iterative calculation, this method can accelerate the speed of curve evolution and improve the accuracy of segmentation. However, due to the spatial support of the structure employed in the feature vector, that CV model still contains lots of unrelated information which leads to high computational complexity.

In order to solve the above problem, Linghui et al. (2011) present a fast method for the segmentation of cracked body which incorporates the WT and the CV model. The method can locate rough regions by using wavelet modulus maxima, which does not only reduce the amount of data but also provide initial contour surface that can accelerate the convergence speed of the CV model. However, this method is the simple combination of the WT and CV model. It does not consider the detailed information of the image and the relationship between the coefficients. Turgay and KaiKuang (2011) propose an unsupervised change detection method for satellite images which combines the undecimated discrete WT and the CV model. This algorithm performs quite well, particularly on detecting adequate changes even under strong noise interference. However, the WT can only provide horizontal, vertical and diagonal directional subbands in a certain resolution, so it cannot capture the high-dimensional singularities effectively.

In this paper, an image segmentation method by exploiting the vector-valued CV model based on the multiresolution representation of images is proposed. The NSCT instead of the other time–frequency localized methods is exploited, since that there is no upsampling and downsampling operation involved, and thus, it is free from an aliasing problem and at the same time it is shift invariant. A vector-valued CV model is then applied to the multiresolution representation based on NSCT for segmenting the images.

2 Multiresolution analysis using NSCT

The NSCT (Cunha et al. 2006) combines nonsubsampled pyramids for multiscale decomposition and nonsubsampled DFB for multidirection decomposition. The building block of the nonsubsampled pyramid is a two-channel nonsubsampled DFB which is shift invariant. A two-dimensional filter is presented by its z-transform \( H\left( z \right) \), and the reconstruction condition is given by:

$$ H_{ 0} \left( z \right)G_{0} \left( z \right) + H_{1} \left( z \right)G_{1} \left( z \right) = 1 $$
(1)

where \( H_{0} \left( z \right) \) is the low-pass filter and \( H_{1} \left( z \right) = 1 - H_{0} \left( z \right) \), and the corresponding synthesis filters \( G_{0} \left( z \right) = G_{1} \left( z \right) = 1 \).

To achieve the multiscale decomposition, the nonsubsampled pyramid can be iterated repeatedly on the low-pass subband outputs of itself. For the next level, all filters are upsampled by 2 in both dimensions. These filters achieve multiresolution analysis as shown in Fig. 1.

Fig. 1
figure 1

Diagram of NSCT. Firstly, a low-pass subband and a high-pass subband of the input are split by nonsubsampled multiscale decomposition. Then a nonsubsampled directional decomposition decomposes the high-pass subband into several directional subbands. The above process is iterated repeatedly. The reconstruction is obtained by the inverse NSCT: a low-pass subband and the corresponding high-pass subband are combined into a realization \( I_{i} \)

Then, the marginal statistics of the coefficients of NSCT is studied, as shown in Fig. 2. The histograms in the finest subbands of NSCT coefficients are plotted in Fig. 2b. A sharp peak at zero amplitude and long tails to both sides of the circle center are exhibited in these distributions. Besides, the majority of coefficients are close to zero, which implies that the NSCT is sparse. The similar distributions at all subbands of other test images are also observed. Based on these characteristics, we consider a simple zero-mean, two-state Gaussian mixture model (GMM) to describe NSCT coefficients as shown in Fig. 2c. The description of GMM is:

$$ p(x|\theta ) = \sum\limits_{m = 1,2} {\pi_{m} p_{m} } (x|\mu_{m} ,\sigma_{m} ) $$
(2)

where \( m = 1 \) means the big state and \( m = 2 \) means the small state; \( \pi_{m} \) is prior probability and \( \sum\nolimits_{m = 1,2} {\pi_{m} {\kern 1pt} \; = \;1} \); \( p_{m} (x|\mu_{m} ,\sigma_{m} )\; \) is probability distribution function (PDF):\( p_{m} (x|\mu_{m} ,\sigma_{m} )\; = \;{1 \mathord{\left/ {\vphantom {1 {\sqrt {2\pi \sigma_{m}^{2} } }}} \right. \kern-0pt} {\sqrt {2\pi \sigma_{m}^{2} } }}\exp \left[ { - {{(x - \mu_{m} )^{2} } \mathord{\left/ {\vphantom {{(x - \mu_{m} )^{2} } {2\sigma_{m}^{2} }}} \right. \kern-0pt} {2\sigma_{m}^{2} }}} \right] \); \( \mu_{m} \) and \( \sigma_{m} \) are the mean and variance, respectively. Here, \( \left\{ {\pi_{m} ,\mu_{m} ,\sigma_{m} } \right\} \) is the estimated parameter of the GMM, and the expectation maximization (EM) algorithm (Luo and Hancock 2015) is used for computing parameter of GMM by E-Step and M-Step. The process of the EM algorithm is introduced briefly. For more information, we refer readers to Luo and Hancock (2015).

Fig. 2
figure 2

Test image, the statistical histogram and the GMM of NSCT coefficients. a Test image. b The corresponding statistical histogram in the finest scale of NSCT coefficients. c The corresponding GMM in the finest scale of NSCT coefficients

Further, we keep the coefficient in a big state, and the coefficient in a small state is set to 0. Consequently, we reconstruct the input image \( I \) by the inverse NSCT which is obtained by applying the previously described steps in the reverse order as shown in Fig. 1. By applying the inverse NSCT, a multiresolution representation of the image is denoted as \( I = \left\{ {I_{1} ,I_{2} , \ldots I_{i} , \ldots ,I_{n} } \right\} \), where \( i \) is the resolution index and n indicates the maximum number of the resolution levels. Here, the resolution 1 corresponds to the input image itself, i.e., \( I_{1} = I \). A lower value of \( i \) will preserve more image details but will be affected by the other unrelated information. Conversely, a higher value of \( i \) will contain fewer image details while yielding more noise reduction.

3 Image segmentation based on the vector-valued CV model

3.1 The vector-valued CV model

The CV model has a better performance for the image with weak object boundaries and is less sensitive to the location of initial contours. Let \( R \) be an image domain, and the CV model finds an evolving curve \( C \) that partitions the original image \( I\left( {x,y} \right) \) into two regions \( R = \left\{ {R_{1} ,R_{2} } \right\} \), which is determined by minimizing the functional:

$$ E\left( {C,I_{\text{avg1}} ,I_{\text{avg2}} } \right) = \lambda_{1} \int_{{R_{1} }} {\left| {I\left( {x,y} \right) - I_{\text{avg1}} } \right|}^{2} {\text{d}}x\,{\text{d}}y + \lambda_{2} \int_{{R_{2} }} {\left| {I\left( {x,y} \right) - I_{\text{avg2}} } \right|}^{2} {\text{d}}x\,{\text{d}}y{\kern 1pt} + \mu {\text{Length}}\left( C \right) $$
(3)

where \( I_{\text{avg1}} \) and \( I_{\text{avg2}} \) are two constants that approximate the image intensity on each connected component of \( R \), respectively, and \( \lambda_{1} \), \( \lambda_{2} \) and \( \mu \) are the weighting parameters. \( {\text{Length}}\left( C \right) \) is the regularizing term, which indicates the length of the evolving curve \( C \).

For evolving, the curve \( C \) is represented as the zero level set of function \( \phi \) (Bouin 2017) over the domain \( R \), which satisfies the following conditions:

$$ \left\{ {\begin{array}{*{20}l} {C = \left\{ {\left( {x,y} \right) \in R:\phi \left( {x,y} \right) = 0} \right\}} \hfill \\ {{\text{inside}}\left( C \right) = \left\{ {\left( {x,y} \right) \in R:\phi \left( {x,y} \right) > 0} \right\}} \hfill \\ {{\text{outside}}\left( C \right) = \left\{ {\left( {x,y} \right) \in R:\phi \left( {x,y} \right) < 0} \right\}} \hfill \\ \end{array} } \right. $$
(4)

Then, the energy functional can be written as:

$$ \begin{aligned} & E\left( {C,I_{\text{avg1}} ,I_{\text{avg2}} } \right) \\ &\quad = \lambda_{1} \int_{{R_{1} }} {\left| {I\left( {x,y} \right) - I_{\text{avg1}} } \right|}^{2} H_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right){\text{d}}x\,{\text{d}}y \\ &\quad \quad + \lambda_{2} \int_{{R_{2} }} {\left| {I\left( {x,y} \right) - I_{\text{avg2}} } \right|}^{2} \left( {1 - H_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)} \right){\text{d}}x\,{\text{d}}y{\kern 1pt} \\ &\quad \quad + \mu \int_{R} {\delta_{\varepsilon } \left( {\phi \left( {x,y} \right)} \right)} \left| {\nabla \phi \left( {x,y} \right)} \right|{\text{d}}x\,{\text{d}}y \\ \end{aligned} $$
(5)

where \( H_{\varepsilon } \) is the approximated Heaviside function given by

$$ H_{\varepsilon } \left( x \right) = {1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-0pt} 2}\left[ {1 + {2 \mathord{\left/ {\vphantom {2 \pi }} \right. \kern-0pt} \pi }\arctan \left( {{x \mathord{\left/ {\vphantom {x \varepsilon }} \right. \kern-0pt} \varepsilon }} \right)} \right] $$
(6)

and

$$ \delta_{\varepsilon } \left( x \right) = \left( {{1 \mathord{\left/ {\vphantom {1 \pi }} \right. \kern-0pt} \pi }} \right) \cdot \left( {{\varepsilon \mathord{\left/ {\vphantom {\varepsilon {\left( {\varepsilon^{2} + x^{2} } \right)}}} \right. \kern-0pt} {\left( {\varepsilon^{2} + x^{2} } \right)}}} \right) $$
(7)

In the following, we describe the extension of the CV model to the vector-valued image

$$ \begin{aligned} & E\left( {C,\overrightarrow {{I_{\text{avg1}} }} ,\overrightarrow {{I_{\text{avg2}} }} } \right) \\ &\quad = {1 \mathord{\left/ {\vphantom {1 n}} \right. \kern-0pt} n}\sum\limits_{i = 1}^{n} {\lambda_{1}^{i} \int_{{R_{1} }} {\left| {I_{i} \left( {x,y} \right) - I_{\text{avg1}}^{i} } \right|}^{2} {\text{d}}x\,{\text{d}}y} \\ &\quad \quad + {1 \mathord{\left/ {\vphantom {1 n}} \right. \kern-0pt} n}\sum\limits_{i = 1}^{n} {\lambda_{2}^{i} \int_{{R_{2} }} {\left| {I_{i} \left( {x,y} \right) - I_{\text{avg2}}^{i} } \right|}^{2} {\text{d}}x\,{\text{d}}y{\kern 1pt} } + \mu \,{\text{Length}}\left( C \right) \\ \end{aligned} $$
(8)

where \( I_{i} \)\( \left( {i = 1,2, \ldots ,n} \right) \) is the ith channel of a vector-valued image on \( R \); \( \overrightarrow {{I_{\text{avg1}} }} \) and \( \overrightarrow {{I_{\text{avg2}} }} \) are two vectors that approximate the image intensity: \( \overrightarrow {{I_{\text{avg1}} }} = \left( {I_{{_{\text{avg1}} }}^{1} ,I_{{_{\text{avg1}} }}^{2} , \ldots ,I_{{_{\text{avg1}} }}^{n} } \right) \) and \( \overrightarrow {{I_{\text{avg2}} }} = \left( {I_{{_{\text{avg2}} }}^{1} ,I_{{_{\text{avg2}} }}^{2} , \ldots ,I_{{_{\text{avg2}} }}^{n} } \right) \).

At last, we derive the following Euler–Lagrange equation to minimize the energy functional \( E \) with respect to \( \phi \):

$$ \begin{aligned} {{\partial \phi } \mathord{\left/ {\vphantom {{\partial \phi } {\partial t}}} \right. \kern-0pt} {\partial t}} & = - \,\delta_{\varepsilon } \left( \phi \right)\left[ {1 \mathord{\left/ {\vphantom {1 n}} \right. \kern-0pt} n}\sum\limits_{i = 1}^{n} {\lambda_{1}^{i} } \left| {I\left( {x,y} \right) - I_{{_{\text{avg1}} }}^{1} } \right|^{2} \right.\\&\quad\left.- {1 \mathord{\left/ {\vphantom {1 n}} \right. \kern-0pt} n}\sum\limits_{i = 1}^{n} {\lambda_{2}^{i} } \left| {I\left( {x,y} \right) - I_{{_{\text{avg2}} }}^{2} } \right|^{2} \right] \\ & \quad + \mu \delta_{\varepsilon } \left( \phi \right){\text{div}}\left( {{{\nabla \phi } \mathord{\left/ {\vphantom {{\nabla \phi } {\left| {\nabla \phi } \right|}}} \right. \kern-0pt} {\left| {\nabla \phi } \right|}}} \right) \\ \end{aligned} $$
(9)

where \( \partial t \) is an artificial time variable, and \( {\text{div}} \) is the divergence operator.

3.2 The proposed image segmentation method

In this paper, the image segmentation method combining the vector-valued CV model based on the multiresolution representation obtained by NSCT is proposed as shown in Fig. 1. The proposed segmentation model consists of two stages by using the NSCT and the CV model in the following algorithm:

  • Step 1. Initialization: the original image \( I \), the level of the NSCT \( n \), \( \lambda_{1} \), \( \lambda_{2} \) and \( \mu \);

  • Step 2. Compute the NSCT of the image \( I \) for \( n \) levels;

  • Step 3. For each level \( i \) (\( i = 1,2, \ldots ,n \)) of the pyramid:

    • Step 3.1 Compute the PDF according to Eq. (2) for each coefficient;

    • Step 3.2 Train the parameter of the PDF by EM algorithm, and classify each coefficient into big or small state;

    • Step 3.3 Set the coefficient in small state to 0;

  • Step 4. Get a multiresolution representation of the image \( I = \left( {I_{1} , \ldots ,I_{i} , \ldots ,I_{n} } \right) \) by inverse NSCT;

  • Step 5. Initialize over the entire image by a set of circles uniformly distributed;

  • Step 6. Minimize the energy functional according to Eqs. (8) and (9) for the multiresolution representation of the image \( I \) from Step 4;

  • Step 7. Output the final segmentation result.

The concrete process is detailed as follows: Firstly, a multiresolution representation of the image \( I \) is built as \( I = \left( {I_{1} , \ldots ,I_{i} , \ldots ,I_{n} } \right) \), where the superscript \( i \) indicates the resolution, which provides a trade-off between the image detail preservation and the noise reduction. The multiscale filter and the multidirection filter are employed in the process of the NSCT implementation. To demonstrate, a multiresolution representation of the image with \( n = 4 \) is shown in Fig. 3. (Various scales from coarse to fine are followed by 4, 4 and 8 directional subbands, and the multiscale filter is “Pyr” and the multidirection filter is “Haar”.)

Fig. 3
figure 3

Multiresolution representation of the original image \( I \) with three multiresolution levels (\( n = 3 \)). a Original image \( I \). b\( I_{1} \) (directional subband: 4). c\( I_{2} \) (directional subband: 4, 4). d\( I_{3} \) (directional subband: 4, 4, 8)

The second step of the proposed model aims at generating the final segmentation. The CV segmentation algorithm for the vector-valued images based on the NSCT implementation is used. We set the parameter \( \lambda = 1 \) to weight each realization of the multiresolution equally. The parameter \( \mu \) controls the sharpness of the boundaries of the segmented regions and smoothness of the contour. Here, we set \( \mu = 5 \times 10^{ - 4} \times 255^{2} \) for the input images as suggested by the CV model (Chan and Vese 2001) empirically. The stopping criterion for the iterative algorithm is formulated as follows. If the zero level set of \( \phi \) remains unchanged in the consecutive iterations, then the algorithm is declared “converged.”

The vector-valued CV model and the NSCT consider the spatial information and detail information, respectively. Compared to a certain method, the proposed method can yield more noise reduction and preserve more image details, but the trade-off between the image detail preservation and the noise reduction will be constrained by the levels of decomposition. The images with a higher value of resolution may mitigate the noise interference but sacrifice the image details; on the contrary, the images with a lower value of resolution may preserve the image details but are affected by the noise.

4 Experiment results

In order to assess the effectiveness of exploiting the vector-valued CV model based on multiresolution representation for segmentation, we conduct both the qualitative and quantitative measurements. The former can be obtained by the figures. The segmentation results are depicted by the red and green lines. And for the latter, once the segmentation has been obtained by using the proposed model, the quantities are computed for comparing the ground-truth segmentation result against the computed segmentation. Note that the ground-truth segmentation result is manually created, which meets overall integrity and consistency of detail to some extent. Here, we adopt “false-positive ratio (FPR),” “false-negative ratio (FNR)” and “error ratio (ER)” as objective evaluation criteria (Casciaro et al. 2012; Davanzo et al. 2011), and the concrete definition is as follows:

$$ {\text{FPR}} = {{\text{FP}} \mathord{\left/ {\vphantom {{\text{FP}} N}} \right. \kern-0pt} N} \times 100\% $$
$$ {\text{FNR}} = {{\text{FN}} \mathord{\left/ {\vphantom {{\text{FN}} N}} \right. \kern-0pt} N} \times 100\% $$
$$ {\text{ER}} = {{\left( {{\text{FP}} + {\text{FN}}} \right)} \mathord{\left/ {\vphantom {{\left( {{\text{FP}} + {\text{FN}}} \right)} N}} \right. \kern-0pt} N} \times 100\% $$

where \( {\text{FP}} \) and \( {\text{FN}} \) are the sum of pixels incorrectly determined and missed out, respectively, \( N \) is the total number of pixels counted in the ground-truth segmentation. Note that the smaller the three values are, the better the segmentation results are.

4.1 Implementation of the proposed model

The proposed model is mainly composed of two stages: the multiresolution representation of the original image and the vector-valued CV model. For the former, the segmentation model with different values of \( n \) experiment on the complex image is shown in Fig. 3.

In Fig. 3, the image with a coarser decomposition is more subject to noise interference while preserving more details information. On the contrary, the image with a finer decomposition can yield more noise reduction but sacrifice more image details.

According to the multiresolution representation of the image, we obtain the final segmentation result by the vector-valued CV active contour model. At first, the initialization is achieved with a set of circles uniformly distributed over the entire image as shown in Fig. 4a. Then, the model is able to iteratively fine-tune the initial contour toward the final segmentation. Some intermediate results of the curve evolution at different numbers of iterations can be observed from Fig. 4b–e.

Fig. 4
figure 4

Curve evolution of the proposed model. a Initialization. b Iteration 200. c Iteration 400. d Iteration 600. e Iteration 800

4.2 Analysis of the proposed model

4.2.1 Effect of the maximum of scales in the NSCT

The proposed model exploits a multiresolution representation of the image \( I \), and the maximum of scales \( n \) will affect the segmentation result. The proposed model with different values \( n \) experiments on the original image is shown in Fig. 5.

Fig. 5
figure 5

Segmentation results with different values \( n \). a Original image. b\( n = 0 \). c\( n = 1 \) (directional subband: 4). d\( n = 2 \) (directional subband: 4, 4). e\( n = 3 \) (directional subband: 4, 4, 8). f\( n = 4 \) (directional subband: 4, 4, 8, 8). g Ground truth

From the qualitative results as shown in Fig. 5 as well as the quantitative results tabulated in Table 1, it can be observed that the segmentation performance will improve when the value of \( n \) gets larger. However, when the original image is oversmoothed (\( n \) is big enough), the performance starts to degrade. According to Fig. 5 and Table 1, \( n = 3 \) can yield the best performance in this paper.

Table 1 \( {\text{FPR}} \), \( {\text{FNR}} \) and \( {\text{ER}} \) with different values of \( n \) shown in Fig. 5

4.2.2 Effect of various pyramids and direction filters

To study the segmentation performance resulted by employing different types of pyramids and direction filters, we use some typical filters: pyramid filters include Maxflat, 9-7, and Pyr. Direction filters include Dmaxflat7, Pkva and Haar. The segmentation results are figured and tabulated in Fig. 6 and Table 2, respectively. The experimental simulation results show that the proposed model delivers almost the same performance regardless of various pyramid filters and direction filters used.

Fig. 6
figure 6

Segmentation results by the proposed model with different filters. a Original image. b Maxflat/Dmaxflat filter. c 9-7/Pkva filter. d Pyr/Haar filter. e Ground truth

Table 2 \( {\text{FPR}} \), \( {\text{FNR}} \) and \( {\text{ER}} \) with different filters shown in Fig. 6

4.3 The performance of the proposed model

In this paper, we analyze the performance of the proposed model by implementing the method (Chen et al. 2017), which consists of three major steps: (1) obtain a multiresolution representation of the original image by NSCT; (2) capture the statistical properties of the coefficients of the multiresolution representation; and (3) produce the final results by HMT model. The segmentation using the NSCT-based HMT approach requires manual settings for the filters which are used for decomposing and reconstructing the image: “9-7” and DFB (three-level decompositions: 4, 4, 8). Here, we implement our method using the same set of parameters. In addition, we implement the CV image segmentation method based on the undecimated discrete WT (Turgay and KaiKuang 2011). Firstly, a multiresolution representation of the original image is obtained by the undecimated discrete WT. Then the CV model is employed in segmenting the images.

The segmentation results obtained from different methods are shown in Fig. 7. The corresponding statistical results of FPR, FNR and ER are shown in Table 3. We can observe that the segmentation results obtained from our model are better than the HMT model based on the NSCT and the CV model based on the undecimated discrete WT. This is mainly due to the fact that ref. (Chen et al. 2017) is achieved by the NSCT, and the data distribution of coefficients contains more image information. But the method does not get the integrity of image segmentation. It is worthwhile to mention that the CV model based on the undecimated discrete WT provided by (Turgay and KaiKuang 2011) can tackle the above problem, and the results can also have a high robustness against noise. However, it can ignore some detailed information.

Fig. 7
figure 7

Segmentation results by using different models: a original images. b HMT model based on NSCT (Chen et al. 2017). c CV model based on undecimated discrete WT (Turgay and KaiKuang 2011). d Proposed model. e Ground truth

Table 3 FPR, FNR and ER using different models shown in Fig. 7

Besides, the iteration times (IT) and the corresponding running time of different methods in Fig. 7 are shown in Table 4. Compared with other methods, we can get that our method arrives more accurate segmentation results at a faster speed.

Table 4 Iteration times (IT) and running time using different models shown in Fig. 7

5 Summary

In this paper, we propose an image segmentation method combining the multiresolution analysis of the image and the vector-valued CV model. NSCT is exploited to obtain a multiresolution representation. The different decomposition scales of the NSCT show various trade-offs yielded between the detail preservation and the noise reduction. Besides, the vector-valued CV segmentation employed in segmenting the multiresolution representation of the images can achieve the integrity of the image segmentation. In this paper, the proposed method is conducted by both the qualitative and quantitative measurements, and the experimental results demonstrate the effectiveness and accuracy of the method.