1 Introduction

Medical image segmentation is a critical step towards the content analysis and image understanding, such as quantification of tissue volumes, study of anatomical structure and computer-integrated surgery [5]. Due to the presence of noise, intrinsic tissue variation, partial volume effects, unclear tissue boundaries and intensity non-uniformity, medical image segmentation remains challenging.

There are many different techniques, which can be broadly classified into histogram based [30], clustering based [7], edge based [12], region based [17], and combination of these techniques [18]. Histogram is a popular tool for real-time image processing due to its simplicity in implementation. It serves as an important basis of statistical approaches in image processing by producing the global description of the image’s information [13]. Histogram thresholding is a popular medical image segmentation technique, which assumes that homogeneous objects in the image manifest themselves as clusters.

Well-defined image, its histogram has a deep valley between two peaks. Around these peaks, the object and background gray levels are concentrated. In that case, the aim is to find a critical value or threshold. Pixels whose gray levels exceed a threshold are assigned to one set and the rest to the other [24]. In general, all histogram thresholding techniques work very well when the histogram is bimodal or nearly bimodal. On the other hand, a great deal of images corrupted by noise and/or irregularly illuminated are usually ill defined, leading to a multimodal histogram, where, in these cases, the ordinary histogram thresholding techniques perform poorly or even fail [26]. Thus, to segment those images using the histogram thresholding technique, the optimum threshold must be located in the valley region. Many references suggested new methods to obtain the right threshold automatically. Ridler [26] used an iterative scheme to achieve pixel separation. Entropy based algorithm was proposed in [6]. Orlando [24] proposed that the histogram could be thresholded based on a criterion of similarity between gray levels. Bonnet [3] proposed a no-threshold histogram based image segmentation method through defuzzification of the relaxed grades of membership.

These methods are only based on image intensity information without taking into account the spatial correlation of the same or similar valued elements. Adjacent pixels in an object are generally not independent of each other. To overcome this drawback, Cheng [8] proposed a concept of homogram. Mohabey [22, 23] introduced a concept of histon, which is a contour plotted on the top of the histogram by considering a similar color sphere of a predefined radius around a pixel. For image segmentation, only the upper approximation was considered and the histogram-based segmentation technique was applied on the histon to find the different regions. This kind of method didn’t take into account the lower approximation for segmentation and thus failed to utilize the properties of the boundary region between the two approximations. Milind [21] proposed a segmentation scheme that used the rough measure of the rough set as a basis for segmentation to overcome this drawback. In rough set theoretic sense, the histogram correlates with the lower approximation and the histon correlates with upper approximation. The rough measure at every intensity level is calculated and then the thresholding method is applied to image segmentation.

However, it is difficult to obtain the significant peaks. Milind [21] proposed two criteria: (a) The peak is significant if the height of the peak is greater than 20 % of the average value of rough index for all pixel intensities. (b) The peak is significant if the distance between two peaks greater than 10. Another problem is that obtaining clusters on the basis of peaks and valleys usually results in over-segmentation. Milind [21] proposed two criteria: (a) The clusters with pixels less than a threshold are merged with the nearest clusters. The process is repeated until the number of pixels in each cluster is greater than the threshold. Experimentally the author found that 0.1 % of the total number of pixels in the image is an appropriate threshold. (b) Two closest regions are combined to form a single region based on predefined distance between two clusters. Experimentally they find that 20 is an appropriate distance for region merging.

In fact, those criteria use the same constants for all images, which is not always right. It is difficult to find the appropriate constants for some given images. Because histogram of the medical image is not smooth, many local peaks and valleys lead to this problem. Another problem is that some of the histogram-based method, histon-based method and Milind’s method are sensitive to the noise for medical image.

In this paper, we propose a segmentation scheme that uses local polynomial regression of histogram and histon to obtain a smooth rough measure of the rough set as a basis for segmentation to overcome those drawbacks. Our method doesn’t need to find the significant peaks and merge some clusters. And it is insensitive to the noise.

2 Rough set for medical image processing

Rough set theory, as proposed by Pawlak [25], has been proven to be an effective tool for feature selection, uncertainty handling, and knowledge discovery from categorical data. The roughness of knowledge is basically represented using the following three types of regions: positive, negative, and boundary regions that are often associated with the spatial relationships among the partitions. In medical domain, such an approximation based rough representation of a region of interest under limited knowledge may provide a new and useful way of image understandings.

There has been increasing works published in this regard. Wojcik [34, 35] used rough sets derived from an equivalence relation to do edge enhancement. Beaubouef [2] introduced the idea of rough sets to deal with spatial uncertainty, in terms of classifying the different kinds of spatial uncertainty. Hirano [15] introduced the rough representation of a region of interest in medical images. Divyendu [10] considered the problem of detecting binary objects using rough sets. Wojcik [35] demonstrated how rough sets were more accurately used in context based image processing than statistical means, and they presented a neural network to uncover casual relationships between images using rough sets approach. Wu [36] introduced the notion of using “rough neural nets” for image classification. A new method for image segmentation based on rough set theory and neural networks was proposed by Jiang [16].

Sen and Maji had done a wonderful work of the fuzzy and rough set for image segmentation. Sen [28] presented a novel histogram thresholding methodology using fuzzy and rough set theories on the basis of the index of fuzziness and rough entropy, the author proposed a bilevel thresholding firstly, then a multilevel thresholding is carried out using the proposed bilevel thresholding method in a tree structured algorithm. This method does not require the histogram to be “well-defined” or not. Sen [29] proposed classes of entropy measures based on rough set theory and its certain generalizations for quantifying ambiguities in images. This paper used some entropy measure instead of rough measure and performs histogram threshold selection to image segmentation. Sen [27] proposed criterion optimization based image thresholding techniques to perform segmentation using global and local fuzzy statistics. Maji [20] proposed a generalized hybrid unsupervised learning algorithm, which is termed as rough-fuzzy possibilistic c-means (RFPCM) and applied to brain MR image segmentation. Maji [19] proposed a robust segmentation technique based on fuzzy set theory for brain MR images,in which the histogram of the given image is thresholded according to the similarity between gray levels. Hassanien [14] introduced a hybrid scheme that combines the advantages of fuzzy sets and rough sets in conjunction with statistical feature extraction techniques.

3 Preliminary

Let U ≠ ∅ be a universe of discourse and X be a subset of U. An equivalence relation R, classifies U into a set of subsets U/R = {X 1,X 2, …,X n } in which the following conditions are satisfied:

  1. (1)

    X i  ⊆ U, X i  ≠ ∅ for any i;

  2. (2)

    X i  ∩ X j  = ∅ for any i, j;

  3. (3)
    $$ {X}_1\mathrm{U}{X}_2\dots \mathrm{U}{X}_n=U. $$

Any subset X i , called a category, represents an equivalence class of R.A category in R containing an object x 1 ∈ U is denoted by [x 1] R .An indiscernibility relation IND(R) is defined as follows:

$$ {x}_1 IND(R){x}_2=\left\{\left({x}_1,{x}_2\right)\in {U}^2\left|\left({x}_1,{x}_2\right)\right.\in P,P\in U/R\right\} $$
(1)

For a family of equivalence relations P ⊆ R, IND(P) is defined as follows:

$$ IND(P)=\underset{R\in P}{\cap } IND(R) $$
(2)

Approximation is used to represent roughness of the knowledge. Suppose we are given an equivalence relation R and a set of objects X ∈ U. The R-lower and R-upper approximation of X are defined as

$$ \underset{\bar{\mkern6mu}}{R}X=\cup \left\{Z\in U/R\left|Z\subseteq X\right.\right\} $$
(3)
$$ \overline{R}X=\cup \left\{Z\in U/R\left|Z\cap X\right.\ne \varnothing \right\} $$
(4)

The lower approximation \( \underset{\bar{\mkern6mu}}{R}X \) contains sets that are certainly included in X, and the upper approximation \( \overline{R}X \) contains sets that are possibly included in X.

Consider I to be a medical image, of size M × N, the histogram is

$$ {h}_1(g)={\displaystyle \sum_{m=1}^M{\displaystyle \sum_{n=1}^N\delta \left(I\left(m,n\right)-g\right)}},\mathrm{for}\ 1\le g\le L $$
(5)

where \( \delta (g)=\left\{\begin{array}{l}1,g=0\\ {}0,g\ne 0\end{array}\right. \) is a Dirac impulse function and L is the total number of gray levels. For P × Q neighborhood around a pixel I(m,n), the total distance of all the pixels in the neighborhood and the pixel I(m,n) is given by

$$ {d}_T\left(m,n\right)={\displaystyle \sum_{p\in P}{\displaystyle \sum_{q\in Q}d\left(I\left(m,n\right),I\left(p,q\right)\right)}} $$
(6)

The pixels in the neighborhood fall in the circle of the similar gray if the distance d T (m,n) is less than a threshold T 0. A matrix I′, of size M × N, such that an element I′(m,n) is given by

$$ {I}^{\prime}\left(m,n\right)=\left\{\begin{array}{l}1,{d}_T\left(m,n\right)<{T}_0\\ {}0, otherwise\end{array}\right. $$
(7)

The histon is defined as

$$ {h}_2(g)={\displaystyle \sum_{m=1}^M{\displaystyle \sum_{n=1}^N\left(1+I\prime \left(m,n\right)\left)\delta \right(I\left(m,n\right)-g\right)}} $$
(8)

The histogram and histon of a medical image can be correlated with the concept of approximation space in the rough set theory. The histogram value of the gth intensity is the set of pixels, which definitely belong to the class of intensity g and therefore, can be considered as the lower approximation and the histon value of the gth intensity represents all the pixels, which belong to the class of similar color and therefore, may be considered as the upper approximation. Rough measure can be defined as:

$$ \begin{array}{ll}\rho (g)=1-\frac{h_1(g)}{h_2(g)} for\hfill & 1\le g\le L\hfill \end{array} $$
(9)

4 Our method

The segmentation process of our method is divided into three stages, as shown in Fig. 1. Firstly, compute the histogram and histon according to the formula (5) and (8). Secondly, model local polynomial regression to smooth the histogram and histon. Lastly, find all thresholds of the rough measure to segment medical image.

Fig. 1
figure 1

Flowchart of our method

4.1 Smoothness with local polynomial regression

Suppose the histogram of the gth gray level is h(g) and we can get L pairs of observations for one medical image: {(1, h(1)),(2, h(2)), ⋯,(L, h(L))}. Assume a model of the form

$$ h(g)=\mu (g)+{\varepsilon}_g\left(1\le g\le L\right) $$
(10)

where μ(g) is an unknown function and ε g is an error term. The errors ε g (1 ≤ g ≤ L) are assumed to be independent and distributed with mean 0 identically.

According to the Taylor’s theorem, any differentiable function can be approximated locally by a straight line, and a twice differentiable function can be approximated by a quadratic polynomial. Locally around a point x, we assume that μ can be well approximated by a member of a simple class of parametric functions. For a fitting point g, define a bandwidth function b(x) and smoothing window (x − b(x), x + b(x)).To estimate μ(x), only observations within this window are used. The observations weighted according to the following formula

$$ {\omega}_g(x)=K\left(\frac{g-x}{b(x)}\right) $$
(11)

Where ω g (x) is a weight function that assigns largest weights to observation close to x. Here, we use the kernel weight function.

$$ K(u)=\frac{1}{\sqrt{2\pi }}{e}^{\frac{-{u}^2}{2}} $$
(12)

Within the smoothing window, μ(u) is approximated by a polynomial

$$ \mu (u)\approx {a}_0+{a}_1\left(u-x\right)+\frac{1}{2}{a}_2{\left(u-x\right)}^2+\cdots +\frac{a_r}{r!}{\left(u-x\right)}^r $$
(13)

Whenever |u − x| < b(x), A compact vector notation for polynomials is

$$ {a}_0+{a}_1\left(u-x\right)+\frac{1}{2}{a}_2{\left(u-x\right)}^2+\cdots +\frac{a_r}{r!}{\left(u-x\right)}^r=<a,A\left(u-x\right)> $$
(14)

where a is a vector of the coefficients and A(.) is a vector of the fitting functions. The coefficients vector a can be estimated by minimizing the locally weighted sum of squares:

$$ {\displaystyle \sum_{g=1}^L{\omega}_g(x)\Big(h(g)-}<a,A\left(g-x\right)>\Big){}^2 $$
(15)

The local regression estimate of μ(u) is the first component of \( \widehat{a} \). According to standard weighted least squares theory [33], the solution can be written as

$$ \overset{\frown }{\beta }={\left({\Delta}_x^T{W}_x{\Delta}_x\right)}^{-1}{\Delta}_x^T{W}_xH $$
(16)

where H = (h(1), h(2), ⋯, h(L)), and

$$ {\Delta}_x=\left(\begin{array}{ccc}\hfill 1\hfill & \hfill 1-x\hfill & \hfill \begin{array}{cc}\hfill \cdots \hfill & \hfill \frac{{\left(1-x\right)}^r}{r!}\hfill \end{array}\hfill \\ {}\hfill 1\hfill & \hfill 2-x\hfill & \hfill \begin{array}{cc}\hfill \cdots \hfill & \hfill \frac{{\left(2-x\right)}^r}{r!}\hfill \end{array}\hfill \\ {}\hfill \begin{array}{c}\hfill \vdots \hfill \\ {}\hfill 1\hfill \end{array}\hfill & \hfill \begin{array}{c}\hfill \vdots \hfill \\ {}\hfill L-x\hfill \end{array}\hfill & \hfill \begin{array}{c}\hfill \vdots \hfill \\ {}\hfill \cdots \kern1em \frac{{\left(L-x\right)}^r}{r!}\hfill \end{array}\hfill \end{array}\right) $$
(17)

And W x is a L × L matrix with the weights along the diagonal. There weights are given by w gg (x) = ω g (x). The estimator \( \widehat{\mu}(x) \) is the intercept coefficient \( {\widehat{\beta}}_0 \) of the local fit, so we can obtain the value from

$$ \widehat{\mu}(x)={\mathrm{e}}_1^T{\left({\Delta}_{{}_x}^T{W}_x{\Delta}_x\right)}^{-1}{\Delta}_x^T{W}_xH $$
(18)

where e T1  = (1,0, ⋯,0) is a vector of dimension (r + 1) × 1. Therefore, we have

$$ \widehat{\mu}(x)={\displaystyle \sum_{i=1}^Ll}(x)h(i) $$
(19)

where \( l{(x)}^T={\mathrm{e}}_1^T{\left({\Delta}_{{}_x}^T{W}_x{\Delta}_x\right)}^{-1}{\Delta}_x^T{W}_x \).

The bandwidth b(x) has a critical effect on the local regression fit. If the bandwidth is too small, insufficient data fall within the smoothing window, and a noisy fit, or large variance, will result. On the other hand, if the bandwidth is too large, the local polynomial regression may not fit the data well within the smoothing window, and important features of the mean function μ(x) may be distorted or lost completely. That is, the fit will have large bias. The bandwidth must be chosen to compromise this bias-variance trade-off. We use Bowman and Azzalini’s adaptive bandwidth selection method [4].

$$ {b}^{*}={\left(\frac{4}{3L}\right)}^{1/5}\sigma $$
(20)

Where σ is the variance of H.

The degree of the local polynomial used in the formula (13) affects the bias-variance trade-off. A high polynomial degree can always provide a better approximation to the underlying mean than a low polynomial degree. Thus, fitting a high degree polynomial will usually lead to an estimate \( \widehat{\mu}(x) \) with less bias. But high order polynomials have large numbers of coefficients to estimate, and the result is variability in the estimate. It often suffices to choose a low degree polynomial and concentrate on choosing the bandwidth to obtain a satisfactory fit. The most common choices are local linear and local quadratic [9]. In this paper, we choice the local quadratic r = 2.

All in all, the local polynomial regression of histogram is computed by the following algorithm.

  • Algorithm 1: Local Polynomial Regression of the Histogram

  • Input: Medical image I;

  • Output: smoothed Histogram \( {\widehat{h}}_1(g) \);

    1. (1)

      According to the formula (5), compute the histogram h 1;

    2. (2)

      Compute the bandwidth b* for data H;

    3. (3)

      Set the polynomial degree r = 2;

    4. (4)

      According to the formula (11) ,compute W x ;

    5. (5)

      According to the formula (17) ,compute Δ x ;

    6. (6)

      For(g = 1; g ≤ L; g + +)

    7. (7)

      { According to the formula (19), compute the estimator \( \widehat{\mu}(g) \) with b* and H;

    8. (8)

      Output the smoothed histogram \( {\widehat{h}}_1(g)=\widehat{\mu}(g) \); }

After computing the histon according to the formula (8), we can also compute the local polynomial regression of the Histon using the algorithm 1 and denote as \( {\widehat{h}}_2(g) \).

4.2 Medical image segmentation with rough measure

With the smoothed histogram and histon, we can compute the smoothed rough measure.

$$ \begin{array}{cc}\hfill \widehat{\rho}(g)=1-\frac{{\widehat{h}}_1(g)}{{\widehat{h}}_2(g)} for\hfill & \hfill 1\le g\le L\hfill \end{array} $$
(21)

The value of roughness is large (i.e. more close to 1) when the value of smoothed histon is large in comparison with the value of smoothed histogram. This situation typically occurs in the object region where there is very little variation in the pixel intensities. The variation in pixel intensities is always more near the boundary between the two objects. This situation will lead to a small (i.e. close to 0) value of roughness. Thus, the peaks and valleys in \( \widehat{\rho}(g) \) can be used to segment medical image. Because all peaks in the smoothed rough measure are important, we need not to examine the peak’s sharpness or area to identify the dominating peaks in it. The local polynomial regression of the histogram and histon reduces the effect of the image noise or radical variation. Our segmentation algorithm can be described as follows:

  • Algorithm 2: Medical Image Segmentation with Rough Measure

  • Input: Smoothed Histogram \( {\widehat{h}}_1(g) \) and Histon \( {\widehat{h}}_2(g) \)

  • Output: Segmented Regions O 1, O 2, ⋯, O |V|

    1. (1)

      According to the formula (21), compute the smoothed rough measure \( \widehat{\rho}(g) \);

    2. (2)

      Identify all peaks using the following formula:

      $$ P=\left\{g\left|\Big(\widehat{\rho}(g)\right.>\widehat{\rho}\left(g-1\right)\Big)\&\left(\widehat{\rho}(g)>\widehat{\rho}\left(g+1\right)\right)\right\} $$
      (22)

      Where P is the set of peaks identified from gray level g.

    3. (3)

      Identify all valleys using the following formula:

      $$ V=\Big\{g\left|\left(\widehat{\rho}(g)<\widehat{\rho}\left(g-1\right)\right)\right.\&\&\left(\widehat{\rho}(g)<\widehat{\rho}\left(g+1\right)\right) $$
      (23)

      Where V is the set of valleys identified from gray level g.

    4. (4)

      Remove all peaks and valleys based on the following rule:

      • If (g is peak) and \( \widehat{\rho}\left(g+1\right)\ne \widehat{\rho}\left(g-1\right) \)

        • Then P = P − g;

      • If (g is valley) and \( \widehat{\rho}\left(g+1\right)\ne \widehat{\rho}\left(g-1\right) \)

        • Then V = V − g;

    5. (5)

      Sort the valley set V as {v 1,v 2, ⋯,v |V|}, ascending;

    6. (6)

      Medical image I is segmented according to the gray intervals,which are constructed by two neighbouring thresholds in the sorted V,that is [1,v 1], ⋯, [v |V| − 1,v |V|], [v |V|,L];

    7. (7)

      Output the segment results O 1, O 2, ⋯, O |V|;

5 Experiment validation

In this section, we describe some experimental results to compare segmentation performance among the HistoGram (HG) method, HisTon (HT) method, Rough Set with Histogram and Histon(Milind)(RSHH) method, RSHH method with the Moving Average Smoothing of Tan [32] (RSHH-MA) and our Rough Set with Local Polynomial Regression (RSLPR) method. Those algorithms are simulated by matlab2009 and tested on the simulated brain MRI images whose ground truthes are known and real abdomen CT images. All experiments were performed on a PC with 1.73 GHz Intel, 1024 MB of RAM. The Dice Similarity Measure (DSM) [37] is used as performance index and it is defined as:

$$ DSM(r)=2{N}_{p\cap g}(r)/\left({N}_p(r)+{N}_g(r)\right) $$
(24)

where N p ∩ g (r) denotes the number of pixels classified by both the proposed method and the ground truth as model r, and N p (r) and N g (r) represent the number of pixels classified as model r by the proposed method alone and by the ground truth, respectively. The DSM index attains the value 1 if the proposed method coincides with the ground truth, and decreases towards 0 as the quality of the segmentation deteriorates. Typically, a value DSM >0.7 means that there is an excellent agreement between the two segmentations [11].

5.1 Simulated brain MRI image segmentation

The brain images from BrainWeb [1] are used to evaluate our algorithm. The Simulated Brain Database(SBD)contains a set of realistic MRI data volumes produced by an MRI simulator. Currently, the SBD contains simulated brain MRI data based on two anatomical models: normal and multiple sclerosis (MS). For both of these, full 3-dimensional data volumes have been simulated using three sequences (T1, T2, and proton-density(PD) weighted) and a variety of slice thicknesses, noise levels, and levels of intensity non-uniformity.

The digital brain phantom has a spatial resolution of 1 mm3, three dimensions 181 × 217 × 181 and startes coordinates −90, −126, −72 (x,y,z). The discrete anatomical model used in this paper is provided which consists of a class label (integer) at each voxel, representing the tissue which contributes the most to that voxel (0=Background, 1=CerebroSpinal Fluid (CSF), 2=Grey Matter (GM), 3=White Matter (WM), 4=Fat, 5=Muscle/Skin, 6=Skin, 7=Skull, 8=Glial Matter, 9=Connective).

In this paper, we set the simulate normal image with modality=T1, noise = 0 %,3 %,5 %,7 % and 9 %, intensity non-uniformity = 20 %. The unsigned byte data is used for each voxel and the data is scaled such that it uses the entire [0,1,…,255] range of values.

We choose the 90 images of No.2,4, 6,…180 from the 181 images to do experiments. Some example images are shown in Fig. 2. For comparative purposes, the segmentation objects are restricted to three components: CSF, WM and GM for those methods. The ground truth of CSF, WM and GM of the Fig. 2(d) and (g) are shown in Fig. 3. The noised image of the Fig. 2(a) with different noise level are shown in Fig. 4.

Fig. 2
figure 2

The BrainWeb images for different axial slices

Fig. 3
figure 3

The ground truth of GM WM and CSF of slices No.60 and No.140

Fig. 4
figure 4

No.60 with different levels noise

Histogram and histon of slice No.60 with 0 %, 3 %,5 % ,7 % and 9 % noise are shown in Figs. 5 and 6, respectively. From the Figs. 5 and 6, we can know that the shape and trend of histogram and histon are very similar and the difference between them is that the values of the histon are larger than those of the histogram in the same region. With the increasing of the noise level, the number of peaks of them reduces gradually. All of the histogram and histon are not smooth and they have a lot of local peaks and valleys. It is difficult to find the real thresholds for image segmentation.

Fig. 5
figure 5

Histograms of the Fig. 2(a) with different noise levels a 0 % noise b 3 % noise c 5 % noise d 7 % noise e 9 % nosie

Fig. 6
figure 6

Histons of the Fig. 2(a) with different noise levels: a 0 % noise b 3 % noise c 5 % noise d 7 % noise e 9 % noise

Rough measure with histogram and histon of the Fig. 2(a) is shown in Fig. 7. Comparing with the histogram and histon in the Figs. 5 and 6, we can know that the rough measure has better distinguish ability than that of histogram and histon for the different noise levels. However, the rough measures in the Fig. 7 also have a lot of local peaks and valleys. It is also difficult to find the real thresholds for image segmentation.

Fig. 7
figure 7

Rough measures with histogram and histon of the Fig. 2(a) with different noise levels: a 0 % noise b 3 % noise c 5 % noise d 7 % noise e 9 % noise

Smoothed histogram and histon for No.60 with Local Polynomial Regression (LPR) are shown in Fig. 8(a) and (b), respectively. Comparing with the Figs. 5 and 6, we can know that the histogram and histon in the Fig. 8 are very smooth. And the LPR smoothed histogram and histon have similar shape and trend of the Figs. 5 and 6, which show that the LPR is effective. The rough measure with the LPR smoothed histogram and histon is shown in Fig. 8(c). It is easy for us to find the thresholds for image segmentation.

Fig. 8
figure 8figure 8

Smoothness with LPR and MA for No.60 with 0 % noise: a LPR smoothed histogram b LPR smoothed histon c rough measure with (a) and (b) d MA smoothed histogram e MA smoothed histon f rough measure with (d) and (e)

It is extremely well known how to get better results by taking appropriate Moving Average (MA) window size. For the image of No.60, we get the best window size 3 × 3 by experiments. The smoothed histogram and histon for No.60 with the MA are shown in Fig. 8(d) and (e), respectively. Comparing them with the Fig. 8(a) and (b), we can know that the MA smoothed histogram and Histon are not smooth. The rough measure with the MA smoothed histogram and histon is shown in Fig. 8(f). It is also difficult for us to find the thresholds for image segmentation.

The number of thresholds of the HG, HT, RSHH, RSHH-MA, and RSLPR methods are shown in Table 1. From the Table 1, we know that the HG, HT, RSHH methods have too many thresholds, which will lead to the over-segmentation problem. Therefore, we use the strategy of Milind (2008) to reduce the fake thresholdings of histogram, histon and rough measure. That is:

Table 1 The primitive and final number of thresholds
  1. (a)

    If the height of the peak is greater than 20 % of the average value of all peaks, the peak is significant;

  2. (b)

    If the distance between two peaks greater than 10, the peak is significant;

  3. (c)

    The clusters with pixels less than a threshold 0.1 % of the total number of pixels are merged with the nearest clusters;

  4. (d)

    Two closest regions are combined to form a single region based on predefined distance of 20.

Segmentation results of CSF, WM and GM for the Fig. 2(a) by the HG method are shown in Fig. 9(a), (b) and (c), respectively. From the Fig. 9, we can know that the HG method has the over-segmentation problem, especially for the WM. One hand, the histogram is not smooth and some thresholds are found wrongly. The other hand, the histogram is lack of spatial information. The segmentation results of CSF, WM and GM for the Fig. 2(a) by the HT method are shown in Fig. 10(a), (b) and (c), respectively. Comparing with the ground truth in the Fig. 3 and the segmentation results in the Fig. 9, we can know that the HT method has better results than those of the HG method, especially for the WM. It is mainly because the HT method has both intensity and spatial information. The segmentation results of CSF, WM and GM for the Fig. 2(a) by the RSHH method are shown in Fig. 11(a), (b) and (c), respectively. Comparing with the Figs. 3, 9 and 10, we can know that the RSHH method has better results than those of the HG and HT methods. The main reason is that the rough measure is more suitable than the HG and HT methods for processing the image uncertain information. The segmentation results of CSF, WM and GM for the Fig. 2(a) by the RSHH-MA method are shown in Fig. 12(a), (b) and (c), respectively. The segmentation results of CSF, WM and GM for the Fig. 2(a) by our RSLPR method are shown in Fig. 13(a), (b) and (c), respectively. Our method can find the thresholds of the smoothed rough measure adaptively. Comparing with the Figs. 3, 9, 10, 11 and 12, we can know that our RSLPR method has better results than that of the RSHH and RSHH-MA methods, especially for the CSF and GM.

Fig. 9
figure 9

Segmentation CSF, WM and GM results for No.60 by the HG method

Fig. 10
figure 10

Segmentation CSF, WM and GM results for No.60 by the HT method

Fig. 11
figure 11

Segmentation CSF, WM and GM results for No.60 by the RSHH method

Fig. 12
figure 12

Segmentation CSF, WM and GM results for No.60 without noise by the RSHH-MA method

Fig. 13
figure 13

Segmentation CSF, WM and GM results for No.60 without noise by the RSLPR method

For the Fig. 2(a) with 3 %, 5 %, 7 % and 9 % noise, rough measure with the PLA smoothed histogram and histon are shown in Fig. 14(a), (b), (c) and (d), respectively. For the Fig. 2(a) with 3 %, 5 %, 7 % and 9 % noise, rough measure with the MA smoothed histogram and histon are shown in Fig. 15(a), (b), (c) and (d),respectively.

Fig. 14
figure 14

Rough measure with LPR smoothed histogram and histon of the Fig. 2(a) with different noise levels: a 3 % noise b 5 % noise c 7 % noise d 9 % noise

Fig. 15
figure 15

Rough measure with MA smoothed histogram and histon of the Fig. 2(a) with different noise levels: a 3 % noise b 5 % noise c 7 % noise d 9 % noise

The segmentation results of CSF, WM and GM for the Fig. 4(e) by the HG method is shown in Fig. 16(a). It is obvious that the HG method is sensitive to the noise, especially for the GM. The segmentation results of CSF, WM and GM for the Fig. 4(e) by the HT method is shown in Fig. 16(b). The HT method is also sensitive to the noise. The segmentation results of CSF, WM and GM for the Fig. 4(e) by the RSHH method is shown in Fig. 16(c). Comparing with the Fig. 16(a), (b) and (c) we can know that the RSHH method has better results than that of the HG and HT methods. The segmentation results of CSF, WM and GM for the Fig. 4(e) by the RSHH-MA method is shown in Fig. 16(d). The segmentation results of CSF, WM and GM for the Fig. 4(e) by the RSLPR method is shown in Fig. 16(e). From the Fig. 16, we can know that the effect of our RSLPR method by the noise is the least.

Fig. 16
figure 16figure 16

Segmentation results for No.60 with 9 % with the five methods

For MRI brain image segmentation, atlas-based segmentation method is one of the most popular approaches. We also compare our method with Souplet’s [31] atlas-based segmentation method which is made available as part of the SepINRIA (http://www-sop.inria.fr/asclepios/software/SepINRIA/) applications. Some atlases of CSF, WM and GM of the SepINRIA are shown in Fig. 17.

Fig. 17
figure 17

CSF,WM and GM atlases from left to right in the SepINRIA method

For the 90 images with 0 %, 3 %, 5 %, 7 % and 9 % noise levels, their average performance DSM indices of the CSF, WM and GM by the HG, HT, RSHH, RSHH-MA, RSLPR and SepINRIA methods are shown in Fig. 18(a), (b) and (c), respectively. For the 90 image, we get the best window size 3 × 3 by experiments for the RSHH-MA. The RSHH has better results than the HT and HG. The RSHH-MA and RSLPR have better results than the RSHH. The RSLPR has better results than the RSHH-MA and SepINRIA.It shows that the rough measure is more suitable for the uncertain information of the medical image than the histogram and histon. The LPR and MA methods can overcome the disturbing of the noise effectively, but our LPR has better smooth results than the MA.

Fig. 18
figure 18

Average DSM indices of CSF, WM and GM by the 6 methods for 90 images with different level noise: a indices for CSF b indices for WM c indices for GM

5.2 Real abdomen CT image segmentation

The real CT abdomen images from the Affiliated Hospital of Jiangsu University in China were also tested. We had copied about 16G real CT images from the Affiliated Hospital of Jiangsu University in China. In this paper, we choose 20 real CT abdomen images to segment. Some images with size of 512 × 512 are shown in Fig. 19. The ground truths of livers, spleen and right kidney in those 20 images are manually segmented by Tian L.Y, a radiologist of the Affiliated Hospital of Jiangsu University.

Fig. 19
figure 19

Some real CT abdomen images

Histogram, histon, rough measure of the Fig. 19(c) are shown in Fig. 20(a), (b) and (c), respectively. The LPR smoothed histogram, histon and rough measure are shown in Fig. 20(d), (e) and (f), respectively. The MA smoothed histogram, histon and rough measure are shown in Fig. 20(g), (h) and (i), respectively. From the Fig. 20, we can know that the LPR and MA estimate the histogram and histon correctly. And we can find the thresholds of the smoothed rough measure with the LPR easily. However, it is difficult for us to find the thresholds of the smoothed rough measure with the MA. Comparing with the Fig. 20(c), (f) and (i), we can know that the LPR is effective.

Fig. 20
figure 20figure 20

Histogram, histon, rough measure,and their smoothed histogram, histon and rough measure of the Fig. 19(c) with the LPR and MA :a Histogram b Histon c Rough measure d LPR smoothed histogram e LPR smoothed histon f Rough measure with (d) and (e) g MA smoothed histogram h MA smoothed histon rough measure with (g) and (h)

The Ground truth and segmentation results of right kidney of the Fig. 19(c) by the HG, HT, RSHH, RSHH-MA and RSLPR methods are shown in Fig. 21(a), (b), (c), (d), (e) and (f), respectively. From the Fig. 21, we know that the right kidney is over-segmented by the HG and HT and the RSHH-MA and RSLPR have better results than the others. The Ground truth and segmentation results of liver of the Fig. 19(c) by the HG, HT, RSHH, RSHH-MA and RSLPR methods are shown in Fig. 22(a), (b), (c), (d), (e) and (f), respectively. The average DSM indices of the 5 methods for the 20 images are shown in Table 2. From the Table 2, we can know that the DSM index of the right kidney is the lowest for its complicated shape. The segmentation results show that our RSLPR method is better than the others methods. The main reason is that the local polynomial regression can reduce the effect of noise and smooth the rough measure.

Fig. 21
figure 21

Ground truth and segmentation results of the right kidney in Fig. 19(c) by 5 methods

Fig. 22
figure 22

Ground truth and segmentation results of the liver in Fig. 19(c) by 5 methods

Table 2 Average segmentation performance indices of the 5 methods for the 20 images

6 Conclusions

In this paper, we proposed a new medical image segmentation method with rough set theoretic and local polynomial regression model. The proposed method is a variant of the histogram-based thresholding segmentation method. Our method uses the local polynomial regression to smooth the histogram as the lower approximation of rough set, while smooth the histon as the upper approximation of rough set. And multimodal thresholding method is applied to segment medical image on the rough measure. Experiment results show our method obtains more realistic multimodal thresholding values and thus achieves better segmentation results than the HG, HT and RSHH methods. One hand, our local polynomial regression model can obtain a smoothed rough measure and we can find the real peaks and valleys to segment medical image. Moreover, our local polynomial regression reduces the effect of noise and we can find the thresholds correctly. Finally, rough measure uses both the intensity and spatial information, which is more suitable for medical image processing than histogram and histon. The experimental results show the superiority of the algorithm.

Because the rough set framework is well suited for dealing with medical images, we are going to do some work about edge extracting and denoising using rough set in the future. And we will also consider the medical image segmentation with the rough set and local polynomial regression model on 2D histogram and Histon.