Keywords

1 Introduction

Cancer is a disease that leads to uncontrolled growth and division of cells. Among different types of cancers brain cancer is one of the deadliest that put people of all ages and genders in a perilous state (Brain Tumor 2019; Central Brain Tumor Registry of the United States (CBTRUS) Fact Sheet 2020). Some brain tumors may be slow-growing and benign but malignant tumors exhibit fast growth and therefore cause rapid deterioration. That is why it is important to get a thorough and accurate diagnosis of a brain tumor as early as possible. The choice of treatment for brain tumors depends on a variety of factors like type, location, and size of the tumor along with the patient’s age and general health. The effectiveness of the various treatments largely depends upon the time of detection and the accuracy of the acquired information about the tumor under investigation. Early detection has been found to increase the chances of complete recovery. Also, the detailed description of the regions of interest and their accurate interpretation enables physicians to properly diagnose the disease and determine the treatment procedure. From various studies it has been found that the malignant and benign tumors in the brain are morphologically different (Das and Das 2020). Figure 1 shows that malignant tumors tend to show more irregularity along the edges than benign ones. These changes in the shape of tumors are considered by many medical practitioners to be evidential in determining malignancy.

Fig. 1
figure 1

Changes in tumor shape

As the quality of the images affects their interpretation, medical images should have maximum clarity and thus be maximally enhanced for better visualization and accurate interpretation. Image enhancement techniques have been widely adopted in many applications of image processing where improvement in the quality of images is necessary. For example, one can remove noise, sharpen, or brighten an image, making it easier to identify key features like edges, corners, and curvatures. The more general methods of image enhancement are filtering with morphological operators (Kimori 2011), Histogram equalization (Singh and Dixit 2015), Noise removal using a Wiener filter (King et al. 1983), Linear contrast adjustment (Tsai and Yeh 2008), Median filtering (Tong and Neuvo 1994), Unsharp mask filtering (Levi 1974), Contrast-limited adaptive histogram equalization (CLAHE) (Stark 2000), Decorrelation stretch. Among the methods mentioned so far, over the last decade, enhancement using Gabor filter (Mehrotra et al. 1992; Rangayyan, et al. 2008), log Gabor filter (Wang et al. 2008; Yao et al. 2006) have yielded the most promising results. The enhancement procedure is followed by the process of image segmentation which aims at segregating the region under suspicion from its imprecise surrounding. Many effective segmentation techniques have been introduced by researchers among which K-means clustering technique (Zalik 2008), Fuzzy C means algorithm (Cai et al. 2007; Gueorguieva et al. 2017) are widely used. Since our primary concern is to determine the class of the tumor post-detection, so the segmented image of the brain tumor is fed to a classifier that utilizes certain feature values extracted from the segmented image to classify it into benign or malignant classes.

In this regard, present work describes a way to integrate Gabor filtering with FCM algorithm to generate a highly enhanced and precisely segmented image of a brain tumor or lesion from an input MRI or CT scan. This image is then classified to determine the class of the tumor with the help of shape-based features extracted from the segmented image.

In the rest of the work, proposed approach is discussed in Sect. 2. Section 3 shows some experimental results. Finally Sect. 4 draws some conclusion of the work.

2 Proposed Approach

We have implemented Gabor filters on MRI (T1, T2, GAD, PD) and CT images of the brain and made an attempt at enhancing the different features such as the boundaries of the affected regions by employing the characteristics of this filter. This is followed by complete segregation of the ROI from its ambiguous surroundings using FCM algorithm. The segmented images thus obtained show precisely the contour of the tumor that is to be further studied by implementing the process of feature extraction. The resulting feature values are then employed to train a classifier to complete the task of classification of the tumor into benign or malignant classes.

2.1 Image Enhancement

Over decades research has been done to discover and improve ways of enhancing the affected area since eventually, it increases the accuracy in its segmentation. While doing so, attention was drawn by the significance of the different frequency components of an image. It has been observed that in an image, the low frequencies are related to slowly varying intensity components and the high frequencies are caused by sharp transitions in intensity, such as edges and noise (Gonzalez and Woods 2008). Thus an attempt at smoothing or sharpening an image introduced the mechanism of filtering which has become a preliminary part of image processing ever since. A lowpass filter that allows only the low frequencies to pass through causes blurring of the image, whereas a highpass filter allowing only the high frequencies enhance the sharpness of the image but at the cost of the noise that may get incorporated. However, extensive studies indicated that an efficient way of acquiring an image with sharp distinct edges (suppressed low frequencies) and reduced noise (suppressed high frequencies) is to implement a bandpass filter. Gabor filters are basically orientation and frequency-sensitive bandpass filters, used for edge and texture analysis. One of the reasons for which Gabor filters (Mehrotra et al. 1992; Rangayyan, et al. 2008) have gained popularity is their ability to detect edges having various orientations. The response of the Gabor filter is strong if the orientation of the filter matches the orientations of the edges that are to be detected in an image. The IRF of a 2-D Gabor real valued filter is given by:

$$g_{\lambda \theta \psi \sigma \gamma } (x,y) = \exp \left( -\frac{- x^{\prime 2} + \gamma^{2} y^{\prime 2}}{2\sigma^{2}} \right)\cos \left( {\frac{{2\pi x^{\prime } }}{\lambda} + \psi } \right)$$
(1)
$$x^{\prime } = x\cos (\theta ) + y\sin (\theta )$$
(2)
$$y^{\prime } = y\cos (\theta ) - x\sin (\theta )$$
(3)

where the arguments x and y specify the position of a light impulse in the visual field and σ, γ, λ, θ and ψ are the parameters defined below:

  • σ = specifies the standard deviation of the Gaussian function which controls the width of the Gaussian function.

  • γ = known to be the aspect ratio, that specifies the ellipticity of the Gaussian factor. The typical values lie between 0.2 and 1. The kernel takes the shape of a circle when the value is 1.

  • λ = specifies the wavelength of the cosine factor of the Gabor function. The wavelength is given in pixels. Valid values are real numbers between 2 and 256.

  • θ = specifies the orientation of the normal to the parallel stripes of the Gabor function. The orientation is specified in degrees. Valid values are real numbers between 0 and 180.

  • ψ = specifies the phase offset of the cosine factor of the Gabor function. It is specified in degrees. Valid values are real numbers between −180 and 180.

The Gabor filters are applied in the same manner as other conventional filters. There is usually an array of pixels (usually 2D array since 2D images are involved) known as ‘mask’ or a ‘convolutional kernel’ that represents the filter. In this array, each pixel is assigned a value and a convolution operation is performed between the kernel and the image as the kernel slides over every pixel of the image. At the output, we get an image whose edges and boundaries are more distinct, thus helping in better analysis of the region of interest. In practice, to analyze texture or obtain features from an image, a bank of Gabor filters with a number of different orientations can be used. In the present work, we have implemented a bank of 16 Gabor filters to generate highly enhanced images of the brain tumors by capturing their outlines precisely from 16 different orientations. Figure 2 shows the image of a brain tumor when seen through a Gabor filter from 16 different orientations (θ  = 0°, 11.25°, 22.50°, 33.75°, 45°, 56.25°, 67.5°, 78.5°, 90°, 101.25°, 112.5°, 123.75°, 135°, 146.25°, 157.5°, 168.75°).

Fig. 2
figure 2

Output of filtering through 16 Gabor filters

2.2 Image Segmentation

In the second stage of our proposed methodology, we have applied the most widely used fuzzy c-means (FCM) algorithm to complete the task of separating the suspected region from its imprecise background. FCM is an unsupervised soft-clustering technique that transforms the crisp boundary concept into a degree of membership function (Gueorguieva et al. 2017). Membership value (varying between 0 and 1) is found by calculating the distance between the cluster centre and each data point. The membership value is higher for the data point that is closer to the cluster centre indicating a higher probability of it belonging to that cluster. Following the selection of suitable cluster numbers empirically, FCM algorithm preserves the target cluster (representative of suspected region) appropriately and suppresses others for separating it from the imprecise surrounding regions.

2.3 Feature Extraction

The complete segmentation of the tumor is followed by the process of extraction of various features of the suspected region based on which the tumor can be classified into either benign or malignant groups. Feature extraction is a critical step in image processing since different features from an image provide a better description of the ROI and thereby make the identification of the affected region highly accurate. The morphological changes that have been seen to occur in a tumor as it tends toward malignancy have encouraged researchers to study the various shape-based features like compactness, eccentricity, perimeter, area, solidity, convex area, etc. Other than these conventional features, one of the most widely used approaches is Extrema based characterization (Das and Das 2020) In the present study we have emphasized the implementation of extrema based features for characterization of brain tumors. As mentioned earlier, the contours of malignant tumors have more irregularities than benign ones, and the extrema (e) determines the extent of concavity/convexity of the tumor profile. Figure 3 shows that in a benign tumor the length of the radius vectors varies negligibly because of its smooth boundary. But in case of a malignant tumor, the variation in the lengths of radius vectors are quite significant due to the spiculations present along its boundary. Extrema refers to the radius vector of maximum or minimum length. The concept of extrema (e) is applied in the development of various shape-dependent features which on implementation led to successful detection of malignancy in a tumor.

Fig. 3
figure 3

Variation in radius vector lengths in a benign and b malignant tumor

Given below are various characterization of e such as extrema count, extrema diff., extrema entropy, extrema variance, extrema acutance, r.m.s. value of extremea (Das and Das 2020) that enable us to distinguish between a benign and malignant tumor.

$${\text{Total extrema count}}\,\left( {{\varvec{e}}_{{{\varvec{count}}}} } \right) = K$$
(4)
$${\text{Extrema difference}}\,\left( {{\varvec{e}}_{{{\varvec{diff}}}} } \right) = e_{\text{max}} - e_{\text{min}}$$
(5)
$${\text{Extrema entropy}}\,\left( {{\varvec{e}}_{{{\varvec{ent}}}} } \right) = - \sum\limits_{k - 1}^{K} {e(k)\log e(k)}$$
(6)

where K is the total number of extrema count

$${\text{Extrema variation}}\,\left( {{\varvec{e}}_{{{\varvec{var}}}} } \right) = \frac{1}{K}\sum\limits_{k - 1}^{K} {[e(k) - e_{m}]^{2} }$$
(7)

where

$$e_{m} = {\text{extrema mean}} = \frac{1}{K}\sum\limits_{k - 1}^{K} {[e(k)]}$$
$${\text{Extrema acutance}}\,\left( {{\varvec{e}}_{{{\varvec{ac}}}} } \right) = \frac{{e_{\max } - e_{\text{min}} }}{K}\sum\limits_{k - 1}^{K} {[e(k)]}$$
(8)
$${\text{Extrema r.m.s. value}}\,\left( {{\varvec{e}}_{{{\varvec{r.m.s.}}} }} \right) = \sqrt {\frac{1}{K}\sum\limits_{k - 1}^{K - 1} {[e(k) - e(k + 1)]^{2} } }$$
(9)

Unlike benign tumors with smooth boundary, malignant tumors having higher number of marginal spiculations tend to elevate the feature values that we have applied.

2.4 Classification

The process of classification of the segmented tumor into benign/malignant category is an essential task which concludes our proposed methodology. In the present work, for classification we have implemented K-NN algorithm (Zhang 2018) that uses ‘feature similarity’ to predict the class (benign/malignant) of a new test sample based on how closely it matches the feature values in the training set. The selection of the number of nearest neighbors (K) is crucial as it determines the accuracy of classification. The classifier measures the distance between the test sample and each of the nearest training samples, the number is decided by the value of K. The distance functions commonly used are Euclidean, Hamming, Manhattan, although present study employs the Euclidean distance metric to locate the nearest neighbor. The neighbors having the least distance from the test sample ultimately decide the class to which the test sample gets assigned by the rule of majority. For example, if K = 4, then the test sample is allocated to that class to which majority of the 4 nearest neighbors belong.

One of the approaches for assessing the efficiency of the classification algorithm is k-fold cross-validation (Wong 2015). In this approach, initially, the dataset is randomly divided into ‘k’ subsets or folds of equal sizes. The classification model is run k times and each time one of the k groups is used as test-set/validation-set while the other (k − 1) groups form the training set. The error estimation is averaged overall k-trials to obtain total effectiveness of the model. Although there is no rule for choosing the value of k, but k = 5 or 10 is found to be very common in the field of applied machine learning as these values have been found to result in a model skill estimate with low bias and modest variance.

In the present study, we have divided the entire dataset into 5 groups (k = 5) and each time out of 5 trials one of the 5 groups become the validation-set whereas the other 4 groups are used to train the classifier model. The performance of the classifier is evaluated in terms of parameters like Sensitivity (Sen), Specificity (Sp), Accuracy (Acc) which have been described below:

  • Sensitivity (Sen): It estimates how correctly the classifier can predict the benign tumors.

    $${\text{Sen}} (\% ) = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} \times 100$$
    (10)
  • Specificity (Sp): It estimates how correctly the classifier can predict the malignant tumors.

    $${\text{Sp}} \,(\% ) = \frac{{{\text{TN}}}}{{{\text{TN}} + {\text{FP}}}} \times 100$$
    (11)
  • Accuracy (Acc): It estimates the overall correct prediction of benign and malignant tumors.

    $${\text{Acc}}\,(\% ) = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{FP}} + {\text{FN}} + {\text{TN}}}} \times 100$$
    (12)

where the number of previously known malignant tumors correctly predicted as malignant (TN); the number of previously known benign tumors correctly predicted as benign (TP); the number of previously known benign tumors incorrectly predicted as malignant (FN); the number of previously known malignant tumors incorrectly predicted as benign (FP).

3 Experimental Results

Proposed enhancement and segmentation techniques are applied on the brain MRI and CT images from the benchmark database of “The Whole Brain Atlas—Harvard Medical School” (Johnson and Becker 1999) and “The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)” (Menze 2015).

Some of the results of enhancement and segmentation have been shown in Fig. 4.

Fig. 4
figure 4

Enhancement and segmentation results using the proposed techniques

In the proposed approach, we have selected fivefold cross-validation technique and varied the number of neighbors (K) to observe the corresponding variation in the parameters like Sensitivity, Specificity and Accuracy. In doing so, we obtained the best results for K = 4 which implies that the test sample is allocated to that class to which majority of the 4 nearest neighbors belong.

Using a K-NN classifier with 4 nearest neighbors and fivefold cross-validation to estimate the classifier efficiency we acquired a Sensitivity of 100%, Specificity of 94.11%, and Accuracy of 96.67%.

4 Conclusion

Present work suggests the design of an efficient model for tumor detection using the method of enhancement followed by segmentation of the brain tumor before it can be classified into benign/malignant class. In this regard, a bank 16 Gabor filters are implemented to precisely identify the tumor boundary. Following this, the tumor is successfully segmented from its inhomogeneous background by employing fuzzy c-means clustering algorithm. The segmented image of the tumor is further studied through the process of feature extraction where various shape-based features help the classifier to finally determine the class of the tumor accurately.

In future, some modifications in the proposed approach can be introduced with the aim of improving the performance of the model. In this respect, Gabor filter can be replaced with Log-Gabor filter to overcome its limitations. The maximum bandwidth of a Gabor filter used in image enhancement is limited to approximately one octave and thus they are not optimal if one is seeking broad spectral information with maximal spatial localization. Log-Gabor filters, on the other hand, can be constructed with arbitrary bandwidth and the bandwidth can be optimized to produce a filter with minimal spatial extent. Also, these filters have an extended tail at high frequencies which results in the preservation of image details. Apart from that, the number and type of features that have been employed greatly determine the classifier performance and the present work with the features already discussed resulted in an accuracy of 96.67%. However, introduction of more complex features may increase the accuracy of classification further as they would enable the classifier to comprehensively analyze the tumor.