Abstract
The increasing size of medical image repositories is due to the increasing number of digital imaging data sources. Most of the image content descriptors proposed in the literature are not suitable for the retrieval of large medical image datasets. The ability to extract features from an image is a vital criterion that should be considered to evaluate retrieval efficacy. This paper proposes an efficient image retrieval system for medical applications based on the new Canny steerable texture filter (CSTF) feature descriptor and Brownian motion weighting deep learning neural network (BMWDLNN) classifier. Initially, Modified Kuan Filter (MKF) is used to condense the noise in images. Then, the image contrast is enhanced using the Gaussian Linear Contrast Stretching Model (GLCSM) method. Then, the image features are extracted using the CSTF method. Later, the dimensionality of the extracted features is reduced by means of the Mean Correlation Coefficient Component Analysis (MCCCA) method and then the BMWDLNN classifier is applied. For the classified images, the score values are calculated using the Harmonic Mean-based Fisher Score (HMFS) method. Thereafter, various distance values are calculated for the score value of the image and are summed up to find the average. The retrieval outcome is determined by the minimum distance between database images and the query image. The proposed method obtained an average precision rate of 0.9981, 0.9992, 0.9951, and 0.9940 for EXACT-09, TCIA, NEMA-CT, and OASIS databases, respectively. The experimental results revealed that the proposed methodology outperforms the existing methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Content-based Image Retrieval (CBIR) is a major task in image processing. It is commonly used in numerous applications such as security, medical, and environmental monitoring [1, 2]. CBIR can efficiently describe the visual information of the images which can be used to classify the medical and natural images and retrieval [3]. The CBIR is a powerful tool that finds the most analogous images in the database for a given query image using distance metrics [4]. It performs two essential tasks: index and searching [5]. The first involves extracting the suitable feature vectors (FV) of a query image and saving them in the database. The second task is computed and compares the FVs of the database [6]. The CBIR system represents each image in its repository as an FV [7]. The feature extraction method is used to transform the image’s key points and regions that contain the raw pixel values into fewer domains that are single values [8]. The features are the visual representations of an image. In general, the texture, color, and shape are the low-level features that represent the various perceptions of the image [9, 10]. The texture is an important feature of an image and the shape descriptor describes the shape of the specific region of an image [11]. The search is carried out to find the images that are related to the extracted features [12]. An image consists of an immense number of features that are stored in digital devices.
Currently, a huge quantity of medical images is being produced in hospitals all across the world to diagnose diseases. These medical images are stored in Digital Imaging and Communications in Medicine (DICOM) format. Among these, the image produced by medical imaging devices was vastly increased [13, 14]. Due to their importance, all images should be examined to acquire a better interpretation of the human body [15]. Managing, indexing, and retrieving such a large collection of images using manual methods is both expensive and time-consuming [16]. The CBIR system has helped physicians to identify disease early and take appropriate steps to treat it [17]. It is very important that the right images are found in the right databases to enable the proper diagnosis and treatment of disease [18]. It is challenging to extract the appropriate and needed information from a huge medical imaging dataset automatically [19]. With the increasing amount of data related to medical images, the need for robust and efficient image retrieval and search systems has become more critical [20]. This paper presents an efficient content-based medical image retrieval (CBMIR) system based on a new canny steerable texture filter (CSTF) and Brownian motion weighting deep learning neural network (BMWDLNN).
The rest of the paper is structured as follows: Sect. 2 discusses various existing methods related to the CBMIR system; Sect. 3 elaborates the process of the proposed methodology; the experimental results for the proposed system are discussed in Sect. 4; the paper is concluded in Sect. 5.
2 Literature survey
2.1 Image retrieval
Swati et al. [21] presented a CBMIR system for brain tumors using T1-weighted contrast-enhanced magnetic resonance images (CE-MRI). The system was developed using pre-trained VGG19 on a large ImageNet dataset (more than 1.2 million labeled images). The Closed-Form Metric Learning (CFML) distance measurement technique was carried out to determine the similarity between the extracted features of a database and the test/query images. This method performed better than the state-of-the-art methods on the CE-MRI dataset. The distance learning task results in an optimization problem, which makes it difficult to provide a closed-form solution.
Sundararajan et al. [22] proposed to retrieve Avascular Necrosis images using Deep Belief-Convolutional neural network (DB-CNN) for feature description. Initially, Median Filter (MF) was used to eradicate the image noise and then resized. For the retrieval task, the modified Hamming distance (MHD) was evaluated to determine the similarity between the database and query images. The test results showed that the work was superior to existing techniques, since this method is limited to small datasets.
Cai et al. [23] presented a CBMIR framework based on CNN and hash coding. The Siamese network (SN) was considered with image pairs as inputs. Then, the compact binary hash codes of the query and database images were computed. Those hash codes were compared for the retrieval task. Two experimentations are conducted on the cancer imaging archive-computed tomography (TCIA-CT), and the vision and image analysis group/international early lung cancer action program (VIA/I-ELCAP). The method outperformed conventional hash algorithms and CNN methods, according to the results. Because of the slower learning process, the Siamese network requires more training time.
Shinde et al. [24] proposed a series of local neighborhood wavelet feature descriptors (LNWFD) for CBMIR. The main components of the system are wavelet decomposition, feature extraction, and similarity measurement. A triplet half-band filter bank (THFB) was used to obtain the ‘4’ sub-bands of the wavelet decomposition. The relationships among the wavelet coefficients were then computed at each sub-band to form LNWFD. The Manhattan distance was calculated to determine the similarity between the query and the database feature vectors. The retrieval tests were performed over OASIS-MRI and NEMA-CT for top ten matches. The average retrieval precision (ARP) of these databases was 74.57% and 99.51%, respectively. Similarly, this method was tested on the Emphysema-CT database for top 50 matches and achieved 55.51% ARP. The computation errors in THFB affect the regularity of the wavelets.
Owais et al. [25] employed a classification-based system that uses enhanced residual network (ResNet) as artificial intelligence was developed for the retrieval of multimodal medical images. The resulting feature vector was then extracted from the last convolutional layer and returned as a deep FV. The Euclidean distance technique was employed to compute the distance then compared one by one with the generated FVs. The test phase demonstrated that the deep-feature-based variable node classification framework could retrieve classes with better accuracy than previous methods. The increased number of layers in the network reduces the efficacy of the system.
Karthik et al. [26] proposed an approach to classify medical images using CNN, the results of which were used for supporting content-based medical image retrieval. For experimental evaluation, Image CLEF 2009 dataset was considered and a classification task was performed based on body orientation.
2.2 Image denoising
Image denoising is an important step in medical image analysis. The obtained medical images may be corrupted or artifacts during the process of acquisition and this leads to wrong analysis. Gai [27] presented a color image denoising technique via monogenic matrix-based sparse representation. The proposed technique considers the color image as a monogenic matrix, which can transform independent color channels into the whole. Then, dictionary learning method was designed using a monogenic matrix. In the sparse coding stage, monogenic-based orthogonal matching was considered. Jia et al. [28] presented a novel cascading U-Nets architecture with multiscale dense processing in the applications of image denoising. The proposed technique was good at edge recovery and structure preservation in real noisy image denoising. Jia et al. [29] proposed a color image denoising technique based on Pixel-Attention CNN with color correlation loss. The pixel-attention mechanism could generate pixel-wise attention maps which help remove random noise. The color correlation loss exploits color correlation to further improve denoising performance on color noisy images. The experimental results on several standard datasets demonstrate the state-of-the-art (SOTA) performance and the superiority of the proposed method.
2.3 Feature dimension reduction
Large numbers of input features to a predictive modeling task might make it more difficult to model. The difficulty could be reduced by obtaining the lesser number of input features of the original data to a model. Zhang et al. [30] developed a novel algorithm named dimension reduction window principal component analysis (DRWPCA). It was realized the dimension reduction by analyzing the correlation between the dimensions, and therefore, the physical meaning of the original data set is retained. It utilizes mathematical statistics to obtain the correlation coefficient or the degree of correlation between attributes. By statistical analysis of the degree of correlation between attributes, the feature with high correlation is removed so as to achieve the goal of reducing the dimension. The original data need not map on the space of other dimensions for processing. Sanchez et al. [31] proposed a new feature relevance measure for star coordinates plots associated with the class of linear dimensionality reduction mappings defined through the solutions of eigenvalue problems, such as linear discriminant analysis or principal component analysis. This approach leads to enhanced feature subsets for class separation or variance maximization in the plots for numerous data sets of the UCI repository.
3 Main contribution of the work
In the domains of imaging research, clinical surgery, and pattern recognition, medical image retrieval has become the most important. The retrieval and classification performances depend on images features such as texture, shape, color, visual, local features, etc. Many algorithms have been developed to improve the retrieval performance of medical images. The medical images consist of texture-like regions, the existing traditional texture features are used to represent the images in various medical image retrieval systems. The local feature descriptors presented through the published literature have utilized the relationship of a reference pixel with its neighboring pixel or among the surrounding neighbors but at the expense of high dimensionality [38]. To overcome the issues in literature, a CSTF feature descriptor is proposed for efficient medical images retrieval and classification.
The key features of the proposed method are (1) noise reduction using Modified Kuan Filter (MKF) which reduces the despeckling noise in medical images, (2) the proposed method uses a novel feature descriptor since the existing feature descriptors are sensitive to the image noise and their semantic representation. For this reason, the proposed feature descriptor uses the steerable texture filter. (3) The Mean Coefficient Correlation Component Analysis (MCCCA) dimensionality reduction technique is proposed to reduce the complexity and computational cost in retrieval and classification tasks. The efficiency of the proposed approach is validated through experimentation over four medical image databases.
4 Proposed medical image retrieval system
Content-based image retrieval (CBIR) refers to the process of retrieving similar images from a database for a given query image. Due to the increasing number of medical images, it has been challenging to improve the image retrieval process in the medical field. The existing methods developed for medical image retrieval have several drawbacks, such as medical image corruption, performance degradation, and larger retrieval time. To overcome these issues, this paper proposed an efficient CBMIR system based on a new CSTF and BMWDLNN. The proposed work contains two phases, namely (1) training and (2) testing. In the training phase, the database images are applied to the following processes, such as noise reduction, contrast enhancement, feature extraction, dimensionality reduction, classification, and score value calculation. In the testing phase, the same processes are carried out for the query image. Primarily, the noise present in the image is reduced using the Modified Kuan Filter (MKF). Next, the contrast of the image is enhanced using the Gaussian Linear Contrast Stretching Model (GLCSM). Thereafter, the important features are extracted from the contrast-enhanced image by means of Canny steerable texture filter (CSTF), and the dimensions of the extracted features are reduced with the help of Mean Coefficient Correlation Component Analysis (MCCCA). Then, the dimensionality reduced features are given to the Brownian motion weighting deep learning neural network (BMWDLNN) classifier. For the classified images, the score value is calculated utilizing the Harmonic Mean-based Fisher Score (HMFS). After calculating the score value, different distance values are measured and the average of all the distance values is determined. Using the average value, a similar image with a minimum average value is retrieved. The proposed CBMIR system is shown in Fig. 1.
4.1 Noise reduction
Initially, the input medical image \(I\) is considered from the dataset. Then, the noise is reduced from the image using the Modified Kuan Filter (MKF) to attain a better-retrieved image. Kuan filter is the well-known image despeckling filter used in the image processing system. Several iterations of the Kuan filter can greatly reduce the noise. However, small details may be lost due to the repetitive smoothing operation. Therefore, in the Kuan filter, the modification is done in two ways, such as the exponential weight factor, and the geometric mean calculation process. The exponential weight is the scale factor so the repetitive smoothing operation is easily avoided. In addition, the Kuan filter considers the local mean of the image window. The local mean value usually will not cover all the directions of the pixels in the image, which leads to repetitive iteration. To avoid his problem, here, the geometric mean calculation is done, which cover all the pixel value during the calculation. The weighting function in the Kuan filter is defined as
where \(P_{{\text{c}}}\) and \( P_{{\text{I}}}\) are the variation coefficients of speckle \({\text{c}}\) and input image \({\text{I}}\). The final despeckled image \(I_{{\text{d}}}\) is obtained as
where \(m_{{{\text{fw}}}} \) is the geometric mean of pixel values in the filter window \({\text{fw}}\), and \(g_{{{\text{fw}}}}\) is the center pixel in \({\text{fw}}\).
4.2 Contrast enhancement
The contrast of the despeckled image \(I_{{\text{d}}}\) was enhanced using the Gaussian Linear Contrast Stretching Model (GLCSM). Gaussian model is the better way to enhance the contrast of the image than the other algorithms. The existing Gaussian model utilizes the histogram equalization method to make better contrast of the image, which was indiscriminate in the contrast enhancement process and results in increasing the contrast of the background noise. To offset the existing issue, in the proposed methodology, the linear contrast stretch is used in the Gaussian model. It mainly uses point operations to correct pixel gray values, linearly stretches the gray value of the image, and enhances the image’s gray areas of interest while suppressing the indifferent gray areas. The Gaussian model includes three steps: modeling, partitioning, and mapping. In modeling, each pixel in the input image \(I_{{\text{d}}}\) is designed into Gaussian distribution. The gray level distribution function \( G(I_{{\text{d}}} |p)\) is expressed as
where \({\text{PDF}}\left( \bullet \right) \) denotes the probability density function, \(c_{i} \) is the weights associated with the \(i{\text{th}} \) Gaussian distribution, and \(M_{{{\text{C}}_{i} }} ,v_{{{\text{c}}_{i} }}\) are the mean and variance of the \(i{\text{th}}\) component. \(a,b\) are the pixel values of the image, \(N \) is the number of mixture components, and \(p \) is a parameter estimated using an expectation–maximization algorithm. The probability of components is chosen to satisfy the following constraints:
Then, the number of training set \(I_{{\text{d}}}\) is drawn independently to estimate parameters \(M_{{{\text{C}}_{i} }} ,v_{{{\text{c}}_{i} }}\) with the mixture of components \(c_{i}\). The parameters \( M_{{{\text{C}}_{i} }} ,v_{{{\text{c}}_{i} }} ,c_{i}\) are estimated by maximizing the log-likelihood function \(L\left( p \right)\) of the expectation–maximization method as
The estimated parameters are denoted as
Then, the expectation and maximization steps are involved to estimate the membership probabilities of the parameters and to update the new values of the parameters. After that, the image partitioning is done to represent the image in a way that is easier to analyze. For partitioning, all the intersection points within the dynamic range of the image are detected. Thereafter, the quality of the image is highlighted using the Linear Contrast Stretching (LCS) model, which provides the output gray level needed for further processing. The output gray levels are obtained as
where the input gray level intervals \(I_{{{\text{gin}}}} \left( {a,b} \right)\) are converted to the output gray level intervals \(I_{{{\text{gout}}}} \left( {a,b} \right)\), \(255\) is the dynamic range of the image, and \({\text{min}}\) and \({\text{max}}\) are the minimum and the maximum intensity values of the image. After the output intervals are mapped to the corresponding input intervals, the contrast-enhanced image \(I_{{{\text{ce}}}} \left( {a,b} \right)\) is obtained.
4.3 Feature extraction
After contrast enhancement, the features of the image \(I_{{{\text{ce}}}} \) are extracted using the Canny steerable texture filter (CSTF) feature descriptor. The proposed method uses this novel feature descriptor since the existing feature descriptor is sensitive to the image noise and their semantic representation also depends on the shapes of the objects in the image. For this reason, the proposed feature descriptor uses the steerable texture filter. The proposed feature descriptor extracts the texture, edge, shape, wavelet features, etc.
The CSTF feature descriptor includes five steps: noise reduction, gradients calculation, non-maximum suppression, double thresholding, and edge tracking by hysteresis. In the first step, the noise present in the image is reduced using the steerable texture filter. This step is to avoid the issue of assuming the noise as edges and also to extract the texture features in addition to the edges. The image \(I_{{{\text{ce}}}} \left( {a,b} \right)\) is applied to the steerable filter and the smoothened image \(I_{{{\text{SF}}}} \left( {a,b} \right)\) is obtained as
where \({\text{SF}}\left( {a,b} \right)\) is the steerable texture filter response, which can be expressed as
where \(U_{z} \left( \gamma \right)\) is the interpolation function with respect to the orientation function \(\gamma\), and \(\delta_{z} \left( {a,b} \right)\) is the impulse response at \(\gamma\). In gradient calculation, the magnitude and angle are calculated for the horizontal and vertical gradients as follows:
where \(\left| \hbar \right| \) denotes the magnitude of the horizontal and vertical gradients \(\tau_{{{\text{hor}}}} ,\tau_{{{\text{ver}}}}\). In the non-maxima suppression step, two neighboring pixels \(a_{n} ,b_{n} \) are selected in the positive and negative directions. Then, the duplicate merging pixels are reduced by
Then, in double thresholding, the magnitudes are compared with the lower and higher threshold values to suppress the smaller gradients and obtain the stronger gradients and weaker gradients. The double thresholding \(I_{{{\text{DT}}}} \left( {a,b} \right)\) is done as
where smaller gradients are denoted as \(\tau_{{{\text{smaller}}}}\), stronger gradients are denoted as \(\tau_{{{\text{stronger}}}}\), weaker ones are denoted as \(\tau_{{{\text{weaker}}}}\), and the higher and lower thresholds are denoted as, \(\wp_{{{\text{high}}}} ,\;\wp_{{{\text{low}}}}\). Finally, the different features are identified and expressed as
where\(f\) denotes the number of features, and \(\psi_{s} \) is the final feature set. The procedure of proposed CSTF method is given in algorithm 1.
The algorithm 1 explains the steps involved in extracting the features of the images. The features set \(\psi_{s} \) extracted by the CSTF method are given for further processing of the dimensionality reduction.
4.4 Dimensionality reduction
The dimensionality of a dataset refers to the number of input variables or features. The process of reducing the number of input variables in a dataset is referred as dimensionality reduction. The CSTF descriptor extracts features such as texture, edge, shape, and wavelet features. The extracted features have high dimensionality, making retrieval and classification tasks more difficult, as well as a high computational cost. Then, the dimensionality of the features \(\psi_{s} \) is reduced using the Mean Coefficient Correlation Component Analysis (MCCCA) method. The algorithm of MCCCA is based on the basic idea of correlation coefficient. The normal Principal Component Analysis (PCA) algorithm calculates the covariance matrix, but the covariance can only measure the directional relationship between two pixels and it cannot show the strength of the pixels. For this reason, the proposed methodology uses the mean coefficient correlation, which measures the strength in a better way. In MCCCA method, the mean of each feature dimension \(\overline{{\psi_{{\text{s}}} }}\) is calculated as
where \(S\) is the number of input features. Then, the mean correlation coefficient \({\text{ Cc}}_{\psi }\) is calculated as
where \(\varsigma\) denotes the covariance of the input vectors, and \(D \) is the standard deviation of \(\psi_{s} ,\overline{\psi }_{s}\). For the correlation coefficient \({\text{Cc}}_{\psi }\), the eigenvalues are calculated as
where \(\ell_{m}\) represents the eigenvectors, and \(E_{m} \) denotes the eigenvalues. Then, the eigenvalues are sorted in descending order and the features are selected based on the \(m \) largest eigenvalues. After dimensionality reduction, the selected feature set is denoted as \(\psi_{r}\).
4.5 Classification of image category
Next, the Brownian motion weighting deep learning neural network (BMWDLNN) classifier is applied on reduced features \(\psi_{r}\). In general, deep learning neural networks, the weight value is selected randomly. The random weight value selection increases the execution time and the over-classification problem. To avoid this problem, the proposed methodology uses the Brownian motion weighting factor. Initially, the features \(\psi_{r}\) are given to the neurons of the input layer. Once the inputs are received, the weight values for corresponding input vectors are randomly generated as follows:
where \(\lambda_{r} \) is the randomly generated weight value. To avoid the existing problems of larger execution time and over-classification, the weight values are initialized using the BMW method as
where \(B_{M}\) is the Brownian motion function, \(\chi\) is a constant, \(\varepsilon\) is known as diffusion parameter, \(\varphi\) is the number of sudden motions, and \(\mathop \lambda \limits^{ \Rightarrow }_{r}\) is the new set of weight values initialized by BMW method. Then, the input features \(\psi_{r} \) and initialized weight values \(\mathop \lambda \limits^{ \Rightarrow }_{r}\) are mapped to the hidden layer where the product of these two values is summed up. After the values are inputted, the activation function is determined as
The output of the hidden layer \(\xi_{{{\text{hid}}}}^{r}\) is computed as
where \(\delta \) is the bias value,\( \eta_{r} \) is the Gaussian activation function, and \(\mathop \lambda \limits^{ \Rightarrow }_{{r,{\text{hid}}}} \) is the weight values between the input and hidden layer. Finally, all the weight values are added at the output layer and the output values are attained as
where \(\mathop \lambda \limits^{ \Rightarrow }_{{r,{\text{out}}}} \) is the weight values between the hidden and output layers,\( \xi_{{{\text{out}}}}^{r}\) implies the output unit of the classifier which contains the category of the input image. After classification, different categorized image set \(W_{n}\) under different classes is obtained.
4.6 Score value calculation
Here, the score value is calculated for the categorized image set \({W}_{n}\) using the Harmonic Mean-based Fisher Score (HMFS). The general Fisher score calculation considers the mean vector calculation. The proposed methodology uses the harmonic mean value calculation in the Fisher score. The proposed algorithm cannot ignore any item of a series and it is rigidly defined. The Fisher score of the image set \(W_{{{\text{fs}}}} \) is computed as
where \(H_{l}\) is the harmonic mean and \(J_{l}\) is the standard deviation (SD) of \(n{\text{th }}\) image category in the \(l{\text{th}} \) classes, and \(M_{l}\) is the number of instances in the \(l{\text{th}}\) classes. The harmonic mean \(H_{l}\) is expressed as
where \(n\) is the number of categories.
All the aforementioned procedures, such as noise reduction, contrast enhancement, feature extraction, dimensionality reduction, classification, and score value calculation are also done for the query image in the testing phase. After that, the Manhattan distance, Euclidean distance, Jaccard distance, Hamming distance, and the relative standard deviation are calculated between the score value of the input database image and query image.
4.7 Image retrieval
In this section, the nearest neighbor of the query image is identified and retrieved based on the average of various distance values. The average value determines the similarity between the classified input and query image. The Fisher scores calculated for the input image and query image are denoted as \(W_{{{\text{fs}}}}\) and \( Q_{{{\text{fs}}}}\).Then, the distance values for the Fisher scores of the input image and query image are initialized as follows:
where \(Y_{n}\) is the total distance value, \(Y_{{\text{M}}}\) is the Manhattan distance, \(Y_{{\text{E}}}\) is the Euclidian distance, \(Y_{{\text{J}}}\) is the Jaccard distance, \(Y_{{\text{H}}} \) is the hamming distance, and \(Y_{{{\text{rsd}}}}\) is the relative standard deviation,
where
wherein \(R \) denotes the number of dimensions, \(\alpha\) is the Jaccard coefficient, \(\beta_{n}\) is the number of ones after the XOR operation \(\oplus\) of \(W_{{{\text{fs}}}}\) and \(Q_{{{\text{fs}}}}\), \(\sigma\) is the SD, and \(\mu\) mean of the score values. Then, the average of all distance values is calculated as
where \(n\) denotes the number of distance values. The similar image with the minimum average value will be retrieved using the average distance value.
5 Results and discussion
In this section, the retrieval performance of the proposed method is evaluated by conducting several experiments using MATLAB.
5.1 Database description
In this work, Extraction of Airways from CT 2009 (EXACT-09) [32], The Cancer Image Archive (TCIA) [33], National Electrical Manufacturers Association (NEMA-CT) [34], and Open Access Series of Imaging Studies (OASIS) [35] databases are used for the experiments. The description for each database is explained as follows, EXACT-09 and TCIA are the publicly available databases. In this work, the images in EXACT-09 are grouped under 19 categories, whereas in the TCIA database, the images are grouped under 8 categories. All the images in both databases have the dimension of 512 × 512. The NEMA-CT database contains 315 CT images which are categorized into 9 categories. The OASIS is a magnetic resonance imaging (MRI) dataset that contains scans of 421 subjects ranging in age from 18 to 96 years old. These 421 subjects are grouped into 4 categories. The detailed descriptions of all databases can be seen in [38] and [45].
5.2 Performance analysis of classification
In this section, the proposed BMWDLNN classifier is compared against the existing Adaptive Neuro-Fuzzy Interference System (ANFIS), Artificial Neural network (ANN), and Naive Basis Results. The performance analysis is done in terms of sensitivity, specification, accuracy, Negative Predictive Value (NPV), False Positive Rate (FPR), False Negative Rate (FNR), Mathews Correlation Coefficient (MCC), False Detection rate (FDR), and False Rejection Rate (FRR) using the above-mentioned datasets.
Figure 2 illustrates the comparative analysis of the proposed BMWDLNN classifier with the existing classifiers based on sensitivity, specification, and accuracy for different databases, namely EXACT-09, TCIA, NEMA-CT, and OASIS. When analyzing for EXACT-09 database, the sensitivity attained by the proposed method is 0.941176, and for specification and accuracy, the method obtained the values of 0.996732 and 0.993808. The proposed method is improved by 70.58% and 83.82%, 3.43% and 4.65%, 6.96% and 8.82% in respect of sensitivity, specificity, and accuracy when comparing with the existing ANN and Naive Bayes. In the analysis of the TCIA database, the sensitivity, specification, and accuracy of the proposed method are 0.88785, 0.983979, and 0.971963. The sensitivity, specification, and accuracy of the proposed method are improved by 5.14%, 6.14%, and 11.79% than the existing ANN. Followed by TCIA, in the analysis of NEMA-CT databases also, the proposed method has 0.964286 of sensitivity, 0.995536 of specification, and 0.992063 of accuracy. The proposed method showed an improvement of 60.71% in sensitivity, 4.68% in the specification, and 10.91% in accuracy compared to the existing ANN. Then, for the OASIS database, the proposed method yields the sensitivity of 0.952381, specificity of 0.984127, and accuracy of 0.97619. The proposed method is improved by the sensitivity of 69.04%, specificity of 23.01%, and accuracy of 34.52% than the existing Naive Bayes. The values of the existing approaches are lower than the proposed method for the four databases described above. The analysis above demonstrates that the proposed method is superior to the existing methods.
The NPV, FPR, and FNR of the proposed and existing methods are analyzed for EXACT-09, TCIA, NEMA-CT, and OASIS database in Fig. 3. The NPV value achieved by the proposed method is 0.996732 for EXACT-09, 0.983979 for TCIA, 0.995536 for NEMA-CT, and 0.984127 for OASIS, which are higher than the existing methods. In terms of FPR, the proposed method has lower values compared to existing methods, which are 0.003268, 0.016021, 0.004464, and 0.015873 for EXACT-09, TCIA, NEMA-CT, and OASIS databases, respectively. In the analysis of FNR, the proposed method has the values for EXACT-09, TCIA, NEMA-CT, and OASIS as 0.058824, 0.11215, 0.035714, and 0.047619, respectively. The NPV value achieved by the proposed method has the improvement of 4.92% and 3.90% for EXACT-09, 10.72% and 7.23% for TCIA, 10.12% and 7.36% for NEMA-CT, and 23.31% for OASIS than the existing ANFIS and ANN, respectively. In terms of FPR, the proposed method had lowered the FPR range when compared to the existing ANFIS method, as 4.61%, 7.20%, 8.92%, and 23.01% for EXACT-09, TCIA, NEMA-CT, and OASIS databases, respectively. In the analysis of FNR, the proposed method was enhanced with the lowered range of 88.97%, 78.50%, 82.14%, and 70.23% than the existing ANFIS method for EXACT-09, TCIA, NEMA-CT, and OASIS, respectively. The higher NPV, and lower FPR, and FNR values of the proposed method demonstrates the proposed classifier is more efficient.
Figure 4 shows the comparative analysis of the proposed classifier with the existing classifiers in terms of MCC, FRR, and FDR. For an efficient classifier, the MCC value should be higher and FRR and FDR values should be lower. The analysis is given for EXACT-09, TCIA, NEMA-CT, and OASIS databases. For these databases, the proposed method achieves 0.937908, 0.871829, 0.959821, and 0.936508 of MCC, 0.058824, 0.11215, 0.035714, and 0.047619 of FRR, and 0.058824, 0.11215, 0.035714, and 0.047619 of FDR, respectively. When comparing, MCC of the proposed method was improved by 73.13% than ANFIS, 56.39% than ANN, 61.57% than ANN, and 92.06% than Naive Bayes for EXACT-09, TCIA, NEMA-CT, and OASIS databases. Compared to the existing ANFIS, the FPR and FDR of the proposed method have lowered to the range of 88.97% and 88.64%, 78.50% and 74.49%, 82.14% and 80.42% and 70.23% and 70.21% for EXACT-09, TCIA, NEMA-CT, and OASIS databases, respectively. The above analysis infers that the proposed BMWDLNN classifier has better performance than the existing methods.
5.3 Retrieval performance analysis
In this section, the proposed method’s performance is assessed in terms of Average Recall Rate (ARR), Average Precision Rate (APR), and F-score. The \({\text{ARR}}\), \({\text{APR}}\) and \( F_{{\text{score }}}\) are calculated as in the following equations:
where \(N_{I}\) denotes the total number of database images. The top 10 retrieved images for the given query image (one of the best results with 100% precision) of EXACT-09, TCIA, NEMA-CT and OASIS databases are shown in Figs. 5, 6, 7, and 8.
The quantitative retrieval results’ comparison is illustrated in Table 1a–c over EXACT-09, TCIA, NEMA-CT and OASIS databases. The performance of the proposed technique is compared with the existing histogram of compressed scattering coefficients (HCSC) [36], Scattering Transform-Canonical Correlation Analysis vertical projection (ST-CCA-v) [37], local directional frequency encoded pattern (LDFEP) [38], local Ternary Pattern (LTP) [39], Local Derivative Pattern (LDP) [40], Local Tetra Pattern (LTrP) [41], Local Ternary Co-occurrence Patterns (LTCoP) [42], Local-Mesh Patterns (LMeP) [43], Spherical Symmetric 3D-LTP (SS-3D-LTP) [44], Local Wavelet Pattern (LWP) [45], and local bit-plane decoded AlexNet descriptor (LBpDAD) [46] features in the literature based on ARR, APR, and F-score values.
Table 1 analyses the APR, ARR, and F-score values of various feature descriptor methods with respect to the (a) EXACT-09 and TCIA, (b) NEMA-CT, and (c) OASIS databases. The proposed method obtained the APR as 0.9981 for EXACT-09, 0.99929 for TCIA, 0.99512 for NEMA-CT, and 99.40 for OASIS. In the case of ARR and F-score, 0.30608 and 0.4685 for EXACT-09, 0.1500 and 0.2608 for TCIA, 0.3364 and 0.5013 for NEMA-CT, and 10.80 and 19.483 for OASIS are obtained, respectively. The retrieval performance analysis of proposed and existing methods in terms of APR, ARR, and Fscore for the aforementioned databases is shown in Fig. 9a–d.
The percentage of improvement of the proposed method in terms of {ARP, ARR and Fscore} is observed as {9.08%, 6.16% and 6.86%}, {6.92%, 3.96 and 4.66%} and {12.15%, 59.74% and 48.58%} with respect to HCSC, ST-CCAv and LDFEP for EXACT-09 database, respectively. Similarly, for TCIA database, {5.05%, 3.30% and 3.49%}, {3.60%, 1.97% and 2.19%} and {2.93%, 1.01% and 1.28%} improvements are noticed in contrast with HCSC, ST-CCAv and LBpDAD methods. Over the NEMA-CT database, {4.39%, 4.21% and 4.27} improvement is observed in the comparison of LWP and APR is improved by 1.2% in contrast with HCSC. In addition, the proposed method’s improvement is observed over OASIS database as {13.76%, 53.40% and 49.52%} and {55.15%, 77.19% and 75.03%} in comparison with LDFEP and LBpDAD descriptors, respectively. For all metrics, the proposed method obtained higher rates than the existing methods.
The category wise retrieval results of the proposed method over the aforementioned databases are shown in Fig. 10a–d. The proposed method shown high APR and low ARR and Fscore in Table 1(b) over NEMA-CT database compared to the HCSC method. Among the nine categories of NEMA-CT, two categories which are 3rd and 6th got low precision as 0.9889, and 0.97, respectively, are shown in Fig. 10c.
6 Conclusions
Due to the increasing number of images in hospitals, the need for image retrieval systems has become more critical. This paper proposes a new CSTF feature descriptor to retrieve medical images based on the BMWDLNN classifier. The proposed feature descriptor extracts the texture, edge, shape, wavelet features, etc. The proposed method achieves the classification accuracy of 99.38% for EXACT-09, 97.19 for TCIA, 99.20% for NEMA-CT, and 97.61% for OASIS which are higher than the existing methods. The retrieval performance is evaluated based on APR, ARR and Fscore. The proposed method’s APR, ARR and Fscore are achieved as 0.9981, 0.30608, and 0.4685 for EXACT-09, for TCIA 0.9992, 0.15 and 0.2608, for NEMA-CT 0.9951, 0.3265 and 0.4917 and for OASIS 99.40, 10.80 and 19.483. The experimental results showed that the proposed method outperform the existing descriptors over the CT and MRI image databases.
References
Sezavar, A., Farsi, H., Mohamadzadeh, S.: Content-based image retrieval by combining convolutional neural networks and sparse representation. Multimedia Tools Appl. 78(6), 1–18 (2019)
Agarwal, M., Singhal, A., Lall, B.: Multi-channel local ternary pattern for content-based image retrieval. Pattern Anal. Appl. 22(8), 1–12 (2019)
Hussain, C.A., Rao, D.V., Mastani, A.S.: RetrieveNet a novel deep network for medical image retrieval. Evol. Intel. (2020). https://doi.org/10.1007/s12065-020-00401-z
Ashraf, R., Ahmed, M., Ahmad, U., AsifHabib, M., Jabbar, S., Naseer, K.: MDCBIR-MF multimedia data for content-based image retrieval by using multiple features. Multimed Tools Appl. (2018). https://doi.org/10.1007/s11042-018-5961-1
Biji, K.R., Marikkannu, P.: An efficient content based image retrieval using an optimized neural network for medical application. Multimedia Tools Appl. 79(19), 1–16 (2020)
Mezzoudj, S., Behloul, A., Seghir, R., Saadna, Y.: A parallel content-based image retrieval system using spark and tachyon frameworks. J. King Saud Univ. Comput. Inf. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.01.003
Sathiamoorthy, S., Natarajan, M.: An efficient content based image retrieval using enhanced multi-trend structure descriptor. SN Appl. Sci. 2(217), 1–19 (2019)
Ghrabat, M.J.J., Ma, G., Alresheedi, I.Y.M.S.S., Abduljabbar, Z.A.: An effective image retrieval based on optimized genetic algorithm utilized a novel SVM-based convolutional neural network classifier. HCIS 9(31), 1–29 (2019)
Bressana, R.S., Bugatti, P.H., Saito, P.T.M.: Breast cancer diagnosis through active learning in content-based image retrieval. Neurocomputing 357, 1–10 (2019)
Mistry, Y.D.: Textural and color descriptor fusion for efficient content-based image retrieval algorithm. Iran J. Comput. Sci. 3(16), 1–15 (2020)
Garg, M., Dhiman, G.: A novel content-based image retrieval approach for classification using GLCM features and texture fused LBP variants. Neural Comput. Appl. (2020). https://doi.org/10.1007/s00521-020-05017-z
Alsmad, M.K.: Content-based image retrieval using color, shape and texture descriptors and features. Arab. J. Sci. Eng. (2020). https://doi.org/10.1007/s13369-020-04384-y
Nair, L.R., Subramaniam, K., Venkatesan, P.G.K.D.: An effective image retrieval system using machine learning and fuzzy c- means clustering approach. Multimedia Tools Appl. (2019). https://doi.org/10.1007/s11042-019-08090-2
Jeyakumar, V., Kanagaraj, B.: A medical image retrieval system in PACS environment for clinical decision making. Intell. Data Anal. Biomed. Appl. (2019). https://doi.org/10.1016/B978-0-12-815553-0.00006-9
Urk, ŞO.: Stacked auto-encoder based tagging with deep features for content-based medical image retrieval. Expert Syst. Appl. (2020). https://doi.org/10.1016/j.eswa.2020.113693
Mirasadi, M.S., Foruzan, A.H.: Content-based medical image retrieval of CT images of liver lesions using manifold learning. Int. J. Multimedia Inf. Retriev. 8(9), 1–8 (2019). https://doi.org/10.1007/s13735-019-00179-6
Biswas, R., Roy, S., Purkayastha, D.: An efficient content-based medical image indexing and retrieval using local texture feature descriptors. Int. J. Multimedia Inf. Retriev. 8(6), 1–15 (2019). https://doi.org/10.1007/s13735-019-00176-9
Kasban, H., Salama, D.H.: A robust medical image retrieval system based on wavelet optimization and adaptive block truncation coding. Multimedia Tools Appl. 78(2), 1–26 (2019)
Shamna, P., Govindan, V.K., Abdul-Nazeer, K.A.: Content based medical image retrieval using topic and location model. J. Biomed. Inf. 91, 1–16 (2019)
Mandal, M., Chaudhary, M., Vipparthi, S.K., Murala, S., Gonde, A.B., Nagar, S.K.: ANTIC antithetic isomeric cluster patterns for medical image retrieval and change detection. IET Comput. Vis. 13(1), 31–43 (2019)
Swati, Z.N.K., et al.: Content-based brain tumor retrieval for MR images using transfer learning. IEEE Access 7, 17809–17822 (2019). https://doi.org/10.1109/ACCESS.2019.2892455
Sundararajan, S.K., Sankaragomathi, B., Saravana-Priya, D.: Deep belief CNN feature representation based content based image retrieval for medical images. J. Med. Syst. 43, 1–9 (2019)
Cai, Y., Li, Y., Qiu, C., Ma, J., Gao, X.: Medical image retrieval based on convolutional neural network and supervised hashing. IEEE Access 7, 51877–51885 (2019)
Shinde, A., Rahulkar, A., Patil, C.: Content based medical image retrieval based on new efficient local neighborhood wavelet feature descriptor. Biomed. Eng. Lett. (2019). https://doi.org/10.1007/s13534-019-00112-0
Owais, M., Arsalan, M., Choi, J., Park, K.R.: Effective diagnosis and treatment through content-based medical image retrieval (CBMIR) by using artificial intelligence. J. Clin. Med. 8(4), 462 (2019). https://doi.org/10.3390/jcm8040462
Karthik, K., Kamath, S.S.: A deep neural network model for content-based medical image retrieval with multi-view classification. Vis. Comput. 37, 1837–1850 (2021). https://doi.org/10.1007/s00371-020-01941-2
Gai, S.: Color image denoising via monogenic matrix-based sparse representation. Vis. Comput. 35, 109–122 (2019). https://doi.org/10.1007/s00371-017-1456-8
Jia, F., Wong, W.H., Zeng, T.: DDUNet: dense dense U-net with applications in image denoising. IEEE/CVF Int. Conf. Comput. Vis. Workshops (ICCVW) 2021, 354–364 (2021). https://doi.org/10.1109/ICCVW54120.2021.00044
Jia, F., Ma, L., Yang, Y., Zeng, T.: Pixel-attention CNN with color correlation loss for color image denoising. IEEE Signal Process. Lett. 28, 1600–1604 (2021)
Zhang, R., Du, T., Qu, S.: A principal component analysis algorithm based on dimension reduction window. IEEE Access 6, 63737–63747 (2018). https://doi.org/10.1109/ACCESS.2018.2875270
Sanchez, A., Raya, L., Mohedano-Munoz, M.A., et al.: Feature selection based on star coordinates plots associated with eigenvalue problems. Vis. Comput. 37, 203–216 (2021). https://doi.org/10.1007/s00371-020-01793-w
Lo, P., Van Ginneken, B., Reinhardt, J.M., Yavarna, T., De Jong, P.A., Irving, B., De Bruijne, M.: Extraction of airways from CT (EXACT’09). IEEE Trans. Med. Imaging 31(11), 2093–2107 (2012)
Das, P., Neelima, A.: An overview of approaches for content-based medical image retrieval. Int. J. Multimedia Inf. Retrieval 6(4), 271–280 (2017)
NEMA-CT image database: ftp://medical.nema.org/medical/Dicom/Multiframe (2020)
Shinde, A.A., Rahulkar, A.D., Patil, C.Y.: Fast discrete curvelet transform-based anisotropic feature extraction for biomedical image indexing and retrieval. Int. J. Multimedia Inf. Retrieval 6(4), 281–288 (2017)
Lan, R., Zhou, Y.: Medical image retrieval via histogram of compressed scattering coefficients. IEEE J. Biomed. Health Inf. 21, 1338–1346 (2017)
Lan, R., Wang, H., Zhong, S., Liu, Z., Luo, X.: An integrated scattering feature with application to medical image retrieval. Comput. Electr. Eng. 69, 669–675 (2018)
Shinde, A., Rahulkar, A., Patil, C.: Biomedical image indexing and retrieval based on new efficient hybrid approach using directional decomposition and a novel local directional frequency encoded pattern: a post feature descriptor. Multimed. Tools Appl. 78, 23489–23519 (2019). https://doi.org/10.1007/s11042-019-7697-y
Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Image Process. 19(6), 1635–1650 (2010). https://doi.org/10.1109/TIP.2010.2042645
Zhang, B., Gao, Y., Zhao, S., Liu, J.: Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans. Image Process. 19(2), 533–544 (2010)
Murala, S., Maheshwari, R., Balasubramanian, R.: Local tetra patterns: a new feature descriptor for content-based image retrieval. IEEE Trans. Image Process. 21(5), 2874–2886 (2012)
Murala, S., Wu, Q.J.: Local ternary co-occurrence patterns: a new feature descriptor for MRI and CT image retrieval. Neurocomputing 119, 399–412 (2013)
Murala, S., Wu, Q.: Local mesh patterns versus local binary patterns: biomedical image indexing and retrieval. IEEE J. Biomed. Health Inf. 18(3), 929–938 (2014)
Murala, S., Wu, Q.J.: Spherical symmetric 3D local ternary patterns for natural, texture and biomedical image indexing and retrieval. Neurocomputing 149, 1502–1514 (2015)
Dubey, S.R., Singh, S.K., Singh, R.K.: Local wavelet pattern: a new feature descriptor for image retrieval in medical CT databases. IEEE Trans. Image Process. 24(12), 5892–5903 (2015)
Dubey, S.R., Roy, S.K., Chakraborty, S., et al.: Local bit-plane decoded convolutional neural network features for biomedical image retrieval. Neural Comput. Appl. 32, 7539–7551 (2020)
Murala, S., MaheshwariBalasubramanian, R.P.R.: Directional binary wavelet patterns for biomedical image indexing and retrieval. J. Med. Syst. 36, 2865–2879 (2012)
Deep, G., Kaur, L., Gupta, S.: Directional local ternary quantized extrema pattern: a new descriptor for biomedical image indexing and retrieval. Eng. Sci. Technol. Int. J. 19(4), 1895–1909 (2016)
Shinde, A.A., Rahulkar, A.D., Patil, C.Y.: Local neighboring binary pattern: a new feature descriptor for biomedical image indexing and retrieval. Signal Image Process. 2017, 154–159 (2017). https://doi.org/10.1109/SIPROCESS.2017.8124524
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare no conflict of interest in the submission of this article for publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rao, R.V., Prasad, T.J.C. An efficient content-based medical image retrieval based on a new Canny steerable texture filter and Brownian motion weighted deep learning neural network. Vis Comput 39, 1797–1813 (2023). https://doi.org/10.1007/s00371-022-02446-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02446-w