1 Introduction

Recently, the breast cancer identification and classification system has gained significant attention in many applications. According to the World Health Report, breast cancer is considered the second most common cause of death in women [1,2,3,4]. Hence, it is essential to predict the disease at an earlier stage to provide appropriate treatment at that time. There are different types of imaging modalities used for detecting and accurately spotting the tumor affected region. Among the other imaging techniques, mammogram images are highly utilized by medical experts for accurate detection and treatment at an earlier stage. For detecting breast cancers from mammogram, various medical image processing approaches have been used in the existing works [5]. The general tumor detection system includes the working modules of image preprocessing, segmentation, optimization, and label prediction [6]. The most commonly used preprocessing techniques are mean, median, adaptive median, Gaussian, and hybrid filtering approaches. Similarly that, pattern extraction techniques are also mainly used for extracting from the preprocessed image [7, 8], which includes the feature types of texture patterns, binary patterns, shape, and density patterns. The major goal of utilizing optimization techniques, in this case is to find the best features for predicting the classified label with increased accuracy and reduced error values [9].

Also, it supports obtaining an improved classification performance [10] with reduced training and testing time. Conventionally, there are various kinds of optimization techniques used for computing the best fitness value to enhance the classification accuracy. Because the reduced number of features can minimize the required amount of time for training the models. The recently used feature selection techniques are Genetic Algorithm (GA), Greedy Optimization (GO), Grasshopper Optimization (GO), Artificial Bee Colony (ABC) optimization, and other bio-inspired optimization techniques [11,12,13,14,15].Similarly, there are various kinds of segmentation approaches used for segmenting the tumor affected regions based on the threshold value. Image segmentation is one of the vital processes used for enhancing detection accuracy and reducing the misclassification results oftumor classification systems[16]. At last, machine learning [17] or deep learning classification [18] algorithms are employed to produce the label as normal or tumor affected. It [2, 19, 20] includes the types of Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Fuzzy Logic (FL), Ensemble Learning (EL), Gradient Boost (GB), Neural Network (NN), and Naïve Bayesian (NB). Similarly, the deep learning models [20,21,22] are categorized into the types of Convolutional Neural Network (CNN), Deep Neural Network (DNN), Deep Belief Network (DBN), Long Short-Term Memory (LSTM), and Recurrent Neural Network (RNN). The major contributions of the proposed work are as follows:

  • To accurately classify breast cancer from the given mammogram image set, an enhanced Convolutional Neural Network (CNN) based classification technique is deployed.

  • To obtain the patterns from the input image by applying an efficient feature extraction mechanism.

  • To optimally select the best features from the extracted output based on the global fitness function, the meta-heuristic Genetic Algorithm (GA) has been utilized, which also helps to reduce the misclassification rate with increased detection performance outcomes.

  • To evaluate the detection performance of the suggested model, various evaluation metrics have been utilized during the analysis. Also, the obtained output values are compared with other recent detection methodologies.

Results of GA-CNN classification are also compared with existing classification methods such as Sparse AutoEncoder (SAE), Deep Belief Network (DBN), CNN and Decision Tree (DT)-CNN. The existing deep learning technique AHEE-CDLS-CNN is compared with a proposed deep learning technique GA-CNN. The goal is to contrast the result of theexisting deep learning technique with the GA-CNN technique and analyze the impact of reduced feature set selection on the classification accuracy and error rate of the proposed model. It is observed that GA-CNN shows considerable improvement in classification accuracy and reduced error rate with the optimal feature set.

In complex data models, the existing system AHEE-CDLS-CNN is costly to train the data set and consumes significant computing time when the number of variables is high. Optimal feature set selection is the process of picking the most relevant subset of features. The most effective feature set selection strategy eliminates irrelevant data, potentially improving system performance. The system’s computing cost is lowered as a result of the optimal feature set selection technique, resulting in greater computational efficiency. The suggested method aims to find the best feature set with the best-fitted value while consuming the minimum possible time for both training and testing models. in addition to increasing classification accuracy while lowering complexity. This work also gave a brief description of the DL-based breast cancer classification system using the GA-CNN technique to diagnose mammography images. The use of DL approaches in clinical analysis has a lot of promise for improving the diagnostic capability of present CAD systems. When compared to calcifications, masses are more difficult to detect and categorize when the density of the breast varies. Traditional machine learning approaches rely on constrained procedures that are restricted to certain dataset density categories. Although the GA-CNN approaches show promise in the diagnosis of breast cancer.

The other sections of this paper are categorized into the followings: Sect. 2 presents a comprehensive review of the conventional medical image processing techniques with their advantages and disadvantages. The working flow of the research method is illustrated with its flow and algorithm explanations in Sect. 3. The performance and comparative analysis of both conventional and proposed tumor detection methodologies are validated by using various measures in Sect. 4. Section 5 discusses a study’s findings. Finally, the overall paper is summarized with the future scope in Sect. 6.

2 Related Works

This unit examines some of the existing research works associatedwith the breast cancer detection and classification framework. Also, it covers the benefits and drawbacks of each technique based on its features and characteristics.

Chowdhary et al. [23] implemented the Fuzzy C-Means (FCM) clustering and incorporated the Support Vector Machine (SVM) classification mechanism for accurately detecting breast cancer from the mammograms. Here, the Otsu based thresholding technique was deployed for accurately segmenting cancer affected regions. The stages involved in this system were preprocessing, background segmentation, clustering, and classification. In order to enhance segmentation accuracy, this work extracted a set of features like shape, circularity, gradient, mean, and standard deviation from the segmented region. Still, this work is limited by the key challenges of increased testing and training duration for the models and high misclassification results. Oliveira et al. [24] suggested a lightweight deep learning model for accurately predicting breast cancer with increased accuracy. Here, the data augmentation has been performed initially to enhance the robustness based on affine transformations. Consequently, image segmentation and classification processes have been performed for training the models by fine tuning the parameters.

Khoulqi et al. [25] suggested the CAD model for using mammograms to diagnose breast cancer at an earlier stage. The median filtering technique has been at first to enhance the quality of input images with increased contrast. Then, the watershed and thresholding-based segmentation methodologies were applied to segment the tumor affected regions from the preprocessed results, which ensures the increased accuracy of classification. Moreover, three different classification approaches such as k-nearest neighbor, SVM, and C4.5 techniques have been applied to predicting the classified output label. Also, the performance of these classification techniques was analyzed and compared based on the increased detection accuracy. The major disadvantages of this work were inefficient classification, reduced performance outcomes, and does not have the ability to handle large dimensional data. Antari et al. [26] suggested an enhanced regional deep learning model, named the Full resolution CNN (FrCN) algorithm for classifying breast cancer from mammograms, which comprises the working steps of mass detection, segmentation, and classification. Here, the YOLO framework has been deployed for mass detection, and FrCN model was utilized for segmenting the mass regions. Finally, the CNN architecture was deployed for accurately classifying breast cancer. Jouni et al. [27] employed an Artificial Neural Network (ANN) for the detection and classification of breast cancer disease at an early stage. This method intends to find the most suitable optimal solution for reducing the classification error with a reduced number of blocks. The advantages of using this mechanism were as follows: reduced error rate, minimal computational complexity, and abetter optimal solution used for classification. However, it requires reducing the dimensionality of features by increasing the efficiency and reducing the time consumption of the classification system.

Muduli et al. [28] implemented a Moth Flame Optimization based Ensemble Learning Machine (MFO-ELM) technique for designing an automated breast cancer detection and classification system. It mainly intends to obtain the set of features from the given mammogram images by using the wavelet transformation technique. The working stages involved in this classification were ROI extraction, wavelet transformation-based feature extraction, feature reduction using PCA with LDA, and classification based on MFO-ELM. Yet, this work limits the issues of high complexity in algorithm design, increased effort for training and testing data, and reduced accuracy. Wang et al. [20] deployed an Extreme Learning Machine (ELM) based CNN methodology for accurately detecting breast cancer based on the set of features. The major stages involved in this work were image preprocessing, mass detection, feature extraction, and classification. Here, an adaptive mass region detection mechanism was employed to extract the ROI from the breast image, which helps to exactly spot the abnormal region. In addition to that, the ELM based training was performed to estimate the output vector for the random weights and bias function. Increased accuracy, excellent detection efficiency, and reduced design complexity were the main benefits of this effort.

In the paper [29], the Naïve Bayes (NB) classification approach was utilized for detecting breast cancer from mammograms with increased performance results. The main factor of this work was to categorize the benign and malignant tumors based on the weight attributes calculation of the NB classifier. Here, different types of feature vectors have been extracted from the input image to improve the detection accuracy.

Khuriwal and Mishs [30] utilized an enhanced deep learning model for detecting breast cancer from pathological images. Here, ROI has been extracted initially from the input image, and the preprocessing was performed on the filtered image to remove the noise that exists in the input image. As a consequence, the CNN model was used to extract the features from the preprocessed outcome. which helps to detect the cancer affected region with increased accuracy. Thawkar et al. [31] looked at effective techniques for mass classification and extraction of features in digital mammograms. The genetic ensemble method combined with tenfold cross-validation was used to choose the subsets of significant aspects. AdaBoost, RF, and single DT classifiers were trained and tested on the most significant feature subset that had the best classification accuracy. The results of the suggested strategy demonstrate how much more effective ensemble classifiers are than single classifiers.

Though the literature has supplied sufficient information on existing models for breast cancer diagnosis, we believe that a less complex feature selection method is required to improve accuracy and diagnostic efficiency for the proposed work. To overcome this challenge, the GA-CNN technique assisted with a breast cancer diagnosis is proposed in this study. With a reduced image feature set, the GA-CNN technique was used to identify the masses as malignant or benign. The automated detection of breast cancer is still a challenge. The proposed breast CAD technique in which genetic algorithms are utilized to extract relevant and significant features and CNN is employed to increase breast cancer detection accuracy.The deep learning approach outperformed the general approaches in a variety of applications [32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70]. This highlights the power of the deep learning method in encouraging breast cancer research.

3 Materials and Methods

A clear description of the working approach, including its suitable flow and algorithmic phases, is offered here.This work mainly objects to working with medical images by using advanced segmentation and classification mechanisms. The main intention of this paper is to accurately predict breast cancer from the given mammogram image dataset with increased detection efficiency and reduced time consumption.

Typically, different types of medical image segmentation and classification techniques have been utilized for segmenting and classifying the breast cancer affected region from the given input. Still, it is limited by the problems of increased complexity in algorithm design, reduced level of accuracy, and high misclassification results. Therefore, the proposed research intends to create a hybrid methodology for breast cancer detection and classification that is both efficient and effective. The proposed breast cancer detection system’s overall flow is shown in Fig. 1 and it comprises the following modules:

  1. 1.

    Data collection

  2. 2.

    Preprocessing and contrast enhancement

  3. 3.

    Segmentation

  4. 4.

    Feature mapping

  5. 5.

    CNN based Classification

Fig. 1
figure 1

The proposed breast cancer detection system’s overall flow

Also, the overall steps involved in the research methodology are illustrated as follows:

Step 1 Get the input mammogram image from the given Dataset;

Step 2 Apply filtering technique for noise removal;

Step 3 Increase the enchantment by using adaptive histogram equalization;

Step 4 Segment the image based on color dense level;

Step 5 Extract the features from the segmented image with respect to the higher dense tissue;

Step 6 Apply Genetic Algorithm (GA) for optimally selecting the set of features;

Step 7 The CNN based classification approach is applied to predict the classified label;

The active or pertinent features for the classification of breast cancer are chosen using an optimal feature selection genetic algorithm and the CNN classifier was used to predict masses. The method of selecting the optimal feature set from enormous data sets reduces computational complexity, increases algorithm performance, and allows for the evaluation of correct and effective outcomes.

3.1 Image Preprocessing and Contrast Enhancement

After getting the input image from the given dataset, the preprocessing is executed for noise elimination and contrast enhancement. Conventionally, various filtering approaches such as mean filter, median filter, adaptive median, and other hybrid filters are used for preprocessing the medical image data. Among the other filtering models, the Gaussian filtering technique is most suitable for medical imaging applications. Also, it provides the following advantages:

  • Simple implementation

  • Effective in noise removal

  • Increased ability to handle both the salt and pepper noise

  • Symmetric in nature

The overall detection efficiency of the classification technique highly depends on the quality of the input image. So, image enhancement is one of the most importantpartsof any medical imaging application system. In this work, both the Gaussian filtering and Adaptive Histogram Equalization (AHE) techniques are utilized for noise removal and quality enhancement. Moreover, the main benefit of using this technique is to efficiently extract breast masses with high contrast and quality. The steps involved in the AHE are illustrated as follows:

Step 1 Obtain the input gray image;

Step 2 Split the image into 3 \(\times\) 3 local matrices;

Step 3 Calculate the brightness of each local matrix;

Step 4 Enhance the local matrix by using the nearest local matrix;

3.2 Markov Random Adaptive Segmentation (MRAS)

In this stage, the preprocessed image has been obtained as the input for segmentation to accurately detect the tumor affected region. For this purpose, the Markov Random Adaptive Segmentation (MRAS) methodology has been utilized in this work, which segments the tumor region by selecting the random value of each group. Normally, segmentation is mainly used to make the image analysis part easier. In the existing detection methodologies, various segmentation approaches have been used to improve the performance of medical imaging applications. But, it limits the problems over segmentation, difficulties in handling textures, and complex ROI extraction. Toaddress these issues, this research model intends to implement the random value-based image segmentation method, which efficiently extracts the ROI with high accuracy. After obtaining the contrast enhanced image as the input, the random value of each group has been selected. Then, the nearest value is computed for each group of pixels, and its mean value is also estimated correspondingly. This estimate has been repeated until reaches the closest point of each group. If there is any change in the random value, the above processes have been repeated. Consequently, the Highest Brightness Pixel (HBP) group is selected and its region is extracted from the gray scale image. Moreover, the segmented region is utilized for further processes like optimization and classification. The detailed processes involved in this algorithm are illustrated as follows:

Step 1 Obtain the preprocessed image as input (Enhanced image, number group);

Step 2 Select the random value of each group;

Step 3 Estimate the nearest value of each group by using the following model:

$$f\left( {C,n} \right) = \sum\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{N} {|x_{ij} - x_{tn} |} }$$

Step 4 Compute the group mean value for each group;

Step 5 Repeat Step 3 until identifying the closet point of each group separately;

Step 6 If there is any change in the random value, repeat from step 4; else, finish;

Step 7 Select the Highest Brightness Pixel (HBP) group;

Step 8 Extract the area of HBP.

Step 9 Filter and extract the HBP values in Gray image;

The MRAS is used to enhance the images visual quality and separate a digital image into several sectors or pixels. It is essential to develop a mathematical model to isolate the area of interest rather than using complicated procedures, however, MRAS can reduce the complexity of handling operations while also making them less difficult.

3.3 GA Based Optimization

Typically, feature optimization is considered as the primary module of the medical imaging system, because it supports improving the overall detection performance.A large number of features can degrade the classifier’s performance by requiring more time to train and test the models.As a result, this work aims to make use of the meta-heuristic optimization technique, named Genetic Algorithm (GA) for identifying features ideally based on fitness value. It includes the processes of crossover, mutation, and selection operations, which automatically determine the relative importance across different features for providing the optimal set of values. In this work, the key reason for employing this strategy is to obtain the reduced set of features with the best fittest value. The processes involved in this technique are illustrated as follows:

Step 1 Optimize the neural network for the training of features using a Genetic algorithm

Step 2 It reduces the features from 37,636 to 1000.

Step 3 Get the position of a selected feature.

Step 4 Use this position to test the time.

The advantages of using this technique are listed below: increased optimization results, high detection rate andreducedeffort for training and testing the system.To locate a better quality subset, many search techniques are applied. In the proposed work GA is used to find a better subset of features and a higher quality of the extracted subset of features. The components of a genetic algorithm are the population size, the number of generations, and the crossover and mutation probabilities. By creating a list of features as the initial population, the search point is established. The chromosomes are employed as a filter of features in the genetic algorithm.

Feature maps are used to visualize the spatial characteristics of input patterns. A feature map reduces the false positives and helps to make the classification process easier. A highly informative image such as mammography includes larger input features. Each training image was separated into segments when the segmentation process was finished. The comparative standard is based on color and shape. Additionally, each segment also retrieved a different set of features. In the feature mapping process, each filter is 3 × 3 and the input image is 256 × 256. There are 7 conventional layers and 7 pooling layers, resulting in a final image size of 228 × 228 which flattens the image as 51,984. The feature length was minimized using GA optimization which handles feature selection.

3.4 Classification

After selecting the best combination of features,the normal and abnormal images are categorized by using the CNN based deep learning model. When compared to the machine learning models, the deep learning techniques have been extensively used in recent days due to their increased accuracy, reduced complexity, and improved detection efficiency. Typically, the mammogram images contain more gray value areas, which decreases the accuracy of the classifier. So, it is highly important to enhance the efficiency of these images by separating the mammary area from the given mammogram images, which is represented as the ROI. Here, the mass region is detected by sequentially scanning all rows of the image. Consequently, the sub-region partitioning is performed to separate the non-overlapping sub-regions. After that, the deep, morphological, density and texture features are extracted from the sub-regions, which help to improve the accuracy and detection efficiency of classification. The deep features are more useful for extracting the most important characteristics of images. The morphological features are considered as essential indicators for medical experts to differentiate the tissues of benign and malignant masses. Similarly, the texture and density features are also more important, which aids in the detection of breast cancer in its early stages. The density feature is extracted by computing the correlation between the benign and malignant portions.

At last, the CNN technique utilizes these feature vectors for predicting the classified label into seven different classes, which include:

  1. 1.

    CALC–Calcification

  2. 2.

    CIRC–Well-defined/circumscribed masses

  3. 3.

    SPIC–Speculated masses

  4. 4.

    MISC–Other, ill-defined masses

  5. 5.

    ARCH–Architectural distortion

  6. 6.

    ASYM–Asymmetry

  7. 7.

    NORM–Normal

The CNN technique comprises the layers of convolution, maximum pooling, and fully connected. When compared to the other classification models, the CNN technique has the benefits of better generalization performance, not sensitivity, and high learning speed. Thus, this research work utilizes the CNN based classification model for accurately classifying breast cancer from mammogram images.

Step 1 Let’s consider, the input size of CNN is 254 \(\times\) 254;

Step 2 Define the kernel size as 3;

Step 3 The first convolution layer is filtered as a 254 \(\times\) 254 input image with a 3 \(\times\) 3 Layer;

Step 4 The output of convolution layer is 250X250

$$CL^{t} \left( {k1,k2} \right) = \mathop \sum \limits_{a,b} KV^{t,l} \left( {i,j} \right).IP^{k1} \left( {k1 - i} \right),\left( {k2 - j} \right) + BA^{t,l}$$

where, KV indicates the tth Kernel and BAt,lis the bias of tth layer;

$$OP^{t} \left( {k1,k2} \right) = \frac{1}{{1 + e^{{(CL^{t} \left( {k1,k2} \right))}} }}$$

Step 5 The convolution layer connected with the max pooling layer and, its convoluting operation is continued to the 15th layer;

Step 6 The classified output is predicted as follows: CALC, CIRC, SPIC, MISC, ARCH, ASYM, and NORM;

In this system, the output layer is 7, the hidden layer is 20, the size of the hidden layer is 10, and the input layer is 37,636.

The data on breast cancer were mapped, normalized, and encoded. The initial population was chosen at random. A random binary encoded vector is used as a feature selector with the GA, and a 0 bit indicates that the feature is not included. The suggested GA-CNN methodology which includes reduced set of features with the best fittest value has the advantage of reducing complexity and the predicted outcome precisely identifies cancer rather than any anomalies in the breast.

4 Results

The results and a comparative analysis of the conventional and proposed research methodologies are validated by using various measures. It includes the parameters of sensitivity, specificity, accuracy, precision, recall, and similarity coefficients. Among the other measures, sensitivity, specificity, and accuracy are increasingly utilized in many medical imaging applications for assessing the overall detection efficiency of the methodologies. The sensitivity is defined by the ratio of the actual number of true positives and the sum of the actual true positives and false negatives. Similarly, to that, the specificity is defined by the proportion of actual true negatives to the total number of true negatives and false positives, which are computed as shown below:

$$Sensitivity = \frac{TP}{{TP + FN}}$$
(1)
$$Specificity = \frac{TN}{{TN + FP}}$$
(2)

Consequently, the accuracy of the classifier is entirely based on the measures of sensitivity and specificity as shown below:

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}$$
(3)
$$\Pr ecision = \frac{TP}{{TP + FP}}$$
(4)
$$F1\_Score = \frac{{2 \times \Pr ecision \times {\text{Re}} call}}{{\Pr ecision + {\text{Re}} call}}$$
(5)
$$MCC = \frac{TP \times TN - FP \times FN}{{\sqrt {(TP + FP)(TP + FN)(TN + FP)(TN + FN)} }}$$
(6)
$$Error\_Rate = 1 - Accuracy$$
(7)

where, TP—True Positive, TN—True Negative, FP—False Positive, FN—False Negative. Figure 2 and Table 1 compare the existing and proposed tumor detection methodologies based on accuracy, sensitivity, and specificity. During this evaluation, the techniques such as DT-CNN, CNN, DBN, and SAE have been for validation.The key benefit of DT CNN method is high accuracy in determining whether the projected malignancy is malignant or benign.Deep belief networks (DBNs) are an excellent alternative, but the method for fine-tuning network weights and biases, as well as the number of hidden layers and neurons, make them difficult to implement [71].The Sparse AutoEncoder (SAE) is an automated approach [72] that uses a classifier to accomplish error-free breast cancer prediction by learning image features from the mammogram.The results reveal that the proposed method GA-CNN outcomes outperform the other methods currently in use in terms of accuracy, sensitivity, and specificity.

Fig. 2
figure 2

Accuracy, sensitivity, and specificity analysis

Table 1 Comparative analysis based on accuracy, sensitivity, and specificity

Table 2 presents the overall performance evaluation and comparison of both convention and proposed tumor detection methodologies. It compares quantitatively Accuracy, Sensitivity, Specificity, Precision, and F1 score. GA-CNN received an F1 score of 0.9952. GA-CNN has a higher F1 score over the existing model. In addition, GA-CNN outperformed the existing model in terms of Accuracy, Sensitivity, Specificity and Precision. The GA-CNN shows the best results with optimal feature selection. These results also indicate that the GA-CNN technique provides improved performance results by accurately detecting the tumor region from the given mammogram images.

Table 2 Overall performance assessment

The standard deviation and the mean are typically employed in clinical and experimental investigations to illustrate the properties of sample data and to explain the findings of statistical analysis.Training Statistics against feature selection listed in Table 3.When constructing a predictive model, the process of feature selection involves lowering the number of input variables. In some circumstances, reducing the number of input variables might enhance the efficiency of the model while also lowering the computing cost of modeling. The total performance of the training, applying the GA-selected optimal feature is better compared to training using all the features, as shown in Table 3.If the irrelevant features are removed from the data set, GA-CNN performs much better. For finding the optimal features for classification tasks, the GA-CNNis a suitable choice.

Table 3 Training statistics against feature selection

Figure 3 presents the comparative analysis of existing and proposed deep learning based classification techniques used for breast cancer detection. In the proposed system, the AHE based contrast enhancement is performed to improve the quality of the original mammograms.The overall performance and detection efficiency of the medical image processing technique are extremely dependent on the excellence of an image.

Fig. 3
figure 3

Sensitivity, specificity, and accuracy analysis of existing and proposed deep learning models

Due to the rising amount of high dimensional data, feature selection has evolved into a crucial phase in the data processing process for developing a machine learning model.The GA is extended to CNN in the suggested way to identify its finest features. GA feature optimization indicates that the metrics’ discernible improvement is adequate to demonstrate the method’s viability. The accuracy of the findings is significantly improved for the GA-based model trained on the optimized feature selection.When compared to the existingmethod, the proposed GA-CNN provides increased accuracy (98.5), sensitivity (99.38) and specificity values (98.4) by accurately categorizing the classified label.

Figure 4 compares the similarity coefficients of both existing and proposed methods based on the exact measures of precision, F1—score, MCC and kappa. As the parameters proceed nearer to their maximum value, it shows the classification performs better. Figure 5 shows the error values of existing and proposed techniques with respect to the measures of FPR and overall error rate. Error rates must be established in order to assess performance, decide whether change is required, gauge the success of interventions, and provide transparency. A lower error rate in the proposed technique produces results that are more exact and accurate.

Fig. 4
figure 4

Similarity coefficients of existing and proposed techniques

Fig. 5
figure 5

Error analysis

Figure 6 quantitatively compares the statistics between GA selected feature set and the feature set selected by the standard techniques. To estimate the training with all features and GA determined optimal feature set, the statistical validation metrics Mean and Standard Deviation are utilized.The success of our approach was proven by a comparison of the results of feature selection using a genetic algorithm and traditional procedures. From these evaluations, it is evident that the proposed GA-CNN technique provides enhanced performance outcomes with reduced error rates through proper segmentation and classification processes.

Fig. 6
figure 6

Training statistics against feature selection

When implementing the technique in a real-time application, computational complexity is a crucial factor. The computational complexity of the suggested and existing tumor identification approaches such as the Hybrid of K-means Gaussian Mixture Model, K-means, Gaussian mixture Model and Growth region hand selection method is compared in Table 4 based on the input size of the breast mammogram images. According to Table 4’s findings, the suggested model offers an excellent time consumption result.

Table 4 Computational complexity

5 Discussion

According to the obtained results, it is concluded that the proposed GA incorporated with the CNN technique excels the other techniques with better measures. Because the best selection of features helps to obtain improved detection accuracy by efficiently training the samples during classification. In this system, the image over segmentation problem is avoided by implementing MRAS segmentation technique, which accurately detects the object boundary based on the random value selection. Moreover, the proper segmentation with optimal feature selection processes could efficiently progress the overall classification efficiency of the proposed system with reduced false positives and error values. This shows the overall effectiveness of the proposed breast cancer detection and classification technique over the other techniques.

6 Conclusion

This paper developed an improved optimization-based classification technique for accurately detecting breast cancer from mammogram images. The main objective of this work is to design an effective breast cancer detection system with reduced computational complexity, over-segmentation, and false positives. For this purpose, a set of image processing techniques have been implemented in this work, which helps to improve the overall classification performance of the detection system. At first, the Gaussian filtering and AHE techniques are utilized for noise removal and quality enhancement. Moreover, the main benefit of using this technique is to efficiently extract the breast masses with high contrast and quality. Then, the MRAS methodology has been utilized for segmenting the tumor region by selecting the random value of each group. Consequently, the GA based optimization technique is utilized for effectivelyextracting the features depending on the best fitness value. It includes the processes of crossover, mutation, and selection operations, which automatically determine the relative importance across different features for providing the optimal set of values. Finally, the CNN based deep learning classification technique is used to determine whether the given image is normal orthe tumoris influenced by its type. Like any adaptive process, genetic algorithms require feedback. The feedback for feature selection appears to only occur when you assess the trained model using those features. This seems like an expensive fitness function. The performance of the proposed technique is validated by using various evaluation metrics. Then, the obtained results are compared with some of the existing techniques for proving the betterment of the proposed system. Based on this evaluation, it is evident that the GA-CNN outperforms the other techniques with increased performance outcomes.

7 Future Work

Automated segmentation of mammogram images is a challenging task.Future research will optimize self-segmentation for better performance in systems for diagnosing breast cancer. In future work, we propose to incorporate the efficient deep U-Net segmentation method to enhance segmentation efficiency.