1 Introduction

Histopathology is a critical way to acquire information related to lesions, especially in cancer detection and has a wide range of applications in the field of health informatics. Therefore, automatic cancer classification from digital images of histology slides has been gaining attention from industries, academics and research communities. The key problem in the classification research is to classify every image as per the class of interest which is based on discerning features extracted from the image [1]. Many factors make the problem of histological image classification very challenging such as diversity in staining protocol, variations in tissues appearances at different magnifications, and confounder tissue patterns [2,3,4]. State-of-the-art methods addressed these challenges by designing separate classifiers for each different magnification level and sometimes by using single classifier independent to the magnification level of the images but only for the case of binary classification [5,6,7,8]. However, magnification independent classification through the pre-trained convolutional neural network (CNN) is not exploited in these works for the sub-classification of the breast cancer histopathological images.

The implementation of the pre-trained network as feature extractor handles the most common problems i.e. (1) the high computational cost and (2) the weight initialization, encountered in CNN training [4]. Due to the fact, we propose two powerful CNN’s namely: VGG and Xception model which serve as a state-of-the-art model in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Both the models were built for the classification of natural images. Although, we have employed these models in the classification of bio-medical images by incorporating alterations in their architecture using transfer learning approach. Besides, we present a faster, firm and a computationally effective way to create an automated classification approach for the sub-classification of the most complicated modality of imaging. It is of worth considering that the present work is an extension of methods given by Shallu et al. for binary classification of BreakHis dataset [8].

The prime objective of the present study is to validate the potential of two pre-existing CNN architectures in a comparative manner for the classification of the histopathological image data set which is independent to magnification levels. Besides, we make use of different conventional classification algorithms namely, random forest (RF), support vector machine (SVM), naïve Bayes (NB), logistic regression (LR), decision-tree classifier (DT) [10], k-nearest neighbour (K-NN), and linear discriminant analysis (LDA) only with the best pre-existing model to choose the most optimal combination which provides distinctive solutions for solving the magnification independent classification problem. Here, it is worth mentioning that we have reported such comparative study of hybrid neural networks for the multi-classification of magnification independent histology images for the first time, as best to authors knowledge concerned. However, the unavailability of the state-of-the-art method especially for this problem left us disabled to compare the results obtained from the current study. A sample of input data used in the classification of breast cancer histological images is illustrated in Fig. 1.

Fig. 1
figure 1

Typical information required for histological image classification which represents different microscopic structures: a fat tissue, b cell nuclei, and c collagen-rich stroma in an example of Mucinous Carcinoma at 100X [9]

2 Related work

The previous studies in biomedical image processing domain have shown that making full use of CNN with conventional machine-learning classifier could benefit diagnostic accuracy [7, 11,12,13,14,15]. The field of biomedical image processing includes segmentation [16,17,18,19], detection [20,21,22], localization [23] and classification [4, 5, 7, 23,24,25,27] in the different modality of medical imaging like MRI scan, PET scan, CT scan, mammography, ultrasound and histopathology images. The present work is focused on the classification of digitized histology images obtained from breast open surgical biopsy. Therefore, this relevant work analysis is limited to the research studies which are performed over the breast cancer histological images for classification purpose. Despite the considerable interest of researchers in the domain of computer-aided diagnosis of cancer using histopathological image analysis. There is a dearth of well-annotated and unified benchmarks. In most of the works, the experiments are performed by the researchers on their private datasets [14]. There are only two datasets namely, BreakHis [9] (8 classes, 7909 images) and BACH [28] (4 classes, 285 images) that are publicly available for breast cancer classification and can be utilized as a benchmark for performance measurement.

The results reported for the aforesaid datasets clear that CNN is capable enough in achieving state-of-the-art performance and gaining popularity for automated classification of images. Spanhol et al. [14] utilized a deep learning approach by proposing CNN and employed patch trick to enlarge the dataset. They reported better performance on BreakHis dataset in comparison to hand-crafted feature extraction methods of classification. Akbar et al. [29] improved the performance of CNN by proposing a new regularization technique in which a transition has been carried out between convolutional and fully-connected layers of the network. However, the devised protocol is not clearly explained in terms of convolutional and fully-connected layers utilization due to which results are not directly comparable. Moreover, an integrated multi-scale model proposed by Gupta and Bhavsar [30] and they have utilized six different colour–texture representations in conjunction with a cluster of heterogeneous classifiers. The outputs of classifiers are then combined through a majority voting scheme to classify the test data which results in better accuracy as compared to the work contributed by Spanhol et al. [14]. Additionally, Han et al. proposed a sub-classification model which is also known as class structure based deep (CSD) CNN [3] In this model, a methodology is adopted to maximize the Euclidean distance of inter-class labels that reports better accuracy for the classification of BreakHis dataset at the image level and patient level. Data augmentation technique also employed to overcome the problem of an unbalanced dataset. Thus, CNN based classifier becomes the best choice for our work.

Other literature based on CNN’s is leveraging the transfer learning approach to classify the histopathological data because the medical domain always suffers from the scarcity of large datasets. In this context, Song et al. utilized a pre-trained VGG model to extract the local features from the breast cancer histopathology images which are encoded by fisher vectors and classified using linear support vector machine (SVM). In conjunction with this idea, improvement has been suggested by Zhi et al. [31]. After this, they proposed an ensemble model which consists of three pre-trained CNNs. Here, Max rule is utilized to ensemble the output of these three CNNs and make a final decision. They performed data augmentation (zooming, horizontal and vertical flips) to enlarge the dataset in order to reduce the problem of overfitting. Shallu et al. [7] also applied the transfer learning approach using VGG and ResNet model on BreakHis dataset for a magnification independent binary classification. In addition, they determined the impact of different training–testing data splitting ratio on the classification performance of these models. Although, we have also performed data augmentation but the analysis of Xception model’s potential in the sub-classification of BreakHis dataset as compared to VGG16 apart it from other research contributions.

In several recent studies, the feature selection framework is also being employed to reduce the computational burden without compromising the performance of the classification model. However, most of the literature discussed here are implemented either for the gastrointestinal disease or skin lesion recognition. Khan et al. [32] utilized VGG16 to recognize the gastrointestinal disease where the features were extracted from the two consecutive fully connected layers and the most relevant features were selected through the metaheuristic approach. A classical method of features fusion (simple array concatenation) and selection (Genetic algorithm based on K-nearest neighbour KNN fitness function) had also been employed and analysed by Majid et al. [33] for gastric infections recognition. The utilized methodology achieved a significant performance but not surpassed the metaheuristic approach of feature selection. Rashid et al. [34] hybridized the two-deep architectures (InceptionV3 and Deep Convolutional Networks for Large-Scale Image Recognition) and concatenated the features extracted from these two architectures for object detection. The fusion of features is performed using a parallel maximum covariance approach and the potential features are selected employing Multi Logistic Regression controlled Entropy-Variances method. Iteration-controlled Newton–Raphson (IcNR) and variance precise entropy-based features reduction are also emerging as new potential methods for feature-selection. Khan et al. [35] proposed the Newton–Raphson based feature selection framework for skin lesion recognition where the localization of lesions was performed with Fast-RCNN. While the features were extracted through pre-trained DenseNet201. The IcNR method is then employed to select the relevant features and further tested the performance on ISBI2016 and ISBI2017 dataset. Rehman et al.[36] had also been tested the classification performance for PH2, ISBI2016 and ISBI2017 dataset. However, they have extracted a variety of handcrafted features namely, histogram-oriented gradient (HOG), speeded up robust features (SURF), and color. The variance precise entropy-based features reduction method is then applied for feature selection and classified through SVM with higher accuracy.

3 Material and methodology

3.1 BreakHis dataset

The main aim behind the creation of BreakHis dataset (https://web.inf.ufpr.br/vri/databases/breastcancerhistopathological-database-breakhis/) was to overcome the scarcity of large and comprehensive databases related to histopathological images. The motive behind the sharing of the database on the public front is to encourage the researchers and machine learning engineers to develop an automatic classification system for cancer diagnosis. Spanhole et al. [9] laid the foundation for this database by collaborating with P&D (Pathological Anatomy and Cytopathology) Laboratory, Parana, Brazil. The BreakHis database contains 7,909 digitized histopathological images, collected from 82 patients using four different magnification power of camera (40X, 100X, 200X and 400X). The database is broadly split into two categories of cancer: Benign (B) and Malignant (M) where the B is a non-invasive and M is an invasive type of cancer, respectively. The non-invasive cancer is spread within the breast whereas invasive cancer affects the nearby organs of the body as well. In the dataset, each category of breast cancer is further split into four sub-categories of cancer, as presented in Table 1. The database released in only one image format (PNG format) which consist of 3-channel RGB images of size 700 × 460 pixel with 8-bit depth in each channel.

Table 1 Distribution of images in BreakHis dataset [9]

In the present study, we are performing a magnification independent sub-classification of histological images, so the images with different magnification factors are considered collectively to build the training set. All classes considered in the BreakHis dataset are depicted in Fig. 2 and the distribution of images per class for training and testing purpose is shown in Fig. 3. Images in training set for all the classes are equal (which is 380), but the testing set having the different quantity of the images.

Fig. 2
figure 2

List of Classes in BreakHis database with an image sample and a label

Fig. 3
figure 3

Histogram of image distribution per class for the training and testing sets, employed during the execution of this work. Note the unbalanced distribution of images in the testing set

3.2 Data-augmentation

Augmentation is a process in which information preserving transformations are applied to the image samples to reduce the problem of overfitting which encountered due to unbalanced dataset and to improve the ability of generalization as well [37]. The available techniques of data augmentation can be broadly categorized into two classes: basic image manipulations and deep learning approaches. Basic image manipulations techniques are further categorized into kernel filters, colour space transformation, geometric transformation, random erasing and mixing images. While, adversarial training, neural transfer style and GAN data augmentation are augmentation techniques fall under deep learning approaches.

A limited dataset is the most prominent challenge in the analysis of medical images. Patient privacy, expensive procedures, rare diseases, manual efforts in medical data processing and requirement of expertise make the task of building big datasets very difficult [38,39,40]. Due to the fact, the incorporation and evolution of the new approaches of data-augmentation are highly imperative. However, the preservation of the label information even after the transformation is a big concern in the implementation of the data-augmentation technique. This process of preserving the label information is also termed as ‘safety of application’. For example, rotation is safe on natural images but not for digit data because the rotation may alter the information contained by the image like 6 to 9. For histopathological images, rotation and flipping are the safest techniques for data augmentation because the histopathological images are invariant to rotation and flipping method.

The data-augmentation process can be conducted in two-ways: offline and real-time, as illustrated in Fig. 4. In the offline mode of augmentation, the images are subjected to the same transformation as in real-time mode. Whereas, the size of the training set increases significantly which ultimately leads to the problem of large memory space and large training time requirements. In the context of memory management, we have opted real-time mode of data-augmentation which is performed using the ImageDataGenerator module from Keras. Hence, random rotation and random flip have been applied over the images. Photometric transformations have not been applied to avoid generating useless and unrealistic training samples that may affect the system performance negatively.

Fig. 4
figure 4

Data Augmentation Methods: a Off-line Data Augmentation, b Real-time Data Augmentation

3.3 Fundamental framework

The efficient implementation of transfer learning approach has been conducted to improve the classification performance with minimum computational cost and training time in a variety of computer vision problems [22, 41, 42]. The major rationale behind the use of transfer learning approach is: (1) the network training starts from patterns that have been learned on a dataset of a related problem, instead of initializing the training process from scratch, (2) the optimization algorithms fail to converge to a global minimum when weights are initialized randomly in the network. However, transfer learning approach always provides a better opportunity to get optimal results with the utilization of pre-trained weights. In solving supervised machine learning problems, the algorithms are designed to receive all the training instances from an unknown function, defined as:

$${\text{n}} = {\text{f}}\left( {\text{m}} \right)$$
(1)

where ‘n’ is the class label and m is the feature vector. The algorithm then searches for the function ‘g’ through a space of hypotheses function ‘H’. As reported in various literature [7, 13, 43], transfer learning can be performed with the architecture of existing CNNs, by utilizing the weights of the pre-trained network for the initialization of the network, with fine-tuning of the existing network on different data or by using the pre-trained network as a feature extractor. However, deployment of the existing networks’ architecture consumes too many computational resources due to the training of network from scratch on a new dataset. Unlike the previous possibility, the ability of initialization, feature extraction and fine-tuning of the network from any layer of the existing network in transfer learning reduces the computational cost. Since the size of BreakHis dataset is much smaller than the ImageNet dataset therefore pre-trained network as feature extractor is the most optimal way to implement a transfer learning approach.

In the current work, we have compared three CNN architectures namely, VGG16 and Xception. However, the selection of VGG16 model is based on a previous work done by Shallu et. al in which the network was utilised for binary classification instead of multiclassification [7]. It has been evident from the experimental analysis that a pre-trained VGG16 network performed far better than VGG19 and ResNet50 for the case of binary classification when employed as a feature extractor. Due to this fact, a hybrid architecture of VGG16 model with conventional classifier has been reconsidered to obtain optimum performance for magnification independent multiclassification in case of histology images. On the other hand, the outstanding performance of Xception in comparison of ResNet and Inception for ILSVRC [44]; motivated us to implement the hybrid architecture of Xception model to further sub-classify the breast cancer images. The training process of these two hybrid architectures is performed by freezing the weights of some layers of the network learned from ImageNet dataset. The classification layer is replaced by a new classification layer in every architecture that consists of 8 nodes for 8 sub-classes of breast cancer. Further, this shallow model is hybridized with logistic regression (LR) classifier where the LR classifier is trained on the extracted features, depicted in Fig. 5. The parameters between the feature extractor outputs (li) and the output layer (ni) are estimated using the BreakHis dataset.

Fig. 5
figure 5

Schematic for the training protocol of VGG16 and Xception model for magnification independent multiclassification of breast cancer

3.4 Training details

The experimentation for both the networks is implemented using the Keras and Tensorflow frameworks that are trained on NVIDIA GeForce 940MX. Each network is trained using Adam optimization algorithm as the Adam optimizer is benefited with the best properties of both the AdaGrad and RMSProp algorithm. In addition, easy implementation, lesser memory requirement and faster convergence make it reliable to get global minima. The pre-trained VGG16 consists of 12,420,544 parameters in the convolution layers and 102,764,544 parameters in the first fully connected layer which are trained using 1.2 million images annotated with 1000 semantic classes. Similarly, the Xception model consists of total 20,861,480 parameters up to global average pooling layer which are also trained on the same 1.2 million images of ImageNet dataset for a coherent comparison. The learning parameters, the learning rate and scheduling rate used for training were tuned through an extensive set of trial-and-error experiments as both the parameters heavily influenced the convergence of CNNs. A larger learning rate generally failed to converge and a smaller learning rate often caused unnecessary delay in the convergence. According to our experiments, a learning rate of 10–5 and a scheduling rate of 0.925 is identified to be a reasonable choice. As per the devised framework, the pre-trained weights have been utilized to extract the features from the data and the extracted features passed to the conventional classifier i.e., logistic regression (LR) by appending a flatten layer before the classifier.

4 Results and discussion

In this section, the performance of the hybridized VGG16, ResNet50 and Xception model is compared for the multi-classification of BreakHis dataset. A balanced training set of 380 images per class is utilized for the model training and the remaining images of the dataset are considered in the computation of the model performance.

4.1 Classification performance

The confusion matrices for three hybrid models under the optimized hyper-parameters configuration are presented in Fig. 6. From the statistical point of view, a comparison between the confusion matrix of VGG16, Xception and ResNet50 model represents that the Xeption model is more precise in the classification of histopathological images as compared to VGG16 and ResNet50 provides the lowest performance for the classification of histopathological images that can be observed from the asymmetric confusion matrix. The hardest class for VGG16 are the fibroadenoma (mainly confused with phyllodes tumor) and lobular carcinoma (mainly confused with adenosis, ductal carcinoma, and mucinous carcinoma). Moreover, the network has also high confusion between papillary and ductal carcinoma, between tabular-adenoma and fibro-adenoma along with the confusion between phyllodes tumor, lobular carcinoma, and adenosis. On the other hand, the Xception model has most of the confusion in differentiating the lobular carcinoma from ductal carcinoma as well as fibroadenoma from phyllodes tumor and tabular adenoma. These are the most common mistakes that could be made easily by an expert pathologist as the phyllodes tumors share similar attributes of lobulated and round mass which resembles fibroadenoma [37]. The mislabelling of images (Fig. 7) carried out by an automated system evidence that these similar expressions can trick even an automated classification system.

Fig. 6
figure 6

Confusion matrix for the hybridized a VGG16 and b Xception and c ResNet50 model

Fig. 7
figure 7

Test samples that were mislabelled by Xception model

The discriminating potential of the three deep learning networks evaluated on grounds of the following performance metrics such as accuracy, recall, precision, receiver operating characteristics (ROC) curve, the area under the ROC curve (AUC), precision-recall curve (APR) and the average precision score (APS) which have been computed from the acquired confusion matrix as follows:

$$Pre_{i} = \frac{{T_{{p_{i} }} }}{{T_{{p_{i} }} + F_{{p_{i} }} }}$$
(2)
$$Rec_{i} = \frac{{T_{{p_{i} }} }}{{T_{{p_{i} }} + F_{{n_{i} }} }}$$
(3)

where

$$T_{{p_{i} }} = C_{i,i}$$
(4)
$$F_{{p_{i} }} = \mathop \sum \limits_{j = 1,j \ne i}^{m} C_{j,i}$$
(5)
$$F_{{n_{i} }} = \mathop \sum \limits_{j = 1,j \ne i}^{m} C_{i,j}$$
(6)

\(C_{i,i}\) is the number of correctly classified images for the ith class, \(C_{i,j}\) is the number of images for the ith class, classified as the jth class by the automated classification system.\(T_{{p_{i} }} ,F_{{p_{i} }}\) and \(F_{{n_{i} }}\) refers to the number of true positives, false positives and false negatives for ith class, respectively. However, the F-score is the weighted sum of the F-measures (\(F_{i}\)) that computed for each class separately by utilizing the following formula:

$$F_{i} = \frac{{\left( {2 \times Pre_{i} \times Rec_{i} } \right)}}{{Pre_{i} + Rec_{i} }}$$
(7)
$$F - score = \frac{{\mathop \sum \nolimits_{i = 1}^{m} F_{i} \times w_{i} }}{{\mathop \sum \nolimits_{i = 1}^{m} w_{i} }}$$
(8)

where \(w_{i}\) refers to the weight of ith class. It has been observed from ROC curve analysis that the AUC obtained by the Xception model for each class is more than that obtained by the VGG16 except for fibro-adenoma (class 2), shown in Fig. 8a, b. On the other hand, the AUC obtained by the ResNet50 model for each class is minimum as compared to the Xception and VGG16 networks, illustrated in Fig. 8c. In addition, the highest APS (Table 2) of Xception model reveals its potential in extracting distinct features from all the sub-classes of breast cancer precisely via transfer learning due to a linear stacking of depth-wise separable convolution layers with residual connections in their architecture. However, the Xception model confronts a bit of failure in making a precise prediction for fibro-adenoma due to more confusion in extracting the distinguishing characteristics from the breast cancer images, especially from phyllodes tumor and tabular adenoma. As per the literature, the Xception model shown a similar trend in the classification of ImageNet dataset, where the top-I accuracy achieved by VGG16 was 71.5% and by the Xception model was 79% [44]. It was claimed that the Xception model shown significant improvement on the JFT dataset as compared to the ImageNet dataset. We believe that the Xception model will provide better results for the current problem of classification if the weights learned on the JFT dataset would be utilized in the training of the model.

Fig. 8
figure 8

ROC and APR curve analysis in the sub-classification of breast cancer histopathological images using a VGG16, b Xception and c ResNet50 model

Table 2 Performance analysis for sub-classification of the histopathological image using VGG16, Xception model and ResNet50 with Logistic Regression

Besides, the running time of the designed classification model have been computed to determine the overall efficiency of the model in a comparative manner. In this context, the total execution time for each classification model is measured through OpenCV library functions with high degree of precision and the obtained results are tabularised in the last column of Table 2. Each experiment is repeated for 10 times to alleviate and nullify the intrusion of operational system routines. Here, the running time (in minutes) represent the average of execution time computed from repetitively executed experimentation. It has been analysed from the result that the ResNet50 model takes the minimum execution time of 26.10 min, while the VGG16 model requires maximum execution time of 39.72 min for training as well as testing. However, the Xception model provides the most optimum performance as compared to VGG16 and ResNet50 with a running time of 36.57 min and an accuracy of 90% which is highest among all the designed models (Fig. 9). Since the ResNet50 model takes the minimum time for classifying the test data but with lowest accuracy. Hence, the Xception neural network-based hybrid model becomes the most efficient classifier for the classification of histopathological images in terms of accuracy as well as running time.

Fig. 9
figure 9

Classification performance analysis for sub-classification of the histopathological image using VGG16, Xception and ResNet50 model with Logistic Regression classifier

4.2 Conventional machine learning algorithms impact on classification performance

The methodology implemented here accomplished the classification of breast cancer histopathological images by extracting the features via transfer learning approach. While the logistic regression algorithm is utilized for the outcome. In order to determine the change in classification performance of the Xception model, different conventional machine learning algorithms are utilized in place of logistic regression. Each variant is evaluated on the same testing set for a coherent comparison. From the experimental results, it has been found that the implementation of a different conventional machine learning algorithm in conjunction with the same classification model (Xception) delivers different values for evaluation metrics like accuracy, precision, recall, F-score, running time etc. tabularized in Table 3. SVM with linear kernel reported the best accuracy of 82.78% with a sensitivity of 0.81 and a running time of 36.19 min when utilised a linear kernel with a penalty (C) of 5. A gradual change of penalty parameter from 1 to 5 raise the classification performance of SVM with linear kernel from 81.79% to 82.78%. The penalty parameter always play a crucial role in improving the classification performance of the model as it helps in avoiding the misclassification of image samples by relaxing the constraints and control the amount of error in the classification. However, the change of kernel from linear to radial basis function and sigmoid drop the classification performance of the model (in term of accuracy) with a significant amount of 8.94% and 16.89%, respectively. Simultaneously, the running time for SVM with radial basis function has also been increased up to 66 min.

Table 3 Effect of different conventional machine learning algorithm on the classification performance of the Xception model

To better reflect the fact that the combination of a different conventional machine learning algorithm in the same model causes the differences in the classification performance of the model. Detailed explanations on the results are illustrated by Fig. 10 (in terms of precision, recall and F-score) and Fig. 11 (in terms of running time). In this context, the bar graph indicates a significant difference in the classification ability and required running time of the model with the change in the algorithm utilized for the final decision. The implementation of decision tree classifier yields the minimum value for all the performance metrics except the running time, while the SVM, as well as the logistic regression classifier, yields the maximum efficiency for the same. This all happens because different algorithms make different assumptions about the data and have distinct rates of convergence. Decision tree divide the high dimensional feature space into hyper-rectangles with an assumption that the decision boundaries are parallel to the axes. Eventually, many partitions have been made and the decision tree classifier scaled up to create more complex functions which raise the problem of over-fitting. On the other hand, SVMs have high predictive power as it avoids the problem of overfitting. Despite the probabilistic framework, the logistic regression classifier assumes a linear decision boundary that can be of any direction. Therefore, the logistic regression performed the task of classification quite well when the feature space is not easily separable by a decision boundary parallel to the axis. Moreover, the logistic regression is inherently simple classifier with a low variance that makes it less prone to over-fitting problem.

Fig. 10
figure 10

Effect of different conventional machine learning algorithms on the classification performance of the Xception model in terms of precision, recall and F-score

Fig. 11
figure 11

Classification performance analysis for sub-classification of the histopathological image using different conventional machine learning algorithms with the Xception model

5 Conclusion

Automatic classification of different types of breast cancer is critical and impose a tough situation on medical practitioners in making firm decisions on patient’s health for their survival. Although many research communities and healthcare industries have addressed these issues and working in the direction to resolve the problem of early diagnosis and prognosis of cancer. In this context, the present work is carried out in a manner where the potential of a pre-trained network is identified. Additionally, we have developed a framework that is capable in classifying the sub-classes of breast cancer histopathology images, automatically. The considered data set of histology images are independent to their magnification factor. It is worth mentioning here that the devised technique has shown a great improvement in computational speed because it works better when there is a constraint on the training time. VGG 16 and Xception deep models are utilised as a feature extractor in conjunction with conventional classifiers (such as logistic regression, SVM, decision tree) in order to create an efficient hybrid model for solving the classification problem. Consequently, it has been observed that Xception deep model outperforms as compared to VGG16 model because it can effectively utilize the model parameters along with the depth-wise separable convolution layer in their architecture. This fact makes the Xception model more competent over VGG16 in the classification process.

Moreover, the implementation of online data augmentation technique is also proposed that further helps in improving the performance of the designed framework with significant memory management. The experimental results confirm that the performance of the Xception model in terms of accuracy, AUC, APS and running time is highly competitive with the state-of-the-art techniques employed for the classification of breast cancer histology images.