1 Introduction

Nowadays, Artificial Neural Networks (ANN) becomes very popular and many researchers have started to discover finest solutions for various applications using ANN. Since 1943, neural network architecture has evolved in machine learning. The very first neuron model was developed by Mcculloch and Pitts [21] in 1943, which is a very simple model to take binary decisions from several input factors. The first generation of multilayer perceptron was introduced by Ivankhnenko [14] with the idea of making decisions by connecting a large number of artificial neurons in layers and processing them in a way to achieve certain results. Werbos [31] and Rumelhart et al. [26] used the backpropagation technique for training the ANN model. The general idea about ANN has been proposed as the Universal Approximation Theorem [13] which states that a multilayered network of neurons with a single hidden layer can be used to approximate any continuous function to any desired precision. ANN had started to achieve milestones in various applications and problems such as image classification, speech recognition, language processing, series prediction, and so on. The process of pathological diagnosis services and medical image analysis mainly relies on machine learning. The procedure of classification using traditional machine learning methods include a sequence of steps, such as preprocessing, feature extraction, feature selection, learning, and classification. ANN with a single or few hidden layers could not perform well on some applications like medical image segmentation, feature extraction, classification and pattern recognition on images. Deep learning, which is based on multilayer ANNs called "deep neural networks," performs feature extraction and classification with a single model deep architecture in contrast to classical machine learning techniques. Researchers have developed various deep architectures like Deep Autoencoders, Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), etc., to perform better on real applications with a large amount of data. Convolutional Neural Network [25] is well suited for automatic feature extraction, pattern recognition and image classification. In 1980, Fukushima et al. [9] proposed a neural network architecture named Neocognitron for hand-written character recognition and pattern recognition, which is similar to Convolutional Neural Network. Apart from that, the first CNN architecture was proposed by Lecun et al., [16] using backpropagation to train the network. Lecun et al., [17] had introduced the LeNET5 CNN architecture, which outperformed other models on the MNIST dataset for hand-written digit recognition.

Deep learning has been a successful research domain in recent years, especially for image classification [23]. Researchers have proposed numerous machine learning and deep learning models [3, 8] to classify images. However, medical image classification has always been a challenging task. CNN works well on images by extracting features automatically using the convolution process, which extracts the spatial correlation among neighbouring pixels. Convolutional neural networks proved to be better in many real-world applications such as handwritten digit recognition, ImageNet classification, face recognition, facial expression recognition, etc. than other machine learning models. Even though there are many state-of-the-art CNN models for real-world applications, medical image classification still seems to be a challenging problem [2]. Unlike other image classification problems, medical image classification identifies the ROI tissue pixels, which don't have much difference from the normal tissue pixels. Making the machine learn such differences is hard with the existing deep learning architectures. Hence, the CNN models should be improved with a novel mechanism to enhance the input images so that it would be easy for the model to distinguish the affected and normal tissue pixels.

In this research, a new lightweight CNN model with sub-sampling and dropouts has been proposed for feature extraction and classification. The main contribution of the proposed CNN architecture is

  1. a.

    to replace the first convolution layer with a novel channelization strategy in order to assist the CNN model in extracting the finer features.

  2. b.

    to retain the significant features of the input images through a novel channelization layer for effective classification; irrespective of whether the input data is a three channeled (RGB) color image or a single channeled grayscale image.

  3. c.

    to propose a new lightweight CNN model with sub-sampling and dropouts for better feature extraction and classification of histopathological medical images.

This paper is organised as follows: Section 2 discusses the previous researches related to this work. Section 3 explains the framework of the proposed algorithm. Section 4 presents the results of the proposed method on four experimental datasets, namely APTOS2019, COVID-19, COVID-19 (2021), and Cancer and Section 5 compares the performance of the proposed architecture with other state-of-the-art architectures.

2 Literature review

In recent years, CNN-based architecture has been widely preferred for classifying images. LeCUN et. al., [17] have proposed the first Convolution-based architecture for character and number recognition on handwritten text images with the input of size 32x32x1 using two convolutional, two sub-sampling, and two fully connected layers. Alex Krizhevsky [15] has proposed AlexNet, which is one of the state-of-art architecture. It won the ImageNet challenge in 2012. It has 5 convolutional layers, 3 pooling layers, and 2 dense layers with an input image of dimension 24x24x3. Simonyan and Zisserman have proposed VGG16[28] with 16 layers and kernels of size 3x3 for all convolutions and 2x2 for all sub-sampling. Lin et al., [20] have formed a Network in Network architecture in which 1x1 convolution is introduced for dimensionality reduction. Szegedy et al., [29] developed a deep CNN architecture with inception blocks, which won the ImageNet Classification Challenge in the year 2014. He et al., [12] created ResNet, which was the winner of the ImageNet Visual Recognition Challenge in 2015. Alizadeh et al., [4] developed a CNN architecture for recognizing facial expressions with facial images of dimension 48x48x1. An 8-layer CNN Model has been proposed by Dachapally et al., [6] for recognizing facial expressions with images normalized and re-sized to 48x48x1.

Many CNN models were proposed for classifying Histopathological Medical Images. Saha et al., [27] have proposed a Deep Learning model for detecting mitosis from Histopathological Breast images. Han et al., [11] have proposed a model for cancer classification from Histopathological Breast images. Zheng et al., [32] have introduced a new CNN architecture for Breast Tumor Classification from Histopathological Breast images. Rachapudi et al., [24] have proposed a lightweight CNN Model with 16 Convolutional, 5 Pooling, and a Softmax layer for cancer tissue classification from Histopathological images. Qing Li et al., [18] have proposed a minimal CNN architecture with just 1 Convolutional, 1 Pooling, and 3 Dense layers for medical image classification on Computed Tomography Lung Images. Li et al., [19] have proposed a new CNN architecture for Pulmonary Nodule classification on Computed Tomography Lung Images.

Many researchers have identified that the performance of Convolutional Neural Networks can be improved by enhancing the model with efficient input features. Vulli et al.,[30] have proposed a model by fine-tuning the DenseNet-169 model using batch normalization and weight optimization strategies for a more precise classification of histopathological images. The authors have incorporated the 1-cycle policy and FastAI, to increase and assist the model in faster convergence towards the solution. Dash et al.,[7] have proposed the technique by employing a fast guided filter and a matched filter for attaining improved performance measures for vessel extraction.

Abdillah et al., [1] applied a filtering process to input images before feeding them into the CNN model and proved that the Canny filter works well along with CNN for vehicle classification problems. Condurache et al., [5] have proposed a Hysteresis Thresholding algorithm to segment vessels on angiogram images. Mughal et al., [22] have proposed an algorithm that combines curve stretching and Adaptive Hysteresis Thresholding for segmenting mammogram images. Zhu et al., [33] have proposed an algorithm for mapping the gray level to stretch the contrast of the grayscale images. Unlike other image datasets such as Imagenet, Digit Recognition, etc., different class images of medical image datasets appear to be similar with very few distinct features. Even though there are many deep CNN models available for image classification, only a very few models perform better on medical image classification. The model developed by Rachapudi et al. [24] with 16 convolution layers, claimed to be a lightweight CNN, performs better on histopathological medical images than other architectures.

Some models preprocess the images using various image enhancement techniques before feeding the images into the CNN. But still, extracting features from preprocessed gray-scaled or colour (RGB) images seems to be hard for the CNN with simple convolution layers. The performance of each convolution layer except the first one depends on the features extracted from the previous convolution layer. Each convolution layer extracts features from the feature maps generated by the first convolution layer of the model. It shows that the feature maps extracted from the first layer of the network are very significant for efficient classification. Hence, in this research, a new layer, namely Channelizing Layer, has been introduced as the first layer of the model using the Adaptive Hysteresis Thresholding technique to produce n number of feature maps with significant features from the input images.

3 Proposed work

This paper proposes a new channelization technique to preprocess the input images efficiently, and it is named as Channelizing layer. Initially, the input grayscale image is contrast stretched using histogram equalization [34], and then it is applied to Adaptive Hysteresis Thresholding [22] to get an N-2 number of enhanced images. The input grayscale image, contrast stretched image, and the N-2 number of features extracted by applying Adaptive Hysteresis Thresholding are merged towards the third dimension as N number of channels. The channelled data is then fed into the next convolution process of the CNN architecture. Classification is done by simple dense layers with the softmax layer at the end.

3.1 Histogram equalization

The histogram is a pictorial description of an image's pixel intensity values [10]. It uses a bar chart to represent the frequency of pixels at each level of gray in the image. Visual effects of the images can be enhanced by equalizing the histogram. The histogram Equalization technique is used for expanding the quantization interval and increasing the contrast, especially in the case of identifying the Region of Interest(ROI) from medical images by separating it from the background.

Since there are issues with utilising traditional histogram equalisation, such as the removal of important details and high brightness on local parts in the enlarged image, Zhu et al. [18] have developed the adaptive histogram equalisation technique. The algorithm finds the type of image (Low, Middle, or High gray level image) in terms of brightness or gray intensity levels and assigns the value for the adaptive parameter β (0.8, 1.1, or 1.5) to map the gray level.

While considering the histograms of medical image datasets such as Lung CT Scan, Retinal images, etc., which absolutely vary from one another in terms of intensity levels and contrast of the pixels in the image, using Adaptive Histogram Equalization is the better approach to identify the ROI from the medical images as shown in Fig. 1. Thus, in this proposed technique Adaptive Histogram Equalization is applied for Contrast Stretching.

Fig. 1
figure 1

Comparison of Histogram of images (a) Original Grayscale image (b) Conventional Histogram Equalized Image (c) Adaptive Histogram Equalized Image

Equalizing Histogram makes it easy to generalize the way of selecting the threshold values for the Adaptive Hysteresis Thresholding technique. The Adaptive Histogram Equalization technique is better than the Conventional Histogram Equalization technique [34] to identify the ROI from the medical images while channelizing using the Adaptive Hysteresis Thresholding technique as shown in Fig. 2.

Fig. 2
figure 2

(d), (e), (f) Comparison of images after applying Hysteresis Thresholding Techniques on (a) Original Grayscale image (b) Conventional Histogram Equalized Image (c) Adaptive Histogram Equalized Image

3.2 Adaptive hysteresis thresholding

After the process of Histogram Equalization, the image is enhanced and channelized with the Adaptive Hysteresis Thresholding (AHT) technique. AHT is used along with curve stitching for segmenting ROI and for enhancing and channelizing the image [24]. This AHT technique is used in the proposed work to retain the significant details of the input images and to distinguish the abnormal tissues or regions from the background or normal tissues in the medical images to make it easy for Convolutional Neural Network to extract the features.

Usually, the multi-thresholding technique leads to mapping the intensity values of the pixels in the input image to one among the list (more than two) of gray values. If there are n thresholds (T{t0,t1,t2…tn-1}) used for thresholding an image, the resultant image would be of pixels with n+1 different intensity values.

In AHT, though two threshold values (ƟL and ƟH) are used, the intensity values of the pixels are mapped to only two gray levels as in Eqs. (1) and (2).

$${\text{g}}_{\text{i},\text{j}}=\left\{\begin{array}{ll}0&\text{if}\;{\text{f}}_{\text{i},\text{j}}\leq{\mathrm\theta}_\text{L}\;\\1&\text{if}\;{\text{f}}_{\text{i},\text{j}}\geq{\mathrm\theta}_\text{H}\\{\text{C}}_{\text{i},\text{j}}&\text{Otherwise}\end{array}\right.$$
(1)
$${\textrm{C}}_{\textrm{i},\textrm{j}}=\left\{\begin{array}{ll}1& \textrm{if}\ {\textrm{f}}_{\textrm{i},\textrm{j}}\kern0.5em \textrm{is connected}\ \textrm{to}\ 1\ \textrm{in}\ \textrm{g}\\ {}0& \textrm{Otherwise}\end{array}\right.$$
(2)

Where fi,j is the intensity value of each pixel in the input image and gi,j represents the mapped gray level. Ci,j is assigned to 1 if the pixel is connected to another which has intensity level higher than ƟH.

3.3 Channelization

Usually, images of three-channel (RGB) colour images or one-channel grayscale or black-and-white images are fed into the convolutional neural network. During the convolution process, input images are converted into feature maps equivalent to the number of filters used for the convolution process in a certain convolution layer. The obtained feature maps are considered as the input image, with a number of channels equivalent to the number of feature maps. Subsampling and dropout layers are added in between as needed, which do not affect the number of channels. The novelty of the proposed architecture is that the convolutional neural network is fed with preprocessed images with N channels instead of just three or single channel images. Various versions of the input image are merged to make the channelized image, which can retain the significant features of the input images with minimal loss.

In the proposed algorithm, the Adaptive Hysteresis Thresholding technique is used for channelizing the input image, which allows the architecture to learn features effectively and produce better results. The first convolution layer in the CNN model is replaced by the proposed channelizing layer to retain the significant features of the images with minimal loss. In the proposed method, to get N-channelled features from the input image, the gray-scaled input image is used as the first channel to keep the originality of the input image, and then it is enhanced using Adaptive Histogram Equalization and merged as a second channel. Then Adaptive Hysteresis Thresholding is applied on the contrast stretched image with m low threshold values (ƟL = {ƟL0L1,…ƟLm-1}) and n high threshold values (ƟH = {ƟH0, ƟH1,… ƟHn-1}). Each value in the low threshold is paired up with every value in the high threshold, which results in N-2 pairs of low and high threshold values as there are already 2 channels prepared. With the N-2 pairs of threshold values, the Adaptive Hysteresis Thresholding technique is applied on adaptive histogram equalized grayscale image (second channel) to get N-2 versions of feature maps from an input image. The generated N-2 versions of the thresholded image are then merged with the first two channels to get N channels.

$$\mathrm\theta_{L}=\left\{\mathrm\theta_{L0},\,\mathrm\theta_{L1},\dots,\mathrm\theta_{Lm-1}\right\}$$
(3)
$$\mathrm\theta_{H}=\left\{\mathrm\theta_{H0},\;\mathrm\theta_{H1},\dots,\mathrm\theta_{Hn-1}\right\}$$
(4)
$$\mathrm T=\left\{\left\{\mathrm{\theta}_{L0},\,\mathrm{\theta}_{H0}\right\},\left\{\mathrm{\theta}_{L1},\,\mathrm{\theta}_{H1}\right\},\left\{\mathrm{\theta}_{L2},\,\mathrm{\theta}_{H2}\right\},\dots,\left\{\mathrm{\theta}_{Lm-1},\,\mathrm{\theta}_{Hn-1}\right\}\right\}$$
(5)

ƟL in the Eq. (3) represents the set of Low threshold values and m represents the total number of Low threshold values in the set. ƟH in the Eq. (4) represents the set of High threshold values and n represents the total number of High threshold values in the set. T in the Eq. (5) represents the set of N-2 pairs of ƟL and ƟH values.

Figures 3 and 4 show the comparison of channelled feature maps using Adaptive Hysteresis Thresholding after the image is contrast stretched using Conventional Histogram Equalization and Adaptive Histogram Equalization.

Fig. 3
figure 3

Comparison of applying Channelization after applying Normal Histogram Equalization and Adaptive Histogram equalization on CT Scanned Lung images of COVID-19 dataset

Fig. 4
figure 4

Comparison of applying Channelization after applying Normal Histogram Equalization and Adaptive Histogram equalization on IRIS images of APTOS dataset

Figure 5 shows the various channels of the image created using Adaptive Histogram Equalization and Adaptive Hysteresis Thresholding techniques. Channel 1 is the normal Grayscaled image, Channel 2 is the Histogram Equalized image and the remaining (N-2) Channels are the images created by applying Adaptive Hysteresis Thresholding with calculated m X n Threshold combinations (T), on the Adaptive Histogram Equalized image.

Fig. 5
figure 5

Channelization

Figure 6 shows the Channelized feature maps prepared by merging N channels using the proposed channelizing layer. The channelized image is then fed into the Convolutional blocks and other layers to learn the model for classification.

Fig. 6
figure 6

Channelized Image

The following algorithm represents the steps to estimate m Low Thresholds and n High Thresholds for getting (N-2) number of channels using Adaptive Hysteresis Thresholding technique.

Algorithm: To calculate Low and High Threshold values

  1. A)

    Calculate the range of Threshold values [ɑ,β] using Eq. (6)

  2. B)

    Distill the range into Quantiles {Q1, Q2, Q3, Q4, Q5, Q6}

  3. C)

    Assign the range for Low Thresholds {Q1} and High Thresholds {Q4, Q5, Q6}

  4. D)

    Calculate Low Threshold Increment (LTinc) and High Threshold Increment (HTinc) using Eqs. (7) and (8)

  5. E)

    Calculate m number of Low Thresholds (ƟLi) and n number of High Thresholds (ƟHj) using Eqs. (9) and (10)

Initially, the range of threshold [ɑ,β] can be selected from one of the 2 ranges of threshold values namely [0.05,0.35] and [0.35,0.65] with respect to the feature of the dataset to get the ROI identified by the channelization process using the Eq. 4.

The process of estimating the low and high thresholds is the significant part of Channelization. If the mean intensity value (μ) of the whole dataset is above 0.5 then the ROI be in the first triad of the intensity levels and hence minimum(ɑ) and maximum(β) of the threshold values are set to 0.05 and 0.35. Otherwise, the minimum(ɑ) and maximum(β) of the threshold values are set to 0.35 and 0.65.

$$\left[\upalpha, \upbeta \right]=\left\{\begin{array}{cc}\left[\textrm{0.05,0.35}\right]& \textrm{if}\ \upmu >0.5\\ {}\left[0.35,0.65\right]& \textrm{Otherwise}\end{array}\right.$$
(6)

where μ represents the mean of mean values of normalized intensity values of each image in the dataset.

Now the range of threshold values [ɑ,β] is distilled into 6 parts as shown in Fig. 7 and the first quantile(Q1) is used for identifying the low threshold values and the range of the final three quantiles(Q4, Q5, Q6) are used for identifying the high threshold values.

Fig. 7
figure 7

Quantiles of Threshold Range

Before calculating the threshold values, minimum threshold values and threshold increment values need to be calculated. Let LTinc be the Low threshold increment value and HTinc be the High threshold increment value which can be calculated using the equations below,

$${LT}_{inc}= Range\ assigned\ for\ Low\ Thresholds/ Number\ of\ Low\ Thresholds\ (m)$$
(7)
$${HT}_{inc}= Range\ assigned\ for\ High\ Thresholds/ Number\ of\ High\ Thresholds\ (n)$$
(8)

Let LTmin be the lowest of the first Quantile which can be the least value for Low Thresholds and HTmin be the lowest of the fourth Quantile which can be the least value for High Thresholds. Now m number of Low Thresholds(ƟLi) and n number of High Thresholds(ƟHj) can be estimated using equations below,

$${{\varTheta}}_{Li}={LT}_{min}+\left(i\ x\ {LT}_{inc}\right)\ where\ i:= x\forall x\ in\left\{x\in W|0\le x\le \left(m-1\right)\ \right\}$$
(9)
$${{\varTheta}}_{Hj}={HT}_{min}+\left(\ \left(j+1\right)\ x\ {HT}_{inc}\right) where\ j:= x\forall x\ in\left\{x\in W|0\le x\le \left(n-1\right)\ \right\}$$
(10)

Finally, N-2 number of combinations, obtained from m number of Low (ƟLi) and n number of High(ƟHj) thresholds are used for Adaptive Hysteresis Thresholding, to get N-2 number of feature maps in the Channelizing Layer.

3.4 Proposed CNN Architecture

The detailed architecture of the proposed model is shown in Fig. 8. Input images of various sizes are fed to the preprocessing phase of the model, in which the color images are converted into grayscale images, resized, and contrast stretched using Adaptive Histogram Equalization.

Fig. 8
figure 8

Architecture of the Proposed Model

The feature extraction phase of the model starts with the Channelization Layer in which the preprocessed images are applied with the proposed channelization technique to obtain N-Channelled feature maps. The channelled feature maps are then fed to the convolution blocks, pooling layers, and dropout layers, as shown in Fig. 8. Convolution blocks are composed of single or multiple convolution layers and are arranged sequentially. Filters of size 3 × 3, with padding and ReLU activation function are used for every convolution step in the model.

As a whole, the proposed architecture consists of one Channelizing layer, 15 convolution layers, 4 pooling layers, and 4 dropout layers in the feature extraction phase. In the architecture, 5 Convolution Blocks are utilised. The first 4 Convolution Blocks are followed by one maxpooling and a dropout layer.

Channeled feature maps in the Channelizing layer are fed to the first Convolution Block which has 3 convolution layers with 16 filters. The output of the first Convolution Block is then fed to one Maxpool Layer for subsampling and to a Dropout Layer as shown in Fig. 9.

Fig. 9
figure 9

Convolution Block-1

Figure 10 shows the structure of Convolution Block-2. The output of the first Dropout is then fed to the second Convolution Block which has four convolution layers using 32 filters, followed by one Maxpool and a Dropout layer.

Fig. 10
figure 10

Convolution Block-2

The next two Convolution blocks have 4 convolution layers using 64 filters, and 3 Convolution Layers using 128 filters respectively as shown in Figs. 11 and 12.

Fig. 11
figure 11

Convolution Block-3

Fig. 12
figure 12

Convolution Block-4

Figure 13 shows the final Convolution Block, which has only one convolution layer with 256 layers, which is the final part of the Feature Extraction phase. The features extracted from the final convolution layer are then flattened and passed to the Classification phase, which has a Fully Connected Layer with 32 neurons and a Softmax layer with 2 neurons.

Fig. 13
figure 13

Convolution Block-5, Dense Layer and Softmax Layer

For medical images, the pixels of the affected tissues are either brighter or darker than the pixels of the normal tissues. Among the experimental datasets, the Aptos dataset is completely opposite in terms of intensity characteristics. Aptos19 has middle-intensity level images that have darker ROI and others have low or middle-intensity images which have brighter ROI.

As the medical image datasets are of different intensity ranges, contrast stretching is done by applying Adaptive Histogram Equalization. Applying Adaptive Histogram Equalization widens the histogram of the images regardless of the intensity type of the images, which makes it easier to fix the threshold values for applying Adaptive Hysteresis Thresholding. Adaptive Hysteresis Thresholding proved to be good at distinguishing ROI from normal tissues by identifying vessels in medical images [31,32, – 33]. In the proposed method, the various sized colour images are first resized and converted into grayscale images. The aligned grayscale images are then contrast-stretched using Adaptive Histogram Equalization and the list of low and high threshold values is calculated to apply Adaptive Hysteresis Thresholding. Channelization is done by applying ADT with different (N-2) combinations of Low and High Threshold values, making different versions of the grayscale image which are then appended to the aligned Grayscale image and the histogram equalized version of the aligned Grayscale image as channels. These channelized feature maps of size HxWxN are then fed into the proposed Convolutional blocks and Sub-sampling layers for extracting features and then classified using the softmax layer.

4 Experiments and results

4.1 Environment

All experiments were implemented in Python using Anaconda 3.7 in Jupyter Notebook. The package Keras is used for creating CNN model and scikit-image and PIL packages are used to implement histogram equalization and thresholding.

4.2 Datasets

The experiments are evaluated on four medical image datasets, namely APTOS 2019, COVID-19 , COVID-19 (2021), and Cancer dataset. The APTOS 2019 dataset (https://www.kaggle.com/datasets) consists of a set of retinal images of diabetic retinopathy-affected and normal people that are downloaded from KAGGLE. The COVID-19 dataset (https://data.world/datasets/) consists of a set of lung images of COVID-19-affected, other pneumonia-affected, and normal people that were downloaded from DATA WORLD. As the considered datasets consist of multiclass images, they are converted to binary class data sets as the proposed model is a binary class classifier. The COVID-19 (2021) dataset (https://www.kaggle.com/datasets), which was published in 2021, consists of a set of lung images of COVID-19-affected, other pneumonia-affected, and normal people that were downloaded from KAGGLE. The Cancer dataset (https://www.kaggle.com/datasets) that was downloaded from KAGGLE consists of CT-Scan images of three different Cancer affected lungs and normal lung images. The images were handpicked and resized, and the dataset is converted into only 2 classes, namely Affected and Not Affected, even if it has multiple classes. Table 1 shows the properties of four datasets used for evaluating the performance of the proposed model.

Table 1 Properties of experimental datasets

4.3 Parameters

The mean intensity values (μ) of the images are calculated as greater than 0.5 for the APTOS dataset and below 0.5 for other datasets consisting of CT scanned lung images. For Channelizing the image, using adaptive hysteresis thresholding, two different ranges of thresholds are used with respect to the dataset. The range [0.05, 0.35] is used as the threshold range for channelization for the APTOS dataset and [0.35, 0.65] for the remaining three datasets as per Eq. (6). To get 32 channels, 30 combinations of Threshold pairs (T) are calculated by fixing m as 6 and n as 5. Six low threshold values and five high threshold values are calculated to get 30 channels using Adaptive Hysteresis Thresholding.

4.4 Results

In order to evaluate the proposed model, the accuracy, precision, sensitivity, specificity and F-measure are estimated. Classification accuracy is estimated by dividing the total number correctly classified samples by the total number of samples. Accuracy is the best indicator, but only when the false positive and false negative rate values are almost equal. Precision is a measure of how many predictions in a certain class are really correct for the obtained set of results. Recall is a metric that measures how many predictions of class membership are accurate out of all of the positive samples in the dataset. F1-score is the harmonic mean of Precision and Recall values. The experimental results of proposed model are evaluated in terms of Accuracy, Precision, Sensitivity, Specificity, and F-Measure on four benchmark datasets namely COVID19 (2019), Cancer, COVID19 (2021) and APTOS (2019) and are tabulated in Table 2.

Table 2 Performance measures of the proposed model on experimental datasets

The proposed model is trained for 200 epochs for all the experimental datasets considered in this research, and the estimated accuracy and loss of the proposed model on training data are plotted as a graph, which is shown in Fig. 14.

Fig. 14
figure 14

Performance of the Proposed model on training data (a) Accuracy (b) Loss

Figure 14 demonstrates the continual reduction in loss and increase in accuracy with regard to the number of epochs

5 Comparative analysis

The performance of the proposed classification model is compared with other existing deep neural network models for medical image classification and with few states of art convolutional neural network architectures. The State of Art architectures like AlexNET [15], VGG16 [28], the model created by Abdillah et. al. [1] in which the images are preprocessed with CANNY filter and the model created by Rachapudi et. al [24] which is very good in classifying medical images have been considered for comparison. Table 3 compares the performance of the proposed model in terms of accuracy with other four state of the art architectures namely AlexNet, VGG16, Abdillah’s Model and Rachapudi’s Model (2020). For better comparison, highest accuracy obtained for each dataset is mentioned in boldface. 

Table 3 Accuracy comparison of the Proposed model with other existing models

Table 3 shows that the proposed model outperforms the other existing models on all four medical image datasets. The proposed method surpasses all the other existing methods in comparison with the accuracy of 92% on APTOS 2019 dataset, 95.65% on the COVID-19 dataset, 83.20% of accuracy on the Cancer dataset, and 85.60 of accuracy on the COVID-19(2021) dataset. Figure 15 compares the performance of the proposed model in terms of accuracy with other existing models in comparison.

Fig. 15
figure 15

Evaluating the Proposed model in terms of Accuracy

Figure 15 depicts that the suggested model exceeds the high-performing Rachapudi's Model (2020) in terms of accuracy for COV-19, Cancer, APTOS, and COV-2021, respectively, by 4.28 percent, 2 percent, 2.40 percent, and 1.60 percent. In summary, the experimental results consistently indicate that the proposed model can classify the histopathological medical images efficiently.

6 Conclusion

As the correctness of image classification depends on the features considered for training the CNN, this research introduces a new layer namely Channelizing Layer for extracting features without losing significant details of the image data. The proposed Channelizing layer uses Adaptive Hysteresis Thresholding technique for channelizing the image data into feature maps. The architecture of the proposed CNN is constructed using the best among various trials with the different combinations of input image size, filter size, dropout, convolution, pooling, and dense layers. The proposed model can obtain the better classification accuracy as the convolution blocks extract features from the feature maps generated by Channelizing layer which retains the significant details of the medical images. The proposed model is proved as better by its higher accuracy with the value 95.65%, 92%, 83.20%, and 85.60% on four medical image datasets namely APTOS2019, COVID-19, COVID-19(2021) and Cancer dataset respectively. Experimental results demonstrate that the inclusion of the proposed channelizing technique as the first layer in the CNN architecture is a very promising method to efficiently classify the histopathological medical images.

As the proposed model still requires more computation for generating N-channeled feature maps using channelization layers, in future, best filtering technique and drop out mechanism can be introduced in further layers to reduce the dimension of input features and network size respectively for improving the light weight architecture of CNN model. Moreover, an efficient filtering technique can be employed in order to maintain the significant micro features for the effective histopathological image classification.