1 Introduction

The outbreak of the novel coronavirus 2019 (COVID-19) has caused a global panic and has emerged as a fatal disease worldwide. The worldwide health crisis has had a profoundly negative effect on people's physical and mental health. COVID-19 impairs the human respiratory system and causes lung lesions that adversely impact the lungs' ability to function normally. People infected with the virus typically experience fever, coughing, muscular weakness, and respiratory discomfort in addition to variable levels of stress and anxiety. This disease is caused by Severe Acute Respiratory Syndrome CoronaVirus-2 (SARS-CoV-2) which is structurally similar to SARS-COV [1, 2]. This virus was identified as COVID-19 by the World Health Organization (WHO) on February 11, 2020 [3, 4]. According to WHO, to date (31st December 2023), the worldwide statistics show that this pandemic has caused a total of 773,819,856 confirmed cases of COVID-19, including 7,010,568 deaths with a rate of 0.906% [5]. The reverse transcription polymerase chain reaction (RT-PCR) is frequently used to diagnose COVID-19. However, its limitations, which include a low sensitivity of up to 71% and a time-consuming method that is also constrained by limited test kits during the outbreak, limit its capacity to detect and confirm positive cases quickly [6]. X-ray imaging and computer tomography can offer feasible solutions for rapid viral detection screening but it is constrained by clinical equipment limited availability [7]. This raises the need to provide the support of computer technology to the medical specialists, to achieve the aim of sensible utilization of hospital resources.

Keeping in view the contribution of deep learning in the analysis of medical imagery, the research work aiming toward the identification of COVID-19 has seen an exponential rise. A study conducted by the Radiology Department at Wuhan concluded that accurate distinctions can be made between pneumonia, COVID-19, and other lung diseases using a deep learning model [8]. According to a study report, it is indicated that chest X-rays (CXR) started displaying the signs of this virus infection in 4 days [9]. In addition to that, based on the results of another study, it is concluded that the computed tomography (CT) scans of COVID-19-infected patients, possessed a peculiar pattern: multifocal patchy consolidation, ground-glass opacities and/or interstitial changes with a predominantly peripheral distribution [10, 11]. Furthermore, some studies have also discovered changes in the images of CT scans and chest X-rays before the commencement of the disease’s symptoms.

Broadly, image classification tasks can be divided into supervised and unsupervised learning. However, with the studies presented under Sect. 2, it is observed that for the classification of medical imagery, primarily supervised learning is followed. Consequently, while employing data labeling for the classifier, the major involvement of neural networks can be noticed and due to this involvement of imagery, most of the deep learning models proposed by various authors include the implementation of convolutional neural network (CNN) [12]. This deep learning technique also assists in identifying unique features embedded in the radio-graphic images. Figure 1 shows a week's record of the chest X-rays of a COVID-19-positive patient who also had pneumonia and was aged 50 years [13].

Fig. 1
figure 1

Evolution of COVID-19 pneumonia as seen over 1 week

The focus of this research is to build a layered CNN model from scratch, exploit the visual features involved in the images of chest X-ray, and present an image diagnostic deep learning model for the (i) binary classification (COVID vs. normal) and (ii) 3-class classification (COVID vs. normal vs. Pneumonia). The reason for undertaking CNN is to leverage the automated feature extraction offered in this model, which also presents promising results concerning accuracy. Moreover, CNNs are predominantly implemented for image data-related studies, which is divergent from the main purpose of artificial neural networks (ANNs) or recurrent neural networks (RNNs). Conventionally, ANNs are beneficial in tabular and text data, while RNNs are advantageous for time series and audio data.

With the circumstances of the pandemic, researchers are focused on achieving faster and more reliable mediums for detecting and identifying COVID-19 in an individual. This research study aims to achieve this objective by presenting a model that results in reasonable accuracy in detecting the target disease in less time. The organization of the remaining paper is as follows. In Sect. 2, a brief overview of the related works is presented. Section 3 presents the methodology followed in the proposed models and discusses their architectures. Subsequently, in Sect. 4, the experimental setup and results are presented and Sect. 5 discusses the observations gathered during experimentation along with scope for the future work. Finally, Sect. 6 concludes the research performed in this study.

The contributions of this study are highlighted as follows:

  1. 1.

    The detection of COVID-19-affected patients based on their chest X-rays is facilitated via the proposed binary classifier.

  2. 2.

    The distinction between COVID-19-affected and pneumonia-affected patients based on the radiographic scans supplied to the ternary classifier developed in this study.

  3. 3.

    The proposed models are more efficient than the existing methods.

2 Related Work

Machine learning techniques have hugely impacted the medical field in terms of their extensive applications and high-accuracy models in image analysis. These automated methods are regarded as one of the most efficient and accurate techniques [14,15,16,17,18]. Therefore, the outbreak of COVID-19 has escalated the speed and amount of work related to experiments conducted in comprehending this disease and presenting its analysis. In [19], a deep learning model named DarkCovidNet has been proposed for the detection of COVID-19 from raw chest X-rays. The authors exploit the use of the Darknet-19 classifier and implement the ‘You Only Look Once’ (YOLO), an object detection system on a real-time basis for presenting binary class classification and multiclass classification models. Their binary model recorded an accuracy of 98.08% while the multiclass model achieved an accuracy of 87.02%. The work [20] developed a CNN model following the procedure of transfer learning on a collection of COVID-19, common bacterial pneumonia, and normal case X-ray images and used a pre-trained network, Visual Geometry Group with 19 layers deep (VGG19) which yielded an accuracy of 96.78% for binary classification and 94.72% for ternary classification.

In [21], Karim et al. used a combination of two neural networks, VGG19 and DenseNet-161 to classify the input chest X-rays into one of the three categories that were specified by [20]. Furthermore, their model has also leveraged gradient-guided class activation maps (Grad-CAM + +) and layer-wise relevance propagation (LRP) to highlight the special features indigenous to each class. This resulted in a precision of 94.6%. Ghoshal et al. [22] presented a drop weights-based Bayesian Convolutional Neural Network (BCNN) combined with a Residual Network of 50 layers (ResNet50) to improve the performance of the diagnostics system running on COVID-19, bacterial pneumonia, viral pneumonia, and normal chest X-ray images. Their quaternary classifier of Bayesian ResNet50V2 achieved an accuracy of 89.82%. Narin et al. [23] presented three different pre-trained CNN models and evaluated them based on chest X-rays of coronavirus pneumonia-infected and normal patients. They concluded that ResNet50 achieved the highest accuracy of 98%; however, it could not address the main issue of differentiating between the pneumonia and COVID-19 cases. In [24], Hemdan et al. proposed a COVIDX-Net framework constituting seven pre-trained architectures of neural networks that used chest radiographs of the patients. They concluded that the VGG19 and DenseNet201 variants of the classifier achieved the best performance score of 90% for binary classification of COVID and normal cases. Another study conducted by Hall et al. presented that a small data set of chest X-rays was also useful in diagnosing this disease [25]. Based on 455 chest X-rays which comprised COVID-positive and pneumonic samples, their model achieved an overall accuracy of 89.2% using the pre-trained deep convolutional neural network, Resnet50.

The work by Wang et al. [26] presented a model for detecting COVID-19’s specific features observed in the CT scans of infected patients. Their implemented model showed a total accuracy of 73.1% on the testing data. Similarly, in [27], Xu et al. presented a case study that employed CT images for screening for COVID-19. An early screening, deep learning model was developed based on the collected data of COVID-19-positive, Influenza-A viral pneumonia, and normal cases and this presented an overall accuracy of 86.7%. Furthermore, their study presents the result of an experiment conducted on existing 2D and 3D machine learning models and leverages it for predicting COVID-19 cases in chest scans. Using the ResNet model combined with a location–attention mechanism, the accuracy achieved was 86.7%. In [28], Mishra et al. explored various approaches for COVID-19 detection based on chest CT scans and presented a model that fused the predictions from multiple models and provided a final decision on its basis. This fusion model achieved an accuracy of 88.34%, which was higher than those of individual models considered separately. Furthermore, Osman et al. [29] proposed a new technique that implemented locality-weighted learning and self-organization map (LWL–SOM) for the identification of coronavirus from chest X-rays. This methodology resulted in an accuracy of 97.88%. In [30], a review study conducted by Rahimi et al. provides a comparison of various machine learning and deep learning techniques that have been employed in radiological imaging to diagnose COVID-19. Similarly, another literature study by Low et al. [31] presents an overview of deep learning models that have been proposed in various studies that are aimed at distinguishing COVID-19 and pneumonia cases from chest radiography. Nayak et al. [32] introduced a comprehensive study on the performance evaluation of pre-trained CNN models, which concluded that the best accuracy of 98.33% was achieved by ResNet-34.

In [33], Akter et al. implemented twelve deep learning models to perform binary classification of X-ray image data. The proposed modified MobileNetV2 surpassed the other classifiers VGG16, ReseNet101, AlexNet, DenseNet, VGG19, InceptionV3, NFNet, GoogLeNet, MobileNetV2, EfficientNetB7, and ResNet50 achieving an accuracy of 98%. Attallah et al. [34] developed an ECG-BiCoNet, a pipeline framework for binary and multiclass classification of ECG images of patients with COVID-19 and other cardiac problems. It utilized five distinct deep learning networks ResNet, Inception, Xception, Inception-ResNet, and DenseNet, for retrieving high and low-level characteristics from different layers. These features were fused using discrete wavelet transform and prime features were selected utilizing the symmetrical uncertainty (SU) approach. The ensemble voting classifier gave an accuracy of 98.8% for binary and 91.73% for multiclass classification. Karim et al. [35] utilized the Naïve Bayes classifier along with the CNN-based feature extraction from AlexNet and Ant Lion Optimization method (ALO) for feature selection for detection of COVID from X-ray data. This method achieved an accuracy of 98.31% accuracy, an F1-score of 98.25%, and a precision of 100% for the binary classification of normal and COVID X-ray images. In Nasiri et al. [36], the XGBoost classifier was implemented for detecting COVID in the X-ray data. Transfer learning CNN (DenseNet169) was used for extracting features and the classifier achieved a 98.23% accuracy for two classes (COVID and normal) and 89.70% accuracy for three classes (COVID, normal, and pneumonia) classification. The investigation by Nahiduzzaman et al. [37] presented the ChestX-ray6 network and assessed it using chest X-rays from both healthy individuals and COVID patients. The pre-trained ChestX-ray6, a custom CNN network, resulted in an accuracy of 97.94% for binary classification on chest X-ray images. In Gaur et al. [38], three pre-trained deep learning networks (VGG-16, EffcientNetB0, and InceptionV3) developed along with a transfer learning approach for the detection of COVID, normal, and Pneumonia classes in X-ray images. EfficientNetB0 attained the highest accuracy score of 92.93%. Chow et al. [39] examined the efficacy of 18 popular CNN networks pre-trained on the ImageNet database, leveraging the transfer learning strategy for COVID and normal image detection in the X-ray data set. The VGG-16 succeeded in achieving the highest accuracy and precision among all with both at an overall 94.3% score.

This analysis has led to the conclusion that radiographic images are used extensively by medical practitioners for detecting this fatal disease. This summary of the pre-existing models discussed in the ‘Related Work’ section which have trained on different data sets of chest X-rays and CT scans, can be seen in Table 1. The prevailing limitation discussed in the presented studies is the comparison between pneumonia and COVID has not been performed in the models. However, with the proposed multi-class classifiers consisting of the pneumonic and COVID data sets, it can be observed that models trained on CT scans have resulted in a lower accuracy than those being trained on X-rays.

Table 1 Summary of pre-existing models and proposed model architectural accuracies

Hence, with the motivation of presenting a self-developed, layered neural network, in this work, an efficient deep learning model that is trained on a larger data set of COVID-19 positive, negative, and pneumonic cases is proposed and developed from scratch. In addition to that, its resulting accuracy has been improved as compared to that of the existing models that used pre-trained CNNs.

3 Proposed Methodology

The authors have proposed the architectures of two convolutional neural networks, which employ the chest X-ray images of the patient and categorize them based on the model selected: (1) binary classification: between COVID-19 and normal and (2) multiclass-classification or ternary classification, i.e., COVID-19 positive or pneumonia or normal. This prediction algorithm can be broadly divided into four principal steps: (1) data pre-processing; (2) setting up convolutional layers; (3) training the model; and (4) classification of the patient based on their lung X-ray images. The data set has been divided as 80% for training and 20% for testing. This bi-furcation has been achieved after implementing different combinations of data set division and selecting the one that resulted in providing high accuracy without overfitting or underfitting in the model. The general framework of the CNN architecture followed for the detection of COVID-19 can be seen in Fig. 2. Both architectures have been developed by the authors from scratch and their performance comparison with those of pre-trained networks is discussed in the ‘Results and Discussion’ section.

Fig. 2
figure 2

General framework of the CNN architecture to detect COVID-19

The reason behind developing two models for dealing with COVID-19 detection majorly concerns the requirements of the individual. Upon observing the similar symptoms of Pneumonia, the need to distinguish it from COVID would be requested and this would be catered by the ternary classifier. Furthermore, to achieve better accuracy, both binary and ternary classifiers have been compared and with the requirement of only COVID-19 detection, the requestor would have a choice to select a better model which would be more suited to their needs.

3.1 Data Set for COVID-19, Pneumonia, and Normal Chest X-ray Images

The data set of the COVID-19-infected chest X-ray was not sufficiently large for training the model, so the authors contrived a way to bring the COVID-19 data set on par with the abundantly available data of the pneumonia-infected and normal individuals. Therefore, an augmented data set has been taken from Mendeley [40] which consists of data collected from online available sources, such as Github [41] and Kaggle [42].

Consequently, the binary classifier has used a total of 1,824 images of the chest X-rays out of which, 912 images belonged to augmented COVID-19, and the remaining 912 pertained to non-augmented normal (non-COVID) images. Firstly, the image gets resized to 256 × 256 to reduce the computational cost. There is no specific rule to select the ratio of splitting the data into training and testing; however, it may depend on various factors, such as size of the data, model complexity, evaluation efficacy, domain-specific knowledge, etc. Iftikhar et al. [43, 44] discussed the significance of splitting data in different ratios in the performance evaluation of various models. Due to the size constraint of data available, the images have been taken and divided into an optimum ratio of 80:20 for building training and validation sets, respectively. Therefore, a total of 1,458 training set images and 366 testing set images have been used. Furthermore, the ternary-class classification model has leveraged a total of 2736 chest X-ray images which constitutes 912 images each from COVID-19 [40], pneumonia, and normal chest scans [42]. With the same division trend as followed in the previous model, 2187 images have been from the training set and 549 have been from the testing set. Both the proposed models have used their validation set as the testing set for analyzing the model’s performance hence in this paper, the terms, ‘testing set’ and ‘validation set’ have been used interchangeably. Figure 3 shows X-ray images obtained from the database where one image belonging to each category of the class is displayed as an example.

Fig. 3
figure 3

Chest X-rays from the data set used in training the proposed models. a COVID-19 lung X-ray. b Pneumonia lung X-ray. c Normal lung X-ray

3.2 Data Pre-processing

Once the training and testing data have been acquired, their pre-processing has been facilitated via the ImageDataGenerator class in the Keras library which assisted in performing the data augmentation technique. Therefore, it artificially increases the number of images by applying various transformations on attributes (excluding the label attribute) of each image. Both the models take input images of 256 × 256 pixels in their first convolution layer and use the batch size of 64. The reason for choosing this specific size of the images as the input to the proposed CNN model is that it presented a non-distorted, un-pixelated, and clear image for training. It is also observed that reducing the size of the images further than this hampered the feature extraction process, which led to a reduction in accuracy.

Data Augmentation of the Training Set and Testing Set: This step subdues the overfitting in the model hence preventing the large gap between the accuracies of the training and the testing data. With multiple experimental trials on different values of the tuning parameters, the following transformations have been performed on the training set, which resulted in achieving the best accuracy: (1) shearing has been done with 0.2 value; (2) zoom range has been set to 0.2; and (3) X-rays images have been horizontally flipped since vertical flip are often disregarded as chest X-rays and may not improve model’s training [45]. Furthermore, both the training and testing set have to undergo rescaling with a factor of \(0.1/255\). This represents the RGB (Red–Green–Blue) coefficients of the image in the range of 0–1 instead of 0–255 thereby, easing the processing of the model.

Data Encoding for Binary and Multi-class Models: As per the requirement of the model, different options are set in their class modes. The binary classification has the class mode set to ‘binary’, to return a one-dimensional (1D) array of binary labels, and, the multiclass model has used the value ‘categorical’, thereby one-hot encoding the classes into a 2D array.

3.3 Model Architecture for Binary-Class Classifier (Normal vs. COVID-19)

In this work, the proposed CNN architecture consists of a total of three convolutional layers in a binary classifier and the data set has been divided into 80% for training and 20% for testing. The model architecture comprises one input convolutional layer taking the pre-processed data as input, two hidden convolutional layers, and two dense layers out of which the last layer is the output. The first three convolutional layers possess the same arguments of a 5 × 5 sized kernel or filter resulting in 64 feature maps and with Rectified Linear unit or ReLu as the activation function. Each of them is accompanied by a max-pooling layer with a pool size of three and a default stride value of one. Subsequently, a dropout layer is added with a factor of 0.25 and flattened to obtain 3136 hidden layer neurons. Furthermore, these neurons are subjected to a fully connected layer with the ‘relu’ activation function after which a dropout with a probability value of 0.25 is performed. This is fed to the output layer which is equipped with the ‘sigmoid’ function. This proposed CNN architecture is elucidated in Fig. 4a and has been compiled using adaptive moment estimation (adam) optimization and ‘binary_crossentropy’ as the loss function.

Fig. 4
figure 4

a Proposed architecture of the CNN models for binary-class classifier. b Proposed architecture of the CNN models for ternary-class classifier

Furthermore, the CNN has used kernel convolution where the kernel or filter is passed over the input image and transformed into a feature map [19]. This convolution operation is depicted as *, where I, is the input image and K denotes the kernel, as shown in the following equation:

$${{I}}{*}{{K}}\left({{m}}, {{n}}\right)=\sum_{{{i}}}\sum_{{{j}}}{{K}}\left({{i}},{{j}}\right){{I}}({{m}}-{{i}},{{n}}-{{j}})$$
(1)

In this case, the indexes 'i' and 'j' indicate the coordinates of the components inside the kernel (K), while the indexes 'm' and 'n' indicate the coordinates of the components inside the source image (I). While 'm' and 'n' iterate across the rows and columns of the source image when the kernel is applied, 'i' and 'j' iterate across the rows and columns of the kernel during the convolution procedure.

Throughout the architecture of the binary model, different activation functions have been used. Activation functions aim to aid the neural network in learning complex patterns in the data and provide compatibility between the output of the previous layer and the input of the next. Each neuron is attached to this function and is activated or deactivated according to its relevance in the model’s prediction. The purpose and equations of each function used in this model, have been provided as follows.

The ReLu activation function has the formula as described in (2), 'x' denotes the input to the operator. This function is computationally efficient as compared to other functions and also introduces non-linearity in the network which is beneficial during the backpropagation [46]:

$$\begin{aligned} ReLu & = 0, \;{\text{for}}\; x < 0 \\ & = x, \;{\text{for}} \;x \ge 0 \\ \end{aligned}$$
(2)

The sigmoid function outputs in a range of 0–1 with any value from the input domain, therefore, making it a logistic function. This function is useful in the output layer since it assists in determining the probability of an image belonging to a particular class [47]. However, due to expensive computational cost, it has been used only once in the binary classifier. Its equation is shown in (3), 'x' denotes the input to the operator:

$${sigmoid}\left( {x} \right) = \frac{1}{{1 + {e}^{{ - {x}}} }}$$
(3)

The main reason behind implementing different activation functions was to leverage their capabilities in different situations.

3.4 Model Architecture for Ternary-Class Classifier (COVID-19 vs. Normal vs. Pneumonia)

The model proposed by the authors, for a triple-class classifier consists of four convolutional layers in total. With 80:20 training and testing data split, the model encompasses one input convolutional layer receiving the 3-channeled images, three hidden convolutional layers, then a dense layer for establishing a full connection, and finally an output layer to predict the outcome. This proposed architecture is illustrated comprehensively in Fig. 4b.

The input layer of the CNN model uses a hyperbolic tangent function (tanh) and possesses a 3 × 3 sized kernel which results in 64 feature maps. The succeeding two convolutional layers follow similar characteristics of the kernel and activation function with a default stride size of 1. Furthermore, a max-pooling layer having a filter of size 2 × 2 accompanies both layers. Another convolution layer with the same characteristics is used with the pooling layers of pool size 3 × 3. Consequently, the same arrangement of the convolutional and the max-pooling layer is created for the 4th layer which uses the leaky rectified linear unit (LReLU) function. The last max-pooling layer’s output acts as an input to the dropout layer set with a rate of 0.35 which is flattened to achieve 23,104 values of hidden layer neurons and then, is subjected to a fully connected layer with the ‘relu’ function. Finally, another 0.40 factored dropout is performed to this layer’s output and then fed to the last layer to get the predicted category of the input image out of three classes. This final layer leverages the ‘softmax’ algorithm which is used as a preference for multi-class image classification. This CNN model is compiled using the ‘adam’ optimizer and ‘categorical_crossentropy’ loss function. Being a CNN, this model also implements (1) for performing the convolution operation.

The convolutional layers of the multi-class model use the tanh function [48] where the range is from – 1 to 1. This function gives a better performance in multi-layer neural networks. It produces zero-centered output which is preferred in backpropagation and also helps in solving the problem of sigmoid function. Its equation is given in (4), where 'x' denotes the input to the operator:

$$\mathbf{tanh}(\mathbf{x})=\frac{{{\varvec{e}}}^{{\varvec{x}}}-{{\varvec{e}}}^{-{\varvec{x}}}}{{{\varvec{e}}}^{{\varvec{x}}}+{{\varvec{e}}}^{-{\varvec{x}}}}$$
(4)

Another activation function, LReLU [49] has been used in the convolution layer of the multi-class model, where is set to 0.2. This activation function has been used to prevent the dying neurons by providing a small epsilon value in the negative part of its derivative, whereas ReLu produces zero value in that region. It is described by the (5) given below, where 'x' denotes the input to the operator:

$$\begin{aligned} LeakyReLu & = \propto *x, \;{\text{for}}\; x < 0 \\ & = x, \;{\text{for}} \;x \ge 0 \\ \end{aligned}$$
(5)

3.5 Proposed Algorithm

  1. 1.

    Input data: Chest X-ray images (X) where Xϵ{COVID-19; normal} or Xϵ{COVID-19; normal; pneumonia}

  2. 2.

    Output data: The trained model which predicts the category of the input image.

  3. 3.

    Pre-processing: Modify the X-ray input to 256 × 256 pixels and data is augmented by shear, zoom, horizontal flip, and rescale.

  4. 4.

    Divide the data set into training and validation sets as 80% and 20% of the total data set, respectively.

  5. 5.

    Data encoding: binary or categorical encoding done according to the classifier used.

  6. 6.

    For epochs in 1 to 50, do:

  7. 7.

    Train the model through three convolutional layers (binary classifier) or four convolutional layers (ternary classifier) on a batch size of 64.

  8. 8.

    end.

4 Experimental Setup and Results

To increase the effectiveness of the two proposed layered models, different combinations of the layers involved in the CNN architectures have been applied and their hyperparameters have been tuned. Being a deep learning model, automatic feature extraction takes place. After several experiments on the model, only the relevant results are presented in the underlying section. During the prior experimentations that have been performed, values of different parameters were adjusted and tested with the model’s training/validation process. This practice assisted in finding the suitable set of parameters that maintained a balance between the model’s improved accuracy and lesser time consumption while presenting the result.

4.1 Implementation Environment

All the experiments have been conducted using Python 3 on Google Colab, a cloud service that provides a powerful GPU Nvidia K80/T4 with 12 GB GPU RAM. The experiment is carried out on a Python 3.7 programming environment, with modules such as Keras, Matplotlib, Scikit-learn, Numpy, itertools, and others being used.

4.2 Experimental Setup

Due to their different scenarios of implementation, both the proposed models have been provided with different data sets of X-ray images i.e., the classification of COVID-19 vs. normal is obtained from the binary classification model implemented on the data set [40] and the classification of the X-ray images into positive COVID cases vs. normal cases vs. pneumonic cases has obtained from triple-class classifier which leveraged the data set from [40, 42]. The models have been trained and validated on 80% and 20% of the total data set, respectively, and all the images of the data set have been augmented and resized to 256 × 256 pixels. The reason behind selecting this specific pixel resizing of input images has been mentioned while presenting the section involving “Data Pre-processing”. Both architectures have batch sizes set to 64 and have been trained for 50 epochs at a learning rate of 10–3 or 0.001 because, during experimentation, it was observed that epochs greater than the prescribed value, resulted in a large gap between validation accuracy and training accuracy, which is not desirable. The two proposed models have also used different combinations of activation functions in their convolutional layers. The dimensions of the layers involved in the proposed are depicted in Fig. 5. Their performances have been evaluated and the results of the best model from each binary and ternary classifier have been presented in the form of a confusion matrix along with classification metrics such as accuracy, f1-score, precision, and recall.

Fig. 5
figure 5

Architectural description of the proposed CNN models for a binary classification and b ternary classification

4.3 Results

In this study, the authors aim to achieve the most accurate models for each binary class and multiclass classifiers; therefore, extensive experimentation has been carried out in analysis, and based on the attained results, only the most valid ones have been reported. The outcome has been evaluated using the performance metrics accuracy, F-score, recall, and precision:

$${\text{Accuracy}} = \frac{{{\text{TPOS}} + {\text{TNEG}}}}{{{\text{TPOS}} + {\text{TNEG}} + {\text{FPOS}} + {\text{FNEG}}}}$$
(6)
$${\text{Precision}} = \frac{{{\text{TPOS}}}}{{{\text{TPOS}} + {\text{FPOS}}}}$$
(7)
$${\text{Recall}} = \frac{{{\text{TPOS}}}}{{{\text{TPOS}} + {\text{FNEG}}}}$$
(8)
$$F{\text{ - score}} = \frac{{2*{\text{PRE}}*{\text{SEN}}}}{{{\text{PRE}} + {\text{SEN}}}}$$
(9)

where TPOS = True Positive, TNEG = True Negative, FPOS = False Positive, and FNEG = False Negative.

In the first proposed model of binary classification, 99.18% training and 98.91% testing accuracy have been achieved. Figure 7 shows the confusion matrix for the binary classification model (COVID vs. normal). Among the 183 cases of each class, 182 COVID instances and 180 normal cases are correctly predicted. Only 1 COVID test sample and 3 normal test instances are predicted incorrectly. The variation in terms of accuracy and loss for training data and validation data concerning each epoch are shown in Fig. 6. The significant loss at the beginning of the training and then its gradual decrease in later epochs can be observed from these curves. The rationale behind this trend of the loss curve is that the deep learning model trains according to the error faced in each epoch, which is facilitated via backpropagation, therefore, as the epochs transcend, the loss decreases. Finally, at the later part of the training, the large variations in the upward and downward curves of the training and validation loss are reduced and a trained model is obtained. Figure 7 depicts the confusion matrix of the binary classification, which can be used to calculate the model’s f1-score, precision, recall factor, and accuracy. Therefore, the overall values of the f1-score, precision, and recall factor are 99%, 98.5%, and 98.5%, respectively, and these values for each class are summarized in Table 2. Based on true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN).

Fig. 6
figure 6

Plot obtained from the binary classifier. a Accuracy achieved from the data set vs. epoch. b Loss encountered by the data set vs. epoch

Fig. 7
figure 7

Confusion matrix for the two-class model, i.e., COVID vs. normal. a Non-normalized. b Normalized

Table 2 Performance metrics for the proposed binary-class model

Similarly, in the second proposed model during ternary classification, 96.5% training and 95.99% testing accuracy have been achieved. The confusion matrix for the ternary classifier, i.e., COVID vs. normal vs. Pneumonia is shown in Fig. 9. 178 COVID samples, 183 normal instances, and 166 Pneumonia images are correctly classified for 183 test images of each class. The unsuccessful predicted cases have 5 COVID instances, 0 normal instances, and 17 Pneumonia instances during testing of the data. In Fig. 8., the training and validation accuracy graph against each epoch is illustrated and their loss graphs are shown as well. Due to the classification into three categories and the high degree of similarity between COVID and pneumonic X-rays, the accuracy of the model was reduced as compared to its predecessor classifier. The confusion matrix shown in Fig. 9 verifies the result obtained. Consequently, Fig. 10 summarizes the comparison between true labels and predicted labels of the images available in the testing data set. As illustrated in the figure, the variation between the “Predicted Label” bar and “True Label” bar is observed to be minor for all classes; however, the minimum difference in the comparison can be seen by “Covid” category while the maximum difference is shown by “Normal” category. Therefore, it can be observed that the ternary model has achieved an acceptable performance in terms of accuracy. Furthermore, the f1-score, precision and recall factor of the proposed model are calculated as 96.33%, 96.3% and 96%, respectively, and these values pertaining to each class are given in Table 3.

Fig. 8
figure 8

Plot obtained from the ternary classifier. a Accuracy achieved from the data set vs. epoch. b Loss encountered by the data set vs. epoch

Fig. 9
figure 9

Confusion matrix for the ternary classifier, i.e., COVID vs. normal vs. pneumonia. a Non-normalized. b Normalized

Fig. 10
figure 10

Double bar graph illustrating the total number of images under each group. The blue colored bar represents images as categorized by the proposed ternary classifier. Meanwhile, the cyan-colored bar represents the actual number of images present under each class

Table 3 Performance metrics for the proposed multi-class model

Based on the results achieved from the models proposed in this study, some findings assisted in differentiating between COVID and pneumonic X-rays, which possess a high degree of similarity. These differences are depicted in Fig. 11. The following are commonly observed characteristics in a COVID-19-affected chest X-ray [50]:

  1. 1.

    Thickening of bronchovascular.

  2. 2.

    Ground-glass opacities (GGO) (subpleural, bilateral, medial, multifocal, peripheral, basal and posterior).

  3. 3.

    A crazy paving appearance (GGOs and inter-/intra-lobular septal thickening).

Fig. 11
figure 11

Characteristic feature distinguishing between chest X-rays of (a) COVID-19 and (b) pneumonia

In similar lines, pneumonic scans display the following characteristics [51]:

  1. 1.

    Thickening of the bronchial wall.

  2. 2.

    Vascular thickening.

  3. 3.

    GGO unilateral and central distribution.

5 Discussion and Future Scope

The development of the CNN models from scratch assisted the system in achieving efficiency. In comparison with existing models that used pre-trained CNN architectures as their base network for detecting COVID-19, the proposed layered networks have used fewer resources in terms of the convolutional layers and have shown a good accuracy inclusive of the data augmentation performed on chest X-ray images. The computational cost has also been reduced in terms of activation functions applied in the layers. Furthermore, even after having a high resemblance between pneumonic and COVID-infected radiographic images, the presented model achieved an accuracy of approximately 96%. The quintessential proposed models have demonstrated a steady result on the expanded data set. With the coalescence of a diversified set of CNN layers being transmogrified based on preliminary reports, the discussed architecture in this manuscript has furnished a more unfaltering system while performing diagnostics. The comparison of the proposed method with state-of-the-art methods is shown in Table 4.

Table 4 Comparison of the proposed method with existing research

The proposed architecture of both these classifiers has produced good results due to the use of X-rays for training the models. As observed from the literature studies considered in this research paper, the models that adopted X-rays have shown better output than those that implemented CT scans. The convolutional layers that have been added to each of the CNN models include the hyper-parameters which have been carefully tuned after performing various experiments on their values. This also assisted in improving the model’s performance.

With the available data set, the models achieved a balance between reliability and time consumed for producing the test results. This enabled the authors to achieve the aim of this study. Subsequently, with comparison to existing models, the accuracy is observed to be better. Another observation recorded while experimenting in this research study is that due to the limited availability of publicly released COVID-19 data, a comparison with studies having different setups has been performed. This limitation of the study can be revisited in the future once a larger data set with open access is available. Hence, a comparison can be achieved over a uniform setup of the discussed research models. Moreover, these proposed layered CNNs can be re-implemented on a data set which would be an amalgamation of CT scans and X-rays. This would bridge the research gap experienced by the majority of the presented literature studies. Nevertheless, the strength of these proposed models is their fast and early diagnostics system, which can help isolate the suspected cases hence reducing the spread of this fatal disease. The study may be impacted by resource constraints and might not take into consideration the temporal features of disease progression. Future studies may take into account potential shifts in COVID-19 strains as well as biases related to demographics and ethnicity present in the data set. Furthermore, the integration of sophisticated statistical methodologies, notably advanced tests such as the Diebold–Mariano test [52,53,54], McNemar test [55], etc. is intended to be explored in future investigations.

6 Conclusion

In this study, authors have proposed two models that can be used for detecting and classifying COVID-19 cases from chest X-ray images. The first, proposed layered CNN architecture for binary classification employed a total of three convolution layers accompanied by two max-pooling layers and two dense layers for establishing a full connection between neurons. Furthermore, another proposed ternary-class layered CNN comprised four convolutional layers with three max-pooling layers and two dense layers at the end. Hence, a layered novel approach has followed in this proposed work, which increases the accuracy and efficiency of the system. In addition to that, being a deep learning model, there was no need for manual feature extraction in the chest scans; therefore, a fully automated architecture is presented in this work, for both binary and ternary-class classifiers with an accuracy of 98.91% and 95.99%, respectively. The results section also encompasses the findings obtained which differentiate COVID-19 chest X-rays from those of pneumonia-infected. These distinguishing features are presented in Fig. 10. In the end, the proposed classifiers along with the pre-existing models are summarized based on their followed approach and accuracy obtained and are presented in the form of a table for comprehension. The developed system in this study intends to produce a quick and early diagnosis of COVID-19, including pneumonia cases, and provide assistance to the patients, as a second opinion.