Keywords

1 Introduction

Since December 2019, one of the most life-threatening viruses has appeared. COVID-19 has caused a devastating effect on both daily life and public health. Thousands of lives are taken daily around the world. Unfortunately, there is no effective treatment to eliminate the virus, and so far, the vaccine has not given reassuring results. Several approaches and applications have been presented, implementing advanced artificial intelligence and medical resources, to diagnose the virus in its first stages [1, 2]. Many studies have been proposed for the detection of COVID-19 by using machine learning algorithms like random forests, genetic algorithms, and Artificial Neural Networks (ANNs) [3,4,5]. ANNs have gained researchers’ interest, in recent years, for its good performance to become the best solution for classification tasks. Particularly, Deep Learning (DL) approaches are widely used for successful classification. It has enabled the automatic extraction of complex data features at high levels of abstraction. The last approaches of DL, which are also characterized by their large number of hidden layers in their networks, provide the most efficient solutions to problems caused by massive calculations and allow machines to learn and predict object classes with more accuracy [6, 7]. Many research papers have contributed to solving the detection of the COVID-19 pandemic by using DL architectures as a solution [5, 8]. In [5], the authors put forward an advanced model for COVID-19 detection, named DarkCovidNet. This model was designed to provide accurate binary (COVID, non-COVID) and triple classification (COVID, non-COVID, Pneumonia) through X-ray images. The used dataset consisted of 500 healthy cases and 500 pneumonia ones. The DarkCovidNet architecture comprised 17 convolution layers. A Batch Normalization (BN) layer was adopted after every convolution layer in order to normalize each layer output. The implementation of DarkCovidNet achieved 98.08% on binary classification and 87.02% in triple classification after 100 epochs of training. In [8], an end-to-end system for COVID-19 and pneumonia infection detection was propounded. Both X-ray for COVID-19 detection and CT images for pneumonia detection, collected from different publicly available resources, were considered to evaluate the model. The Inception Recurrent Residual Neural Network (IRRCNN) was suggested for the detection of COVID-19, and the NABLA-N network model was used for the identification and segmentation of the infected regions. The IRRCNN architecture was composed of an input layer, five inception recurrent residual units, a global average pooling layer and a softmax output layer. The implemented NABLA-N network architecture was based on a U-Net template that contained two nested U-Nets inside it. The detection model showed around 84.67% testing accuracy from X-ray images and 98.78% accuracy in CT-images. All these studies used many DL approaches to COVID-19 detection, but they also utilized either the X-ray or CT datasets. In order to improve the performance of COVID-19 detection, we firstly combine the heterogeneous X-ray and CT chest datasets into a large one, and secondly, we investigate the most adequate DL architectures that combine the advantages of the best approaches and techniques in the literature in order to boost pre-processing and feature extraction suitable for heterogeneous images. In this context, we put forward a novel DL model, called DarkCovidNet-NRC, which integrates the Nested Residual Connections (NRCs) in the DarkCovidNet model. This paper is organized as follows: Sect. 2 describes the proposed DL model that will improve the performance of COVID-19 detection. Section 3 describes the heterogeneous dataset used for the experiment and the used implementation software environment, and it discusses the obtained implementation results. The last section concludes and gives some future perspectives of this work.

2 DarkCovidNet-NRC DL Architecture

In this paper, we investigate the most adequate DL architectures to come up with a better performing model. The proposed model, named DarkCovidNet-NRC, is a combination between the DarkCovidNet architecture and the NRC blocks, which are composed of nested residual blocks (residual blocks within a residual block). In fact, as described in [5], the DarkCovidNet model can achieve good accuracy as it provides efficient feature-map extraction through its convolution layers. The integration of residual blocks with skip-connections in the DarkCovidNet architecture can also make this model more robust and expandable and it can reduce the chances of overfitting, thus achieving a better performance of COVID-19 classification. The DarkCovidNet-NRC architecture is depicted in Fig. 1. It contains two single Dark Net (DN) blocks which each one contains one convolutional layer followed by BN and PReLU operations. Furthermore, the suggested architecture integrates four NRC blocks each one is a residual block composed of one DN block and two successive nested residual blocks. Moreover, this block provides a skip-connection from the first DN block to the output of the second nested residual block. Finally, a PReLU activation function is applied at the end of this block. The DarkCovidNet-NRC architecture ends by one DN block, a flattening layer and a softmax layer that produces the outputs. In the first six blocks, the channel size of convolutions increases twice. This size decreases to 2 in the last DN block, which is equal to number of targeted classes (COVID, non-COVID).

Fig. 1.
figure 1

DarkCovidNet-NRC architecture

3 Experiments and Results

In order to implement the suggested DL model for the COVI-19 detection, we first describe the used dataset, set up the need for software and hardware environments and present and discuss the implementation results.

3.1 Dataset

We use an open dataset, named “Extensive COVID-19 X-ray, and the CT Chest Images Dataset” published on 12/06/2020 by Walid El-Shafai and Fathi Abd El-Samie [9]. This open dataset has been collected from multiple sources and augmented with different techniques to end up containing over 17,599 annotated samples of COVID images and non-COVID ones. The X-ray dataset contains 5,500 images of non-COVID and 4,044 images of COVID, and the CT images are divided into 2,628 non-COVID images and 5,427 COVID ones.

3.2 Experiment Settings

The DL architectures are developed using Python language and Keras library. The experiments are achieved with the hardware implementation of an AMD R5 3600 @3.6 GHz CPU, a GTX 1060 6 GB GPU, and 16 GB of RAM. For training and validation, we preprocessed the images by dividing them by 255. The adaptive moment estimation (Adam) optimizer is used for weight parameter learning. We also use a learning rate of 0.001. The number of epochs to train is set to 50 while an early stopping is implemented if the model does not improve for 20 epochs. The batch size is set to 32. All these hyper-parameters are fixed to ensure the convergence of the network.

3.3 Results and Discussion

The implementation has been done to show the effectiveness of the proposed architectures for the detection of COVID-19 cases. The training method will be a categorical classification that implements the K-Fold cross validation technique with K = 5. In this section, we present and discuss the implementation results. At the end, a comparison with the state of art is done.

Implementation Results.

The training process is done on three different 5-fold cross validation sessions. The first session is applied only on the X-ray dataset. The second session includes only the CT dataset, and in the final session we use the combination of X-ray and CT datasets. Table 1 presents the average of the 5-fold cross validation sessions of accuracy, precision, recall and F1-Score. The results from this table indicate that our model achieves better results on the mixed session, which proves the capability of our model in classifying combined heterogeneous X-ray and CT chest images. The accuracy and precision can respectively achieve 0.9609 and 0.9780 on the combined dataset using the proposed DL architecture.

Table 1. Performance measures of DarkCovidNet-NRC architecture of the three sessions

Discussion and Comparison with State of the Art.

We implement other architectures recently and successfully used in the literature for the COVID-19 detection in order to compare and locate our model. Table 2 shows the average of the 5-fold cross validation of accuracy, precision, recall and F1-Score using DarkCovidNet, Mobilenet-v2 and VGG19 architectures validated on a mixed dataset. We note that the proposed model can compete and outperform the other models in accuracy and precision metrics. This shows that this model has a great capability of classifying COVID-19 images. The proposed architecture permits designing better DarkCovidNet architecture in integrating the NRC blocks, which enables feature reusability and facilitates the propagation of information for better classification performances.

Table 2. Summary of state-of-the-art results

4 Conclusion and Perspectives

In this paper, we have introduced a novel DL architecture, DarkCovidNet-NRC, which integrates the NRC in DarkCovidNet model. Indeed, we have used a large dataset for the implementation of the K-fold cross-validation technique on X-ray and CT chest datasets separately and then combined. The implementation results of DarkCovidNet-NRC have improved the performance of the detection of COVID-19 in many metrics using the combined heterogeneous datasets. We note that the new architecture competes with the state of the art and outperforms the literature in some metrics. In the future, we will improve this architecture to realize the classification of many classes related to lung diseases. Furthermore, we can use this model to classify other diseases.