1 Introduction

The presence of the Coronavirus pandemic (COVID-19) has put the whole world face various problems since December 2019. COVID-19 causes multiple issues, including dry cough, fever, headache, myalgia and chest troubles [5, 20]. To this date, the number of COVID-19 cases across the globe is getting increasing day by day. According to the World Health Organization, the number of deaths worldwide is approximately 2.6 cases (https://covid19.who.int/). COVID-19 disease is generally considered a respiratory disease. This virus takes around 14 days to show its main symptoms and to be contagious to other persons. The most common test used to diagnose COVID-19 is Reverse transcription-polymerase Chain Reaction (RT-PCR) [12, 30]. However, various countries and hospitals do not provide enough of these tests, or they are very costly to patients. To address these problems, multiple works have been proposed for classification, detection and COVID-19 image analysis and segmentation.

Various imaging techniques have been recently proposed to build new COVID-19 detection systems, such as chest x-ray and chest computed tomography CT scans. However, COVID-19 causes significant troubles in lung chest radiography images, including chest x-ray and chest CT [11].

We propose in this paper to build a chest CT COVID-19 image segmentation system to fix the COVID-19 indicators infection in images. We note that all these visual indicators present an alternative component for a rapid diagnosis of COVID-19-infected patients.

RT-PCR test kits are costly and present in a limited number in various hospitals around the world. Therefore, new COVID-19 detection and segmentation applications based on chest x-ray and CT images present a potent component and are easy to be accessed. Artificial intelligence (AI) and deep learning-based techniques have been widely used to build new COVID-19 detection and segmentation applications and aim to minimize the cost and the time to detect infected patients. More precisely, deep convolutional neural networks (DCNN) have demonstrated effective and competitive results in the medical imaging area [10, 23].

Developing new systems used for early diagnosis and detection of COVID-19 infection on CT scan imaging using deep learning techniques presents a very challenging task to be addressed. We propose in this work to build a new COVID-19 segmentation system by tacking advantages of deep learning models. The proposed work will present a new tool used to assist medical staff to predict COVID-19 at early stages and save human lives.

We propose developing an early diagnosis and segmentation application for COVID chest lung CT images in this work. CT scan provides detailed information about different organs such as: lungs, blood vessels, and bones. In contrast to chest x-ray images, CT images offer helpful information about a given region of the body without overlaying the body’s different structures. Thus, CT images provide more detailed images than those conventional x-ray images. All this information serves to determine whether there is a problem and also to contribute to the location of this problem. Based on this fact, various deep learning methodologies have been proposed for COVID-19 CT scans, as presented in [3, 19, 25, 26, 31, 36], and. To train and test these networks and algorithms, there is a significant need to develop new systems and applications used to perform early diagnosis for COVID-CT-dataset [40]. There is an increasing need to create new designs and applications to achieve an earlier diagnosis for COVID-19 disease to control the severe pandemic as it is spreading rapidly worldwide.

Deep learning-based techniques have demonstrated very efficient results in COVID-19 classification, detection and segmentation. Thorax computed tomography CT images can be widely used to perform an early diagnosis of COVID-19 patients. We also note that performing a diagnosis of COVID-19 classification based on thorax CT images require an immense knowledge of radiology expert, which present a considerable time loss. Countries do not provide test kits to COVID patients as they are costly. Based on this fact, there is an increasing need to build news applications to diagnose the presence of COVID-19 in the human body early and to put the infected persons immediately under quarantine to avoid affecting other persons. Generally, the RT-PCR test kit suffers from a false negative rate as the machine used for the test takes between 4 to 8 hours to process one trial of one patient. RT-PCR test presents low sensitivity. In various cases, the infected person is not recognized. Sometimes, the infected patient is considered COVID-19 harmful because of the false negative rate [4]. By viewing the above, there is an increasing need to build new automated structures for thorax CT analysis of COVID-19 images to improve performance and save much more time for the medical staff.

Deep learning-based architectures have demonstrated the best performances in classification, detection and segmentation tasks. It is efficient and robust enough to identify a set of illnesses quickly and accurately. Deep learning–based applications can be successfully applied in different areas including uncertainty estimations [37], point cloud analysis [38], indoor objects detection [1], and road sign detection [6]. Deep learning techniques are widely applied in the medical field. For example, they are used to detect COVID 19 is present in CT images of patients as COVID-19 infected or not.

Our main aim of this work is to propose a new COVID-19 segmentation system based on deep learning techniques. The proposed method can detect and perform COVID-19 segmentation on lung CT images. This work was developed using a context aggregation network. This network consists of three main modules: the context fuse model (CFM), attention mix module (AMM) and a residual convolutional module (RCM). We note that this work presents the first one employing this network type. This work can widely contribute to the early diagnosis of COVID-19 disease by highlighting the ground glass opacity and consolidation areas. Therefore, the proposed system can widely help the medical staff community for early diagnosis of COVID-19 infection more accurately and faster. It also can save human lives by detecting COVID-19 at an early stage. A graphical representation of our proposed system is presented in the following Fig. 1. First, the CT images will be taken from the CT scan machine and then the CT scan will be sent to the computer desktop. At this stage, a context aggregation network will be applied to the captured images to detect the ground glass opacity and the consolidation areas which are related to the COVID-19 disease presence.

Fig. 1
figure 1

Proposed workflow for CT scan images segmentation

2 Related work

Image segmentation is an essential task in medical imaging and healthcare. It presents one of the primary steps for clinical applications. Image segmentation can be applied for different medical imaging issues, including brain glioma segmentation [32] and automatic brain lesion segmentation [17]. Several works and research have been conducted during the last few years in medical imaging, such as the diagnosis of CT scans using deep learning-based techniques. Recently, during the outbreak of coronavirus disease, CT images and x-ray images have been handy for diagnosing the presence or not of COVID-19.

Since their appearance, deep convolutional neural networks (DCNNs) have demonstrated huge performance in CT image segmentation tasks. U-Net [27] has been evaluated to accurately detect COVID-19 in CT scans and has demonstrated good COVID-19 segmentation performances [21]. In [28], authors evaluated SegNet [7] for COVID-19 tissue segmentation. The obtained results show the superiority of SegNet in classifying infected and non-infected CT scan tissues.

Deep learning-based architectures have demonstrated tremendous performances in terms of feature extraction and network learning capabilities [14, 15]. Image quality influences these performance results to ensure good detection and segmentation performances. In [13], Gu et al. introduced an excellent image quality evaluation algorithm that can be used in different tasks.

Conventional components for early diagnosis of COVID-19 still depend on time and present a risk for the medical staff. Moreover, the number of test kits is limited and costly. Therefore, medical imaging techniques such as x-ray or CT images are recently used for COVID-19 screening and diagnosis. X-ray imaging has been widely used to diagnose the COVID-19 segmentation as they demand much less time to process images at a lower cost. Also, x-ray image scanners are more provided than other machines. However, the x-ray imaging inspection by radiologists is time-consuming. Generally, it presents a high error rate due to the lack of prior knowledge about the infected regions of x-ray images. To address this problem, various works have been oriented to work with computed tomography CT images by taking advantage of deep learning-based algorithms. In [39], the authors provide a comprehensive literature review of COVID-19 detection in CT images. The critical components of the CT scan image diagnosis have been the reticular pattern, ground-glass opacities and consolidation.

In [41], Zhao et al. proposed investigating the relation between chest CT findings and the clinical conditions of COVID-19 pneumonia. In this work, the main symptoms associated with coronavirus disease were studied and analyzed. The data on 101 cases of COVID-19 virus were collected from four institutions in China. In this study, the patients who are infected by COVID present typical imaging features that can be highly useful in analyzing the suspected cases of this disease.

A retrospective study on CT findings to the time between the symptoms apparition and its initial CT scan images was performed in [8]. Generally, the main hallmarks of COVID-19 disease on CT imaging were consolidative pulmonary opacities and bilateral and peripheral ground glass; after a long duration of COVID-19 symptoms, CT findings were more frequent, including total lung involvement.

Generally, an accurate and early diagnosis of COVID-19 suspected cases plays a vital role in putting the infected persons in quarantine and saving lives. In [42], the authors proposed a deep learning-based model for automatic COVID-19 detection on chest CT images. This deep learning-based software system was developed using 3D CT volumes. In addition, the authors used U-Net to segment 3D lung regions to build this system to predict infectious COVID-19 areas. This system has obtained exciting results.

Chest CT images are widely used for the early diagnosis of coronavirus symptoms. They are also an essential complement to the R-PCR test kit, a new study including 1014 chest and CT and RT-PCR tests from patients in Wuhan. Authors in [2] evaluated the performances of chest CT diagnosis of COVID-19with the use of the RT-PCR reference standard. Based on their studies, chest CT images presented high sensitivity for the diagnosis of COVID-19 disease. In [29], a deep learning-based technique used for early diagnosis of corvid-19 using CT scan images, authors developed a new model named CTnet-10 for early diagnosis of COVID-19. The authors also tested another state-of-the-art network, including DenseNet-169, VGG 16, VGG 19, ResNet 50 and inception V3. In [28] authors proposed COVID-19 lung CT images using two different architectures, U-Net and SegNet. This work has been submitted for image tissue classification. Generally, SegNet is used for scene segmentation networks and U-Net is used for medical segmentation tools. Both neural networks have been evaluated for binary segmentation and multi-class segmentation. These two architectures were trained on 72 images, validated on 10 and tested on 18. SegNet shows its superior ability in binary classification (infected, non-infected).

In contrast, U-Net shows its better ability on multi_classification tasks: this work has been treated on CT scan images of COVID-19 patients. They achieved a classification rate coming up to 0.95 for classifying infected and non-infected tissues when using SegNet and they obtained 0.91 as a mean accuracy of classification when using U-Net for multi-class segmentation. In [24], the authors proposed to develop a new deep learning-based model named CovNet used to extract features from chest CT scans. The authors trained and tested their collected dataset, which consists of 4352 chest CT images. The results were obtained on their dataset, composed of 4352 chest CT scans collected from 3322 patients. They got sensitivity and specificity coming up to 90%. In [33], a COVID detection system was developed based on a lightweight mask RCNN to segment areas with ground opacity. They used ResNet 18 and 34 with a single layer of FPN as a feature extractor. The network is named COVID CT market and trained and tested on the COVID x CT dataset (21,191 images). It achieves 91.35% as the sensitivity of COVID-19, 91.63 as common pneumonia sensitivity and 96.95%true negative rate and 93.95% as an overall accuracy on the COVID x CT dataset. In [34], the authors used Mask RCNN to segment lesion areas. They used the COVID CT mask Net and the COVID CT dataset. They achieved 93.88% COVID-19 sensitivity, 95.64% overall accuracy, 95.06% pneumonia sensitivity, and 96.91% actual negative rate on the COVID x CT dataset.

In [35], the authors proposed a COVID CT market model. To build this system, the authors used the COVID x CT dataset. They used about 5% of the training split of COVID x CT to train the model. As a result, they achieved 90.80% as a COVID sensitivity,91.62% as a common pneumonia sensitivity, 92.10 as a normal sensitivity and 91.66% as overall accuracy.

The remainder of this paper is organized as follows: Section 3 describes and details the proposed method used for CT scan COVID-19 image segmentation. Experiments and results are provided in Section 4 and Section 5 concludes the paper.

3 Proposed methodology for COVID-19 segmentation

An increasing need is addressed to develop an automatic COVID-19 detection from CT scan images. To create this system, we make use of deep learning algorithms. More significantly, we used the context aggregation neural network [9]. This model demonstrated good segmentation performances when it was first applied to remote sensing imaging. This work will evaluate this convolutional neural network on COVID-19 CT image segmentation and diagnosis. This network is specially designed for semantic labeling. It is composed of 3 main modules:

  • Context Fuse Module (CFM).

  • Attention Mix Module (AMM).

  • Residual Convolution Module (RCM).

3.1 Context fuse module

Due to the vast scale, complexity and variation in input data information, context information is crucial to accurate semantic labeling. To address this problem, the context fusion matrix consists of two main parts: part A which consists of a parallel convolution block containing four branches with multiple convolutions and kernel sizes. Part B consists of a global pooling followed by a 1 × 1 convolution and a batch normalization layer to form the global context information. The parallel convolution block is composed of 4 convolutions branches with various kernel sizes (3 × 3,7 × 7, 11 × 11 and 15 x ×15). Every branch contains 2 convolution layers to extract the main features, 2 batch normalization layers to minimize the internal covariate shift and one activation layer RELU [22]. This module uses a solid kernel size of a regular convolution to contribute to more consistent results. After that, all the extracted features by every convolution branch will be concatenated with different receptive fields. We note that using 15 × 15 convolution kernels ensures the network’s ability to treat more complex tasks.

Global pooling branch: this branch is used to introduce global data information. Features maps are passed through the global average pooling layer to capture the data’s main features and international context, followed by a 1 x 1convolution to reduce the channel size. Finally, the features map is passed through a batch normalization layer and then concatenated with the output of the parallel convolution block.

The following equation presents the whole procedure of the CFM is presented as Eq. 1:

$$ F=\mathbb{P}(x)\oplus {\sum}_i^{\ast}\mathbb{C}(x)\kern2em \mathrm{i}=3,7,11,15 $$
(1)

Presents the series of operations of the global pooling branch. Finally, \( \mathbb{C}(x) \) presents the stacked layer of one convolution parallel block branch. \( \sum \limits_i^{\ast } \) denotes a single and consecutive concatenation operation. By considering all the above, the parallel convolution block provides multiscale information via the different kernel sizes of the convolution layers. So, generally, the CFM module provides multi_scale information and global context information. In Fig. 2, we present the CFM block module architecture.

Fig. 2
figure 2

Context fuse module architecture details [9]

3.2 Attention mix module

Generally, in semantic labeling, the deeper layers contain semantically vital information. Therefore, this module will highly affect semantic segmentation accuracy. To contribute to more accurate results and location information, combining shallow features with deeper ones is essential. The features fusion process in AMM is presented in the following Eq. 2.

$$ {\mathrm{F}}_{\mathrm{c}}=\mathbb{C}\ \left({\mathrm{F}}_{\mathrm{h}}\oplus {\mathrm{F}}_{\mathrm{L}}\right) $$
(2)

Where Fh and FL are high and low, respectively,  presents convolution, batch normalization and RELU activation layer and Fc corresponds to the output just before the global pooling.

$$ \mathrm{F}=\mathbb{P}\ \left(\mathrm{Fc}\right)\ast \mathrm{F}\mathrm{c}+{\mathrm{F}}_{\mathrm{l}} $$
(3)

Where corresponds to the global pooling operation, in the AMM module, the low-level features are concatenated with the high-level features, followed by a 3 × 3 convolution layer to perform a reduction of channels. After removing the features map via a 1 × 1 kernel size and a global pooling, the features map is multiplied by itself, in addition to obtaining an explicit fusion. All this contributes to getting new features with better recognition abilities. Figure 3 presents the attention mix module architecture.

Fig. 3
figure 3

Attention mix module architecture [9]

3.3 Residual convolution module

The network is used for classification issues but, in some cases, adds extra layers to be adopted to treat semantic labeling problems. The following figure presents the RCM module.

As mentioned in Fig. 4, the RCM module consists of 1 × 1 convolutions that unify the channels number, followed by a residual-like block for features refinement. Also, this block makes the network deeper to enable it to capture the more critical and sophisticated features.

Fig. 4
figure 4

Residual convolution module architecture [9]

All the models presented above form the context aggregation network. This network offers an encoder/decoder-like architecture. ResNet is the backbone adopted to extract the robust features map in this network [18]. Every output stage of the ResNet backbone is passed through the RCM block to refine the output features. After that, the last features map is passed through the CFM block to obtain multiscale feature information. Finally, the different features map generated is combined via the AMM module and refined by the RCM module. The objectless probability is calculated using the softmax function as the following:

$$ Pk\left({x}_i^j\right)=\frac{\exp \left({h}_k\left({x}_i^j\right)\right)}{\sum_{n=1}^k\exp \left({h}_n\left({x}_i^j\right)\right)} $$
(4)

Xij presents the jth pixel in ith image and \( \Big(\left({h}_n\left({x}_i^j\right)\right) \) shows the network output.

K Є{1,2,3,…, k}: class number.

To calculate the loss function, it uses the normalized cross entropy function. Figure 5 depicts the entire CAN with all its modules that compose it.

Fig. 5
figure 5

Context aggregation network detailed architecture

4 Experiments and results

In order to build an efficient and robust COVID-19 detection system, we make use of a deep learning network named context aggregation network CAN. We note that in the proposed, we used the context aggregation network for the first time to be applied for CT scan images segmentation task. Training and testing processes were performed based on the COVID-x-CT dataset [16]. In the proposed work, CT scan images will be processed and segmented using the CAN network to detect and segment ground glass opacity and consolidation areas which are generally related to common pneumonia or COVID-19 cases.

In this work, we have efficiently and comprehensively evaluated the efficiency and the robustness of the context aggregation neural network to be applied on COVID-19 diagnosis and segmentation. Extensive experiments have been conducted in the proposed work that leads to very competitive results regarding segmentation precision and processing time. In the following, we will detail all the experiments conducted in the proposed work. Training and evaluation experiments were performed on.

4.1 Data preparation and augmentation

The context aggregation network (CAN) has been trained and evaluated on the COVID-x-CT dataset [16]. COVID-x-CT dataset is more extensive than other CT datasets used for COVID-19 detection. It has been collected from CNCB, which was collected from patents from all provinces of China. The dataset has been divided into 3 parts: train, validation and test. Over 60,000 images were used for training, 21,036 for confirmation and 21,192 for testing the network. The warranty set studies how the web can understand the data and avoid overfitting problems.

Generally, the most common problem of deep learning algorithms is class imbalance issues. To overcome this problem, data augmentation has been applied using different techniques, including Random cropping, brightness adjusting, horizontal flipping, vertical flipping, image translation, and random translation.

4.2 Training details

All the proposed experiments are conducted under the PyTorch framework. We used the context aggregation network to contribute to a COVID-19 CT scan images segmentation system based on deep learning techniques. The total batch size is set to 6 and 8. We used the stochastic gradient descent (SD) with momentum as a network optimizer during the training process. Learning weight and weight decay are set into 0.001 and 0.0005, respectively. We also note we adopted an adaptive learning rate during the training process to contribute to better training performances. Finally, we used cross entropy as a loss function and 300,000 iterations. Table 1 provides all the experiment settings conducted during the proposed experiments.

Table 1 Experiments Settings

To assess the model performances, we evaluated our model using various metrics, including F1 score, recall, precision, and accuracy (Table 2).

Table 2 Evaluation metrics used
$$ Precision=\frac{TP}{TP+ FP} $$
(5)
$$ Sensitivity=\frac{TP}{TP+ FN} $$
(6)
$$ F1- score=2\ast \frac{Precision\ast Recall}{Precision+ Recall} $$
(7)

To increase the effectiveness of the proposed COVID-19 segmentation application, we changed the batch size to study the effectiveness of the proposed work. Batch size presents one of the most critical hyperparameters for deep learning models. Therefore, the number of batch sizes is studied for testing accuracy. The following table shows the testing accuracy obtained for two different batch sizes, 6 and 8. Based on the obtained results provided in Table 3, it can be seen that the highest testing accuracy was obtained when using the batch size of 8 instead of 6.

Table 3 Obtained testing accuracies for different batch sizes

The number of the batch size adopted during the training process contributes to a more stable and better training process. As testing segmentation accuracy, we obtained 96.23%, which outperforms the state-of-the-art results. The following Fig. 6 depicts a segmentation example of a context aggregation network when applied to a CT scan for common pneumonia and COVID-19 segmentation.

Fig. 6
figure 6

A segmentation example: (a): original CT scan, (b): ground truth, (c): predicted

As mentioned in Fig. 6, red color zones refer to the ground glass opacity and the green color refers to consolidation areas. As mentioned, the two zones are detected and segmented very well. The following table compares the state-of-the-art networks used for lesion detection, common pneumonia and COVID-19 segmentation on CT scan images. We note that we used a specific data protocol in the proposed experiments (training over 60,000 images, validation: 21036 images and test: 21192 images). For this, we reproduced the experiments with the same data partition. In addition, we implemented SegNet [7] and U-Net [27] to compare their performances with context aggregation networks before and after compression techniques application.

As mentioned in Table 4, we note that by using the context aggregation network for lesions segmentation for COVID-19 prediction, we succeeded in obtaining better results than those obtained SegNet and U-net in the Covid-x-CT dataset. In addition, we obtained better results and performances before and after compression techniques application. Based on the obtained results, the proposed work shows better competitive segmentation results, which can help the scientific community, especially doctors, for early diagnosis and prediction of COVID-19 and save more lives.

Table 4 Comparison of segmentation accuracies on the COVID-x-CT dataset

After COVID-19 CT scan segmentation, a post diagnosis is performed. Three classes can be detected based on the segmented CT images: normal, COVID-19 and pneumonia. Confusion matrix is one of the precise metrics that evaluates the work efficiency. Figure 7 provides the confusion matrix. We note that P refers to pneumonia, c refers to COVID-19 and n refers to normal.

Fig. 7
figure 7

Confusion matrix

Based on the results obtained in the confusion matrix, the developed system achieved good performances that can widely help the medical staff in early diagnosis of COVID-19 that can widely reduce the spread of this virus.

5 Conclusion

We propose to build a new COVID-19 segmentation system on CT scan images in this work. All the experiments have been conducted using the COVID-x-CT dataset. To develop the proposed plan, we used the end-to-end context aggregation neural network, which provides an encoder-decoder-like architecture. This architecture is very efficient and robust in extracting the main features as it allows for a context-based module and attention-based multi-level features fusion.

This work highlights the power and efficiency of deep learning models in combatting the COVID-19 epidemic. The proposed system presents a very efficient component used to detect COVID-19 from CT scans of suspected cases and to combat the pandemic. This system can extract the main features at different levels, making it suitable to see two central regions related to COVID-19: ground glass opacity and consolidation faster. Furthermore, the proposed segmentation achieves high accuracy coming up to 96.23%.

The proposed work was not yet evaluated under real-world conditions. As a future work, the system accuracy can be further improved as well as the time complexity. Also, we can apply different compressions techniques in order to ensure a lighter version of the proposed work.