IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation

AboElenein, Nagwa M.; Songhao, Piao; Afifi, Ahmed

doi:10.1007/s11042-022-12586-9

IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation

Published: 19 March 2022

Volume 81, pages 24041–24057, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation

Download PDF

547 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Accurate segmentation of brain tumors is an essential stage in treatment planning. Fully convolutional neural networks, specifically the encoder-decoder architectures such as U-net, have proven successful in medical image segmentation. However, segmenting brain tumors with complex structure requires building a deeper and wider model which increases the computational complexity and may also cause the gradient vanishing problem. Therefore, in this work, we propose a novel encoder-decoder architecture, called Inception Residual Dense Nested U-Net (IRDNU-Net). In this model carefully designed Residual and Inception modules are used in place of standard U-Net convolutional layers to increase the width of the model without increasing the computational complexity. Additionally, in the proposed architecture, the encoder and decoder are connected via a sequence of Inception-Residual densely nested paths to extract more information and increase the depth of the network while reducing the number of network parameters. The proposed segmentation architecture was evaluated on two large brain tumor segmentation benchmark datasets; the BraTS’2019 and BraTS’2020. It achieved a mean Dice similarity coefficient of 0.888 for the whole tumor region, 0.876 for the core region, and 0.819 for the enhancement region. Experimental results illuminate that IRDNU-Net outperforms U-Net by 1.8%, 11.4%, and 11.7% in the whole tumor, core tumor, and enhancing tumor, respectively. Moreover, the IRDNU-Net enables a great improvement on the accuracy compared to comparative approaches, and its ability in the face of challenging problems, such as small tumor regions, with fewer parameters.

MS UNet: Multi-scale 3D UNet for Brain Tumor Segmentation

A Multi-module 3D U-Net Learning Architecture for Brain Tumor Segmentation

A Pretrained DenseNet Encoder for Brain Tumor Segmentation

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Brain tumors are among the most deadly cancers worldwide [30]. The most prominent brain tumor [4] are Gliomas. Gliomas are graded into High-Grade (HGG) and Low-Grade (LGG) gliomas. Based on the tumor’s pathological evaluation gliomas, comprise numerous sub-areas of heterogeneous histology, namely enhancing, edema, and necrotic core tumors. Magnetic Resonance Imaging (MRI) is one of the most important tools for assessing gliomas as it can provide a lot of information about the tumor structure. The typical phases of MRI screening [21] include Fluid-attenuated reversal, T1-weighted, T1-weighted contrast-enhanced, and T2-weighted. Accordingly, brain tumor segmentation from multi-modality MRI is essential to evaluate the tumor aggressions, and responsiveness to glioma treatment, and it has beneficial applications for brain tumor diagnosis, tracking, and treatment. However, manual segmentation of certain brain tumors is vulnerable to human mistakes and it is a time-consuming task. There is also a lack of reproducibility, which harms patient management’s effectiveness and can result in ineffective therapy [31]. On the other hand, automatic brain tumor segmentation can provide a more effective solution [29]. The progress of deep learning in this area in recent years [10, 23] is immense, as it is considered the state-of-the-art in classification, segmentation, and detection applications [25]. Convolutional neural networks (CNNs) [3] specifically are one of the most popular techniques for building efficient segmentation approaches as they can automatically learn the most useful and relevant features. However, accurate tumor segmentation remains a difficult task due to the heterogeneous appearance and multiple types of brain tumors, as well as the high variability of brain tumor size, shape, position, and intensity, and contrast in different imaging modalities [34]. Therefor, in this paper we introduce a novel Inception Residual Dense Nested U-Net (IRDNU-Net) for solving the insufficient precision of small-scale tumors with fewer numbers of parameters. Experiments on two brain tumor segmentation datasets demonstrate that IRDNU-Net method enables a significant improvement in accuracy compared to other models, especially for small tumors, with less computational complexity. The main contributions of this study can be summarized as follows:

Based on the U-Net architecture, we propose an efficient brain tumor segmentation approach called IRDNU-Net. It can extract more representative features from brain tumors which enhances the segmentation accuracy specially for small tumors.
In IRDNU-Net, to make the network structure wider, the standard convolutional layers used in the U-Net architecture are replaced by carefully designed Residual Inception modules. The IRDNU-Net encoder and decoder sub-networks are linked by many nested dense paths to increase the depth of the network as well.
We evaluate the proposed architecture using two brain tumor segmentation datasets, Brats 2019 and Brats 2020. The experimental results indicate that the proposed automated segmentation approach is accurate and computationally efficient.

Following this introduction, Section 2 presents a brief survey of the related work. Section 3 includes the details of the proposed architecture. In Section 4, experiments and results are introduced in detail. Discussion is provided in Section 5, and conclusion is summarized in Section 6.

2 Related work

Utilizing deep learning for brain tumor segmentation has attracted increasing attention. Perira et al. [32] suggested a 2D CNN network with a deeper layer structure using small 3 x 3 kernels to develop an automated brain tumor segmentation approach, which ranked in the second position in the BraTS 2015 Challenge.

Havaei et al. [16] constructed a cascaded architecture that combines local and global paths. However, their architectures lose spatial continuity and requires a large storage space which leads to low segmentation performance. Zhao et al. [39] presented their solution to the segmentation of brain tumors by combining fully convolutional (FCN) and Conditional Random Fields (CRFs), which achieved competitive performance with only three imaging modalities, rather than four imaging modalities.

Using FCN architecture, Ronneberger et al. [33] constructed an asymmetric, fully convolutional network called U-Net, comprising a contracting path that extracts spatial image features and an expanding path that generates a segmentation map from the encoded features. U-Net was commonly used for medical image segmentation tasks [17, 28], and brain tumor segmentation is not an exception. Dong et al. [13] establish a 2D U-Net approach for automatic tumor segmentation, which utilizes the Dice loss function, they obtained equivalent results for the complete tumor region and better results for the core tumor regions. Cahall et al. [5] proposed a new image segmentation framework using a U-Net architecture with Inception modules to perform multi-scale feature extraction. They proposed a new loss function based on the modified Dice Similarity coefficient. Haichun Li et al. [26] presented a novel end-to-end approach for brain tumor segmentation, they proposed Inception module in each block to learn network, and an efficient cascade training method for segmenting sub regions however, the proposed approach suffers from a data imbalance. Cheng et al. [7] introduced a new Memory-Efficient Cascade 3D U-Net which uses fewer down-sampling channels to achieve high segmentation precision with less memory complexity. Ibtehaz et al. [20] proposed MultiRes blocks to build a more robust and reliable approach, their proposed approach are not perfect, but in most of the cases, it outperforms the classical U-Net by a moderate margin. Zhou et al. [41] presented the deep architecture supervision with re-designed skip pathways. However, the number of parameters, as well as the time to train the network, is significantly higher. Brain tumors are known to have complex forms and various sizes that contribute to the presence of small tumors. U-Net continuously reduces the image dimension during the down-sampling process, resulting in low segmentation accuracy for small-scale tumors.

To further enhance its segmentation performance, several modules, such as the Multi residual module, Dense module [12], and Inception module [36] are added to the baseline model, which has facilitated the development of methods for brain tumor segmentation. Parvez et al. [2] introduced approach of residual-dense connections based on U-Net model, however, the final scores not improved enough. In general, although some recent models have achieved some level of competitive success, they still have limitations. For example, 1) they are ineffective at identifying smaller tumors, 2) the most advanced models need enormous computational resources to achieve high segmentation accuracy. To address these shortcomings, we developed a novel Inception Residual Dense Nested U-Net. The Inception-Residual block allows us to make the proposed network substantially wider, while residual connection makes the network easier to train. Meanwhile, nested dense paths can increase the depth of the network, optimize the network results, and minimize the computational complexity. In addition, compared to other state-of-the-art methods, our IRDNU-Net significantly can achieve comparable results for the whole tumor regions, and superior results for the core tumor regions, and enhancing tumor regions, with fewer parameters.

3 Material and methods

3.1 Datasets

The proposed segmentation approach is evaluated using two benchmark datasets; the BraTS 2019, and BraTS 2020 brain tumor MRI datasets which were published by the Multi-Modal Brain Tumor Segmentation Challenge (BraTS) [11]. BraTS 2019 train dataset contains 335 cases, 259 cases of HGG, and 76 cases of LGG while the BraTS 2019 validation dataset has 125 unlabeled cases. The BraTS2020 dataset is larger than the BraTS2019 dataset with a training set of 367 cases. The datasets are co-registered, re-sampled, and skull-stripped to 1 mm³. The size of each MRI image is 240 × 240 × 155, and each case has FLAIR, T1-, T1-enhanced (T1c), and T2 volumes. Where the whole tumor (WT) region includes all intra-tumor regions, i.e., necrosis non-enhancing tumor, edema, and enhancing tumor, Tumor core (TC) region that incorporates non-enhancing tumor necrosis and enhancing tumor (ET) region. For training, 25,000 2D patches of size 128 × 128 × 4 are randomly sampled from each case that represents the four modalities. From the training dataset of BraTS 2019, 205 cases are used to form the 5,125,000 patches training set, and from BraTS 2020 we use 160 cases from the training dataset to form the 4,000,000 patches as shown in Table 1.

Table 1 Training data partitioning

Full size table

3.2 Methods

The proposed IRDNU-Net model for brain tumor segmentation depends on the same encoder-decoder architecture like the U-Net model. However, as shown in Fig. 1, we deepen and widen the encoder-decoder architecture by combining Inception-Residual modules and re-designing the skip pathways, namely dense skip connections are presented. This architecture will be used for brain tumor segmentation from four MRI modalities, namely; FLAIR, T1, T2, and T1c.

3.2.1 Inception Residual Dense Nested U-Net(IRDNU-Net)

U-Net structures are Encoder-Decoder architecture. The Encoder tries to gradually lower the spatial dimension of feature maps while capturing more high-level semantic characteristics. While The goal of the decoder is to restore object characteristics and spatial dimensions. Therefore, capturing more high-level characteristics in the encoder is important, and preserving more spatial information in the decoder to increase segmentation performance. Based on the encoder-decoder architecture, Inception-Residual module, and the Dense model, Fig. 1a demonstrates the proposed architecture denoted as IRDNU-Net. The proposed network structure is composed of an encoding branch and a decoding branch. In each encoder block we substitute the two convolution layers in the original U-Net model with the proposed Inception-Residual block, as it will be described in the next section, followed by a 2 × 2 max-pooling operation for down-sampling. At each down-sampling step we double the number of feature channels. Correspondingly, the same number of up-sampling processes is performed in the decoding branch to restore the spatial size of the segmented output. Each up-sampling is implemented by a 2 × 2 transposed convolution, and the number of feature channels is halved. In U-Net, the feature maps of the encoder are directly received in the decoder; however, in our proposed model dense nested paths in addition to Inception-Residual blocks connect the encoder to the decoder. We allocate a parameter P for each of the Inception-Residual blocks to control the number of convolution layer filters in these blocks to ensure a clear relation between the number of parameters of the baseline U-Net model and the proposed one. The value of P is calculated as follows:

$$ P= F \times \alpha $$

(1)

Where F represents the number of the filters in the U-Net layer, and α represents a scalar coefficient.

Similar to the baseline U-Net, we set the number of filters, F, to 32, 64, 128, 256, and 512. To minimize the number of parameters in our model, we assign α = 1.8 inside the Inception-Residual block, we assign ${\left [\frac {P} {12}\right ]}$, ${\left [\frac {P}{6}\right ]}$, ${\left [\frac {P}{4}\right ]}$, and ${\left [\frac {P}{2}\right ]}$ to the four consecutive convolutional layers, respectively. In our experiments, we found that this mixture produced good results: to balance between the parameters numbers reduction and improving segmentation accuracy. Like the U-Net architecture, it is also noted that P doubles or halves after each pooling or deconvolution process.

The Rectified Linear Unit (ReLU) function [15] is used for all covolutional layers in this network. The output layer is activated with the same sigmoid activation function as with the U-Net model. In Fig. 1, we show a diagram of the proposed Inception-Residual Dense Nested U-Net model. The details of our architecture are listed in Table 2.

Table 2 Inception-residual dense nested U-Net architectural details

Full size table

Inception-Residual Block

The Inception module [37] can experimentally enhance the visual representation. To increase the segmentation network’s ability, we use a residual connection, due to its effectiveness in the segmentation of medical images [14]. The goal is to add feature maps from various kernels of different sizes that can expand the network and allow it to learn multi-scale features. Based on Inception-Residual module [36] the modified Inception-Residual U-Net denoted as (IResU-Net) is proposed in this work. Figure 1c shows the modified Inception-Residual block. The modified Inception-Residual module includes multiple sets of 1 × 1 convolutions, 7 × 1 convolutions, and 1 × 7 convolutions. Compared with the original Inception-Residual module. We first add a Batch normalization (BN) [22] layer after each convolutional layer to alleviate the gradient vanishing problem. Consequently, 1 × 1 convolutions are applied on the identity skip connections to preserve a similar relationship between the base U-Net number of filters and our suggested model. As it can be noticed from Fig. 1c, the output filters generated from the convolution layers in the right branch, 1 × 1, 1 × 7, 7 × 1, are concatenated with convolution layers in the middle branch, 1 × 1, then this output added with the convolution layers in the left branch, 1 × 1. Suppose that y_l is the l_th layer output. The h_n×n(⋅)is a n × n kernel convolutional layer and h_b(⋅) indicates the BN layer. Where f_r(⋅) denotes the ReLU activation function, concatenation function is denoted by ∘, the output of each Inception-Residual module can be calculated as in (2):

$$ y_{l+1}=h_{b}(f_{r}(h_{b}(h_{1\times1}(h_{b}(h_{1\times1}(y_{l})\circ h_{b}(h_{7\times 1} (h_{1\times 7}(h_{1\times 1}(y_{l}))))))+h_{b}(h_{1\times1}(y_{l})))) $$

(2)

Dense Nested Paths

The encoder feature maps are directly obtained from the decoder in the original U-Net. While the main purpose of dense connection is making the network deeper and wider without causing the gradient vanishing problem. So, in IRDNU-Net, we also re-modeled skip pathways to enhance encoder-decoder network connectivity by using dense nested paths denoted as (DNU-Net) with Inception-Residual blocks. To facilitate and increase the model capacity for accurate brain tumor segmentation. We use three, two, and one Inception-Residual blocks in the four Dense Nesting paths, respectively; in the blocks of the four Nested paths, we use 32, 64, 128, and 256 filters, respectively as seen in Fig. 1a. Figure 1b shows the skip pathway between Inception-Residual Block^0,0 and Inception-Residual Block^0,4 which comprises a dense block of three Inception-Residual blocks. A concatenation process merges the preceding Inception-Residual block production with the equivalent up-sampled output of the lower dense block in the same dense block before each Inception-Residual block. The skip pathway is formulated as follows: b^i,j represent the Inception-Residual output. Considering i as the down-sampling layer on the encoder, and j as the dense block’s Inception-Residual layer along the skip pathway, the b^i,j is calculated according to:

$$ b^{i,j} =\left\{ \begin{array}{ll} IR(b^{i-1 ,j})& \quad ,j=0. \\ IR\left( \left[ \left[b^{i,k}\right]_{k=0}^{j-1},U(b^{i+1,j-1}) \right] \right) & \quad ,j>0. \end{array}\right. $$

(3)

Where the function IR(⋅) is the Inception-Residual operation, U(⋅) represents the layer of up-sampling, and [ ] represents the layer of concatenation. The blocks at j = 0 only obtain one entry from the last encoder layer. Blocks receive two inputs at level j = 1, both from the two-succession level encoder network and blocks at j > 1 obtain j + 1 inputs, which obtained from the previous j blocks on the same skip path, and the last input is the up-sampled output from the bottom skip path. Figure 1b illustrates (3) by displaying how the feature maps move along the top skip path of our proposed architecture.

3.3 Combined loss function

In a deep learning system, the loss function is crucial when we are dealing with extremely imbalanced data. A proper selection of the loss function may also improve the model accuracy. In this study, we utilize a combined loss function, defined in (4), that integrates the weighted Cross-Entropy loss (WCE), defined in (5) and Generalized Dice Loss (GDL) [35], defined in (6), to alleviate the effect of the class imbalance problem.

$$ CL=WCE + GDL $$

(4)

$$ WEC=\frac{-1} {k} \sum\limits_{k} {\sum\limits_{i}^{L}} w_{i} g_{ik} log (p_{ik}) $$

(5)

$$ GDL=1-2\frac{\left( {\sum\limits_{i}^{L}} \ w_{i}\sum\limits_{i} \ g_{i_{k}} \ p_{i_{k}}\right)} {{{\sum\limits_{i}^{L}} \ w_{i}\sum\limits_{i} (\ g_{i_{k }}+\ p_{i_{k}})}} $$

(6)

Where L represents the total number of labels, k is the size of a batch, w_i is the weight of the label i^th. For the generalized dice loss, g_i and p_i represent the value of the pixel of the binary ground truth image and the binary segmented image.

4 Experiments and results

4.1 Experimental setup

All experiments were performed using the Keras framework with the TensorFlow back-end. We used the stochastic gradient-based (SGD) [9] as optimizer. A batch of size 8 is used. For 20 epochs, we have trained our model because the loss of validity did not change afterward. The momentum = 0.8, the initial learning rate = 0.0001 decay with a decay factor of 0.1. The training was carried out on an Intel Corei7 3.5GHz machine using NVIDIA GeForce GTX 1070.

4.2 Evaluation metrics

In this work, we utilized the Dice score, Sensitivity, and Specificity metrics to assess the segmentation results. The Dice similarity score primarily measures the overlap region between the segmented lesion and the ground truth segmentation. Sensitivity is often referred to as a true positive rate, and the Specificity is utilized to define the true negative rate, which can be calculated using the next Equations, respectively:

$$ DSC=\frac{2TP}{(FP+2TP+FN)} $$

(7)

$$ Sensitivity =\frac{TP}{TP+FN} $$

(8)

$$ Specificity=\frac{TP}{TP+FP} $$

(9)

Where TP, FN, and FP denote true positive, false negative, and false positive.

4.3 Experimental results

Our experiments are divided into three parts, which are carried out on the BraTS 2019 training dataset, the BraTS 2019 validation dataset, and the BraTS 2020 testing dataset. Evaluation results of BraTS2019 training and validation dataset are disseminated on the challenge leaderboard website.^{Footnote 1} Meanwhile, the BraTS2020 testing dataset is individual runs.

4.3.1 Evaluation results on BraTS2019 training dataset

205 cases from the BraTS 2019 datasets are used in this experiment. 80% of the dataset (164 subjects) are used for model training and the remaining 20% (41 subjects) are used for validation as mentioned in Table 1. The evaluation results of the proposed IRDNU-Net on the BraTS 2019 training dataset are presented in Table 3. Quantitatively the proposed network achieved a DSC of 0.888 for the whole tumor, 0.876 for core tumor, and 0.819 for the enhancing tumor. The mean, standard deviation, the median, 25th and 75th percentiles for all metrics are shown in the Table 3. The proposed approach has been evaluated by DSC, Specificity, and Sensitivity, which are measured using the online evaluation system on the leaderboard BraTS 2019 online website.^{Footnote 2}

Table 3 Segmentation results on Training BraTS 2019 Dataset

Full size table

Ablation Study

To study the effect of different modules and enhanced architectures, we perform an ablation study for the DNU-Net, IRU-Net, and IRDNU-Net models. The ablation study results are summarized in Table 4 for the dice similarity coefficient (DSC), where (DNU-Net), (IRU-Net), and (IRDNU-Net) are our enhanced models on U-Net. As can be noticed from Table 4, IRDNU-Net produces the most accurate segmentation results among the four models with an improvement of 1.8% on the whole tumor, 11.4% for core tumor, and 11.7% on enhancing tumor over the standard U-Net. It outperforms the DNU-Net by a ratio of 1.2% for the whole tumor, 4.3% for the core tumor, and 7.8% for enhancing tumor. Compared with IRU-Net, IRDNU-Net outperforms IRU-Net with gains of 0.6% on the whole tumor, 6.7%, and 5% accuracy improvement on the core tumor, and enhancing tumor, respectively.

Table 4 Ablation study on the training BraTS 2019 Dataset and comparison segmentation results with baselines

Full size table

Comparative Study

The proposed IRDNU-Net is also compared to other related brain tumor segmentation approaches to assess its efficiency. This comparison is presented in Table 5, IRDNU-Net outperforms other top networks in the DSC value for the core tumor and enhancing tumor but it is slightly lower than the approach proposed by li et al. [26], and Hu et al. [18] for the whole tumor. In li et al. [26], the network structure is optimized by designing and refining the U-Net architecture. K.Hu et al. [18] authors apply fully connected conditional random fields and multi-cascaded. In comparison to Zhang et al. [38] and Chen et al. [8] methods, the proposed IRDNU-Net model achieves enhanced segmentation efficiency. Zhao et al. [39] utilized conditional random field to increase efficiency. However, our IRDNU-Net achieves 0.8% on the whole tumor, 2.7%, on core tumor, and 4.9% for enhancing tumor achieving gains over them , without applying any post-processing strategy. Compared to the network developed in Memory Efficient Cascade 3D U-Net [7], Our IRDNU-Net outperforms this network for core tumor and enhanced tumor by a large margin of 5.6% and 5.4%.

Table 5 Comparison of segmentation results on the BraTS 2019 Training Dataset with typical methods

Full size table

By comparing the sensitivity metric, our IRDN U-Net achieves a sensitivity score of 0.883 for the whole tumor, 0.869 for the core tumor, and 0.857 for enhancing tumor segmentation. In particular, an optimum sensitivity score for core tumor segmentation is obtained. While the best sensitivity score for the whole and Enhancing tumor obtained by Hu et al. [18], the comparative sensitivity results, to a certain extent, indicate the efficacy of IRDNU-Net in the segmentation of small tumors. By comparing the specificity score, our IRDNU-Net achieved a specificity of 0.994 for the whole tumor, 0.998 for the core tumor, and 0.997 for the enhancing tumor. In general, the IRDNU-Net model can achieve competitive efficiency and outperforming other state-of-the-art techniques.

4.3.2 Evaluation results on BraTS2019 validation dataset

We use 66 validation cases from the validation to take part in the BraTS 2019 competition. Our algorithm’s segmentation efficiency was calculated by using the online evaluation system for DSC, Specificity, and Sensitivity in the challenge leaderboard Web site.^{Footnote 3} These results are available on the leader-board section of these challenges under the title “Nagwasalim”. The experimental results are shown in Table 6. Quantitatively, DSC is 0.865 for the whole tumor, 0.864 for the core tumor, and 0.806 for the enhancing tumor. The mean, standard deviation, the median, 25th and 75th percentiles of all metrics are also shown in Table 6.

Table 6 Segmentation results on BraTS 2019 validation dataset

Full size table

Ablation Study

Table 7 demonstrates a comparison of the segmentation results with baselines. Also, Table 8 demonstrates the comparative results with other standard approaches. The comparison results between U-Net, DNU-Net, IRU-Net, and IRDNU-Net has shown in Table 7 are identical to those in Table 4. In comparison, IRDNU-Net achieves higher performance than U-Net. While 0.1% and 11.8% increase on U-Net for the whole tumor, and core tumor segmentation, it increases the enhancing tumor by 11.2%, demonstrating its good effect on little tumor segmentation. IRDNU-Net outperforms DNU-Net 0.3%, 4.3%, and 6.5% on the whole, core, and enhancing tumors. It exceeds IRU-Net by 6.1% margin in core tumor segmentation. These comparative results define the ability of IRDNU-Net to segment brain tumors.

Table 7 Ablation study on the Validation BraTS 2019 Dataset and comparison segmentation results with baselines

Full size table

Table 8 Comparison of segmentation results on the BraTS 2019 Validation Dataset with typical methods

Full size table

Comparative Study

Table 8 demonstrates the suggested technique’s effectiveness and other advanced techniques to the 66 validation datasets; IRDNU-Net provides highly competitive performance relative to other advanced brain tumor segmentation approaches. IRDNU-Net achieves DSC values of 86.5% on the whole tumor, 86.4% on the core tumor, and 80.6% on the enhancing tumor. In specific, Our approach achieves the highest DSC values for (core and enhancing) tumor and core tumor sensitivity. Hu et al. [18] achieved slightly higher on whole tumor segmentation; their approach suggested multi cascaded convolutional neural networks. Still, their models cannot achieve good segmentation results for each view. Our IRDNU-Net achieves superior segmentation efficiency on DSC and Sensitivity metric than some recent approaches by Hu et al. [19] and Abouelenien et al. [1].

Figure 2 shows a sample results of the standard U-net model compared to the proposed one. The red regions in this figure refer to necrosis, edema is shown in green areas, and enhancing tumor is the yellow areas. In the meantime, Flair image, ground truth, U-Net, and IRDNU-Net segmentation results are shown from left to right, respectively.It can be noticed from Fig. 2 that, IRDNU-Net evident produces the best performance in the segmentation of brain tumors. Figure 3 shows the results for Dice, Sensitivity, and Specificity for Validation data. The boxplots show the minimum, median, maximum, lower, and upper quartile. Points outside of the interquartile are referred to as outliers. From the boxplots, it was evident that our algorithm achieves considerably high segmentation accuracy in most cases.

4.3.3 Evaluation results on BraTS 2020 training dataset

We also execute an experiment on the BraTS 2020 training database to demonstrate our approach’s effectiveness further. Here, to train our brain tumor segmentation models, we use 160 cases from the training dataset and 56 cases for the test as mention in Table 1. Table 9 shows the ablation study and comparison results with baselines. Besides, Fig. 4 Represents bar plots of the average DSC for the BraTS 2020 test dataset for the three tumor regions. In this experiment, IRDNU-Net achieves higher segmentation efficiency in the three tumor regions relative to its baseline U-Net. Meanwhile, IRDNU-Net exceeds the U-Net by 0.9% in the whole, 4.7% in the core tumor, and 5% in the enhancing tumor, respectively. After that, compared DNU-Net with IRDNU-Net, the DSC scores increased by 0.7%, 0.8%, and 3%, respectively, on three tumor segmentation regions. In specific, our models gain accuracy improvement over the baselines; this is also because of the efficacy of multi Inception residual with dense nested U-net in improving the segmentation of small brain tumors. Besides that, Fig. 5 shows a scatter diagram of the validation set. It reveals that for most brain images, our algorithm performs well. Because the BRATS 2020 data varies widely, and the class distribution is severely imbalanced, some outliers cause a reduction in the average score.

Table 9 Ablation study on the BraTS 2020 test dataset and segmentation results with baselines

Full size table

Moreover, the proposed model decreases the number of trainable parameters,reducing the computational cost. Table 10 displays a comparison of the trainable parameters and times for one epoch for both the proposed model, and related models, which demonstrate that IRDNU-Net has 5.91M trainable parameters only, which is the least parameters among all competitive methods. Lin et al. [27] have 24.62M parameters, which are almost four times the proposed method. Zhou exceeds our proposed method by little margin in the training times for one epoch. These results showed that IRDNU-Net requires the lower computational resources with only 5.91M parameters. Generally, our suggested approach obtains a good balance between brain tumor segmentation accuracy and the number of training parameters of the BRATS2019 and BRATS2020 datasets.

Table 10 The number of trainable parameters and the average time for one epoch

Full size table

5 Discussion

Accurate segmentation of gliomas has garnered considerable interest from medical doctors and researchers as a critical component of tumor diagnosis, treatment preparation, and subsequent assessment. Since manual tumor segmentation is laborious and time-consuming, developing an accurate automated segmentation is very important. Therefore, in this study, We develop an inception residual model with nested dense paths based on the U-Net to achieve high segmentation accuracy with fewer parameters. We evaluated the network on BRATS 2019 and BRATS 2020 datasets. These datasets are composed of MRI images taken from various institutions. It is gathered by regular clinical assessment of preoperative scans for glioblastoma patients. The proposed network achieved better results compared to the U-Net, and other techniques. The results in Fig. 2, indicates that the size, shape, location, and intensity of tumors in these samples are different, and also enhance the segmentation performance for small tumor regions. Generally, the results of the proposed architecture are comparable to the ground truth result. In Tables 5, and 8 it is also observed a small gaps for the evaluation metrics between training and validation dataset because we used only 66 cases in validation due to memory limitations of the current GPU. Figure 3 shows boxplots for validation of the BraTS’2019 dataset it is observed that the variance of the specificity tumor core (TC) is larger than that of specificity enhancing tumor(ET), the most likely reason for this is that the network sometimes incorrectly predicts the whole tumor as the tumor core due to the influence of the LGG tumor samples, resulting in increased variance. Due to the memory limitations of the current GPU, and the multi-modality nature of MRI, it is worth noting that in the proposed method, the time for one epoch is around 5 min.

6 Conclusion

This paper introduced an efficient IRDNU-Net model for automated brain tumor segmentation from Multi-modality MRI images. This approach is an efficient extension to the successful idea of encoder-decoder fully convolutional neural networks. First, we integrate the Inception module and residual units into U-Net in each block to enhance brain tumor segmentation performance. A series of dense nested pathways then connect the sub-networks encoder and decoder. The re-modeled skip connections aim to minimize the semantic gape between the feature maps of the encoder and decoder networks. We assessed our proposed approach using the BRATS 2019 and BRATS 2020 datasets. The experiment results showed that IRDNU-Net surpassed the U-Net and other typical brain tumors segmentation approaches by a large margin. IRDNU-Net is capable of achieving comparable segmentation accuracy with fewer parameters. However, to build our segmentation model, we used 2D slices due to computational resources limitations. In the future, we expect to work on 3D networks while seeking a balance between high accuracy and computational resources. In addition, we will use a more powerful GPU. Also, for further evaluation, we will extend our model to other medical segmentation image tasks.

Notes

References

Aboelenein NM, Songhao P, Koubaa A, Noor A, Afifi A (2020) HTTU-Net: hybrid two track U-Net for automatic brain tumor segmentation. IEEE Access 8:101406–101415
Article Google Scholar
Ahmad P, Qamar S, Hashemi S R, Shen L (2019) Hybrid labels for brain tumor segmentation. In: International MICCAI brainlesion workshop, pp 158–166
Bakas S, Reyes M, Jakab A, Bauer S, Rempfler M, Crimi A, Eaton-Rosen Z (2018) Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. arXiv preprint arXiv:1811.02629
Bauer S, Wiest R, Nolte L P, Reyes M (2013) A survey of MRI-based medical image analysis for brain tumor studies. Physics in Medicine and Biology, 58
Cahall DE, Rasool G, Bouaynaya N C, Fathallah-Shaykh HM (2019) Inception modules enhance brain tumor segmentation. Front Comput Neurosci 13:44
Article Google Scholar
Chandra S, Vakalopoulou M, Fidon L, Battistella E, Estienne T, Sun R, Paragios N (2018) Context aware 3D CNNs for brain tumor segmentation. In: International MICCAI brainlesion workshop. Springer, Berlin, pp 299–310
Cheng X, Jiang Z, Sun Q, Zhang J (2019) Memory-efficient cascade 3D U-Net for brain tumor segmentation. In: International MICCAI brainlesion workshop. Springer, Berlin, pp 242–253
Chen W, Liu B, Peng S, Sun J, Qiao X (2018) S3D-UNet: Separable 3D U-Net for brain tumor segmentation. In: International MICCAI brainlesion workshop. Springer, Berlin, pp 358–368
Da K (2014) A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Dargan S, Kumar M, Ayyagari MR, Kumar G (2019) A survey of deep learning and its applications: A new paradigm to machine learning. Archives of Computational Methods in Engineering,, pp 1–22
Dataset:CBICA. https://www.med.upenn.edu/cbica/brats2019/data.html
Dolz J, Gopinath K, Yuan J, Lombaert H, Desrosiers C, Ayed I B (2018) HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentation. IEEE Trans Med Imaging 38(5):1116–1126
Article Google Scholar
Dong H, Yang G, Liu F, Mo Y, Guo Y (2017) Automatic brain tumor detection and segmentation using u-net based fully convolutional networks. In: Annual conference on medical image understanding and analysis. Springer, Berlin, pp 506–517
Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C (2016) The importance of skip connections in biomedical image segmentation. In: Deep learning and data labeling for medical applications. Springer, Berlin, pp 179–187
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning 1-2. MIT press, Cambridge
MATH Google Scholar
Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, Larochelle H (2017) Brain tumor segmentation with deep neural networks. Med Image Anal 35:18–31
Article Google Scholar
He H, Zhang C, Chen J, Geng R, Chen L, Liang Y, Xu Y (2021) A hybrid-attention nested UNet for Nuclear segmentation in histopathological images. Front Mol Biosci 8:6
Google Scholar
Hu K, Gan Q, Zhang Y, Deng S, Xiao F, Huang W, Gao X (2019) Brain tumor segmentation using multi-cascaded convolutional neural networks and conditional random field. IEEE Access 7:92615–92629
Article Google Scholar
Hu Y, Xia Y (2017) 3D deep neural network-based brain tumor segmentation using multimodality magnetic resonance sequences. In: International MICCAI brainlesion workshop. Springer, Berlin, pp 423–434
Ibtehaz N, Rahman M S (2020) MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Article Google Scholar
Işın A, Direkoǧlu C, Şah M (2016) Review of MRI-based brain tumor image segmentation using deep learning methods. Procedia Comput Sci 102:317–324
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448–456
Ker J, Wang L, Rao J, Lim T (2017) Deep learning applications in medical image analysis. Ieee Access 6:9375–9389
Article Google Scholar
Kermi A, Mahmoudi I, Khadir MT (2018) Deep convolutional neural networks using U-Net for automatic brain tumor segmentation in multimodal MRI volumes. In: International MICCAI brainlesion workshop. Springer, Berlin, pp 37–48
Kumar M, Gupta S, Kumar K, Sachdeva M (2020) Spreading of COVID-19 in India, Italy, Japan, Spain, UK, US: A prediction using ARIMA and LSTM model. Digit Gov: Res Prac 1(4):1–9
Article Google Scholar
Li H, Li A, Wang M (2019) A novel end-to-end brain tumor segmentation method using improved fully convolutional networks. Comput Biol Med 108:150–160
Article Google Scholar
Lin F, Wu Q, Liu J, Wang D, Kong X (2020) Path aggregation U-Net model for brain tumor segmentation. Multimedia Tools and Applications, pp 1–14
Lou A, Guan S, Loew M H (2021) DC-UNet: Rethinking the U-Net architecture with dual channel efficient CNN for medical image segmentation. In: Medical imaging 2021: image processing 11596: 115962T, international society for optics and photonics
Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Van Leemput K (2014) The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging 34(10):1993–2024
Article Google Scholar
Miranda-Filho A, Piñeros M, Soerjomataram I, Deltour I, Bray F (2017) Cancers of the brain and CNS: global patterns and trends in incidence. Neuro-oncology 19(2):270–280
Google Scholar
Olabarriaga SD, Smeulders AW (2001) Interaction in the segmentation of medical images: a survey. Med Image Anal 5(2):127–142
Article Google Scholar
Pereira S, Pinto A, Alves V, Silva C A (2016) Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans Med Imaging 35(5):1240–1251
Article Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 234–241
Saouli R, Akil M, Kachouri R (2018) Fully automatic brain tumor segmentation using end-to-end incremental deep neural networks in MRI images. Comput Methods Prog Biomed 166:39–49
Article Google Scholar
Sudre C H, Li W, Vercauteren T, Ourselin S, Cardoso M J (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Berlin, pp 240–248
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 31, p 1
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Zhang J, Jiang Z, Dong J, Hou Y, Liu B (2020) Attention Gate ResU-Net for automatic MRI brain tumor segmentation. IEEE Access 8:58533–58545
Article Google Scholar
Zhao X, Wu Y, Song G, Li Z, Zhang Y, Fan Y (2018) A deep learning model integrating FCNNs and CRFs for brain tumor segmentation. Med Image Anal 43:98–111
Article Google Scholar
Zhou C, Ding C, Wang X, Lu Z, Tao D (2020) One-pass multi-task networks with cross-task guided attention for brain tumor segmentation. IEEE Trans Image Process 29:4516–4529
Article Google Scholar
Zhou Z, Siddiquee M M R, Tajbakhsh N, Liang J (2018) Unet++: A nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support. Springer, Berlin, pp 3–11

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
Nagwa M. AboElenein & Piao Songhao
Faculty of Computers and Information, Menoufia University, Menoufia, 32511, Egypt
Nagwa M. AboElenein & Ahmed Afifi
Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, P.O. Box 400, Al-Ahsa, 31982, Saudi Arabia
Ahmed Afifi

Authors

Nagwa M. AboElenein
View author publications
You can also search for this author in PubMed Google Scholar
Piao Songhao
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Afifi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piao Songhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

AboElenein, N.M., Songhao, P. & Afifi, A. IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation. Multimed Tools Appl 81, 24041–24057 (2022). https://doi.org/10.1007/s11042-022-12586-9

Download citation

Received: 19 March 2021
Revised: 22 May 2021
Accepted: 31 January 2022
Published: 19 March 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11042-022-12586-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

IRDNU-Net: Inception residual dense nested u-net for brain tumor segmentation

Abstract

Similar content being viewed by others

MS UNet: Multi-scale 3D UNet for Brain Tumor Segmentation

A Multi-module 3D U-Net Learning Architecture for Brain Tumor Segmentation

A Pretrained DenseNet Encoder for Brain Tumor Segmentation

Explore related subjects

1 Introduction

2 Related work

3 Material and methods

3.1 Datasets

3.2 Methods

3.2.1 Inception Residual Dense Nested U-Net(IRDNU-Net)

Inception-Residual Block

Dense Nested Paths

3.3 Combined loss function

4 Experiments and results

4.1 Experimental setup

4.2 Evaluation metrics

4.3 Experimental results

4.3.1 Evaluation results on BraTS2019 training dataset

Ablation Study

Comparative Study

4.3.2 Evaluation results on BraTS2019 validation dataset

Ablation Study

Comparative Study

4.3.3 Evaluation results on BraTS 2020 training dataset

5 Discussion

6 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation