Abstract
Automated brain lesions detection is an important and very challenging clinical diagnostic task, because the lesions have different sizes, shapes, contrasts and locations. Deep Learning recently shown promising progresses in many application fields, which motivates us to apply this technology for such important problem. In this paper we propose a novel and end-to-end trainable approach for brain lesions classification and detection by using deep Convolutional Neural Network (CNN). In order to investigate the applicability, we applied our approach on several brain diseases including high and low grade glioma tumor, ischemic stroke, Alzheimer diseases, by which the brain Magnetic Resonance Images (MRI) have been applied as input for the analysis. We proposed a new operation unit which receives features from several projections of a subset units of the bottom layer and computes a normalized l2-norm for next layer. We evaluated the proposed approach on two different CNN architectures and number of popular benchmark datasets. The experimental results demonstrate the superior ability of the proposed approach.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Annually in the United State alone 24,000 adult and 4,830 children will be diagnosed as new cases of brain cancer. A lot of people have died due to brain tumor, multiple sclerosis, ischemic stroke and Alzheimer diseasesFootnote 1. Medical imaging is an important tool for brain diseases diagnosis in case of surgical or chemical planning. Magnetic Resonance Imaging (MRI) can provide rich information for premedication and surgery medication, which is extremely helpful for evaluating the treatment and lesion progress. However the raw data extracted from MR images is hard to be directly applied for diagnosis due to the large amount of the data. An accurate brain lesion detection and classification algorithm based on MR images might be able to improve the prediction accuracy and efficiency, that enables a better treatment planning and optimize the diagnostic progress. As mentioned by Menze et al. [1], the number of clinical study for automatic brain lesion detection has grown significantly in the last several decades. Some brain lesions such as ischemic strokes, or even tumors can appear with different shapes, inappropriate sizes and unpredictable locations within the brain. Furthermore, different types of MRI machines with specific acquisition protocols may provide MR images with a wide variety of gray scale representations on the same lesion cells. Recent research has shown strong ability of Convolutional Neural Network (CNN) for learning hierarchical representation of image data without requiring any effort to design handcrafted features [2,3,4]. This technology became very popular in computer vision society for image classification [5, 6], object detection [7,8,9], medical image classification [10, 11] and segmentation [12, 13]. As mentioned by LeCun et al. in [2]: different layers of a network are capable of different levels of abstraction, and capture different amount of structures from the patterns present in the image.
In this work we investigate the applicability of CNN for brain lesions detection. Our goal is to perform localization and classification of single as well as multiple anatomic regions in volumetric clinical images from various image modalities. To this end we propose a novel framework based on CNN with l2-norm unit. A detailed evaluation on parameter variations and network architectures has been provided. We show that l2-norm operation unit is robust to the error variations in the classification task and is able to improve the prediction result. We conducted experiments on a number of brain MRI datasets, which demonstrate the excellent generalization ability of our approach. The contribution of this work can be summarized as following:
-
We propose a robust solution for brain lesions classification. We achieved promising results on four different brain diseases (The overall accuracy is over 95%).
-
We applied multiple MRI modalities as network input, and this improved the dice coefficient up to 30% on ISLES benchmark.
-
We implemented l2-norm unit in Caffe [14] framework for both CPU and GPU computation. The experimental results demonstrate the superior ability of l2-norm in various tasks.
The rest of the paper is organized as follows: Sect. 2 describes the proposed approach, Sect. 3 presents the detailed experimental results. Section 4 concludes the paper and gives an outlook on future work.
2 Methodology
In this chapter we will describe our deep network for classification and detection task in detail. The core techniques applied in our approach are depicted as well. In the recent deep learning context, a deep neural network can be built driven by two principles: Modularity and Residual learning. Modularity is a set of repeatable smaller neural network unit which enables the learning of high-level visual representations. The bottleneck module of the Inception architecture [15] and the corresponding units in VGG-Net [16] can be considered as typical examples. In such networks the wide and depth have been significantly increased. On the other hand residual learning [6] considers new way to each layer. Every consequent layer is responsible for, in effect, fine tuning the output from a previous layer by just adding a learned “residual” connection to the input. This essentially drives the new layer to learn something different from what the input has already encoded. Another important advantage is that such residual connections can help in handling gradient vanishing problem in very deep networks [6]. Figure 1 shows an exemplary residual building block, where \(F(x) + x\) denotes the element-wise addition of the original input and the residual connection. The block on the left depicts vanilla residual unit proposed by He et al. [6], where the one on the right side is a dense block that we utilize in our classification network.
2.1 l2-Norm Unit
In linear algebra, the size of a vector v is called the norm of v. The two-norm (also known as the l2-norm, mean-square norm, or least-squares norm) of a vector v is defined by Eq. 2. Assume we have a 2D matrix \(X_{i,j}\) (cf. Eq. 1) which is the output of the specific patch of \(a_{i,j}\) from the first convolution layer. Then for each item in feed forward or backward pass we calculate the l2-norm as described by Eqs. 2 and 3. We consider l2-norm operation as a pooling function and apply it to reduce the dimension of the learned representations, which is able to obtain better generalization ability. For example in the classification task an input volume of size \(224\times 224\times 64\) is pooled by l2-norm operator with filter size 2 and stride 2 into an output volume of size \(112\times 112\times 64\).
2.2 Brain Abnormality Classification
Recently, ResNet (Deep Residual Network) [6] achieves the state-of-the-art performance in object detection and other vision related tasks. As mentioned above we explored the ResNet architecture with l2-norm unit for brain abnormality classification. Figure 2 depicts the network architecture. Our classification network takes 2D images with three channels, while each channel contains a gray scale copy with the same size and same plane from various MRI modalities with respective class label l = {0,1, ..., 4}. Each gray scale copy extracted from T1, T1c and FLAIR of the same MRI categories has been mapped to the Red, Green and Blue channels of a standard image container, respectively. The proposed network strongly inspired by vanilla ResNet block depicted by Fig. 1.
As shown in Fig. 2, we apply l2-norm operation after the first convolution layer and before the first inner product layer. In the experiments we observed that the l2-norm layer performs a similar effect as a pooling operator, which reduces the spatial size of the feature representations and extracts features that are not covered by standard pooling operators. This allows the network to learn more distinguished feature information such as variance from the data stream, which could improve the overall generalization ability of the model.
2.3 Brain Lesions Detection
Unlike image classification, object detection extracts location and region information of a target object within an image. Figure 3 represents our network for brain abnormality detection. In our work-flow, we extract and apply multiple modalities from MRI images, where the images are sampled in 2D slices from the axial, coronal and sagittal view with various sizes. Inspired by Fast R-CNN network [17], we build our CNN network based on VGG-16 [16] style architecture as the feature extractor. Instead of using max-pooling and spatial max-pooling we place the l2-norm unit after the second convolution (conv1-2) layer and before the first fully connected (inner product) layer respectively. We utilize selective search [18] to generate object proposals, which is a set of object bounding boxes. The proposal sampling process is performed on top of dense feature layer after layer conv5-3. We confirm the suggested solution by Girshick et al. [17] to come over on heterogeneous collection of computed proposals and divide them into a pyramid grid of sub-windows. Here three pyramid levels \(4\times 4, 2\times 2, 1\times 1\) and l2-norm “pooling” have been applied in each sub-window to generate the corresponding output grid cell. Subsequently each output feature vector is further fed into a sequence of fully connected layers, which is followed by two sibling output layers: the SVM (Support Vector Machine) classifier for object class estimation [19], and the bounding box regression layer to calculate the loss of proposed object bounding boxes. The overall training is performed in the supervised manner, and the loss of the whole network sums losses from both object classification and bounding box regression.
3 Experimental Results
In the experiment we applied real patient data from five popular benchmarks to evaluate the proposed methods. For classification task we totally compiled 1500 MRI images with label of healthy, tumor-HGG, tumor-LGG, Alzheimer and multiple sclerosis. We consider 20% of the data for testing and 80% for training. IXI dataset [20] contains 600 MRI images from normal, healthy subjects. The MRI image acquisition protocol for each subject includes six modalities, from which we have used T1, T2, PD, MRA images. The first column of Fig. 4 shows the healthy brain images from IXI dataset in the sagittal, coronal and axial sections. The BraTS2016 benchmark [1, 21] prepared the data in two part of High and Low Grade Glioma (HGG/LGG) Tumor. All images have been aligned to the same anatomical template and interpolated to 1 mm, 3 voxel resolution. The training dataset consists of 220 HGG and 108 LGG MRI images which for each patient T1, T1contrast, T2, FLAIR and ground truth labeled by medical experts have been provided. Alzheimer disease datasetFootnote 2 comes from Open Access Series of Imaging Studies (OASIS). The dataset consists of a cross-sectional collection of 416 subjects aged from 18 to 96. For each subject, 3 or 4 individual T1-weighted MRI scans were obtained in single scan sessions. 18 MRI images with multiple sclerosis from ISBI challenges 2008 [22] have also been applied in the classification task. ISLES benchmark 2016 [23] (Ischemic Stroke Lesion Segmentation) comes from MICCAI challenge in two part, by which we used only SPES dataset with 30 brain images with 7 modalities in our task. An visual overview of the applied datasets can be found in Fig. 4.
Because the MRI volumes in the BraTS and ISLES datasets do not possess an isotropic resolution, we prepared 2D slices in sagittal, axial and coronal view. As mentioned by Havaei et al. [24], unfortunately brain imaging data are rarely balanced due to the small size of the lesion compared to the rest of the brain. For example the volume of a stroke is rarely more than 1% of the entire brain and a tumor (even large glioblastomas) never occupies more than 4% of the brain. Training a deep network with imbalanced data often leads to very low true positive rate since the system gets to be biased towards the one class that is over represented. To overcome this problem we have chosen volume of MRI with lesions, and augmented training data by using horizontal and ventricle flipping, multiple scaling. By using a re-designed ResNet architecture described in Sect. 2, we achieved over 95% classification accuracy as shown in Table 1, while Fig. 5 demonstrates the confusion matrix of the classification result. We also compared our result with the most recent deep learning based approaches as shown in Table 1, where the reference method also used IXI, OASIS datasets. Figure 6 shows learning curves of testing accuracy, training and testing losses during the training process. For brain lesions detection experiment we applied both BraTS and ISLES datasets. We used 70% of the data for training, 10% for validation and 20% for testing. It is expected that more generalized features could be able to learned from multiple modalities, and the testing accuracy based on more generalized features should be gained. The brain lesions detection results from Table 2 proved our assumption, where better detection results were achieved by increasing the data modalities in the model training. The detection result can be improved by 20% in BraTS and 30% in ISLES dataset.
From Table 2, we can also infer that the FLAIR modality is the most relevant one for identifying the complete tumor (Dice: 73.38%), However in ISLES benchmark we don’t have this modality and it is justified less accuracy on this category. It motivated us to work on generating the missing modalities in the future. The subjects in Fig. 7 are from our testing set, for which the model is not trained on, the detection results from these subjects could give a good estimation of the model performance. Table 3 demonstrates the evaluation results of the detection architectures with and without l2-norm unit. From which we can easily realize the superior ability of the proposed l2-norm operator. We are able to improve the detection performance significantly on both datasets by using this novel operator.
4 Conclusion
In this paper, we explored two important clinical tasks: brain lesions classification and detection. We proposed end-to-end trainable approaches based on state-of-the-art deep convolutional neural networks. We implemented a novel pooling operator: l2-norm unit which can effectively generalize the network, and make the learned model more robust. The applicability, model accuracy and generalization ability have been evaluated by using a set of publicly available datasets. As the future work we will further investigate the automatic segmentation of tumor regions based on the detection results.
References
Menze, B., Reyes, M., Van Leemput, K.: The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2014)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Gulcehre, C., Cho, K., Pascanu, R., Bengio, Y.: Learned-norm pooling for deep feedforward and recurrent neural networks. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 530–546. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44848-9_34
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. In: Advances in Neural Information Processing Systems, pp. 2553–2561 (2013)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR abs/1506.01497 (2015)
Paul, J.S., Plassard, A.J., Landman, B.A., Fabbri, D.: Deep learning for brain tumor classification, vol. 10137, pp. 1013710-1–1013710-16 (2017)
El Abbadi, N.K., Kadhim, N.E.: Brain cancer classification based on features and artificial neural network. Brain 6(1) (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). doi:10.1007/978-3-319-24574-4_28
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. CoRR
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. CoRR (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Girshick, R.B.: Fast R-CNN. CoRR abs/1504.08083 (2015)
Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. Int. J. Comput. Vis. 104, 154–171 (2013)
Liu, G., Zhang, X., Zhou, S.: Multi-class classification of support vector machines based on double binary tree. In: Fourth International Conference on Natural Computation, ICNC 2008, vol. 2, pp. 102–105. IEEE (2008)
http://www.medinfo.cs.ucy.ac.cy/index.php/downloads/datasets/
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.M., Larochelle, H.: Brain tumor segmentation with deep neural networks. Med. Image Anal. 35, 18–31 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Rezaei, M., Yang, H., Meinel, C. (2017). Deep Neural Network with l2-Norm Unit for Brain Lesions Detection. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10637. Springer, Cham. https://doi.org/10.1007/978-3-319-70093-9_85
Download citation
DOI: https://doi.org/10.1007/978-3-319-70093-9_85
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70092-2
Online ISBN: 978-3-319-70093-9
eBook Packages: Computer ScienceComputer Science (R0)