Keywords

1 Introduction

Nowadays, brain tumors are classified as one of the most common major causes of mortality in the world. A brain tumor consists of a group of cells that grows abnormally by multiplying rapidly with each other inside of the brain, or around the brain. Besides the pressure in the brain, tumors also cause abnormal neurological symptoms, which can lead to the death of a patient. Early diagnosis of brain tumors is imperative to prevent permanent damage to the brain or death of the individual [1, 2]. Brain tumor detection is a complicated job because of their varied behavior both in terms of structure and intensity. Advanced paradigms in imaging systems have greatly enhanced the interpretation of medical images [3, 4]. Magnetic resonance imaging (MRI) is preferred over other imaging modalities for brain tumor analysis, monitoring [5]. To detect an abnormality, these images are visually scanned by physicians to assure a reliable result. In some detection cases, complex anatomy and the varying abilities of the physicians may result in an interpretation errors [6]. In this regard, Computer-Aided Detection (CADe) systems have taken a major place in routine clinical work to assist medical experts in decision-making. Those systems aim to enhance the diagnostic capabilities of physicians and reduce the time required for accurate diagnosis by providing a computer output as a second opinion that can supply a fast and precise diagnosis [7]. The goal of CADe system in brain images is to help radiologists avoid missing a tumor by detecting and marking suspicious areas in an image [8]. Different CADe systems may apply different steps. A CADe system typically comprises two principles stages, namely: (1) Pre-processing (2) Segmentation. The objective of image pre-processing is to enhance the quality of data by eliminating noise and other useless information from the raw images. Segmentation is an important step in CADe process, because the effectiveness of subsequent tasks, including feature extraction and classification, depends highly on the quality of the Region of interest (ROI). The main purpose of segmentation stage in brain tumor detection is to accurately separate the tumor parenchyma from other tissues [9]. Recently, Deep Learning (DL) [10, 11] technologies have played a crucial roles in CADe as well as they have demonstrated remarkable abilities in detecting a variety of tumors, such as skin cancers [12], breast [13] andd Glaucoma [14]. Using a large dataset for training helps the learned model to achieve an accuracy comparable to human experts [15].

Unfortunately, many application fields, such as medical imaging diagnosis systems, do not have access to big data. Moreover, the majority of medical data are unbalanced and usually the data with abnormalities are less which leads to unsatisfying results. Most deep learning methods assume an equal occurrence of classes and do not consider the misclassification cost in a general classification process. This imbalance distribution makes these models tend to be biased toward the majority class, while the ability of positive cases detection is fair weak. Therefore, traditional data augmentation techniques have been introduced to generate new synthetic data and overcome this shortage of data. Synthetic data can be obtained in several ways, making simple adjustments to the visual data is common [16]. Generative Adversarial Networks (GAN) [17] is one of the very powerful data augmentation approaches, that aim to enhance the size and quality of training datasets and build more accurate detection models. This type of generative networks has proven its ability to generate very convincing samples of a particular class as a result of the game between a discriminator and a generator unit.

Another challenge about AI and medical data is detecting the object such as tumors from an MRI image. With advancement in machine learning techniques, different types of DL models were proposed to detect abnormalities in brain images [18,19,20,21]. An exhaustive review of these techniques and their application in medical domain, can be found in Liu et al. [22]. Advanced deep learning techniques such as UNet and SegNet are used for Computer Aided Detection systems (CADe) to detect the region of interest (ROI) in other words the region that has an abnormality in order to classify it, meanwhile a new architecture based on adversarial networks named CycleGAN proved satisfying results in object detection.

The objective of this paper is the investigation of different variants of Generative Adversarial Networks to solve these two problems; use GAN to synthesis images of the minority class since it is one of most performant techniques for data augmentation, and investigate CycleGAN in a new domain: Brain tumor detection.

The remainder of this paper is organized as follows: Sect. 2 reviews some related works on GANs architectures for data augmentation and synthesis. Section 3 presents an ensemble of used concepts followed by Sect. 4 describing the proposed approach and its main parts. Experimental results are presented in Sect. 5. Finally, a conclusion and future works are outlined in Sect. 6.

2 Related Work

This field has been treated from different points of view, for instance, there existed some work that focalized on data augmentation or data segmentation to improve the classification model using GAN. In this section, we review some of the major studies about brain reported in the literature. Most of the Brain are working with Magnetic Resonance Imaging (MRI). Important studies have been published about cancer, epilepsy, Alzheimer’s disease. These studies used different models for either data augmentation or data segmentation. For instance, Moeskops et al. [23] used adversarial training and dilated convolutions for Brain MRI Segmentation to improve its performance (Their average DC for 7 classes is 0.85 ± 0.01 using the dilated network). On the other hand, Borne et al. [24] used 3D U-Net on a heterogeneous training dataset proving a good results of 85%. Rezaei et al. [25] used cGAN for a semantic segmentation convolutional neural network. This latter had better performance for brain tumor segmentation. Several researches about segmentation and image synthesis are presented in Table 1.

Table 1. Segmentation and image synthesis publications.

3 Used Concepts

  • Generative Adversarial Networks (GAN) were be a solution for generating images based on perceptron neural networks [21].

    GANs consist of two models: the generator (G) and the discriminator (D). These neural networks can be likened to an artist who paints and another one who criticizes and advises him to improve his work. The generator is the artist. It generates the data based on noise vector randomly z a sample of the distributed p(z) that has a Gaussian or a uniform distribution; and the discriminator is the person who criticizes the artist. It checks how realistic is the generated data and gives feedback to the generator. So the input of D is the generated fake data G(z) and real data X used for the training. Figure 1 presents the structure of a basic GAN model.

    The objective is to minimize the generator’s loss \(L_{G}\) and maximize the discriminator’s loss \(L_{D}\).

  • U-Net architecture is proposed for image segmentation [12] and improved to serve better the medical images. This symmetric architecture has a “U” shape as its name. It consists of two paths: the down sampling “encoder” (for contraction) and the up sampling “decoder” (for expansion). The neck of the bottle is where the two paths meet.

  • Many variants are made from the GAN, as the CycleGAN [13]. This latter is made for image-to-image translation. As a basic GAN, CycleGAN also has a discriminator and a generator, but with the image-to-image translation CycleGan uses two generators and two discriminators in order to have two image domains. Using this architecture makes the data flow more complex. Transformations will have place from domain A to domain B by \(G_{B}\) and from domain B to domain A by \(G_{A}\), which is similar to two reciprocal mappings.

Images \(X_{fA}\) are generated from images \(X_{B}\) of domain B by \(G_{A}\) using domain A characteristics. On the other hand, images \(X_{fB}\) are generated from images \(X_{A}\) of domain A by \(G_{B}\) using domain B characteristics. Images of domain A are identified by discriminator \(D_{A}\) and images of domain B are identified by discriminator \(D_{B}\) (Figs. 2 and 3).

Fig. 1.
figure 1

GAN model components.

Fig. 2.
figure 2

UNET architecture (ZHANG et al. 2008 [26]).

Fig. 3.
figure 3

Segmentation using CycleGAN.

4 Proposed Approach

Figure 4 presents the architecture of the proposed system. At first, a preprocessing of the data is necessary in order to improve the segmentation results. This is followed by the following stages:

In this work, GAN is applied on the minority data class to generate synthetic images. Once the used dataset is balanced, CycleGAN is applied on it to detect the tumors also U-Net (as a referential model).

The training stage: during this stage the data was augmented using GAN, and the generated brain tumor MRI images were then set as an input to the next step, that is, the segmentation. The dataset was segmented using CycleGAN and U-Net. As a result, two models were generated.

The decision stage: during this stage, we used the CycleGAN based model and U-Net based model to classify the input images. Each model generates a contour of the detected object.

4.1 GAN for Generating Synthesis Images of the Minority Data Class

As mentioned in the previous section, the GAN has two models: the generator and the discriminator.

Discriminator:

This model is a classifier which is trained twice, one time for the real images and another for the generated ones from the generator model. When we start, it is easy for the discriminator to classify the results since the generator outputs noise. We compare the discriminator’s output to the real images to calculate the loss for this cycle. This loss is used to adjust the hyper-parameter of the discriminator, so it classifies more precisely the next cycle. This network consists of three linear layers. The activation function of the output layer is sigmoid (in order to have a probability that ranges between one and zero when the image is real).

Fig. 4.
figure 4

Architecture of the proposed approach.

Generator:

This network is made of three linear layers. The activation function of the output layer is tanh. It generates a batch of fake images that are transmitted to the discriminator. If this latter classifies them as “fake”, which will be the case at the beginning since the generator starts with random noise, we use the loss to update the hyper-parameter of this network, in other words D(X) is close to 1 since X are real samples. In order to reach this optimization D(G(z)) decreases among the process and at the end we will have a high-dimensional real data space out from a low-dimensional noise space.

4.2 CycleGAN and UNet for Data Segmentation

The cycleGAN’s generator performs the segmentation. It generates segmented cartographic images of brain tumors from MRI images. The discriminator compares the generated images to the real images that are segmented by health experts. This process continues until the generator generates segmented images that are close to the real ones, and the discriminator is no more able to distinguish whether the images are real or fake. This is the CycleGAN’s objective function:

$$\begin{aligned} L(G_{A}, G_{B}, D_{A}, D_{B}) = L_{GAN}(G_{A}, G_{B}, X_{A}, X_{B}) +L_{GAN}(G_{B}, G_{A}, X_{B}, X_{A}) \end{aligned}$$
(1)
$$\begin{aligned} + L_{cyc}(G_{A}, G_{B}, X_{A}, X_{B}) \end{aligned}$$
(2)

Our objective is :

$$\begin{aligned} \arg \min _{G_{A}, G_{B}} \max _{D_{A},D_{B}} L(G_{A}, G_{B}, D_{A}, D_{B}) \end{aligned}$$
(3)

In the U-Net architecture, the encoder (down sampling path) is used to capture the context of the image. This path is a stack of convolutions and max pooling [15]. The decoder (up sampling path) is used for a precise localization via transposed convolutions.

5 Experiments and Results

5.1 Dataset

The used dataset from Figshare is a referential for tumor diagnostic [18]. It has 3064 contrasted images that are improved at T1 of 233 patients, and present three types of tumors: Glioma, Meningioma, and pituitary gland tumor. These data are collected in China from Nanfang hospital and the medical university of Tranjing during five years (2005 till 2010) (Figs. 5 and 6).

Fig. 5.
figure 5

Architecture of the used cycleGAN.

Fig. 6.
figure 6

Brain tumor images.

5.2 Augmentation of the Minority Data Class Using GAN

The minority data class concerns brain MRIs with abnormalities. After using different parameters, a precis brain MRI image was generated from a random noise, showing a frontal lobe tumor. The shown result in Fig. 7 is after one hour only of GAN training for 3000 epochs.

Fig. 7.
figure 7

GAN generated image after 3000 epochs (one hour).

Figure 8 presents the GAN’s loss, the generator’s loss decreases and almost be stable over 5000 iterations, while the discriminator’s loss increases over the iterations, as mentioned in Sect. 2 the optimal result is to minimize the generator’s loss and maximize the discriminator’s loss.

Fig. 8.
figure 8

GAN’s loss.

5.3 Data Segmentation Using CycleGAN and U-Net

In order to detect brain tumors in MRI images CycleGAN is used on the balanced dataset. Figure 9 represents the CycleGAN segmentation results.

Fig. 9.
figure 9

MRI Brain image, CycleGAN segmented result, Prediction of the corresponding segmented MRI.

5.4 Evaluation

To evaluate the segmentation performance we used ROC, DICE and precision measures. The following table shows that CycleGAN outperformed U-Net. CycleGAN’s performance is 0.9, 0.76, 0.74 and U-Net’s performance is 0.79, 0.75, 0.7 respectively ROC, PR and DICE.

Table 2. Quantitate comparison of the models, using ROC, PR and DICE (brain tumor).
Fig. 10.
figure 10

Data segmentation model’s performance.

The CycleGAN performances directly reach the top over the first 1000 iterations but never stabilize (even after 20000), while U-Net performances decrease over the first 200 iterations then increases to reach the top at 1000 iterations ans stabilize.

5.5 Discussion

After comparing the performance of the two models, it is clear that U-net has good but lower performance than CycleGAN. Both models were trained using the Adam optimizer. U-Net was run for 2000 iterations, while CycleGAN was run for 50000 iterations. The results of the proposed CycleGAN are presented in Table 2 and Fig. 10(a).

Although CycleGAN adds a certain quality, it ignores important diagnostic information during the translation process. Thus, it was hard to extract the characteristics from this type of results such as in Fig. 9.

Qualitatively, we observe good representation using CycleGAN, we see in Fig. 9 distribution of images from input of MRI images is indistinguishable from the distribution of segmented results using an adversarial loss.

6 Conclusion

In this study, the imbalanced learning problem in brain tumor MRI images was tackled and a CADe system based adversarial networks approach for tumor detection was proposed. At first, GAN was used to synthesize the data, which has proven good results with brain tumor MRI images. Then, to segment the brain tumor images, two models were trained: CycleGAN and U-Net as a referential model. As a conclusion, the use of GAN-generated images significantly has influenced the performance of the model network trained for tumor segmentation. GAN permits to generate realistic MRI images that are useful to solve the problem of limited data or imbalanced datasets a, d we can see it as an efficient approach for augmentation, and improves the predictive power of the segmentation step by applying the particular CycleGAN consistency loss.

Qualitative results shows generative adversarial networks gave significant results, and can be used in handling medical data, and in particularly for brain tumor detection.

In future work, we plan to explore other types of medical datasets through the transfer of the knowledge gained while learning using CycleGAN for MRI segmentation. We also aim to optimise the model complexity for a better interprabilty.