1 Introduction

Cancer may be defined as the unrestricted, abnormal, and unnatural growth of the section of the cells or tissue. Occurrence of this abnormal cells raise in the brain is called as brain tumor. Brain tumors are considered fatal cancers. The tumors when originated in brain itself are classified as primary. Gliomas are brain tumors that arise from brain glial cells. Early diagnosis of brain tumor is vital in enhancing treatment opportunities. Medical imaging techniques such as computed tomography (CT), single photon emission computed tomography (SPECT), positron emission tomography (PET), magnetic resonance spectroscopy (MRS), and magnetic resonance imaging (MRI) are altogether utilized to get the important information about outline, length, position, and metabolism of brain tumors [1].

The various research studies used brain MRI imaging because of their high resolution. After capturing the brain MRI, it is essential to separate the tumor region from the MRI brain image. Accurate segmentation of brain MRI images helps the medical practitioners for planning the treatment of the patients. Due to complicated brain tissue structure, manual tumor segmentation from MRI images is hard and complex and it is primarily based on the operator’s experience and subjective selection. Therefore, computerized segmentation strategies are required. Because of its high variability in brain tumor’s shape, size, regularity, area, and heterogeneous presentation, there are several difficulties in automated algorithms [2].

Artificial neural network (ANN) consists of machine learning (ML) and deep learning (DL). It plays an important role for classification of biomedical images. ANN is consisting of layers namely, Input, Hidden, and Output. The inputs can be radiometric functions that have been extracted from the images. Automated image segmentation, facts analysis, and image reconstruction play a vital role in ML [3, 4]. Figure 1a and b show the samples of brain MRI normal and abnormal images.

Fig. 1
figure 1

Brain MRI images. a Normal brain MRI. b Brain MRI with tumor

In this paper, an algorithm is proposed based on convolutional neural network (CNN) to detect the tumor in brain MRI images. CNNs are an improvement on the general idea of artificial neural networks. Their ability to automatically learn appropriate representations of the data makes problems easier to solve, especially problems involving large amounts of data that would otherwise require a lot of pre-processing.

The organization of the remaining paper is as follows. Section 2 gives an update of previous research work carried out by various researchers, with respect to brain tumor detection, whereas in Section 3 proposed technique is explained in detail along with methodology. The obtained results, findings, and a discussion of the proposed method are explained in Section 4. Section 5 comprises of concluding remarks as well as upcoming opportunities to work in the presented work.

2 Literature review

In deep learning, the machine learns useful knowledge and features from raw information [5, 6], bypassing physical and troublesome steps. CNN is an effective method of analyzing good descriptions of images [7, 8]. Various architectures used in CNN are Alexnet [1], ZFNet [10], VGG Net [11], Google Net [12], and ResNet [13]. The AlexNet [9] is better among all the existing algorithms. Table 1 shows the information of various available community architectures.

Table 1 Brief tabulation of DL network architectures and object Recognition Challenge [8]

A typical CNN architecture consists of multiple layers such as convolution, pooling, activation, and classification (fully connected) layers [14]. Convolutional layer produces feature maps by convolving a kernel across the input image to generate the image features [15]. Pooling layer is used to down-sample the output of preceding convolutional layers by using the maximum or average of the defined neighborhood as the value passed to the next layer. Rectified Linear Unit (ReLU) is the most commonly used activation functions [16]. The convolution operation can create uncommon element maps depending on the channels utilized. The pooling layer plays a down sampling activity. Neurons in a fully connected layer are associated with all actuations inside the first layer. The architecture of Alexnet is shown in Table 2.

Table 2 Alexnet architecture [using basic mode transfer learning] [17]

3 Methodologies

3.1 Brain MRI dataset

Dataset collection is the most important step in any research work. BRATS 2013 and BRATS 2015 are the main dataset used in this study along with Open-I NLM dataset [18]. The dataset consists of both the types of images, i.e., images with tumors and images without tumors. MRI images are collected from MRI machines with different field strength. However, images captured below 1.5 T field intensity are included in this work. This field strength is sufficient to envision the tumors in the images. Large number of dataset is available at Open-I website of dataset portal. Table 3 shows some major depositories used by the researchers to conduct the investigation.

Table 3 Summary of various datasets used for brain tumor detection [8]

Figure 2 shows splitting of the dataset during the experimentation. MRI images have different modalities like-T1, T2, and FLAIR. In the proposed work, datasets are divided into two parts, 80% for training and validation along with 20% to prepare to test dataset. The splitting of the overall images is as shown in Fig. 2 [19, 20].

Fig. 2
figure 2

Dataset splitting for the experimentation

The sample images from the dataset are shown in Fig. 3, with three different views.

Fig. 3
figure 3

MRI image slices showing a patient’s brain tumor. A Axial view. B Coronal view. Sagittal section view

3.2 Proposed method

3.2.1 Transfer learning

Knowledge gained while solving one type of problem can be used to solve other similar type of problems. So, previously gained knowledge in the form of pretrained network can be used to learn and solve new similar problems. While solving the first problem, the pretrained network has learned a rich set of features; such learning can be readily used for solving other similar problems [21]. For example, one can take a network which is already trained on millions of images and retrain it for new object classification using only hundreds of images. Such retraining process will be faster and easier than starting to train the network from scratch initial stage. Fine-tuning of pretrained network with transfer learning is the important stage while using pretrain network for new applications.

In CNN, within the convolutional layer, the input image is split into several tiny regions. The output layer is used to produce the class probability. CNN brain tumor classification is split into two stages, namely training and testing. Dataset images are broken into special groups using tumor and non-tumor brain images. Within the training phase, pre-processing, features extraction, and categorization with loss feature are executed to make a prediction. In the pre-processing phase, resizing operation is performed to change the size of the image. The general framework of the brain tumor classification using CNN is shown in Fig. 4.

Fig. 4
figure 4

Flow of proposed technique

Brain MRI images are taken from “Open-i Biomedical” image dataset, “BRATs 2013” and “BRATs 2015” dataset. Alexnet is one of the pre-trained convolutional neural networks. A pre-trained model for brain tumor classification is used. Transfer learning is used for fast training process. With respect to the task of classification, first and last three layers of pre-trained networks are modified in order to adapt them. In fully connected layer, the output size represents absence or presence of tumor.

The loss characteristic is obtained through gradient [slope] descent [Succession] algorithm. The unknown image pixel is mapped to a particular class with its rankings by means of a score characteristic. The usefulness of a selected bunch of constraints is recorded by means of the loss function. The loss characteristic count is extremely important to progress the precision. The algorithm of the proposed work is shown in Fig. 5, which explains the CNN work flow and the steps required for training and calculating the accuracy.

Fig. 5
figure 5

Flow of proposed techniques

A significant set of 32 intensity and grain textual features are extracted from the segmented region of interest (SROI) of tumor part. These features are First order Statistical Features, Gray Level Co-occurrence Matrix (GLCM), Grey Level Run Length Encoding Matrix (GLRLM), Grey Level Gap Length Matrix (GLGLM), and Grey Level Size Zone Matrix (GLSZM).

4 Outcomes and discussion

The tumor detection from imaging is challenging task. In this work, special strategy is used for the tumor detection using the CNN approach. MATLAB 2020 evaluation version is used for simulation and system hardware used is i5-8250U processor, RAM: 8 GB, System type: 64-bit Operating System. Experiments are performed on BRATS 2013 [22], BRATS 2015 [23], and OPEN-I [18] dataset. Table 4 shows how the total image collection available for the experimentation is divided into training and testing portions. Few subjects contain more measurement for some patients and only one measurement per patient is used.

Table 4 Dataset splitting

Image data augmentation is used to increase the dataset. Random combination of resizing, cropping, rotation, reflection, shear, and translation transformations are done to increase the dataset. Experimentations are done on actual dataset as well as augmented dataset.

The Alexnet architecture from CNN is used in the proposed work. The transfer learning approach is adopted to minimize the execution time and computational complexity. The different parameters tuned and finalized for the modelling are listed in Table 5.

Table 5 Hyperparameters set during the training phase

The sample output obtained during the training phase is shown in Fig. 6. The blue line indicates the smoothen training curve of the CNN network during training phase, faint blue line indicates training curve, and black line indicates validation phase.

Fig. 6
figure 6

Output obtained during the training progress

Similarly, layer-wise different features are extracted after training of Alexnet architecture, which is shown in Fig. 7. The CNN is next explored through the visual investigation of their transitional layers.

Fig. 7
figure 7

Feature map view obtained from the proposed algorithm after CNN training on entire dataset

4.1 Features on convolutional layer 1

This layer is the second layer in the network and is named ‘conv1’. These images mostly contain edges and colors, which indicates that the filters at layer ‘conv1’ are edge detectors and color filters. The edge detectors are at different angles, which allows the network to construct more complex features in the later layers.

4.2 Features on convolutional layer 2

These features are created using the features from layer ‘conv2’. The second convolutional layer is named ‘conv2’, which corresponds to layer 6. Visualization of the first 30 features is learned by this convolutional layer 2, by setting channels to be the vector of indices 1:30. Figure 7 shows the visualization of features of first five features.

Same is the case for the features from layer ‘conv3’, ‘conv4’, and ‘conv5’.

The following output is obtained when the single image features are observed as input to the Alexnet architecture. Input image size is 227 × 227 × 3. There are usually many kernels of the same size in each convolutional layer. Convolutional layer C1 includes 96 kernels of size 11 × 11, applied with a stride of 4 and padding of ‘0’. So, the output image is of size 55 × 55 × 96 (one channel for each kernel). Convolutional layer C2 includes 256 kernels of size 5 × 5 applied with a stride of 1 and padding of 2. So, the output image is of size 27 × 27 × 96 (one channel for each kernel). Convolutional layer C3 includes 384 kernels of size 3 × 3 applied with a stride of 1 and padding of 1. So, the output image is of size 11 × 11 × 384 (one channel for each kernel). Convolutional layer C4 includes 384 kernels of size 3 × 3 applied with a stride of 1 and padding of 1. So, the output image is of size 13 × 13 × 384 (one channel for each kernel). Convolutional layer C5 includes 256 kernels of size 3 × 3 stride of 1 and padding of 1. So, the output image is of size 13 × 13 × 256 (one channel for each kernel). Various types of kernel (convolutional filters) are applied on the input image to extract the required features. Convolutional layers 1 and 2 describe lower-level image descriptors as shown in Fig. 8.

Fig. 8
figure 8

Feature maps obtained from the proposed algorithm for single image as input

These images mostly contain edges and colors, which indicates that the filters at layer ‘conv1’ are edge detectors and color filters. The edge detectors are at different angles, which allow the network to construct more complex features in the later layers. As one moves to further layer, higher layers in the network might build upon these representations to represent larger structures. Convolutional layer 5 (higher layer) might represent whole objects.

To check the significance of selected features, the image is divided into eight partitions as shown in Fig. 9.

Fig. 9
figure 9

Partitioning of the image in eight parts

In the case of image as shown in Fig. 9, tumor is split into two partitions P2 and P3. The statistical features obtained of each partition are presented in Table 6. The various first-order and second-order statistical features are used for the experimentation.

Table 6 Statistical features of each part — Sample Image 1

As shown in Table 6, image shown in Fig. 9 contains tumor spread into partition 2 and 3. In case of partitions 1 and 4, very minor change in statistical features is observed, but in the case of partitions 2 and 3 prominent change in statistical values is observed.

4.3 Comparison of the result

Alexnet and VGG-16 architectures are implemented in the proposed work. The effect of variation of the training functions is studied and listed in Table 7. During experimentation, the initial learning rate is also changed; as modelling the architecture (hyperparameter tuning) plays a crucial part in the implementation of CNN algorithms. The analyzed training functions are Adam (adaptive moment estimation) and RMSprop (root mean squared propagation).

Table 7 Results of the proposed algorithm using Alexnet architecture [Adam optimizer]

The best training parameters are obtained for CNN training using Alexnet architecture with TL being with 100 epochs with the mini-batch size of 64 image instances and initial learning rate of 3.00E-07. In Alexnet architecture, the maximum accuracy of 98.67% is achieved with training function of ADAM. Accuracy is calculated using following formula

$$\mathrm{Accurancy}=\left[\mathrm{TP}+\mathrm{TN}\right]/\left[\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}\right]$$
(1)
  • TP: True positive: Images with tumor correctly identified as images with tumor

  • FP: False positive: Images without tumor incorrectly images with tumor

  • TN: True negative: Images without tumor correctly identified as Images without tumor

  • FN: False negative: Images without tumor incorrectly identified as images with tumor

ADAM optimizer effectively minimizes cost function without any parameter tuning [24]. Dropout is applied to improve generalization and performance on the test set. Convolutional layers are always followed by the pooling layers, which limits the capabilities of this network due to the aggressive information loss in pooling. Table 8 shows the results of proposed algorithm using Alexnet architecture.

Table 8 Results of the proposed algorithm using Alexnet Architecture [Rmsprop optimizer]

The best training parameters are obtained for CNN training using Alexnet architecture with transfer learning with 15 epochs with the mini-batch size of 64 image instances and initial learning rate is 3.00E-05. In Alexnet architecture, the maximum accuracy of 98.67 is achieved with training function of ADAM as shown in Table 9. In Alexnet architecture, the maximum accuracy of 93.33 is achieved with training function of rmprop.

Table 9 Results of the proposed algorithm using Alexnet architecture

The hyperparameters of the VGG-16 ConvNet are tuned, and the obtained results are tabulated as shown in Table 10. The training functions referred in this experiment are SGDM, RMSprop, and ADAM. The VGG 16 ConvNet architecture and optimizers, SGDM, ADAM, and RMSprop are listed in Tables 10, 11, and 12 respectively.

Table 10 Results of the proposed algorithm using VGG16 with SGDM training function
Table 11 Results of the proposed algorithm using VGG16 with ADAM training function
Table 12 Results of the proposed algorithm using VGG16 with RMSPROP training function

In VGG-16, using training function SGDM maximum accuracy achieved is 89.33 with the Learning rate = 1.00E-4.

The results obtained accuracy = 90.67% with learning rate = 1.00e-05, Number of Epochs = 3, Iteration = 82, and Normalized error rate = 0.25 for VGG-16-TL network with training function ADAM.

In VGG-16, using training function rmsprop, maximum accuracy achieved is 88.00 with the Learning rate = 1.00E-5.

The proposed work is tested with the various optimizers. The results obtained from the optimizer are compared in terms of the accuracy achieved as shown in Table 13 and Fig. 10. The performance of various optimizers used is tabulated in Table 13.

Table 13 Optimizer performance
Fig. 10
figure 10

Performance of various techniques used for experimentation

In case of Alexnet architecture, ADAM and RMSPROP optimizers are compared. Maximum accuracy of 98.67% is achieved with ADAM optimizer with Alexnet [24]. In case of VGG-16 architecture, Adam, SGDM, and rmsprop are compared. Maximum accuracy of 90.67 is achieved with Adam optimizer in the case of VGG-16.

The important step in the result authentication is cross validation of experimental results. The cross validation proportion can be varied. Table 14 shows the results obtained for random tenfold cross validation. Table 15 shows comparison of obtained results for convolutional neural network architecture.

Table 14 Ten-fold cross validation results (Alexnet)
Table 15 Comparison of obtained results for the convolutional neural network architecture

The proposed convolutional neural network architecture using Alexnet architecture is modelled using transfer learning approach. In any neural network, hyperparameter tuning is the most vital step. Hence, delicate decision needs to be taken while selecting the values. Understanding the input image characteristics and applying the appropriate hyperparameters is a must.

The kernels available in the Alexnet architecture are found suitable for the brain tumor detection from MRI images. The features extracted using these kernels are also appropriate for the characterization of the tumor. Hence, the proposed work has reached the maximum accuracy. Various researchers’ findings are compared with the developed approach shown in Table 16. The investigation is completed on the standard image dataset. The results obtained are approved by the two medical experts. The research in this domain is scattered at various points, like database used, cross validation methods, and performance parameter used (many have used Sensitivity, Dice coefficient, Tanimoto, Jaccard similarity coefficient). Apart from all this diversity, it is tried to give an overall picture of the state of art research carried in this domain. For this purpose, common performance parameter is fixed as “accuracy of the system” and comparison of other developed methods is done with respect to accuracy of system.

Table 16 Comparison of obtained results with approaches used in literature

In the recent literature, it is found that AlexNet, GoogLeNet, and VGG are most popular pre-trained CNN models and are used in many classification applications. Different approaches were used for the identification of brain tumor. To overcome the drawback of the machine learning approach, CNN architecture is used. In this type of approach, the kernels defined in the convolutional layers are extracting the required features from the input images. The features extracted in this layer are combination of all types of features.

Transfer learning is better than the random initialization to train the pre-trained CNN model when datasets are small. As convolution layer increases, the accuracy increases but at the other side training time also increases.

Further experimentations are done to calculate the dimensions of tumor (i.e., tumor parameters). Few sample image dataset is shown in 11 [43]. The tumor-describing parameters like diameter of tumor, area of tumor, perimeter of tumor, eccentricity of tumor, and circularity of tumor are calculated as shown in Table 17. Diameter gives the mean of major axis and minor axis. It is a scalar value. Area defines quantity of pixels in the region. Perimeter provides the definite figure of the pixels in the shape of the nodule. The eccentricity is the proportion of the distance among the foci of the ellipse and its major axis length. The value is ranging from 0 to 1. An ellipse whose eccentricity is 0 is a circle, while 1 is a line segment. Circularity is the roundness of shape which is to 1 only for roundness and it is < 1 for any other shape.

Table 17 Tumors’ dimensions and classifications

Tumors are classified on the basis of their growth rate. To calculate the growth rate, one must have at least 2 samples of same patients to come across the proper conclusion. In present system, because of dataset limitation, the size of tumor is measured and interpreted in Table 17 (for sample dataset shown in Fig. 11), which shows the dimensions of the nodules (one tumor) in mm. Tumor parameters are calculated for classifying it into various classes. Tumors diameter greater than 10 mm will require special attention of radiologist.

Fig. 11
figure 11

Few sample images under test to calculate tumor parameters

Benign tumors have clearly defined borders and they are composed of harmless cells. Nearby tissues are not infiltrated by the benign brain tumors. On the other hand, distinct borders are absent in the case of malignant brain tumors. The malignant brain tumors tend to grow rapidly and infect other parts of the brain. The brain tumor classification depends on “how rapidly it is growing” and “how likely it is to invade other tissues.” World Health Organization grading system classified the brain tumors on the basis of rate of growth into four categories, grades I, II, III, and IV. Grade I tumors are the least malignant and grow slowly. But even a grade I tumor may be life-threatening if it is inaccessible for surgery. Grade II tumors grow slightly faster than grade I tumors and have a little abnormal microscopic appearance. These tumors may attack surrounding normal tissue, and may reappear as a grade III or higher tumor. Grade III tumors are malignant. The chances of recurrence of these tumors are quite high. Grade IV tumors are the most malignant and invade wide areas of surrounding normal tissue.

5 Conclusion and future scope

Brain tumors are relatively diverse in their spatial location and structure. Data augmentation is used to explore this variability. A robust CNN-based image processing algorithm is presented for the classification of brain tumor images into normal and abnormal type. The algorithm has been successfully tested on the dataset BRATs 2013, BRATs 2015 and Open-I images.

The presented method is based on CNN which is constructed using convolutional layer with 11 × 11 kernels to permit specific features of the images. The results are obtained with accuracy = 90.67% with learning rate = 1.00e-05, No of Epochs = 3, Iteration = 82, Normalized error rate = 0.25 for VGG-16-TL network with training function as ADAM. In VGG-16, using training function rmsprop, maximum accuracy achieved is 88.00 with the Learning rate = 1.00E-5. In VGG-16, using training function SGDM maximum accuracy achieved is 89.33 with the Learning rate = 1.00E-4. The best-performing classifier had an accuracy of 98.67%, with learning rate = 3.00e-07, Normalized error rate = 0.45, No of Epochs = 100, Iteration = 400 for AlexNet-TL network.

From simulation results, it is observed that the highest classification accuracy of 98.67% has been achieved in Alexnet architecture, with training function of ADAM. The proposed methodology is valid for axial, coronal, and sagittal slice images of the brain. Tumor parameters will help the doctors to classify the tumors in various grades defined by WHO. In upcoming algorithm development, the proposed system may be verified with real time images from other dataset along with multiple tumors, to confirm the results in a more general way. The proposed work can be further extended for finding the brain tumor at an early stage, using combination of two CNNs for increased accuracy. Researchers can also explore the use of bio-inspired algorithms in the process of brain tumor detection.