Keywords

1 Introduction

Healthcare professionals rely on medical images for disease detection and treatment when visual inspection is not possible. For this reason, medical images should be examined more efficiently in order to develop this method, which is frequently preferred in health services [1]. On the other hand, the devices and the created images are developed to obtain better quality results. Thus, the analysis of images with smart systems in order to facilitate their use in healthcare services has become a research topic. Physicians and radiologists can obtain more effective results in image analysis with computer-aided systems. Computing and data-based repetitive processes can be automated with computers and have become an important reference source for healthcare professionals [2].

In classical machine learning techniques, the feature extraction process is revealed manually, unlike deep learning methods, and the obtained images can be defined numerically. The manual feature extraction method is not an approach that can be possible in every field and on the other hand the necessary competence in the medical field may not be achieved. In addition, manually extracted features cannot produce effective results in all cases and successful results cannot be obtained in complex scenarios. One possible solution is to learn research-related features directly from medical data. Such a data-driven approach can naturally learn domain information from data without manual feature engineering. However, constraints such as optimization difficulty and hardware limitation are encountered. Since such architectures also lead to the use of shallow architectures, healthier results can be obtained with deep learning models [3].

El Kaitoun et al. have used an improved Markov method and a U-net-based deep learning method for brain tumor detection [4]. Ramirez et al. have presented a new variational model for MRI image segmentation [5]. Sobhaninia et al. have obtained more effective results by using multiple scales and the utilization of two cascade networks as an image segmentation method [6]. Wu et al. have used a 3D U-net-based deep learning model for tumor image segmentation [7]. Chetty et al. have presented a new approach model for classifying brain MRI images based on the 3D U-Net deep learning architecture [17].

The development of computer hardware and artificial learning techniques in recent years has enabled deep learning models on problems. Thus, classification with deep learning models, object perception, and computer vision problems became research topics. In this context, calculations that have transformed from traditional machine learning methods to deep learning methods have gained great momentum. Although deep learning in medical fields has become widespread, there is still much field for improvement. In this study, a model has been developed for the classification problem on brain tumor images with CNN and VGG deep learning methods, which are among the computer vision methods, is presented.

2 Materials and Methods

In this study, a classification model has been created using MR images frequently used by healthcare professionals. With this model, brain tumor symptoms in patients can be detected by computer vision methods. As a data source, publicly available data on Kaggle has been used [8]. The dataset includes 155 tumor patients images separated according to patient data and 98 healthy MR images. Figure 1 contains sample data from the tumor and healthy brain images in the dataset.

Fig. 1
figure 1

Dataset sample ımages

Performance results of classification models created with CNN and VGG architectures, among deep learning techniques, have been compared. Images are rearranged to a fixed pixel size of 224 × 224. The created model is designed with the steps as in Fig. 2.

Fig. 2
figure 2

Classification model

2.1 CNN and VGG Architecture

A convolutional neural network consists of layers in Fig. 3. The image is processed with the Convolutional Layer, Pooling Layer, Fully Connected Layer, and Output Layer steps, respectively, as indicated in Fig. 4 [9,10,11,12,13,14,15,16].

Fig. 3
figure 3

CNN architecture

Fig. 4
figure 4

VGG architecture

Each step in which the image is processed consists of the following processes.

Convolutional Layer and Activation Function: It takes place to reveal the features and the non-linearity is introduced to the model through the activation function.

Pooling Layer: This layer is added after convolutional layer and helps to reduce the number of weights in the model. Thus, by reducing the number of parameters in the network, it allows a reduction of computational complexity.

Flattening Layer: Prepares the classical neural network data by making the matrix that consists of convolutional layer and pooling layer steps into one-dimensional array.

Fully Connected Layer: The standard neural network method used in the classification process is applied to the one-dimensional array taken from the pooling layer.

Relu Activation Function: This layer comes into effect after convolutional layers. The main task of this layer is to convert negative values from input data to zero. The model that comes from the operations before this layer has a linear structure and is required to turn it into a nonlinear structure. Thus, the model learns more efficiently [10, 11].

Softmax Activation Function: Softmax is mainly used in the output layer. Calculates the probability that the input belongs to a particular class. This is calculated by generating values between 0 and 1 with a probabilistic interpretation. It is mathematically expressed as in Eq. (1). The input vector x is a real number and consequently, a probability result p is produced [12, 13].

$$p = \left( \begin{gathered} p_1 \hfill \\ \vdots \hfill \\ p_n \hfill \\ \end{gathered} \right),p_i = \frac{{e^{x_i } }}{{\sum_{j = 1}^n {e^{x_j } } }}$$
(1)

2.2 Model Implementation and Performance Analysis

The model has been implemented with the Python programming language version 3.7 and using the Keras deep learning library [15]. The dataset that used in the study has been divided into two groups as train and test datasets. The rate is 75–25%. First of all, the images are adjusted to a fixed size. Images have been rearranged in 224 × 224 size and thus it has aimed to achieve more effective results. Two different models proposed that based on CNN and VGG architectures have been designed. Layer information and the number of parameters for presented models have been created as shown in Table 12.

Table 1 Model 1 Parameters
Table 2 Model 2 Parameters

Model performances have been measured after training with two different models. According to the performance results, since the success rate in model 2 is higher and the loss function is closer to the value of zero, a better classification result has been obtained compared to model 1. As a result, a classification performance result of 92% in model 2 and 85% in model 1 has been obtained. It has been observed that model 2 with a result closer to 0 is more effective. As the performance values obtained from these models, accuracy and loss function graphs have obtained for 50 epochs as in Table 3. Only the accuracy criterion does not give sufficient results in cases where there are unbalanced data sets [14], thus accuracy, precision, and recall values have also been observed as other performance metrics in Table 4.

Table 3 Accuracy and loss rates
Table 4 Precision and Recall Metrics

TP (True positive): Correct detection of the tumor image.

FP (False positive): Incorrect detection of the healthy image.

TN (True negative): Correct detection of the healthy image.

FN (False negative): False detection of the tumor image.

Precision rate (P) = TP / (TP + FP).

Recall rate (R) = TP / (TP + FN).

3 Conclusion

In this study, the performance performances of CNN- and VGG-based proposed models for tumor detection through MRI images, which are frequently used in health care, in classification have been evaluated. The number of convolutional layers, dataset quality, number of epochs can be among the main criteria that can affect the success of the model during training. The number of patient samples may be limited in the training set and thus this situation may lead to over-learning. It has been observed that better results can be obtained with a limited number of images with the VGG-based model, which has a pre-trained architecture for such cases. As a result, it has been observed that pre-trained VGG-based models have high applicability in the health field where data acquisition is limited even if they are trained with objects with different characteristics. It has been observed that the learned features during training can be transferred with high accuracy on different models and it makes such models a viable option for classification problems in the healthcare field.