Keywords

1 Introduction

Lung cancer patients’ relative five-year survival rate is around 20%, making it the deathliest cancer type. A man’s probability of lung cancer is 1/15, while it is 1/17 for a woman. This ratio is increased for smokers [1]. The major issue with lung cancer is the difficulty of its treatment because it is identified at later stages [2]. Identifying lung cancer at early stages might boost the patients’ survival rate up to 60–70% [3]. Predicting the estimated survival period after identification improves the prognostic accuracy, which leads to physicians’ and patients’ families’ better decision-making [4]. Therefore, predicting patients’ lung cancer survivability has become a trendy research topic by scholars from medicine and computer science domains. The advent of artificial intelligence (AI) techniques, specifically machine learning algorithms, has improved the diagnosis and treatment of cancer patients.

Machine learning develops models capable of learning and providing informed decisions based on large amounts of historical data. The training data used by machine learning algorithms also enable professionals to get an insight into early cancer diagnosis, variation of treatments, and drug discovery [5]. This automatic process of identification and exploration has proved its performance and efficiency in classifying patients’ lung cancer images using various machine learning algorithms [6]. The existing literature has extensively relied on computer tomography (CT) or X-ray images for identifying lung cancer patients. For instance, Vas and Dessai [7] used an artificial neural network (ANN) to categorize the cancer stages based on CT scan images, while Gang et al. [8] employed the parallel immune algorithm for cancer diagnosis based on X-ray images. However, these techniques were criticized for several reasons. First, the errors in the images taken through the CT and X-ray techniques lead to false-negative reports, causing delays in cancer treatment. Second, these techniques can sometimes be difficult for screening patients because of the low number of available devices, high costs, and radiation doses. Sathesh A 2020 proposed a lung nodule segmentation algorithm that uses adaptive weights as a feature for the recurrent neural network. The algorithm initially detects the lung parenchyma from which the background region is minimized. However, the boundaries of the obtained nodule candidate region are not accurate. The evaluation was done with the LIDC datasets using the metrics such as Hausdorff distance, probability rand index (PRI), accuracy, recall, and precision. The scheme provides accuracy, recall, and precision of 94.08%, 89.3%, and 94.1%, respectively [19, 20].

To overcome the limitations of CT and X-ray images, machine learning algorithms have been used to diagnose lung cancer patients by relying on patients’ clinical features. However, there is a lack of research on predicting lung cancer patients’ survival or death period, specifically with the use of deep learning techniques. Unlike classical machine learning, deep learning techniques learn by building a more abstract representation of data as the network expands deeper [10, 11]. This, in turn, helps maximizing the prediction accuracy of deep learning models compared to their counterparts of classical machine learning. Therefore, this research aims to predict lung cancer patients’ survival or death period through a comparison between classical machine learning and deep learning techniques using patients’ demographic and clinical features. The early prediction of the survival period of lung cancer helps patients and healthcare professionals to manage costs better and provide treatment at the appropriate time [12,13,14,15].

2 Proposed Methodology

In recent years, deep learning methods gained significant interest in the segmentation of lung cancer. Deep learning method has gained a significant advantage compared to other approaches. Deep learning puts forward a way to let computers learn the features automatically based on data-driven to reduce the complexity of artificial design features. The deep learning model with essentially enlarged depth advances segmentation performance. Figure 1 depicts the summary of the proposed methodology.

Fig. 1
figure 1

Overall block diagram of proposed lung cancer segmentation model

2.1 Pre-processing

We used the randomization strategy as image preprocessing, which could ensure that the deep learning model still maintains strong generalization performance after a large number of repeated training. Multimodal brain images of the same patient use the same processing in one epoch training and different random measures in different epochs. It helps to learn the image features of different modes in the same brain while obtaining generalization. The figure shows the image preprocessing methods: 3D random clipping, 3D random rotation, 3D image intensity random enhancement, 3D image random mirror inversion, and normalization.

Image normalization is a widely used technique in computer vision, pattern recognition and other fields. The z-score normalization was applied in this work. It is defined as per Eq. (1):

$$z = \frac{x - \mu }{\sigma }$$
(1)

where \(\sigma\) is the standard deviation, and \(\mu\) is the mean value. Then, the 3D random clipping method randomly cuts the MRI image (240, 240, 155) into a matrix (144, 144, 128). The 3D random rotation method rotates the reduced image by the angle \(U(-10, +10)\). The random intensity enhancement method of 3D image sets the image pixel value is defined as per Eq. (2):

$$\begin{array}{*{20}r} \hfill {x_{new} = x_{old} *U(0.9,1.1) + U( - 0.1,0.1)} \\ \end{array}$$
(2)

where \(U\) is the uniform distribution. The random mirror processing symmetrizes the image according to deep, height and width directions. We applied these image enhancement routines to extend the training data set to improve the performance and generalization ability of the deep neural network.

2.2 Deep Learning Model

Deep learning which is a subset of Artificial intelligence is gaining momentum each day by making different tasks much easier and more efficient. CNN which is a type of deep learning mechanism is an inevitable part of image vision problems. In recent years, deep learning methods gained significant interest in the segmentation of lung cancer. The performance of deep learning segmentation methods usually depends on the size of training data. However, it is always difficult to acquire a large number of image data with pixel-level annotation in clinical practice. In order to address the challenge of scarce annotation, studies [12,13,14,15,16,17] have applied few-shot learning on medical image analysis, where the labeled data is a small portion of the whole dataset. For these methods, it is often difficult to effectively utilize the unlabeled data. In recent years, unsupervised learning has been performed on medical image, using unlabeled data for model optimization. The U-shaped model is an efficient and straightforward segmentation network in 3D medical images especially in lung cancer, learning features from deep and shallow neural units. The UNet model consists of four encoder layers and four decoder layers. The proposed model is shown in Fig. 2. In this model, the four channel model inputs correspond to the MRI images of four modes, respectively. The main body of the network is composed of auto-weight dilated (AD) unit, Residual (Res) unit, linear upsampling, and the first and last convolution units. In the downsampling stage (feature coding extraction), we use 8 AD units to obtain multi-scale feature maps. In the upsampling stage (feature decoding), we use the AD unit, Res unit and a linear upsampling layer to form a primary decoding layer. Finally, a convolution unit outputs the results of the network model. Moreover, each convolution unit, AD unit and Res unit contains batch normalization and ReLU functions. We used extended convolution to extract fine-grained and multi-scale glioma features, and employed residual structure to obtain long-dependent glioma features.

Fig. 2
figure 2

An illustration of the proposed architecture for lung cancer classification system

As for the Res Unit layer, we used two convolution units to reduce and then enlarge the number of convolution kernels so as to realize feature learning and feature map reorganization. From an experimental point of view, this is an efficient coding method. Then, we used two group convolution units with stripe 1 and group 16, and the kernel size is 3 × 3 × 3. Finally, we used a convolution residual element to obtain the characteristic graph of long dependence.

As for the AD Unit layer, we used two convolution units firstly (like the Res unit). Then, we used three extended convolution units (the divided parameters are 1 and 2, respectively) and used two learnable parameters to adjust and fused the characteristics of the two group extended convolution units. Finally, a group convolution unit was used to output the result of the AD unit. We also set up residual calculations in the AD unit. The dilated convolution could expand the receptive field of the convolutional kernel without sacrificing computational resources, while normal convolution could provide a more accurate feature map. The fusion of the two types of convolutions could strengthen the ability of the network to extract features.

In the encoder stage, each residual block is a dual-pathway structure. In this stage, we set channel depth to 32, 64, 128, and 256. The residual block is the critical structure of down-sampling. In the decoder stage, we connect a convolutional unit and a de-convolutional unit for upsampling. The stripe of the de-convolutional unit is 2 \(\times\) 2 \(\times\) 2, and the kernel size is 3 \(\times\) 3 \(\times\) 3. Batch Normalization and RReLU activation functions are connected behind the convolutional unit or the de-convolutional unit for all convolutional units and all de-convolutional units in this stage. Similarly, we set the channel depth in the decoder stage to 32 \(\times\) 64 \(\times\) 128 \(\times\) 256. A combined deep neural network with the residual blocks enables the network to obtain more significant gradients in deep layers.

So, the phenomenon of gradient disappearance is relatively rare and gets more practical features of gliomas. The formula of the gradient propagation in the convolutional layer can be defined in Eq. (3),

$${\delta }_{l}={\sigma }^{^{\prime}}({O}_{l})\cdot ({w}_{l+1}{)}^{T}{\delta }_{l+1}{O}_{l}$$
(3)

where \({\sigma }^{^{\prime}}\) means the first derivative of the loss function, \(w\) describes the weight, \(O\) indicates the output matrix vector, and \(l\) is the layer \(l\). Then, the gradient in Block-R1, Block-R2 and Block-R3 can be defined in Eq. (4),

$$\begin{gathered} O_{{r1_{l + 1} }} = f(\delta_{2} (f(\delta_{1} (O_{{r1_{l} }} )) + O_{{r1_{l} }} )) \hfill \\ O_{{r2_{l + 1} }} = f(\delta_{2} (f(\delta_{1} (O_{{r2_{l} }} )) + O_{{r2_{l} }} )) \hfill \\ O_{{r3_{l + 1} }} = \delta_{2} (f(\delta_{1} (f(O_{{r3_{l} }} )) + O_{{r3_{l} }} )) \hfill \\ \end{gathered}$$
(4)

where \(f\) means the activation function, \({\delta }_{1}\) and \({\delta }_{2}\) represent the first and second convolution calculations, respectively. It is worth noting that the difference between Eq. (3) and Eq. (4) lies in the order of normalization, which is not reflected in the equation.

Multiplication is widely used in the calculation of series convolution, such as \({\delta }_{2}(f({\delta }_{1}))\). The cumulative multiplication between \((-1, 1)\) makes it possible for the gradient to appear the result of approximate 0, so that the classical gradient disappears. The residual-connection weakens this problem through weight addition, and enhances the stability of the network. Obviously, it is a very effective way to use residual blocks to build architecture in very deep neural layers, especially in calculating the depth feature map.

We defined the convolution block (BN, RL, Conv) in AD unit as per Eq. (5),

$$\ell_{({c}_{i},k)}={{w}_{{c}_{i}}}^{T}f({I}_{i})+{b}_{{c}_{i}}$$
(5)

where \({c}_{i}\) is the convolution layer \(i\), and \(k\) describes the kernel size, \(f\) is the activation function. The \(w\) and \(b\) represent the convolution weight and bias, \({I}_{i}\) is the input data. Then, the AD block can be define as per Eq. (6)

$$\ell_{AD}=\ell_{({c}_{1},1)}\ell_{({c}_{2},1)}(a\ell_{({d}_{0},3)}+b\ell_{({c}_{3},3)})\ell_{({c}_{4},3)}+\ell_{({c}_{5},1)}$$
(6)

where \(\ell_{{d}_{0}}\) is the dilated convolution. In the gradient back-propagation process, \(a\) and \(b\) can automatically adjust the weight ratio of the convolution integral branch in the main path. In addition, the channel parameters of the AD-Net were set to 32, 64, 128, 256, and the skip connection adopted the 3D matrix concatenate method. The residual structure was a necessary element. The residual calculation ensured the stability of gradient in deep feature calculation.

3 Experimental Results and Discussion

This research involved 10,001 subjects from the SEER database collected between 2000 and 2019 [16] (SEER, 2021). The age of the patients was ranged between 19 to 94 years old (Mean = 65.85, SD = 9.72). Further, 51.8% of the patients were males (n = 5181) and the remaining were females (n = 4820). There were 1214 (12.1%) alive cases and 8787 (87.9%) death cases. The age of alive cases ranged between 23 and 90 years old (Mean = 64.40, SD = 8.71), whereas the age of death cases ranged between 19 and 94 years old (Mean = 66.05, SD = 9.83). In terms of race, there were 86.3% white, 10.8% black, and 5.1% other nations (i.e., American Indian/AK Native, Asian/Pacific Islander). The alive cases included 571 (47%) males and 643 (53%) female patients. In addition, the death cases included 4610 (52.5%) male patients and 4177 (47.5%) females. An independent sample t-test was performed to test the difference between male and female patients. The outcomes pointed out that there is a significant difference between male and female subjects (t (9999) = 3.55, p < 0.001). This indicates that the probability of death was higher in males than females. The samples lung cancer CT images collected from SEER database are given in Fig. 3.

Fig. 3
figure 3

Sample lung cancer CT images in SEER database

The following evaluation metrics are used for validating the classifier performance. It is given in Eq. (7)

$$\begin{aligned} & Accuracy = \frac{TP + TN}{{Total\, no\, of\, images}} \\ & Precision = \frac{TP}{{FP + TP}} \\ & Recall = \frac{TP}{{FN + TP}} \\ & F1score = 2*\frac{Precision*Recall}{{Precision + Recall}} \\ \end{aligned}$$
(7)

where accuracy explains how well the architecture can classify the images. It is given as the ratio of total correct prediction made to the total number of predictions made. Precision is given by the ratio of total number of correctly classified positive cases to the sum of total positive cases predicted. High value for precision is required in order to minimize the number of false positive classifications. Recall is defined as the ratio of total positive cases correctly classified to all correctly classified observations and F1 score gives the harmonic mean of precision and recall. It can be also defined as the weighted average of precision and recall [15,16,17,18]. The performance comparison results shown in Tables 1 and 2 suggest that all applied classifiers have a good performance. Experimental results of different models are visualized in Fig. 4 and the classification rate of various models using different statistics method is given in Fig. 5.

Table 1 Performance measures of different Lung cancer classification models
Table 2 Performance comparison of the classifiers
Fig. 4
figure 4

Experimental results of different classification model

Fig. 5
figure 5

Classification rate of different model using various statistics methods

4 Conclusion

In this paper, a new multi-scale approach to segment the lung cancer in CT image was described and evaluated in several publicly available databases. This paper also presents an assessment of the most appropriate scales for the lung cancer segmentation, complementing previous work that defines these scales empirically. Furthermore, it was also demonstrated that a multi-scale analysis can improve the lung cancer segmentation. Although recent research has been focusing on deep learning methods, rule-based methods can also be important for the definition of features, that can significantly improve the outcome of these methods. The achieved results show that the proposed approach is very competitive when compared with the current state of the art methods, particularly in high-resolution images. Our method still needs further improvement in the enhancing tumor region segmentation. It were practical tools in 3D lung cancer segmentation.