Keywords

1 Introduction

Automatic diagnostic systems are a significant use for study of database and pattern recognition. It aims at supporting doctors in marking diagnostic decisions [25]. This system is mainly used to diagnose the variety of cancers. The cancer is second major cause for death in the world, because of this fact, and is expected to move top level to cause of death in few years [1]. The classification of medical images is very essential in the medical field and it is important for therapy preparation, identifying deformity, quantifies tissue volume to check tumor progress, analyses anatomical structure. Manual classification of Computed Tomography images is a challenging and cumbersome task and highly possible to make an error due to inter-observer variability. The classification results are highly substandard which leads to erroneous results. Thus, an automatic or classification approach is highly desirable as it decreases the complexity on the human work. Rule-based approach or Adaptive Neuro-Fuzzy Inference System (ANFIS) has been employed in a variety of applications, such as disease detection, data fusion [14], information security, trust management [2, 19] etc. throughout the last decade. ANFIS is one of the extensively used neuro-fuzzy systems [13, 16]. In this research work, the neuro-fuzzy based approach called ANFIS is applied for tumor recognition and classification.

2 Literature Review

The underlying objective of this research is to discover interesting knowledge from Computer Tomography (CT) images to give effective radiotherapy. This literature review presents some works that are needed to accomplish the objective of this research.

Hosseini et al. emphasized ANFIS as classifier and it overcomes the problems of fuzzy systems and neural networks [8]. Roy et al. explored an improved classifier with ANFIS for brain tumor tissue characterization. They have used Harvard benchmark dataset and obtained 98.25% accuracy for both contrast and non-contrast images [21]. Deshmukh et al. exhibits a computerized identification approach for the MRI image with the help of neuro fuzzy logic [4]. The substantial iteration time and the precision level are attained to be around 50–60% upgraded in identification compared to the existent neuro classifier. Mishra et al. presented ANFIS and ANN in identifying the tumor cells in the brain [18]. Sharma and Mukherji presented an image segmentation technique for locating brain tumor. GLCM is used for feature extraction in their proposed work. Fuzzy rules and membership functions are defined to increase the accuracy using hybrid Genetic algorithm and based on certain features. An adaptive network integrates the benefits of both fuzzy and neural network for segmenting brain tumor from MRI images [23]. Selvapandian and Manivannan proposed fusion based brain Tumor detection and Segmentation using ANFIS Classification by using BRATS open data set [22]. Fathima Zahira M et al. proposed novel segmentation and classification methodology using ANFIS for efficient classification [26]. Mahmud et al. discussed the broad investigation on the application of various learning techniques in biological data mining and also presented various open source tools and specified pros and cons [15, 17]. Kaiser et al. exhibits fuzzy neural Network based COVID-19 for self-screening and identify employee admissibility to be present at the workplace. They proposed iWorkSafe that can help in confirming social distancing parameter with the scores which are replicating the fitness of the employees [13].

3 Tumor Stage Identification

Staging designates the sternness of an individual’s cancer growth depends on the dimensions of the main tumor and depends on cancer has spread in the body. It is very essential because of the below points:

  • Staging aids the physician to make the strategy and decide the appropriate therapy.

  • The stage of the cancer can be used in recognizing a person’s diagnosis.

  • The identification of cancer stage is significant in recognizing clinical trials that will be an appropriate therapy choice of any patient.

Automatic investigative systems are mainly used as a major application for analysis of database entities and pattern recognition, which are aiming at helping medical experts in marking investigative assessments [25]. Automated diagnosis is mainly used to recognize the cancer types. The classification of medical images is becoming progressively more significant in the medical domain since it is vital for therapy preparation and identifying anomaly, quantify tissue volume to perceive tumor progress, study anatomical structure. Manual classification of Computed Tomography images is a challenging task and it is also takes more time. Hence, an automatic diagnostic method is required as it shrinks the hitches on the manual process. ANFIS is one of the extensively used neuro-fuzzy systems. In this research work, the neuro-fuzzy based approach specifically ANFIS is applied for tumor recognition and classification.

4 Adaptive Neuro-Fuzzy Inference System (ANFIS) for Tumor Stage Classification

ANFIS is the most popular techniques which has been applied frequently in recent years and is an amalgamation of two predictive analytics methods: Neural Network (NN) and Fuzzy Inference System (FIS) proposed by Jang et al. [10]. The aim of ANFIS is to incorporate the finest advantages of fuzzy systems and neural networks. The benefit of fuzzy set is the depiction of preceding facts into a set of constraints to decrease the utilization of search space. It is a hybrid intelligent system which combines the least squares and the back propagation gradient descent method of Sugeno type fuzzy inference systems (FIS). The classification precision of ANFIS is relatively greater than the fuzzy and neural classifiers. The conjunction time period of ANFIS is better compared to neural and the fuzzy classifier [7]. ANFIS is mainly used to optimize the parameters of fuzzy systems and replaces the manual process.

Some benefits of ANFIS are:

  • Mainly is used in segment an image to improve the fuzzy if-then rules.

  • It does not necessitate manual intervention.

  • It increases many membership functions.

  • The reason for using ANFIS is to give more accurate classification. Only minimal difference is present between different stages. So ANFIS is used for precise tumor classification.

4.1 Architecture of ANFIS

An ANFIS modify parameters and structural design of FIS implements neural learning rules. The FIS can be categorized into three types. In this research work, type 3 architecture called Takagi and Sugeno’s fuzzy if- then rules and triangular membership function are used that is illustrated in Fig. 1. The object degree represents the membership value between the range of 0 and 1, which indicates the fuzzy set. The fuzzy set matches between input value and its membership values with interrelated membership function.

Fig. 1.
figure 1

Illustration of the triangular membership function.

ANFIS uses two sets of arguments: a set of premise arguments and a set of consequent arguments for membership function and rules. Two fuzzy if-then rules are used to design the ANFIS architecture.

$$\begin{aligned} R_1&:~\text {If}~p~\text {is}~A_1~\text {and}~q~ \text {is}~B_1,~\text {then}~f_1 = l_1p + m_1q + n_1 \\ R_2&:~\text {If}~q~\text {is}~A_2~\text {and}~q~\text {is}~B_2,~\text {then}~f_2 = l_2p + m_2q + n_2 \end{aligned}$$

where p and q are the inputs, The fuzzy sets are represented as \(A_i\) and \(B_i\), \(f_i\) are specified by the fuzzy rule within the fuzzy region, remaining parameters such as \(l_i\), \(m_i\) and \(n_i\) are the parameters of design which are used during the training process.

Fig. 2.
figure 2

Structure of the ANFIS Layers.

These two rules are used in the ANFIS architecture i.e., is shown in Fig. 2, the symbol circle is denoted for a stable node, and the flexible node is represented by square symbol. This ANFIS consists of a five layer architecture as below:

Layer 1: The layer 1 node represents the flexible nodes:

$$\begin{aligned} O_{1,i}&=\mu _{A_i}(p); i = 1, 2\\ O_{1,i}&=\mu _{B_i-2}(q); i = 3, 4 \end{aligned}$$

where \(\mu _{A_i}(p)\), \(\mu _{B_i-2}(q)\) represents fuzzy membership function [9, 11]. The triangular membership functions are given by using three parameters such a, b and c:

$$\begin{aligned} triangle (x; a,b,c)= \left\{ \begin{matrix} 0, &{} x\le a.\\ \frac{x-a}{b-a},&{} a\le x \le b.\\ \frac{c-x}{c-b}, &{} b\le x \le c.\\ 0, &{} c \le x. \end{matrix}\right. \end{aligned}$$

The another expression for the previous equation is specified by:

$$\begin{aligned} f(x; a,b,c) = max \left( min \left( \frac{x-a}{b-a}, \frac{c-x}{c-b} \right) , 0 \right) \end{aligned}$$

Layer 2: This layer represent nodes are fixed. They are labeled with \(\pi \). The layer 2 output is represented as below and called as firing strengths of the rules:

$$\begin{aligned} O_{2,i} = w_i = \mu _{Ai}(p) \mu _{Bi}(q)~for ~i = 1,2 \end{aligned}$$

Layer 3: The nodes are representing normalization part of the layer 2 [20]. The outputs are called as normalized firing strengths and are denoted as:

$$\begin{aligned} O_{3,i} = \overline{w_i} = \frac{w_i}{w_i+w_2}~for ~i = 1,2 \end{aligned}$$

Layer 4: It contains adaptive nodes. The outputs of this layer are:

$$\begin{aligned} O_{4,i} = \overline{w_i}f_i = overline{w_i}(l_ip+m_iq+n_i)~for ~i = 1,2 \end{aligned}$$

Layer 5: The label \(\sum \) represents the single fixed node. It sums up all input signals. The output of this part is:

$$\begin{aligned} O_{5,i}=\sum _{i} \overline{w_i}f_i=\frac{\sum _{i}W_if_i}{\sum _{i}W_i} \end{aligned}$$

It is inspected that the layer1 have three changeable parameters \(a_i\), \(b_i\), \(c_i\) also called as premise parameters, are associated with the input membership functions. There are also three modifiable parameters or consequent parameters \(l_i\), \(m_i\), \(n_i\), concerning to the first order polynomial on the layer4. Hence, an adaptive network is formulated which are matches to a type-3 fuzzy inference system [10, 24].

The structure of ANFIS consists of two input node and ten output node. The two input represent the height and width calculated from each lung tumor slices of image. The triangular membership function is implemented. The output of the ten rules is condensed into one single output, representing the Lung cancer stage for a particular patient. The set of premise parameters and consequent parameters are most significant feature in ANFIS architecture. The parameters which change the range of the membership function are called the premise parameter. The parameters which conclude the output based on the condition is called consequent parameter. Two nonlinear parameters and ten linear parameters are used in the proposed ANFIS architecture. Height and width are the premise parameter and between stage1a to stage3 are the consequent parameters. The fuzzy if-then rules are followed to make the input in the ANFIS architecture [12].

5 NCCN Guidelines Version 2.0 Staging Nom-Small Cell Lung Cancer

Lung cancer is the leading reason of cancer demise in the world wide, and the delay in identification is a fundamental obstacle to improving lung cancer outcomes. 1.59 million deaths occur worldwide due to lung cancer. The patient survival rate can be increase in a substantial manner if early stage identification of lung cancer. The males are affected more in Lung cancer than females in 5:1 ratio [3].

6 The Tumor, Node, and Metastasized Staging System

Staging helps to choose what our suggested therapy plan may be. Staging means finding out:

  • tumor location.

  • its size.

  • if and how much the lung cancer has spread.

The stage of a cancer will calculate the spreading level in the human body. The International Association of the Study of Lung Cancer (IASLC) [5, 6] was modified the International staging system. The accurate identification of the stages of cancer is very important to select the appropriate treatment.

The TNM staging system is used to narrate the development and extent of Non-Small Cell Lung Cancer (NSCLC).

  • T represents the tumor size and it affected places.

  • N narrates the spreading of cancer in lymph nodes. These nodes are group of immune cells in the human body.

  • M indicates the percentage of cancer has spreads out in the organs of body.

6.1 Prediction of Stages in Lung Cancer

The stage grouping is framed based on the T, N and M values to prepare the entire stages. Sometime the stages are divided into two stages named as A and B. Recognize the cancers using these stages, which have a related outlook, treated on a same approach.

  • Stage I: The cancer is identified in the lungs, not extended to any nearby places such as lymph nodes.

  • Stage II: The cancer is spreads out in both lung and lymph nodes.

  • Stage III: This stage is advanced stage. The cancer is located in the lung and in the lymph nodes in the middle of the chest. The stage III has two subtypes:

    • Stage IIIA represents the affected lymph nodes, which are available on the same side of the cancer affected chest.

    • Stage IIIB denoted the infected lymph nodes. These nodes are now situated on the opposite side of the chest.

  • Stage IV: In this stage, the cancer has spread to both lungs or to another part of the body. The NCCN Clinical Practice Guidelines for lung tumour is narrated in Table 1.

Table 1. Descriptor, T and M Categories, and Stage Grouping (NCCN Clinical Practice Guidelines in Oncology, 2012)

7 ANFIS Rules

The model of fuzzy if-then rules is outlined for the cancer stage classification is represented as below.

if biopsy==p&& stage_1a_height==1&& stage_1b_height == 0&& stage_2a_height==0&& stage_2b_height == 0&& stage_3_height ==0 then tumor_height_stage = 1;

if biopsy==p&& stage_1a_height==0&& stage_1b_height == 1&& stage_2a_height==0&& stage_2b_height == 0&& stage_3_height ==0 then tumor_height_stage = 2;

if biopsy==p&& stage_1a_height==0&& stage_1b_height == 0&& stage_2a_height==1&& stage_2b_height == 0&& stage_3_height ==0 then tumor_height_stage = 3;

if biopsy==p&& stage_1a_height==0&& stage_1b_height == 0&& stage_2a_height==0&& stage_2b_height == 1&& stage_3_height ==0 then tumor_height_stage = 4;

if biopsy==p&& stage_1a_height==0&& stage_1b_height == 0&& stage_2a_height==0&& stage_2b_height == 0&& stage_3_height ==1 then tumor_height_stage = 5;

if biopsy==p&& stage_1a_width==1&& stage_1b_width == 0&& stage_2a_width==0&& stage_2b_width == 0&& stage_3_width ==0 then tumor_width_stage = 1;

if biopsy==p&& stage_1a_width==0&& stage_1b_width == 1&& stage_2a_width==0&& stage_2b_width == 0&& stage_3_width ==0 then tumor_width_stage = 2;

if biopsy==p&& stage_1a_width==0&& stage_1b_width == 0&& stage_2a_width==1&& stage_2b_width == 0&& stage_3_width ==0 then tumor_width_stage = 3;

if biopsy==p&& stage_1a_width==0&& stage_1b_width == 0&& stage_2a_width==0&& stage_2b_width == 1&& stage_3_width ==0 then tumor_width_stage = 4;

if biopsy==p&& stage_1a_width==0&& stage_1b_width == 0&& stage_2a_width==0&& stage_2b_width == 0&& stage_3_width ==1 then tumor_width_stage = 5;

8 Experimental Results and Discussions

An automatic detection of lung cancer stage from multiple slices of CT images is presented based on ANFIS rules. The experimental result demonstrates lung tumor slices in the whole slice of the patient. Six patients who had previously undergone CT scans for the treatment of lung cancer were selected for this study. GTVs, CTVs, PTVs were contoured manually on all tumors by the radiation oncologist. Here the patient id 002 tumor slices only presented.

8.1 Patient Id: 002 – Slice No 63-74

The patient id 002 consists of 103 slices. The tumor present in this patient is between slice no 63 and 74 out of 103 slices.

Fig. 3.
figure 3

Illustration of the slice by slice Contours of GTV, CTV and PTV have been identified on this CT slice no 63 to 74 for a Lung tumor patient.

Fig. 4.
figure 4

Illustrate the Contours of GTV, CTV and PTV have been identified on the CT slice for a Lung Tumor (Slice no. 68).

In Fig. 4, the red color indicates GTV. It is a primary tumor volume. Pink color indicates CTV. The CTV is represented the margin either fixed or variable length surround the GTV. Yellow color indicates PTV. PTV denotes the CTV plus a fixed or variable margin.

In this research, only the Stage classification for lung tumor is considered. Because esophagus and rectum staging is not based on size at all. It is entirely on the depth of invasion of tumor. So size parameter cannot co-relate with staging. So the stage is not identified for esophagus and rectum tumor. In addition to the tumor volumes, a biopsy report is also used as supporting information for this research. But the aim of the research is to identify the tumor and its stage before performing the biopsy.

The two important parameter height and width is used to identify the stage. The tumor height and width is calculated based on the number of pixels for each row and column. Each pixel value consists of 0.1. Each pixel value is calculated for every row and column in tumor contouring. So the size of the tumor is based on height and width. According to Table 1 the stages are classified as stage 1A, stage 1B, stage IIA, stage IIB and stage III. In this research, primary tumor stage is found. The other lymph nodes such as N0, N1, N2 and N3 are not considered. Based on the size of the tumor, we classify the stages of the tumor. Size of the tumor is calculated in cm unit. Height and width is calculated for all tumor slices. The average of all tumor slices is taken for the size of the GTV. In this research work, the lung tumor stage classification for the patient id 002, 005, 006, 008, 013, 014 is represented.

The patient id 002 consists of 103 slices. The tumor slice is present from slice no 63 to 74 out of 103 slices i.e. shown in Fig. 3. The height and width of GTV is calculated for every slice. The maximum, minimum and average value is taken to identify the stage. The tumor height is 11 cm and width is 10 cm for patient ID 002. The average is 10.5 cm. According to Table 1, this value exceeds greater than 7cm. It indicates that the stage is T3, i.e., Stage III. The nodes are not considered in this research. Similarly the other lung tumor patient width and height value is shown in Table 2. This table number 2 represents the results obtained by the proposed method and also verified with radiation oncologist result. The proposed method gives very close result in patient id 005,006,013,014 and exactly same in patient id 002 and 008. The primary tumor (T) result is the same for all patients.

Table 2. Comparison of Lung Tumor Stage Classification by the Radiation Oncologist and by proposed method with ANFIS

9 Conclusion

The medical diagnosis data is huge and more complicated. The mining of understandable knowledge in this data is the difficult task in medical domain. In this research presents the knowledge discovery process from tumor volume (GTV) for cancer stage identification using ANFIS rules. The proposed system yielded good results and can be used in diagnosis of cancer in an efficient way. The primary Lung tumor detection has been discussed using classification accuracy. From the study it has been found that the accuracy of the proposed methods with ANFIS is 98%, which means that this system can help the radiologists and radio oncologist to increase their diagnostic confidence. The tumor stage result demonstrates that this research is valuable to improve the diagnosis and reduce the number of unnecessary biopsies. The system can be used as an intelligent tool by radiologists and radio oncologists to help them make more reliable diagnosis.