Keywords

1 Lung Cancer Overview

Non-small cell lung cancer (NSCLC) remains the leading cause of cancer-related mortality in men and women with an overall 5-year survival rate of 19.3 % [1]. Tumor TNM staging using tumor size, local invasion, and the presence of nodal and distant metastases remains the prevailing method to predict patient survival with 5-year stage-specific survival rates ranging from 73 % in stage IA disease to 13 % in stage IV disease [2]. In addition to these long-established clinical methods of predicting survival, newer prognostic tools based on individual tumor mutations and protein expression show great promise in providing additional personalized genetic information with the potential to revolutionize treatment algorithms and tumors classifications.

2 Lung Cancer Staging

Cancer staging systems provide a standardized framework to define a tumor’s spread so homogenous patient groups can be studied and discussed by different sources. The lung cancer staging system provides useful prognostic information for patients and structures treatment plans for providers. The current staging system developed by the International Association for the Study of Lung Cancer (IASLC) Lung Cancer Staging Project is the seventh widely used NSCLC staging system and is the first in NSCLC to be developed from an international patient database and to be internally and externally validated to significantly stratify patients based on survival outcomes.

2.1 History of Lung Cancer Staging

In the 1950s, the Veterans’ Affairs Lung Study Group introduced a two-stage system to classify lung cancer for use in clinical trials, which described patients as having either limited or extensive disease. The first TNM classification of lung cancer was introduced by the Union Internationale Contre le Cancer (UICC) now known as the International Union Against Cancer in 1966 as part of a series of brochures that proposed TNM descriptions for a variety of different organ sites. In 1968, lung TNM definitions were published under the section “other sites” in the UICC “TNM Classification of Malignant Tumors.” No stage groupings were suggested, and the TNM descriptors were used to simply convey the anatomic extent of the tumor: T1 for a tumor localized to one lung segment, T2 for a tumor confined to one lobe, T3 for a tumor involving the main bronchus or more than one lobe, T4 for tumors extending beyond the lung, and N1 to describe any involvement of intrathoracic lymph nodes [3].

Soon after the initial UICC proposal in 1973, the American Joint Committee for Cancer Staging and End Results Reporting (AJC), now the American Joint Committee on Cancer (AJCC) Task Force on Lung Cancer, proposed new data-driven TNM definitions and introduced stage groupings. The AJCC system, published by Clifton Mountain, David Carr, and W.A. Anderson, was based on 2155 surgical lung cancer specimens mainly from MD Anderson Cancer Center in Houston, Texas from patients with at least 4 years of follow-up data. This first AJCC system outlined the majority of the T descriptors still used today, including size cutoff of 3 cm, invasion of visceral and parietal pleura, chest wall, diaphragm, and mediastinum and an N2 lymph node category was added to describe mediastinal lymph node involvement. Different TNM permutations were grouped into stages I, II, and III to allow for the maximum separation in survival outcomes between the groups [4]. While some TNM groupings had too few cases for analysis and there was no validation of the proposed stages, this data-driven publication represented a major step forward in NSCLC staging and laid the framework for the current staging system.

Based mainly on a growing number of patients in Dr. Mountain’s MD Anderson database and some from the National Cancer Institute (NCI), newer updated editions of the lung cancer staging system were published through 1997. These newer editions divided the T classification into subdivisions such as T1a and T1b, added N3 to accommodate contralateral or distant nodal metastases, further stratified TNM stage groupings into A and B, and added stage IV to describe metastatic disease. In addition, the descriptors “c,” “p,” “y,” and “r” were introduced to identified tumors staged clinically, pathologically, following treatment, and following recurrence, respectively. TNM groups were assigned to stages I to IV based on survival data, with statistically significant survival differences seen between different stage groups [5].

All of the AJCC staging system revisions continued to be based on Dr. Mountain’s database, which at the time of the last revision consisted of 5319 specimens. At the time, this was the largest collection of patient pathologic and survival information available, but using Dr. Mountain’s database was flawed in that the samples were mainly drawn from a single institution in USA, some survival data that were more than twenty years old, and none of the staging cutoffs were externally validated. In addition, the patient population reflected historic lung cancer demographics. Dr. Mountain’s original staging study patients were mostly male and the database contained 1712 cases of NSCLC of which 30 % were adenocarcinoma and 58 % were squamous cell carcinoma [4]. Since that time, the histologic prevalence of lung cancer had shifted, major advances in imaging dramatically changed the way lung cancer was diagnosed and staged, and new chemotherapy regimens and radiation treatments had evolved.

The IASLC Lung Cancer Staging Project was an unprecedented international effort to revise the staging system to reflect a global patient population, all treatment modalities of care, and current survival outcomes. The IASLC Staging Project lead to the 2010 adoption of the seventh and current edition of the TNM staging in lung cancer which is based on 81,015 international lung cancer cases including 67,725 NSCLC, 13,290 small cell lung cancer (SCLC), and 513 carcinoid tumors. The IASLC staging system represents a milestone in accurate and scientifically based lung cancer staging as it underwent extensive internal and external validation and resulted in modified T and M categories and updated stage groupings to reflect the most current survival data.

2.2 Non-small Cell Lung Cancer Staging

NSCLC is the broad grouping of primary lung tumors including adenocarcinoma, squamous cell carcinoma, and large cell neuroendocrine carcinoma which combined comprise 85–90 % of all newly diagnosed lung and bronchus tumors [1]. Adenocarcinoma is the most common form of NSCLC and lung cancer overall, and accounts for about 50 % of NSCLC and 38 % of newly diagnosed lung cancers. Squamous cell carcinoma has slowly been decreasing in incidence and is currently the second most common NSCLC. Recent Surveillance, Epidemiology, and End Result (SEER) cancer registry data indicate that it accounts for 30 % of nearly diagnosed NSCLC in men and 20 % of new diagnoses in woman [6, 7].

2.2.1 IALSC NSCLC TNM Descriptors

The result of the IASLC Staging Project was the seventh edition of the UICC/AJCC TNM system for NSCLC. The TNM system is used to stage most cancers and describes the anatomic spread of a tumor. In it, the T descriptor describes the extent of the primary tumor, the N descriptor reflects the extent of lymph node involvement, and the M descriptor defines spread to distant sites.

The T descriptor in most cases is determined by tumor size as measured by the greatest dimension on computerized tomography (CT) imaging with T1a ≤ 2 cm, T1b > 2 but ≤ 3 cm, T2a > 3 but ≤ 5 cm, T2b > 5 but ≤ 7 cm, and T3 > 7 cm [8]. There is debate and no official consensus regarding how to measure semisolid lesions or ground glass opacities with a solid component as the measurable tumor dimensions change when viewed on a lung or a mediastinal window [9]. In our practice, both measures are reported, but we base our clinical T stage off the measured solid component. For the pathologic T stage, tumors should be measured prior to fixation to determine the greatest diameter as fixation in formalin can cause up to 20 % shrinkage in tumor size [10]. Beyond size criteria alone, direct invasion of nearby structures can increase a tumor’s T stage. T2 is used to describe tumors that invade the visceral pleural, involve the main bronchus but remain ≥2 cm away from the carina, or tumors which cause atelectasis or obstructive pneumonia that does not involve the entire lung. T3 tumors directly invade the chest wall, diaphragm, phrenic nerve, mediastinal pleura, or parietal pericardium, and T3 also describes a tumor in the main bronchus <2 cm from the carina, a tumor causing atelectasis or obstructive pneumonia of the entire lung, or a separate tumor nodule(s) in the same lobe. T4 describes a tumor of any size with invasion of the heart, great vessels, trachea, recurrent laryngeal nerve, esophagus, vertebral body, or carina, or a separate tumor nodule(s) in a different ipsilateral lobe [8]. Pancoast tumors which invade thoracic nerve roots would be classified as T3, and T4 if the tumor invades C8 or higher cervical nerve roots, the brachial plexus, subclavian vessels, vertebral bodies, lamina, or the spinal canal [9] (Table 1).

Table 1 Seventh edition TNM staging system for NSCLC

Pleural invasion, particularly the presence of tumor at the surface of the visceral pleura, has been an indicator of a poor prognosis since the early systems for lung cancer staging were established [4]. In subsequent years, the definition of what constitutes invasion has been interpreted differently by different clinicians and varied from gross pleural puckering to histologic confirmation of tumor on the visceral pleural surface. Several studies showed that there was a significant survival difference in patients when tumor crossed the visceral pleural elastica [11, 12]. Additionally, cases with invasion across the visceral pleural elastica showed similar prognoses as those with tumor at the visceral pleural surface. Pleural invasion can be classified using histologic criteria put forth by Hammar [13]. Using these criteria, a tumor can be classified as PL0 (no invasion), PL1 (invasion through visceral pleural elastica), PL2 (tumor present at surface of visceral pleura), and PL3 (tumor invades into parietal pleura) (Fig. 1). Tumors with visceral pleural invasion (PL1 and PL2) are classified as T2a unless other factors result in a higher designation. Tumors with parietal pleural invasion (PL3) are classified as T3 unless other factors result in a higher designation [14].

Fig. 1
figure 1

Pleural invasion. Hematoxylin and eosin (H&E) stain 100x (a) and Verhoeff-van Gieson (VVG) stain 100x (b) of a lung adenocarcinoma invading the visceral pleura. The visceral pleural surface is seen in the top left corner inked blue with the tumor invading from the bottom of the image. Multiple tumor deposits (orange arrow) can be seen approaching the visceral pleural surface on the H&E stain. The VVG stain allows visualization of the visceral pleural elastica layer (black arrows). The presence of the tumor deposit (orange arrow) superficial to the visceral pleural elastica layer stages this tumor as PL1 pleural invasion

Nodal involvement is characterized by the N descriptor with N0 indicating no nodal involvement. N1 is defined by tumor metastasis or direct extension into ipsilateral peribronchial or perihilar lymph nodes and intrapulmonary nodes, representing lymph node stations 10–14. N2 describes tumor metastasis or direct extension into ipsilateral mediastinal or subcarinal lymph nodes, representing lymph node stations 2–9. N3 status reflects metastasis into contralateral mediastinal, contralateral hilar, ipsilateral, or contralateral scalene or station 1 supraclavicular nodes. Extrathoracic nodal involvement, such as a positive axillary lymph node, is classified as M1b [8] (Table 1). The lymph node stations and radiographic borders defined by IASLC are shown in Fig. 2.

Fig. 2
figure 2

Non-small cell lung cancer lymph node stations. a International Association for the Study of Lung Cancer (IASLC) lymph node station map, stations, and CT scan. b Application of the IASLC lymph node stations and borders to CT scans. Reproduced from Rusch et al. [101] with permission

Micrometastases as defined by UICC and AJCC contain cancerous cells with mitoses and invasion and can be seen on standard hematoxylin and eosin staining. Micrometastases in lymph nodes should be considered a positive node and described as N2 (mi). However, isolated tumor cells, which are differentiated as being small clusters of tumor cells without mitosis, vascular invasion, or lymphatic invasion, should not be counted as a positive metastasis [9].

The M descriptor relates to distant metastatic disease and is divided into M1a and M1b. M1a describes tumors with a separate tumor nodule in a contralateral lobe, pleural nodules, or malignant pleural dissemination. M1b describes metastases to distant sites and extrathoracic organs (Table 1). The same rules regarding nodal micrometastases and isolated tumor cells apply to M staging [8].

The TNM stage grouping scheme was adjusted in the seventh edition of the staging guidelines to best separate survival outcomes between stages [2]. Stage IA includes tumors up to 3 cm with no lymphatic spread. These early-stage tumors are managed with surgical resection alone. Adjuvant treatments are not indicated in stage IA NSCLC, and patients are followed with surveillance CT scans. Stage IB tumors measure between 3 and 5 cm or have other criteria to make them T2 such as invasion of the visceral pleura or involvement of the main bronchus without any lymphatic spread. Stage IIA tumors are any T1 or T2 tumor with N1 nodal involvement or a tumor between 5 and 7 cm in size without nodal involvement. Positive N1 nodal involvement and a 5- to 7-cm tumor becomes Stage IIB. Stage IIB also includes T3 tumors without nodal involvement such as tumors greater than 7 cm, tumors that invade the chest well, diaphragm, or mediastinal pleura, tumors that involve the main bronchus, or a separate tumor nodule in the same lobe [2]. Stage IB-IIB tumors without nodal involvement amenable to complete surgical resection can be managed with upfront surgery followed by adjuvant therapy. Tumors that are locally invasive or have suspect N1 should undergo neoadjuvant chemotherapy to attempt to reduce and downstage tumors prior to surgical resection.

Stage IIIA tumors are the most heterogeneous group with a wide range of presentations from smaller tumors with mediastinal nodal involvement to large and locally invasive tumors. This stage grouping includes any T1, T2, and T3 tumor with N2 nodal involvement and T3 tumors with N1 nodal involvement. Stage IIIA also includes stage T4 tumors with invasion of the great vessels or heart or with a separate ipsilateral tumor nodule with N0 or N1 lymph nodes [2]. It is challenging to develop rigid treatment algorithms for stage IIIA patients due to the diversity of tumors, and as a result, treatment plans for IIIA patients should be discussed by a multidisciplinary tumor board. Survival outcomes can vary widely within this complex group depending on the presence of mediastinal nodal disease, the tumor’s response to neoadjuvant chemotherapy and or radiation, and the pulmonary operation required to achieve a complete resection [15].

Advanced-stage lung cancers, where a surgical resection for local control no longer offers a survival advantage, include stage IIIB and IV tumors. Stage IIIB tumors are T4 tumors which invade the heart, great vessels, trachea, or other major nearby structures or T4 tumors with a separate ipsilateral tumor nodule with N2 lymph node involvement, or any tumor with N3 lymph node involvement of contralateral mediastinal, contralateral hilar, ipsilateral or contralateral scalene, or supraclavicular nodes. Stage IV disease comprises tumors with any M1 distant metastasizes including separate tumor nodules in a contralateral lobe, pleural nodules, malignant effusion, or metastasis to an extrathoracic organ [2] (Table 2).

Table 2 Seventh edition NSCLC staging definitions

TNM staging can be assessed at multiple time points and tumors can be down-staged by treatment or upstaged as disease progresses. The type of staging classification is denoted by prefix with clinical stage and pathologic stage being the two most commonly used types. Clinical stage, denoted by a “c” prefix, refers to staging based on exam, imaging, biopsy, and surgical staging done prior to any treatment. Pathologic staging is the gold standard and is based on the surgical specimen and information obtained from a definitive surgical resection. The seventh edition of the NSCLC staging system allows clinical and pathologic classifications to be applied to T, N, and M individually when only partial information is available [16]. After induction treatment, staging or restaging is denoted with the prefix “y” which can be further described as “yc” or “yp.” Staging done after a recurrence developed is denoted by the prefix “r,” and staging done postmortem based on an autopsy is denoted by the prefix “a.” [9] (Table 3).

Table 3 Staging modifiers

The most recent NCI SEER data on 48,315 annual cases of lung cancer show the following NSCLC incidence by stage: stage IA 11.7 %, stage IB 6.1 %, stage IIA 3.6 %, stage IIB 3.7 %, stage IIIA 11.7 %, stage IIIB 5.6 %, stage IV 49.3 %, with 1.5 % being occult and 5.1 % of cases with stage unknown. SEER data indicate that over the past few years, there has been a steady rise in the incidence of smaller, early-stage tumors, and particularly stage IA lesions which has been attributed to the increasing use of chest CT scans and increasing detection of incidental lung lesions [17].

2.2.2 Synchronous Tumor Nodules

The most important distinction in approaching additional pulmonary nodules in the setting of a primary lung cancer is determining whether they represent a separate primary lung cancer, an isolated pulmonary metastasis, or multifocal lung cancer. IASLC guidelines give the pathologist primary responsibility for determining when nodules represent a synchronous primary lung cancer or a pulmonary metastasis [5]. This distinction was historically more difficult to make as most synchronous primary lung cancers have the same histologic type [18]. However, the current era of rapid tumor mutation profiling will likely simplify this process though mutation profiling for this exact purpose has yet to be validated.

The distinction between a metastatic single lung cancer and two separate early-stage cancers dramatically alters the clinical stage and patient management, so this determination ideally needs to occur prior to a surgical resection. For this reason, others have recommended [9], and it is our practice to discuss these complex patients with an experienced multidisciplinary tumor board before defining lesions as synchronous primary lung cancers with separate TNM staging and treatment plans.

The most widely known criteria for histologic differentiation of synchronous primaries from intrapulmonary metastases are those proposed by Martini and Melamed [19]. According to these criteria, tumors of similar histology are categorized as synchronous primaries if they are in different segments, lobes, or lungs, showing a component of carcinoma in situ, and there is an absence of both intralymphatic tumor in shared lymphatics and extrapulmonary metastasis. At the time of the publication of the criteria of Martini and Melamed, the majority of the tumors evaluated were squamous cell carcinoma, and the diagnosis of adenocarcinoma in situ had not been accepted in lung tumors. More recently, Girard et al. [20] presented a method of comprehensive histologic assessment to compare separate nodules to determine whether they represent synchronous primaries or metastases. This method evaluates tumor histologic type (e.g., adenocarcinoma, squamous cell carcinoma), histologic pattern and percentage breakdown of pattern for adenocarcinomas (e.g., lepidic, acinar), and stromal and cytologic features (e.g., lymphoid hyperplasia, signet ring cells). Comprehensive histologic assessment correlated well with molecular profiling and showed prognostic accuracy when staging patients. Patients determined to have intrapulmonary metastases within the same lobe have survival outcomes similar to patients with solitary tumors designated as T3. Patients with ipsilateral metastases to a different lobe show survival outcomes similar to patients with solitary tumors designated as T4. Patients with contralateral metastases are designated as M1a [14].

2.3 Pulmonary Carcinoid Tumor Staging

As part of the IASLC Lung Cancer Staging project 513, carcinoids were submitted to the international lung cancer database used to define the NSCLC TNM stage groups. These tumors were excluded from the NSCLC analysis and were not used in creating the new TNM categories; however, subsequent review of the IASLC data as well as SEER data has demonstrated that the T, N, and M categories as well as the TNM groupings for NSCLC are also significant predictors of survival when applied to pulmonary carcinoid tumors [21].

SEER data on 1437 pulmonary carcinoid tumors indicate that carcinoids are diagnosed at an earlier stage than NSCLC with the following incidence: stage IA 57 %, stage IB 22 %, stage IIA 9 %, stage IIB 3 %, stage IIIA 6 %, stage IIIB <1 %, and stage IV 3 % [21]. Overall carcinoid tumors have a better prognosis stage-for-stage than NSCLC with 5-year survival rates of 93, 85 75, and 57 % in stage I, II, III, and IV tumors, respectively. As with NSCLC, older age and male sex are significantly associated with worse survival [21]. While SEER data analysis of the TNM staging system did not distinguish between typical and atypical carcinoid tumors, typical carcinoids have a better prognosis than atypical carcinoids. Long-term survival data show 5-year and 10-year overall survival rates of 97 and 90 % for typical carcinoids and 71 and 62 % for atypical carcinoids, respectively [22].

2.4 Small Cell Lung Cancer Staging

Small cell lung cancer (SCLC) is characterized by rapid doubling time, early development of widespread metastases, and markedly worse survival outcomes than NSCLC [23]. SCLC has been decreasing in incidence with current NCI data indicating that SCLC currently comprises only 10 % of all new lung cancer diagnoses [1]. While most patients with SCLC will initially respond to chemotherapy and radiation, disease recurrence remains a major problem [24]. Outcomes have remained poor over the past several decades and only 4.6 % of all patients are still alive two years following diagnosis [25].

Over 60 % of SCLC patients present with overt metastatic disease and almost all of the remaining 35–40 % have locally advanced disease not amenable to surgical resection. Therefore, the classic TNM staging systems, based on pathologic confirmation from a surgical specimen, were historically considered neither practical nor clinically useful in these advanced-stage patients. Instead, a modification of the original VALSG two-stage lung cancer staging system was widely used with SCLC patients described as having “limited” or “extensive” disease which corresponded to TNM stages I-IIIB and stage IV, respectively.

Patients with “limited” disease were described as tumors confined to the ipsilateral hemithorax and regional nodes, which could be included in a single radiation treatment field. Such SCLC patients are generally treated with curative-intent chemoradiation and chemotherapy. In these favorable “limited” disease patients, there is still only a 10 % 5-year survival rate [25]. “Extensive” disease in SCLC is the stage IV equivalent and is defined as tumor beyond the boundaries of limited disease including distant metastasis, malignant pericardial or pleural effusion, or contralateral hilar or supraclavicular involvement. In this group, there are no long-term survivors.

While the two-stage system is still widely used in SCLC, the seventh edition of the TNM system that was developed for using in NSCLC successfully applies to SCLC. The IASLC staging project collected data on 12,620 cases of SCLC and had sufficient data to apply the new NSCLC TNM criteria to 8088 of them. SCLC clinical TNM stage data were used instead of pathologic TNM data as only 5 % of SCLC patients are eligible for surgery, and therefore, pathologic TNM stage cannot be obtained from the majority of patients. The TNM staging system predicted survival for SCLC patients with significantly worse outcomes among patients with increasing cT stage. There were no significant difference in survival between patients with cN0 versus cN1 nodal spread stage; however, cN2 and cN3 disease did correlate with progressively worse survival. Increased TNM stage groupings were associated with worse outcomes with shorter median survival times of 30 months in stage IA, 18 months in stage IB, 33 months in stage IIA, 18 months in stage IIB, 14 months in stage IIIA, 12 months in stage IIIB, and 7 months in stage IV patients. Five-year overall survival rates were as follows: stage IA 38 %, stage IB 21 %, stage IIA 38 %, stage IIB 18 %, stage IIIA 13 %, stage IIIB 9 %, and stage IV 1 %. Of note, outcomes in stage IIA patients were slightly off trend as N0 versus N1 nodal status was not shown to be an important distinction in SCLC [26].

3 Prognosis

The international database created for the IASLC lung cancer staging project lead to data-driven and extensively validated T, N, and M categories with significant differences in survival. TNM stage remains the most important factor in survival prognostication; however, heterogeneity in outcomes within the same TNM group suggests that other clinical or molecular prognostic markers should be developed and used to further refine risk stratification. Here, we present a review of the current evidence behind the major prognostic factors in NSCLC.

3.1 Stage-Based Survival Outcomes

The IASLC database of 81,015 eligible cases yielded prognostic information on T, N, M and overall stage medial survival times and overall 5-year survival rates from the largest collection of patient data ever available. Clinical and pathologic stage-based survival estimates are available and often provide different prognostic estimates. For example, tumors staged clinically as cT1a or cT1b have 5-year survival rates of 53 and 47 %, respectively, whereas pathologically staged pT1a and pT1b tumors have 5-year survival rates of 77 and 71 %, reflecting that early-stage tumors are often clinically under-staged. The presence of any nodal spread is a poor prognostic indicator. The clinical presence of cN1 nodal disease is associated with a 67 % 1-year survival rate and a 29 % 5-year survival rate. Pathological-staged pN1 nodal disease has a slightly better prognosis at 77 % 1-year survival and 38 % 5-year survival, reflecting how the inclusion of patients with micrometastatic nodal disease in the pathologically staged group will upstage the same patient, and therefore increase survival rates in the pathologically staged group. Any M1 categorization by malignant pleural effusion, contralateral nodule, or distant disease was associated with 5-year survival rates of less than 6 % [8]. Detailed information on survival by each T, N, and M descriptor from the IASLC database from the original Detterbeck et al. study is reproduced in Table 4.

Table 4 Staging modifiers

TNM stage groupings currently provide the most accurate prognostic estimate of overall survival. Pathologically staged stage IA patients have a median survival time of 119 months or almost 10 years, and a 5-year overall survival rate of 73 %. This compares to a 46 % 5-year survival rate among stage IIA patients and a 24 % 5-year survival rate among IIIA patients (Table 5; Fig. 3) [2]. However, there is obvious heterogeneity within each stage group with some patients rapidly developing systemic disease and others surviving long term without recurrence. There is great interest in identifying the clinical characteristics and tumor biologic markers that might be used to pinpoint more personalized and accurate survival outcomes within each TNM stage group.

Table 5 Prognosis by TNM stage group
Fig. 3
figure 3

NSCLC 5-year overall survival rates. a Overall 5-year survival by clinical stage. From Goldstraw et al. with permission. b Overall 5-year survival by pathologic stage. From Goldstraw et al. [2] with permission

3.2 Clinical and Demographic Prognostication

In addition to TNM stage, other factors that have been shown to have prognostic value include tumor grade, sex, age over 65 years, smoking status, performance status, comorbidities, type of pulmonary resection, and hospital case volume [27]. A Mayo clinic review of 5018 NSCLC patients found that following TNM stage, the most important prognostic factor was tumor grade with a 70 and 80 % higher risk of death for poorly differentiated and undifferentiated carcinomas after controlling for age, sex, smoking history, tumor stage, histologic cell type, and treatment [28].

In addition to poorly differentiated tumor grade, prognostic factors shown in multiple studies to be independently associated with worse long-term survival include male gender, increased age, high pT stage, and patient’s performance status [2830]. Histologic subtype has often been cited as a prognostic factor in NSCLC with improved survival outcomes in patients with squamous cell histology [31]. However, repeated multivariate analyses have failed to identify histologic subtype as an independent prognostic marker [27]. Comorbid diseases at the time of diagnosis have been shown to independently decrease survival rates and a Charlson comorbidity score ≥3 is associated with an 80 % increased risk of death at 1-year [32]. Specifically, cardiovascular comorbidities have been shown to increase NSCLC risk of death by 30 %, diabetes increases mortality by 20 %, cerebrovascular disease increases mortality by 20 % [33], and a history of chronic obstructive pulmonary disease (COPD) decreased 5-year survival by 20 % [34]. In a recently published study of 394 patients with advanced NSCLC, the median survival was only 7.8 months, and on multivariate analysis, only performance status was a significant prognostic factor that influenced survival [35]. Smoking cessation following diagnosis with early-stage NSCLC has also been shown in meta-analysis to improve prognosis. Patients who continued to smoke following NSCLC diagnosis had increased mortality (HR 2.94, 95 % CI 1.15–7.54) compared with patients who stopped smoking after diagnosis [36].

A review of 19,702 stage I NSCLC cases from the California Cancer registry found that advanced age, male sex, low socioeconomic status, non-surgical treatment, and poor histologic grade were associated with increased mortality, whereas bronchoalveolar carcinoma histology and Asian ethnicity were associated with decreased mortality [37]. Unmarried patients and patients with lower socioeconomic status with early-stage NSCLC are less likely to undergo surgery. Lower socioeconomic status is associated with other potential prognostic factors including male sex, unmarried status, squamous cell histology, poorly differentiated tumors, fewer surgical resections, and less treatments overall in NSCLC. When these other factors are controlled for on multivariate analysis, low socioeconomic status remains an independent poor prognostic factor [38].

Long-term NSCLC survival data beyond 5 years from the SEER database demonstrate that patients still alive at 5 years can expect long-term overall survival rates of 55.4, 33.1, and 24.3 % at 10, 15, and 18 years, respectively, and disease-specific survival rates of 76.6, 65.4, and 59.4 % at 10, 15, and 18 years, respectively. Significant predictors of improved long-term disease-specific survival after 5 years include tumor size < 3 cm, age < 60 years, female gender, right-sided tumor, non-squamous histology, having undergone lobectomy or pneumonectomy. Poor predictors of long-term survival beyond 5 years include squamous cell histology and having a pulmonary wedge resection or no surgery at all [39].

For patients that undergo a surgical resection, hospital case volume has been shown repeatedly to impact prognosis. A study of 119,146 NSCLC patients from the National Cancer Database found that among patients that underwent surgical resection, 30-day mortality was highest among patients who require a pneumonectomy (8.5 %), and among older patients (age > 85, 7.1 %), male patients (4.4 %), and patients with increasing comorbidities (Charlson score ≥ 2, 5.0 %). Hospital case volume was also a significant independent predictor of 30-day mortality with an overall 3.6 % 30-day mortality in low volume hospitals who perform less than 47 pulmonary resections a year and a 0.7 % 30-day mortality in high-volume hospitals that perform more than 190 pulmonary resections a year (p < 0.0001) [40]. In addition to the expected impact of case volume on 30-day mortality, SEER data indicate that 5-year survival is also impacted by hospital case volume. In the SEER database, patient’s operated on at high volume centers have 5-year survival rates of 44 % compared with 33 % at low volume centers [41].

3.3 Biomarkers and Genetic Prognostic Indicators

Only 53 % of stage I and II NSCLC patients are alive 5 years after a complete surgical resection, with most deaths being directly related to cancer recurrence [42]. While the TNM staging system remains the strongest predictor of survival, tumor biology and survival outcomes vary widely within each stage. In the modern era of molecular biomarkers and rapid genetic sequencing, increasing amounts of tumor-specific information can refine prognostic estimates beyond crude anatomic TNM stage alone.

Over a thousand studies have been published that identify prognostic biomarker proteins, mRNA, miRNA, and oncogenes in NSCLC; however, no dominant single biomarker has withstood sufficient validation to be incorporated into clinical use. Immunohistochemistry (IHC) staining of tumors to identify overexpressed proteins is the most typical method used to identify and evaluate potential prognostic biomarkers, but IHC methods are not standardized using different antibodies and “positive” cutoff definitions, and as a result, data are inconsistent between studies. Single protein markers which initially seemed promising such as insulin-like growth factor-1 receptor (IGF1R), hepatocyte growth factor (MET), cyclin D1, Excision Repair Cross-Complementation group (ERCC1) [43, 44], and many others have later failed to be prognostic in subsequent cross-validation studies [4550]. The most promising proteins which have shown more consistent support or been backed by meta-analysis include epidermal growth factor receptor (EGFR) [51, 52] and B cell lymphoma 2 (Bcl-2) [53, 54] as favorable prognostic markers and human epidermal growth factor receptor 2 (HER-2) [55], vascular endothelial growth factor (VEGF) [56, 57], Kirsten rat sarcoma (KRAS) [31, 51, 52], tumor protein p53 (TP53) [31, 52], and Ki-67 [58] as poor prognostic markers.

EGFR mutations are found at much higher rates among certain patient populations, most notably in over 60 % of never-smoking Asian women with lung adenocarcinoma [59] and in 20 % of NSCLC patients under the age of 50 [60]. In the TRIBUTE study, EGFR mutations were detected in 13 % of tumors in previously untreated NSCLC patients. These patients with EGFR mutations had longer overall survival times regardless of treatment and improved responses to the EGFR tyrosine kinase inhibitor erlotinib [51]. A study in 397 Japanese patients found EGFR mutations in 49 % of patients and showed that EGFR mutations were a favorable prognostic indicator of improved overall survival times. However, multivariate analysis accounting for smoking history and tumor stage did not find EGFR mutations to be an independent prognostic indicator when controlling for other prognostic factors (p = 0.03225) [52].

There is mixed data regarding Bcl-2 and prognosis in lung cancer. One study found Bcl-2 to be highly expressed in 63 % of lung adenocarcinomas and 45 % of lung squamous cell carcinomas and patients with high Bcl-2 expressing tumors had longer survival times. Bcl-2 was found to be independently associated with survival on multivariate analysis [53]. Other studies have found no correlation between Bcl-2 expression and survival [31] but a recent meta-analysis over 7765 patients demonstrated that high expression of Bcl-2 protein was a favorable prognostic indicator [54].

In stages IB and IIA, NSCLC HER-2 expression is associated with poor prognosis [55]. Vascular endothelial growth factor (VEGF) overexpression has also been associated with poor survival [56]. A recent meta-analysis of the prognostic impact of VEGF expression found that increased expression of VEGFA and VEGFR was independently associated with poor survival outcomes in NSCLC and particularly in lung adenocarcinoma [57].

KRAS mutations have been repeatedly identified as a poor prognostic indicator [31]; however, this may be due to its known association with other prognostic factors including smoking history and tumor stage so the validity of KRAS as an independent prognostic marker is still under debate. In the TRIBUTE study, KRAS mutations were present in 21 % of tumors and were associated with shorter time to progression and worse survival in patients treated with erlotinib [51]. The association between KRAS mutation and poorer survival outcomes has also been shown in Japanese patients with shorter survival times among patients with a KRAS or TP53 mutation on univariate analyses. Interestingly, KRAS and TP53 mutations seem to correlate with other clinical prognostic factors such as smoking history and tumor stage. While smoking history (p = 0.0310) and tumor stage (p < 0.0001) remained significant poor prognostic indicators on multivariate analysis, neither KRAS (p = 0.8500) nor TP53 (p = 0.3191) was independent prognostic factors [52]. In IHC studies, TP53 overexpression has been shown to correlate with worse survival outcomes [31].

Tumor cell proliferation measured by Ki-67 staining on IHC has produced conflicting results as a biomarker in NSCLC. However, a large study of 1065 patients demonstrated that perhaps some of these differences occurred from grouping lung adenocarcinoma and squamous cell carcinoma together in the analysis. The mean Ki-67 index in squamous cell carcinoma was twice as high as in lung adenocarcinoma, and data from this study indicated that high Ki-67 was a stage-independent negative prognostic factor in lung adenocarcinoma, whereas a high Ki-67 was a favorable prognostic factor in squamous cell cancer [58].

There has been much interest in developing liquid biopsy technology that detects either circulating tumor cells (CTCs) or circulating free DNA (cfDNA) in blood samples from patients with solid tumors. The detectable presence of circulating tumor cells itself has been suggested as a poor prognostic indicator in many types of malignancy. In NSCLC, serial analysis of CTC has demonstrated that a decreasing number of captured cells correlate with disease regression in response to treatment and increase in the number of circulating tumor cells is associated with tumor progression [61]. A recent meta-analysis of a total of 1576 patients found that CTCs were associated with lymph node metastasis, tumor stage, shorter overall survival, and progression-free survival [62]. Many of the studies in this area have examined the use of CTC or cfDNA to characterize well-known mutations such as EGFR mutations and secondary mutations along multiple time points of a patient’s treatment. Patient who responded by RECIST criteria to treatment with pertuzumab and erlotinib had decreased CTC counts, and the patients with decreasing CTC counts had significantly longer progression-free survival times (p = 0.05) [63]. The relative amount of circulating cfDNA has also been shown to be of prognostic value in early studies. In a study of advanced NSCLC patients, levels of cfDNA increased as their disease progressed and overall survival and progression-free survival were both significantly shorter in patients with higher levels of cfDNA [64].

A large number of studies have used microarray technology to generate validated gene expression signatures from thousands of markers using high throughput sequencing and improving computational tools. Multiple assays have shown some prognostic value; however, there is disappointingly little overlap between different gene sets [65, 66]. Much of the problem in creating these prognostic algorithms lies in over-fitting of the prognostic signatures to the thousands of microarray data elements from a relatively small number of patients. Efforts lead by major scientific journals that require authors to make raw microarray data available in places such as the Broad Institute, Gene Expression Omnibus, or ArrayExpress may improve this computational process by sharing data and allowing more independent validation [45].

It has been shown in NSCLC that global DNA hypermethylation is associated with a worse prognosis [67]. However, it has been challenging to identify specific gene hypermethylation signatures that have consistent prognostic value. A study of 237 stage I NSCLC patients identified that hypermethylation of five genes (HIST1H4F, PCDHGB6, NPBWR1, ALX1, and HOXA9) was significantly associated with shorter recurrence-free survival in stage I NSCLC. The accompanying DNA methylation signature assay was able to divide patients into high- and low-risk groups with significant differences in recurrence [68]. Other studies have created other DNA methylation signatures which correlate with survival [69] or identify select genes which have prognostic significance within the dataset [67], but none have passed external validation. Like microarray signatures, the problem with these prognostic assays lies in over-fitting of the data and little overlap between DNA hypermethylation is seen between studies.

Given the complexity of tumor biology, a panel of genes to reflect the multiple mutations acquired by a tumor is likely to be more accurate and widely applicable than a single prognostic biomarker. A handful of assays has been developed and validated to show prognostic value. Of these, the most widely tested and validated is a 14-gene expression assay on formalin-fixed paraffin-embedded tumors specimens developed at our institution. This gene expression assay uses QT-PCR and a computational algorithm on a panel of 14 genes to stratify non-squamous NSCLC patients into low-, intermediate-, or high-risk categories. It has proven to have prognostic value in over 2000 patients from multiple international validation cohorts [7072]. In the initial validation study among 433 stage I, non-squamous NSCLC patients with an R0 surgical resection from the Kaiser Permanente Division of Research 5-year overall survival rates were 71.4, 58.3, 49.9 % among low-, intermediate-, and high-risk patients, respectively (p = 0.0003) [70]. Rigorously validated prognostic assays such as this one have clinical utility in identifying early-stage patients at higher risk of recurrence who may benefit from adjuvant chemotherapy and separating out the high-risk patients from those with a low risk of recurrence who might spare the toxicity of unnecessary adjuvant treatments.

Other notable multi-gene prognostic signatures include a 160-gene signature developed from 332 stages I to III NSCLC patients from the Directors’ Challenge Consortium and validated on 264 patients from combined test series. Patients identified as “high-risk, poor prognosis” by this gene prognostic signature had 2.8 times greater risk of 5-year lung cancer-related mortality than “low-risk, poor prognosis” patients (p <  0.0001) [73]. The University of Texas Southwestern 12-gene signature was also developed from Directors’ Challenge Consortium non-squamous NSCLC data on 422 patients and validated in two data sets consisting of a total of 266 validation patients. This gene signature predicts which patients are likely to benefit from adjuvant chemotherapy with improved survival (HR 0.34, p = 0.017) seen among patients predicted to benefit from adjuvant therapy and no improvement in survival (HR 0.80, p = 0.070) among the predicted the group without benefit [74]. Another signature, a 15-gene signature based on microarray of 133 Canadian patients from the Joint British Recommendations-10 trial has been validated in 5 microarray cohorts of fully resected, stages I to II NSCLC patients with worse survival (HR ranges 1.92–3.57) among patients with “high-risk” gene signatures [75, 76]. A cell cycle proliferation (CCP) score based off of 31-genes that was originally developed from RT-qPCR of fresh frozen paraffin-embedded prostate cancer samples has been validated in lung adenocarcinoma cohorts such as the Directors’ Consortium Cohort to predictor cancer-specific survival (HR = 2.08, p = 0.00014) with significant prognostic value in both univariate and multivariate analyses [77].

In early-stage lung cancer, these prognostic assays can serve a valuable role in selecting which patients are more likely to recur following surgery, and therefore who may benefit from a more aggressive treatment approach or increased monitoring. Adjuvant chemotherapy has been shown repeatedly to add a survival benefit in fully resected, early-stage NSCLC [78, 79] and is recommended by National Comprehensive Cancer Network (NCCN) guidelines for patients with stage IIB and greater NSCLC and stages IB and IIA patients with certain “high-risk” clinicopathologic features [80]. Use of tumor molecular profiles to further risk-stratify early-stage NSCLC patients has been demonstrated to better predict patients’ recurrence risk following surgery than NCCN “high-risk” features [72].

3.4 Predictive Biomarkers

In addition to the aforementioned prognostic markers which provide survival outcomes information, many separate predictive markers have been identified that can be used to predict response to treatment. Studies of these predictive biomarkers are plagued by the same difficulties of over-fitting datasets and failure in cross-validation that make prognostic biomarkers challenging to identify. VeriStrat is a proteomic signature based on mass spectrometry that was developed to predict which advanced NSCLC patients would respond best to the EGFR tyrosine kinase inhibitors gefitinib and erlotinib. While initial data in validation cohorts seemed encouraging [81], testing on later patient cohorts showed that VeriStrat did not significantly predict erlotinib response. Though VeriStrat did not prove to be useful as a predictive marker, it did have some support as a prognostic marker in the subset of patients who did not receive erlotinib treatment where a VeriStrat “poor” stratification was predictive of worse survival overall [82].

A meta-analysis of BRCA1 as a predictive biomarker of outcome of NSCLC treated with platinum-based and paclitaxel-based chemotherapy showed that overall lower levels of BRCA1 were associated with greater responses to chemotherapy and better overall survival [83]. Another predictive biomarker in NSCLC is ribonucleotide reductase M1 (RRM1) that may have some use predicting response to gemcitabine. Meta-analysis of data on 1243 patients has shown that low RRM1 is associated with a better response to gemcitabine-based regimens and improved survival [84].

3.5 Immunotherapy and Prognosis

The recent FDA approval of the programmed death-1 (PD-1) inhibitor nivolumab as a second-line treatment for squamous non-small cell lung cancer marks the beginning of a new era of treatment options for advanced NSCLC with the potential for durable responses and prolonged survival in some patients. The major immune checkpoint modulators PD-1, PD-L1, and cytotoxic T lymphocyte antigen-4 (CTLA-4) are targets of new drugs in various stages of clinical trials and along with these expanding treatment options come new prognostic and predictive immunotherapy biomarkers.

The PD-1 receptor or CD279 is an immune checkpoint modulator that is expressed on the surface of CD4 and CD8 lymphocytes, B lymphocytes, and natural killer (NK) cells and plays a key role in blunting T cell immune function. PD-1 is also preferentially expressed on regulatory T cells, which generate the immunosuppressive tumor microenvironment. The ligand of PD-1, PD-L1, is upregulated in many solid tumors including NSCLC where it binds to regulatory T cells and exploits the PD-1/PD-L1 pathway to evade recognition by the host’s anti-tumor immune system [85].

A recent meta-analysis of 1157 NSCLC patients showed that PD-L1 expression was significantly associated with poorly differentiated tumor histology (OR 1.91, p = 0.001), and high PD-L1 expression was correlated with shorter overall survival times (HR 1.75, p < 0.001) [86]. Another study of 164 NSCLC surgical specimens found higher PD-L1 expression in tumors from female patients, never smokers, and higher expression in adenocarcinoma versus squamous cell carcinoma with EGFR mutations and adenocarcinoma histology independently associated with increased PD-L1 expression. This study also showed that higher levels of PD-L1 in resected tumors were associated with significantly shorter overall survival times and were a poor prognostic indicator [87].

PD-L1 has also been linked to the EGFR pathway. Activation of the EGFR pathway in NSCLC leads to overexpression of PD-L1, Il-6, and TGFβ all of which contribute to immunosuppression. In a xenograft model of EGFR-driven tumors PD-1 inhibition has been shown to cause tumor regression and improved survival [88]. PD-L1 overexpression has been correlated with EGFR mutations and is a poor prognostic indicator in EGFR wild-type patients; however, it has not been shown to correlate with survival in EGFR mutant patients [89].

Some data have suggested that PD-L1 expression might be a useful predictive biomarker for response to immune therapy; however, this has not been supported by more recent clinical trial data. Nivolumab is a PD-L1 monoclonal antibody that works by blocking PD-1 T cell tolerance and thereby activating the immune system against cancer cells. Phase 1 clinical trials of nivolumab demonstrated a 17 % objective response rate in patients with heavily pre-treated NSCLC [91]. PD-L1 is expressed by 50–95 % of all NSCLC [90] but trial data suggest that there is no clear association between PD-L1 expression and response to nivolumab or survival [91]. Pembrolizumab (MK-3475, Merck) is another anti-PD-1 immunotherapy that is FDA approved for ipilimumab-refractory melanoma is currently in clinical trials for NSCLC. Phase 1 trial data of pembrolizumab in advanced NSCLC showed an overall response rate of 19.4 %. NSCLC patients with elevated levels of PD-L1 on IHC had a 45.2 % pembrolizumab response rate versus 16.5 % response rate in patients with low levels of PD-L1 and 10.7 % response rate in patients with no PD-L1 expression, suggesting that PD-L1 is a predictive biomarker of pembrolizumab response [92].

Selecting patients most likely to respond to immune therapy remains a critical question in order to avoid the risks of autoimmune toxicity and pneumonitis in patients unlikely to respond to treatment. In the nivolumab phase I clinical trial, only 17 % of patients responded to treatment and only 2 patients had a response last longer than one year. Squamous cell tumors were more likely to respond than non-squamous tumors with response rates of 33 and 12 %, respectively [93]. In the 3 mg/kg dosing cohort with the best response rates, the median OS was 14.9 months, 1-year OS was 56 %, and 2-year OS was 45 % [94]. Squamous cell histology may be a useful predictive marker of nivolumab response, but current data suggest that PD-L1 on IHC is not. Using PD-L1 as a predictive marker is also complicated by the fact that studies have used different IHC detection antibodies and different expression thresholds to define tumors as PD-L1 positive or negative [90]. More importantly, robust responses have also been observed in patients with low PD-L1 expression and use of PD-L1 as a predictive marker remains in early stages of development. Data from 135 patients that received nivolumab in the phase 3 clinical trial in squamous cell NSCLC showed that PD-L1 expression was neither prognostic nor predictive of benefit from nivolumab [95].

Ipilimumab is a human monoclonal antibody to CTLA-4 that has shown promise in early clinical trials of advanced NSCLC [96]. CD4 and CD8 T cells are activated when antigen presentation by a major histocompatibility complex is accompanied by binding of B7 molecules on the antigen presenting cell to CD28 receptors on the T cell. CTLA-4 acts competitively with CD28 for B7 binding, and when bound to B7 CTLA-4 inhibits T cell activation [97]. CTLA-4 is expressed in 51–87 % of NSCLC tumors, and its expression is associated with adenocarcinoma histology, older patient age, and poor tumor differentiation; however, none of the current studies have found it to be independently prognostic of overall survival nor has CTLA-4 expression been shown to be predictive of treatment response [98, 99].

3.6 The Future of NSCLC Prognostication

An updated IASLC database of 94,708 new patients diagnosed with lung cancer between 1999 and 2010 is currently being analyzed to inform recommendations for the eight edition of TNM NSCLC guidelines which are projected out in 2016 [100]. This dataset is expected to yield updated survival estimates and clarify minor issues with the seventh staging system but dramatic changes in the TNM classifications are not anticipated. Major shifts in the future of lung cancer prognostication are likely to come from widespread use of molecular testing and clinical application of our increasing knowledge of biomarkers in lung cancer. Further understanding of tumor biology and rapid genetic analysis will improve risk stratification within each TNM stage and lead to more individualized treatment plans and precise survival prognostication.