Introduction

Non-small cell lung cancer (NSCLC) accounts for more than 85 % of all lung cancer cases, with the main histological subtype consisting of adenocarcinoma. In the last decade, the advent of novel therapeutics targeting signalling pathways activated by genetic alterations has revolutionized the way patients with NSCLC are treated [1]. Recent practice guidelines in oncology and pathology recommend that all locally advanced and metastatic NSCLC with adenocarcinoma histology undergo testing for the most common targetable genetic abnormalities, such as epidermal growth factor receptor gene (EGFR) mutations, anaplastic lymphoma kinase gene (ALK) rearrangements, and non-targetable such as Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations. Approximately 15 % of all NSCLCs in patients from European ethnicities and 50 % of NSCLCs in never-smokers are EGFR mutation-positive. The most frequent EGFR mutations (sensitizing activating mutations) are associated with tumour sensitivity to EGFR tyrosine kinase inhibitors (gefitinib, erlotinib, and afatinib) [2]. ALK rearrangements have a lower incidence (<7 % of all NSCLC) and are more frequent in never/former smokers. Crizotinib was the first drug approved for NSCLC harbouring ALK rearrangements, while ceritinib and alectinib have been approved only in some countries (USA and Japan, respectively) [35]. KRAS is a member of the Ras family of GTPases that promote cell growth and division. Approximately 20–25 % of NSCLC present KRAS mutations and are associated with smoking and adenocarcinoma histology [6]. To date, KRAS has been considered a non-druggable target that predicts poor response to standard and targeted therapies, while therapeutic strategies are currently under clinical investigation [7].

While tumour genotyping can reveal important information related to underlying biology, acquisition of tissue through invasive biopsies is often obtained from only a portion of a generally heterogeneous lesion and cannot completely represent the lesion’s anatomic, functional, and physiologic properties, such as its size, location, and morphology [8]. Moreover, an initial surgical specimen might significantly differ from the molecular profile in primary tumour and its metastases [9], but sequential or multiple biopsies to identify subclones can rarely be implemented in routine clinical care because of logistical and financial barriers.

Radiogenomics, focused on defining relationships between “image phenotypes” and “molecular phenotypes” [10], is currently considered a promising new paradigm for extending clinical imaging into the era of molecular imaging [11].

If CT imaging correlates of clinically relevant gene expression signatures, such as EGFR, ALK, and KRAS, were established in NSCLC, this could help redefine existing staging and diagnostic paradigms and would thus be of clinical benefit.

Previous studies have evaluated the relationship between some CT imaging features and single genetic mutations in NSCLC [1214], but, to the best of our knowledge, no previous study has evaluated the association of the three most common NSCLC genetic mutations with CT features. Therefore, the purpose of this study was to assess the association between CT features and tumoral mutations of EGFR, ALK, and KRAS in patients with NSCLC (adenocarcinoma).

Materials and methods

Patient selection

The study population was retrospectively selected from a database where patients with lung adenocarcinoma undergoing testing for EGFR and/or ALK and/or KRAS mutations were prospectively recorded between May 2006 and February 2014. Inclusion criteria consisted of a pretreatment chest CT study of the primary tumour at our institution, a cell type diagnosis of adenocarcinoma, and data on EGFR and/or ALK and/or KRAS rearrangement status. Exclusion criteria consisted of a CT performed at another institution or CT not including the chest at our institution. Written informed consent to undergo the pathological test and to the use of clinical and imaging data for scientific and/or educational purposes was obtained from all patients beforehand.

CT image acquisition

CT examinations were randomly performed either on a 16-slice Lightspeed CT system (General Electric Healthcare, Milwaukee WI, USA) or on a 64-slice GE MSTC Optima 660 (General Electric Healthcare, Milwaukee WI, USA). All examinations extended in a craniocaudal direction, with or without contrast medium. All images were archived in digital format. On the 16-slice CT, images were acquired with the following parameters: tube rotation time 0.8 s; pitch 1.75; standard soft-tissue algorithm reconstruction; collimation 20 mm (16 × 1.25 mm); slice thickness 2.5 mm; reconstruction interval 2.5 mm; display field of view (DFOV) 320–360 mm; tube voltage 120 kV; tube current 100–440 mA; Noise Index (NI) =11.57. On the 64-slice CT, images were acquired with the following parameters: tube rotation time 0.6 s; pitch 1.375; standard soft-tissue algorithm reconstruction; collimation 20 mm (32 × 0.625 mm); slice thickness 2.5 mm; reconstruction interval 2.5 mm; DFOV 320–360 mm; tube voltage 120 kV; tube current 80–440 mA; NI = 18.2.

Evaluation of CT features

Two radiologists with different degrees of experience in interpreting chest CT images independently performed all qualitative image analyses. One was a senior radiologist with 11 years’ experience in thoracic imaging (SR); the other was a fellow with 3 years’ experience in interpretation of CT images (FDM). They both analyzed the Digital Imaging and Communications in Medicine (DICOM) images from CT studies without access to genomic data, but aware of the presence and site of a NSCLC. The first 15 cases were analyzed in consensus to standardize the reading. If interpretations differed, the senior reader’s decision was accepted.

For each patient, date of CT examinations, age, sex, presence of distant metastases, and smoking status (positive for current or former smoking for at least 10 years) were extracted from the medical records. For each CT (each corresponding to a single patient), the following data were recorded on an Excel spreadsheet file (Microsoft Office Excel 2003, Richmond, VA, USA): 1) site of the lesion, indicated as right upper lobe (RUL), middle lobe (ML), right lower lobe (RLL), left upper lobe (LUL), left lower lobe (LLL), and mixed when infiltrating more than one lobe; 2) maximum diameter of the lesion (in millimetres) evaluated on the multiplanar reconstructed (MPR) images with a soft tissue window; 3) shape, indicated as complex, round, or oval; 4) margins, evaluated in the lung window, and indicated as smooth, lobulated, or spiculated/irregular; 5) presence or absence of a ground-glass opacity; 6) lesion density, indicated as subsolid or solid; 7) presence or absence of cavitation; 8) presence or absence of air bronchogram; 9) thickening of the adjacent pleura (fissural or peripheral pleura); 10) presence or absence of intratumoral necrosis; 11) presence or absence of satellite nodules in primary tumour lobe; 12) presence or absence of nodules in non-tumour lobes; 13) presence or absence of pleural retraction; 14) location of the lesion, as central or peripheral; 15) presence or absence of intranodular calcifications; 16) presence or absence of emphysema; 17) presence or absence of fibrosis (related to presence of honeycombing, traction bronchiectasis, lung architectural distortion, reticulation); 18) presence or absence of pleural contact; 19) presence or absence of pleural effusion.

Selected gene sequencing and identification of mutations

Tumours were classified according to the 2004 WHO Classification [15]. DNA mutational analysis was performed on formalin-fixed, paraffin-embedded tumour tissues. Areas with at least 60 % tumour cells were selected and macrodissected. DNA was extracted using a commercially available kit (QIAamp DNA FFPE Tissue kit, Qiagen) to amplify exons 18 through 21 of EGFR and exon 2 and 3 of KRAS. The polymerase chain reaction products were purified by enzyme treatment with Exonuclease I and shrimp alkaline phosphatase (GE Healthcare, Life Sciences). The cycle sequencing reactions were carried out using BigDye Terminator chemistry (Applied Biosystems), followed by removal of unincorporated reagents with BigDye XTerminator kit (Applied Biosystems). Sequencing reaction products underwent capillary electrophoresis using 3500Dx Genetic Analyzer (Applied Biosystems) and were sequenced in both forward and reverse directions. Mutations were confirmed in two independent experiments. ALK gene rearrangement was detected by fluorescence in situ hybridization. An unstained slide was incubated with ALK dual colour probe (Break Apart Rearrangement-DNA probe, Abbott Molecular, Les Plaines, IL, USA). At least 100 tumour cells were screened to detect any rearrangement at the ALK locus.

The evaluation criteria provided by the manufacturer were applied: cells were considered positive if a break-apart pattern of orange and green signals, at least one additional orange signal or a combination of both patterns appeared. Tumours were considered to reveal an ALK rearrangement if at least 15 % of cells were positive.

Statistical analysis

Patient and CT characteristics of the study population were expressed as mean and standard deviation for continuous variables (age and lesion size), and as frequency and percentage for categorical variables. Since the Kolmogorov-Smirnov test suggested an abnormal distribution for age and lesion size, non-parametric tests were used for the analysis of these variables.

Inter-reader agreement for CT features was assessed by percent of concordant cases and Kappa of agreement, with 95 % confidence intervals (CI). Conventionally, a value of Kappa lower than 0.20 is considered poor agreement, between 0.21 and 0.40 fair, between 0.41 and 0.60 moderate, between 0.61 and 0.80 substantial, and finally, a value greater than 0.80 is considered almost perfect agreement [16].

Univariate analysis was performed to assess the association of patient and CT features with each gene mutation, using non-parametric two-sample Wilcoxon test for continuous variables, and chi-square test for categorical variables; subsequent multivariate analysis was performed and odds ratios with 95 % confidence intervals were calculated by a logistic regression model with stepwise selection of variables. According to stepwise selection, effects were entered into and removed from the model so that each forward selection step could be followed by one or more backward elimination steps. At each forward selection step, the score chi-square statistic was computed for each effect not in the model and the largest of these statistics was examined. If it was significant at the p = 0.05 level, the corresponding effect was added to the model. At each backward elimination step, results of the Wald test for individual parameters were examined. The least significant effect that did not meet the p = 0.05 level for staying in the model was removed. The stepwise selection process terminated when no further effect could be added to the model or when the current model was identical to a previously visited model. A ROC curve was drawn for each gene mutation prediction according to each significant characteristic and with the full model, and the corresponding area under the curve (AUC) was calculated. P-values <0.05 were considered significant. The analysis was performed with SAS Software, version 9.2.

Results

According to the inclusion and exclusion criteria, from the original database of 410 patients, 125 were excluded and the final study cohort included 285 patients (mean age 65.21 ± 9.58 years; M:F = 160:125). Characteristics of study population are summarized in Table 1. Patients positive for gene alterations were: 60/280 (21.43 %) for EGFR; 31/270 (11.48 %) for ALK; and 64/240 (26.67 %) for KRAS.

Table 1 Patient, CT, and tumor characteristics of the study population

Inter-observer agreement for the studied CT features was substantial for pleural contact (Kappa = 0.68), margins (Kappa = 0.77), air bronchogram (Kappa = 0.75), and pleural retraction (K = 0.78), and it was almost perfect (Kappa > 0.80) for all the remaining CT features (Table 2).

Table 2 Analysis of inter-reader agreement: percent of concordance and kappa of agreement

As shown in Table 3, univariate analysis showed that among the EGFR+ patients there were significantly higher percentages of air bronchogram and pleural retraction (CT features) (Fig. 1); females; absence of emphysema and non-smokers (clinical features). Subsequent multivariate analysis confirmed the significance of these features, with the exception of emphysema, with evidence of two further significant CT features (small lesion size and absence of fibrosis). Figure 2 shows the ROC curves for the presence of EGFR mutation prediction (with the full model the AUC was 0.82).

Table 3 Association of patient, tumor, and CT features with the EGFR mutation: univariate and multivariate analysis
Fig. 1
figure 1

Axial CT image of an 89-year-old man with a left upper lobe lung adenocarcinoma EGFR+, showing an air bronchogram within (black arrow), and a peripheral pleural retraction (white arrow)

Fig. 2
figure 2

Comparison between ROC curves for EGFR mutation prediction according to each significant characteristic and with the full model

Table 4 shows a significant association of ALK rearrangements with pleural effusion (CT feature) (Fig. 3) and age (clinical feature). Figure 4 shows ROC curves for ALK translocation prediction (the AUC for the full model was 0.65).

Table 4 Association of patient, tumor, and CT features with the ALK mutation: univariate and multivariate analysis
Fig. 3
figure 3

Axial CT image of a 53-year-old man, smoker, showing the presence of right pleural effusion (white arrow) concomitant with a right lung adenocarcinoma ALK+ concomitant with atelectasia (black arrow)

Fig. 4
figure 4

Comparison between ROC curve for ALK mutation prediction according to each significant characteristic and with the full model

Table 5 shows a significant association of KRAS mutation with CT features such as cavitation (p = 0.05) and emphysema (p = 0.03) at the univariate analysis, both unconfirmed at the multivariate analysis, which also shows a significant association between KRAS mutation and CT features such as round shape, nodules in non-tumour lobes (Fig. 5), and smoking (clinical feature). Figure 6 shows ROC curves for KRAS mutation prediction, where the AUC was 0.67.

Table 5 Association of patient, tumor, and CT features with the KRAS mutation: univariate and multivariate analysis
Fig. 5
figure 5

Axial CT image of a 79-year-old man, smoker, showing a left lower lobe round lesion (adenocarcinoma KRAS+; black circle), associated with the presence of nodules in non-tumour lobes (right middle lobe and right lower lobe; white arrows)

Fig. 6
figure 6

Comparison between ROC curves for KRAS mutation prediction according to each significant characteristic and with the full model

Discussion

Radiogenomics has potential to affect therapy strategies by evaluating patient-specific possibility of response to therapy [8].

Recently, Jain et al demonstrated that a combination of clinical, imaging, and genomic markers could provide important and unique prognostic information about the poorly understood non-enhancing regions in glioblastomas [17]. Data from the I-SPY 2 trial has permitted computer analyses of imaged breast lesions that can potentially be related to molecular classifications of cancer (e.g., estrogen-progesterone receptor and HER2 status) [18]. Karlo et al. found significant associations between gene mutations and phenotypic characteristics of clear-cell renal carcinoma by contrast-enhanced MDCT [19]. Evidence of intra-tumour genetic heterogeneity in high-grade serous ovarian cancer has suggested the need of tumour analysis of both primary and metastatic cancer for development of targeted therapies and validation of biomarkers for therapeutic response [20].

On one hand, the need for an inexpensive and easily obtainable source of material for analysis of tumour molecular aberrations has led to the possibility of identifying molecular signatures on circulating free DNA. On the other hand, the NCI workshop report suggests the opportunity for replacing repeated biopsies with validated imaging approaches using feature extraction methods, and the lung CT has been indicated as the first research area in the priority list for data collections [8].

Therapeutic success with EGFR tyrosine kinase inhibitors (TKIs) in EGFR-mutated advanced lung adenocarcinoma has improved the survival and quality of life of patients with lung cancer [2123]. If imaging traits can be associated with previously determined treatment-response gene expression patterns, routine imaging studies performed for staging and follow-up might give the likely response to specific chemotherapeutics and aid in the decision making process towards an optimal form of treatment, for a more genetic-based and personalized medicine [11].

The present study found a significant association between EGFR mutation and air bronchogram, present in 60 % of tumours with the EGFR mutation and only in 35 % of wild type tumours. This result is concordant with a recently published study showing a significant association between air bronchogram and EGFR mutation in lung adenocarcinomas [12]. The association found here between EGFR+ and air bronchogram does not exclude an association with gene activation in the hypoxia pathway hypothesized by Gavaert et al. [13] for the correlation between air bronchogram and overexpression of KRAS, although assessment of this association was not a study objective.

We also found an association between EGFR mutation and pleural retraction (present in 65 % of tumours with EGFR mutation, and in 35 % of those without the mutation). Pleural retraction, a frequent sign of visceral pleural invasion, is one of the most important prognostic factors in patients undergoing complete resection for NSCLC [24]. Indeed, a retrospective analysis of prognostic factors in stage I patients showed that an air bronchogram and the absence of pleural retraction were associated with a better 5-year disease-free survival [25]. At multivariate analysis, two more signs were significant: small lesion size and absence of fibrosis. The association of EGFR+ and small tumour size is concordant with Hsu et al. who demonstrated that adenocarcinomas with wild-type EGFR were larger than those with EGFR mutation [12]. This association might be related to the better prognosis of EGFR mutation carriers compared with the KRAS+ and ALK+ tumours. Lung fibrosis detected at CT is frequently seen as a late consequence of smoking; therefore, its absence as a significant feature may be related to the absence of smoking in EGFR+ patients.

The known association between EGFR mutation, female sex, and never-smoking [26] was also confirmed in this series, both at univariate and multivariate analysis. When associating all the significant features for EGFR mutation (female sex, never-smoking, air bronchogram, pleural retraction, lesion size, absence of fibrosis), the AUC of the ROC curve was 0.82.

ALK is a tyrosine kinase receptor; its rearrangements are rare, but occur more frequently in young never-smoker patients with clinically advanced lung adenocarcinoma [22]. In our study cohort, including a relatively high percentage of ALK translocation-positive cases (31/270; 11.48 %), a significant correlation between ALK+ and young age was confirmed. A significant association between pleural effusion and the ALK translocation was also found. This is concordant with Yamamoto et al. who recently demonstrated that a large pleural effusion was significantly associated with ALK+ status [14]. Considering both significant features (age and pleural effusion) for ALK+ status, the AUC was 0.65. Concordantly with previous studies [27], we did not find significant correlation between ALK rearrangements and lesion margins, presence of air bronchogram and pleural retraction.

KRAS proto-oncogene mutation is one of the most common mutations in lung adenocarcinomas [28]. KRAS mutations were described as a negative prognostic marker in lung adenocarcinoma more than 20 years ago [29], and since then the prognostic significance of KRAS has been investigated extensively in NSCLC, with uncertain results. EGFR mutation and KRAS are almost mutually exclusive, and some but not all studies have suggested that KRAS mutation confers some degree of resistance to EGFR TKIs in patients with wild-type EGFR. Although many studies suggest no significant effect of KRAS in differential survival benefit from EGFR TKIs in NSCLC, two meta-analyses suggest that KRAS may be a negative predictive biomarker for response to EGFR-TKIs [28,30,31]. The present study found a significant association between KRAS mutation, round lesion shape, nodules in different lobes, and smoking. Nodules in non-tumoral lobes are highly suggestive of hematogeneous lung metastases and relate to a more aggressive behaviour of KRAS-mutated tumours [32], although the statistical significance of these findings, the confidence intervals (shown in Table 4), were close to unity. Therefore, this association will need to be confirmed by larger studies. The association between KRAS+ and smoking has already been described and was confirmed in our series [33,34].

This study has some limitations: it is a discovery-phase study without validation of the findings. However, in a relatively large group of patients (n = 285) we found some significant associations between selected gene mutations and CT imaging features of NSCLC, which may assist in the selection of features for a subsequent validation study. Another limitation is that not all CT examinations were performed with the same protocol. However, this was a retrospective study in which only CT studies performed at our institution were included in order to limit variations in acquisition protocols. Furthermore, since not all CT examinations had a standard contrast enhancement protocol, our analysis did not include CT features related to contrast enhancement of the lesions. With regard to the genomic findings, our data were recorded as presence or absence of mutations/rearrangements. This may be considered a limitation because genes may have different types of mutations. Moreover, these mutations could be associated with defined histological subtypes of adenocarcinoma [35] that could produce specific CT image morphologies. However, our study aimed to assess association between CT features and gene mutations as the basis for further analysis of mutation types in subsequent validation studies.

In conclusion, this preliminary radiogenomics analysis of NSCLC revealed associations between CT features and EGFR alterations (internal air bronchogram, pleural retraction, small lesion size, and absence of fibrosis), ALK (pleural effusion), and KRAS (round lesion shape and nodules in non-tumor lobes). The association of these features with significant clinical features, such as female sex and non-smoking for EGFR, young age for ALK, and smoking for KRAS, may suggest which patients are more likely to be mutation carriers.