Introduction

Peripheral adenocarcinomas are the most common type of lung cancer being diagnosed with an increasing frequency, which account for 30 − 35 % of all primary lung tumours [1]. The increasing frequency of adenocarcinomas may be in part attributed to the growing use of chest computed tomography (CT) and a corresponding increase in the detection rate of early adenocarcinomas manifesting as persistent subsolid nodules.

Subsolid nodules represent a wide spectrum of pathology, ranging from pre-invasive lesions such as atypical adenomatous hyperplasia (AAH) and adenocarcinoma in situ (AIS) to invasive lesions including minimally invasive adenocarcinoma (MIA) [2]. The pre-invasive lesions usually appear as pure ground-glass nodules (GGNs), especially when the lesion size is small, while the invasive lesions typically contain solid components of various sizes, which reflect invasive components on pathology [29].

In a recent study, the maximal diameter of the solid component only has been shown to be a better prognostic indicator than the maximal diameter of the whole lesion in part-solid nodules [10]. Therefore, accurate evaluation of solid components is considered as key to establishing optimal management plan for subsolid nodules. In particular, accumulating evidence of the indolent nature of pure GGNs and part-solid nodules with solid components less than 5 mm in size has suggested that conservative management may be indicated for the nodules [2].

However, assessing the presence or absence and the size of solid components in subsolid nodules may be challenging due to their small sizes and is inevitably subject to interobserver variability [11, 12]. Lack of a standardised approach may complicate the assessment of solid components. At present, the Fleischner Society recommends using the mediastinal window setting for this assessment [13], while Lung RADS recommends using the lung window setting for measuring nodules [14]. Based on the previous finding that disagreements on the presence and size of a solid component were the main sources of discrepancies in nodule categorisation [15], it would be essential to recommend using the same window setting for assessing subsolid nodules, because the interobserver variability would further increase if observers use different window settings. In this context, it has been reported that both the accuracy and interobserver agreement among the two readers for solid component measurements did not significantly differ between lung and mediastinal windows [16]. To our knowledge, however, in-depth investigation of the difference in interobserver agreements for the nodule classifications and solid component measurements between the two window settings has not yet been conducted.

The purpose of this study was to compare interobserver agreements among multiple readers and accuracy for the assessment of solid components in subsolid nodules between the lung and mediastinal window settings.

Materials and methods

This retrospective study was approved by our institutional review board, and informed consent was waived.

Patient selection

One hundred and two subsolid consecutive nodules, which were surgically resected between April 2013 and April 2015, were selected from our radiology report database. Prior to the surgery, sizes of the entire nodules and their solid components on preoperative chest CT scans were measured by one experienced radiologist (J.M.G. with 25 years of experience). Of the 102 nodules, 25 nodules with solid components larger than 8 mm were excluded. For the inclusion criteria, a cut-off value for the solid component was chosen as 8 mm, given that 8 mm is the threshold discriminating the 4A and 4B categories (i.e. suspicious category with findings for which additional diagnostic testing and/or tissue sampling is recommended) in Lung-RADS (Version 1.0) [14] and that the effect of interobserver variability is less apparent if the solid components are relatively large.

Finally, 77 nodules from 76 patients (32 men [mean age, 63 years; age range, 45–75 years] and 44 women [mean age, 55 years; age range, 33–72 years]) were included in this study. Part of the patient population in this study (n = 19) overlap with those in the previous study [16].

Image acquisition

Four CT scanners were used to obtain preoperative chest CT scans: Sensation 16 (Siemens Medical Solutions, Forchheim, Germany), Somatom Definition (Siemens Medical Solutions, Forchheim, Germany), LightSpeed Ultra (GE Healthcare, Milwaukee, WI, USA), and Brilliance 64 (Philips Medical Systems, Best, The Netherlands). Given the retrospective design of this study, various CT protocols were used including low-dose (n = 53) and standard-dose (n = 24) CT protocols with (n = 41) or without (n = 36) contrast administration, with the tube current ranging from 20 to 40 mAs for the low-dose and from 200 to 400 mAs for the standard-dose techniques at a fixed tube voltage of 120 kV. All CT images were reconstructed using the high-frequency algorithm with a section thickness of 0.625 − 1.25 mm. The field of view was optimised for the patient’s size and ranged from 300 to 350 mm. CT images were obtained in the supine position at full inspiration for all patients.

Image analysis

All CT scans were reviewed by five independent readers (R.E.Y., E.J.H, S.H.Y., C.M.P. and C.H.L. with 5, 4, 10, 14 and 20 years of experience, respectively), who were blinded to pathologic diagnosis, on a picture archiving and communication system workstation (Infinitt Healthcare, Seoul, Korea).

With regard to the nodule classifications (i.e. pure GGN or part-solid nodule), the presence or absence of solid component was first visually determined on a representative axial section of each nodule in both lung (window width, 1,500 HU; level, -700 HU) and mediastinal (window width, 400 HU; level, 30 HU) window settings. Because the nodules were assessed on both windows, the presence of ground-glass components in the nodules could be appreciated on lung windows. A nodule was classified as a GGN on mediastinal windows if there was a ground-glass component on lung windows but no demonstrable solid component on mediastinal windows. The representative axial section for each nodule was chosen by one experienced radiologist (J.M.G. with 25 years of experience) such that the section would contain the longest diameter of the solid component.

Cases in which there was discordance in nodule classifications between the lung and mediastinal windows were reviewed and the potential reasons for the discordance were categorised by one experienced radiologist (J.M.G.): (1) small-sized solid component, (2) intermediate attenuation (between completely ground-glass and completely solid) of solid components and (3) prominent vessels within the subsolid nodules.

Subsequently, one-dimensional measurements of the nodule sizes were performed by the five readers. The longest diameters of the solid components, if present, were measured on the representative sections in both window settings. In addition, overall sizes of the nodules (the longest diameters of the whole nodules including the ground-glass components) were measured on the representative sections in the lung window setting. Although contiguous axial CT sections containing the entire nodules were provided to the readers to help differentiate solid components from vessels, the readers were instructed to perform both the nodule classifications and solid component measurements on preselected representative sections to minimise other sources of variability.

Pathology measurement

The longest diameter of tumour was measured on gross specimen by using a ruler in most cases. However, when the entire tumour was small enough to be mounted on pathological slides, it was measured on a representative slide containing the largest cross-section of the tumour under light microscopy. An invasive component was measured with a ruler under light microscopy. In cases with invasive components larger than 10 mm, exact sizes were not provided on pathology reports and the sizes of the invasive components were simply reported to be larger than 10 mm.

Statistical analysis

All statistical analyses were performed with R software (version 3.2.0; http://www.r-project.org/). Results with P values less than 0.05 were considered statistically significant.

Comparison of interobserver agreements in the lung and mediastinal window settings

Fleiss kappa and intraclass correlation coefficients were calculated to determine interobserver agreements for the nodule classifications and solid component measurements, respectively, in both lung and mediastinal window settings. A k value or ICC of 0.00–0.20 was considered to indicate slight agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 substantial agreement and 0.81–1.00 almost perfect agreement [17].

Interobserver agreements in the two window settings were compared using 95 % confidence intervals (CI) derived from 1,000 boot strapping replications. The results were considered to be significant if the 95 % CI for the difference between the two kappa values or ICCs did not include 0 [18].

Imaging-pathology correlation

The incidence of cases in which the presence or size of the invasive component was either underestimated or overestimated on CT with respect to the pathology was compared between the lung and mediastinal window settings by using the McNemar’s test.

In addition to Friedman test and Wilcoxon signed rank test, linear mixed model analysis was performed to compare the difference between the solid component size and the invasive component size according to window settings.

Results

Pathological findings of nodules

All nodules were surgically resected by lobectomy (n = 14), segmentectomy (n = 22), wedge resection (n = 40), and combined segmentectomy and wedge resection (n = 1). Pathological diagnoses of the nodules were as follows: AAH (n = 3), AIS (n = 8), MIA (n = 27), invasive adenocarcinoma (n = 35), focal fibrosis (n = 1), meningothelial hyperplasia (n = 1), anthracofibrotic lesion (n = 1), and inflammatory myofibroblastic tumour (n = 1). On the pathology, the median overall size of the 73 premalignant, pre-invasive, and invasive lesions for lung adenocarcinoma was 12 mm (interquartile range, 9-14).

Comparison of the lung and mediastinal window settings

Results of the nodule classifications and CT measurements are summarised in Table 1. With regard to the nodule classifications, the results for the potential reasons for discordance between the lung and mediastinal window settings are provided in Table 1. In 19 nodules, there was at least one discordant result between the lung and mediastinal window settings among the five readers. The discordant nodule classifications between lung and mediastinal windows are shown in Table 2.

Table 1 Summary of classifications and measurements of 77 subsolid nodules
Table 2 Discordant nodule classifications between lung and mediastinal windows

Interobserver agreements were moderate for the nodule classifications and substantial for the solid component measurements in both lung (k = 0.51 and ICC = 0.70) and mediastinal (k = 0.57 and ICC = 0.69) window settings (Figs. 1, 2, and 3). There were no significant differences in the interobserver agreements between the two window settings with respect to both the nodule classifications (95 % CI, -0.22, 0.12) and solid component measurements (95 % CI, -0.14, 0.16).

Fig. 1
figure 1

Case with a high interobserver agreement on both lung (a) and mediastinal (b) windows. The subsolid nodule was interpreted as a part-solid nodule by all five readers in both window settings. The size of the solid component (arrow) was measured to be 5 mm by three readers, 4 mm by one reader and 6 mm by one reader on lung windows, while it was measured to be 3 mm by three readers and 4 mm by two readers on mediastinal windows

Fig. 2
figure 2

Case with a low interobserver agreement on both lung (a) and mediastinal (b) windows. In both window settings, the subsolid nodule was interpreted as a part-solid nodule by two readers and as a GGN by three readers. On the retrospective review by a single experienced radiologist, difficulty in differentiating a solid component from a vessel has been suggested as a potential reason for the discrepant readings

Fig. 3
figure 3

Case with a lower interobserver agreement on lung windows (a) compared with mediastinal windows (b). a On lung windows, a small area of equivocally increased attenuation (arrow) was interpreted as a solid component by three of the five readers. b On mediastinal windows, the nodule was classified as a pure GGN by all five readers

Imaging-pathology correlation

CT prediction of the presence of invasive components

Thirteen nodules (benign or premalignant lesions [n = 7]; nodules for which the presence or absence of invasive components were not reported on pathology [n = 6]) were excluded from the analysis. Pathological diagnoses of the remaining 64 nodules were AIS (n = 8), MIA (n = 27), and invasive adenocarcinoma (n = 29).

Given that an invasive component of adenocarcinoma is generally solid on CT, a false negative result corresponds to the case in which an invasive component was present on pathology although there was no discernible solid component on CT. In contrast, a false-positive result corresponds to the case in which an invasive component was absent on pathology despite the presence of definite solid component on CT. False negative results for the presence of invasive component were significantly more common in the mediastinal window setting than in the lung window setting (P < 0.001) (Fig. 4). There was no significant difference in the incidence of false positive results between the two window settings (Table 3).

Fig. 4
figure 4

Minimally invasive adenocarcinoma in a 61-year-old woman. The subsolid nodule was classified as a part-solid nodule by all five readers on lung windows (a) and as a pure GGN by four readers on mediastinal windows (b). The presence of invasive component was correctly predicted only on lung windows

Table 3 Imaging-pathology correlation for the presence and size of invasive components

CT prediction of the size of invasive components

Thirty-two nodules (benign or premalignant lesions [n = 7]; nodules for which the presence or absence of invasive components was not reported on pathology [n = 6]; nodules for which only the ranges, and not the specific sizes, were given on pathology [n = 11]) were excluded from the analysis. Pathological diagnoses of the 53 nodules were AIS (n = 8), MIA (n = 27), and invasive adenocarcinoma (n = 18). On the pathology, the median sizes of the tumour and invasive components were 11.0 mm (interquartile range, 9.0-14.0) and 4.0 mm (interquartile range, 2.0-6.0), respectively.

There were significant differences between the solid component size and the invasive component size in both window settings (2.2 mm [interquartile range, 0.2- 4.4] for lung windows versus 4.0 mm [interquartile range, 2.0-6.0] for pathology, P = 0.001; 0.8 mm [interquartile range, 0.0-2.5] for mediastinal windows versus 4.0 mm [interquartile range, 2.0- 6.0] for pathology, P < 0.001). However, the median absolute difference between the solid component size and the invasive component size was significantly larger in the mediastinal window setting than in the lung window setting (2.0 mm [interquartile range, 1.2-3.7] and 1.8 mm [interquartile range, 0.6-2.8], respectively; P < 0.001). In addition, the median solid component size on lung windows differed significantly from that on mediastinal windows (P < 0.001) (Table 3).

Discussion

With regard to the evaluation of lung adenocarcinomas manifesting as subsolid nodules, the initial step is to determine the presence or absence of solid components within the nodules, which reflect invasive components on pathology. Accumulating evidence suggests that subsolid nodules may be managed differently according to the presence or absence of solid components. Part-solid nodules warrant an aggressive diagnostic approach owing to its sufficiently greater likelihood of being malignant, except those with solid components measuring 5 mm or less, which frequently correspond to either AIS or MIA [19, 20]. As for pure GGNs, based on the results from a large prospective cohort [International Early Lung Cancer Action Program (I-ELCAP)], Yankelevitz et al. [21] have recently recommended that the nodules of any size can be safely followed with annual repeat scans to monitor transition to part-solid nodules and pathological diagnosis may be delayed until the development of solid components as the earlier treatment provides no additional benefit. In cases with part-solid nodules, the exact sizes of solid components are also of interest, given their close relationship to the prognosis of the patients. Several previous studies have proven that the solid component size may have a higher prognostic value than the total size of subsolid nodules [4, 10, 2225].

In parallel to increasing emphasis placed on solid components as the key determinant of the prognosis and optimal management of patients with the subsolid nodules, there has been a growing interest in establishing a standardised approach for assessing the solid components [2633]. In particular, although the mediastinal window setting is the one recommended to be used by the Fleischner Society at the present time, lack of evidence for the inferiority of the lung window setting, compared with the mediastinal window setting, has prompted a further investigation in this study.

First, for evaluation of the reproducibility, we compared interobserver agreements among multiple readers with respect to the nodule classifications and the solid component measurements between lung and mediastinal windows. Both window settings showed moderate agreements for the nodule classifications and substantial agreements for the solid component measurements. Low interobserver agreement was noted for cases in which there was a difficulty in distinguishing a solid component from a vessel, regardless of the window setting (Fig. 2). In addition, interobserver agreement for the nodule classifications was found to be slightly lower on lung windows than on mediastinal windows, partly because the area of equivocally increased attenuation on lung windows, which may have posed a diagnostic dilemma to some readers, appeared completely ground-glass on mediastinal windows (Fig. 3). However, there were no significant differences in the interobserver agreements between the two window settings. The results are in keeping with those of the previous study, which demonstrated no significant difference in the interobserver agreement among two observers for solid component measurements between the two window settings [16]. Our results confirm the previous findings with stronger evidence provided by larger numbers of nodules and readers.

As for the accuracy, the mediastinal window setting tended to underestimate the presence and size of invasive components, compared with the lung window setting. Higher false negative rate on mediastinal windows can be attributed to the fact that invasive components do not always manifest as dense solid components on CT. Invasive components may have various CT morphology, ranging from ground-glass to solid. In a previous study pertaining to annual follow-up screening CT of non-solid nodules, pathological evidence of stromal invasion was found to be present in 52 of 62 (84 %) nodules which remained non-solid prior to resection. Since it is impossible to discriminate ground-glass invasive components from the lepidic growth, the invasive components appearing ground-glass are inevitably underestimated in both lung and mediastinal windows. However, invasive components with intermediate attenuation between completely ground-glass and completely solid are likely to be underestimated in mediastinal windows but not in lung windows.

Meanwhile, the incidence of overestimation of the invasive components turned out to be significantly higher on lung windows than on mediastinal windows, although false-positive rates for the presence of invasive components did not significantly differ between the two windows. It has been speculated that inadequately inflated status of the lung tissue after resection and tissue processing may have an influence on pathology measurements and may cause tumour sizes to be measured smaller, compared with CT measurements [34]. Given that solid components were measured significantly larger on lung windows than on mediastinal windows in both the previous and present studies [16], it might not be surprising that the incidence of overestimation of the invasive components was significantly higher in the lung window setting.

In addition, we found the median difference between the solid component size and the invasive component size to be significantly larger on mediastinal windows than on lung windows, despite the presence of significant differences between the solid component size and the invasive component size in both window settings. Based on the findings, it is inferred that the accuracy of the nodule classification and solid component measurement may be higher on lung windows than on mediastinal windows. In the previous study by Lee et al. [16], however, the accuracy for solid component measurements in lung windows was not shown to significantly differ that in mediastinal windows. Our results may differ from those of the previous study, because we have investigated a larger number of subsolid nodules with a wider spectrum of pathology ranging from AIS to invasive adenocarcinoma to obtain results that can be generalised to other settings.

Apart from the intrinsic limits of any retrospective study, there were a few other limitations that should be mentioned. First, pathology assessments themselves, which served as the reference standard in this study, may not be accurate and subject to variability depending on several confounding factors. Specifically, it has been reported that interobserver agreement for the presence of invasive components among pathologists may vary widely depending on histological patterns [35]. Furthermore, inadequately inflated status of the lung tissue after resection and tissue processing may have caused tumour sizes to be measured smaller than the actual sizes. Discordance in measurement planes between CT and pathology and interobserver variability between pathologists may have also contributed to the variability. Second, some cases were excluded from the subgroup analysis for imaging-pathology correlation because of unavailability of the exact sizes of invasive components on pathology reports. Nonetheless, the reproducibility and accuracy are of concerns when evaluating small solid components, and therefore the exclusion of the cases does not undermine the clinical importance of our findings. Third, although the representative axial section for each nodule was used for analysis to minimise other sources of variability, the use of the preselected axial section only could have been a potential cause for the selection bias and lowered agreement with pathology. Moreover, given that readers may choose different sections for nodule measurements in routine practice, the use of preselected axial section may be a potential limitation for extrapolating the interobserver variability seen in this study to routine practice. In addition, coronal or sagittal sections (if available) could have been helpful in distinguishing solid components from vessels in some cases. Fourth, the potential reasons for discordance in nodule classifications between the two window settings should be interpreted with caution because they were based on the opinion of a single experienced radiologist.

In conclusion, the lung window setting had a comparable reproducibility but a higher accuracy than the mediastinal window setting for nodule classifications and solid component measurements in subsolid nodules. These findings imply that the lung window setting may be better than the mediastinal window setting for the evaluation of solid components in subsolid nodules.