Introduction

Idiopathic pulmonary fibrosis (IPF) is defined as a specific form of chronic progressive fibrosing interstitial pneumonia of unknown cause occurring primarily in older adults, limited to the lungs [1, 2]. The American Thoracic Society and the European Respiratory Society (ATS/ERS) [1] published an international consensus statement on the diagnosis and treatment of this disease in 2000. The statement defined IPF as a clinical entity associated with histopathology and/or radiologic pattern of usual interstitial pneumonia (UIP). The definition of IPF required the exclusion of other forms of interstitial pneumonia with known aetiology such as environmental exposure, medication or systemic disease. The ATS/ERS 2002 statement [2] included IPF among the seven diseases defined as idiopathic interstitial pneumonias (IIPs) and detailed the specific clinical, radiologic and histologic features. A recent update of the classification included IPF among the major idiopathic interstitial pneumonia, together with INSIP, RB-ILD, DIP, COP and AIP [3].

Surgical lung biopsy and pathological pattern of UIP are able to predict the mortality rate and the long-term outcome in patients with IPF [4, 5]. Nevertheless, surgical lung biopsy is not frequently performed due to sampling problems [6] and to possible complications in patients with more advanced disease [710]. In patients with severe physiologic impairment or substantial comorbidity, the risk of surgical lung biopsy may outweigh the benefits. Therefore, non-invasive evaluations have been suggested as an alternative method for diagnosis and prognosis in patients with typical HRCT pattern of UIP [1114]. In 2011, American Thoracic Society, European Respiratory Society, Japanese Respiratory Society and Latin American Thoracic Association (ATS/ERS/JRS/ALAT) published a new evidence-based consensus statement guideline [15] on the diagnosis and management of IPF.

These guidelines identified HRCT as an essential component of the diagnostic pathway in IPF and defined the criteria for UIP pattern, possible UIP pattern and inconsistent with UIP pattern (Figs. 1, 2, 3). The positive predictive value of a HRCT diagnosis of UIP is high, but in patients whose HRCT does not demonstrate a UIP pattern, surgical lung biopsy may still demonstrate UIP pattern on histopathology. Therefore, in an appropriate clinical setting, the presence of a UIP pattern on HRCT is sufficient to make the diagnosis of IPF. Recently, was suggested that surgical lung biopsy sampling might not be necessary also in patients with possible UIP pattern on HRCT [16].

Fig. 1
figure 1

HRCT images demonstrating (at three different levels ac) UIP pattern in a patient classified into group 1. The four described features are: honeycombing, reticular abnormalities, subpleural basal predominance and the absence of features listed as inconsistent with UIP pattern. Estimated overall extent of lung abnormality was 65 %; predominant features are honeycombing (black arrow), reticulation (white arrow), and bronchiectasis (arrowhead)

Fig. 2
figure 2

HRCT images demonstrating (at three different levels ac) possible UIP pattern in a patient classified into group 2. The three described features are: reticular abnormality, subpleural basal predominance and the absence of features listed as inconsistent with UIP pattern. Estimated overall extent of lung abnormality was 15 %; predominant features are reticulation (arrowhead) and ground glass (white arrow)

Fig. 3
figure 3

HRCT images demonstrating (at three different levels ac) inconsistence with UIP pattern in a patient classified into group 3. The features listed as inconsistent with UIP are: upper or mid-lung predominance, peribronchovascular predominance, extensive ground-glass abnormality, profuse micronodules, discrete cysts, diffuse mosaic attenuation/air trapping, consolidation in bronchopulmonary segment(s)/lobe(s). Estimated overall extent of lung abnormality was 50 %; predominant features are extensive ground-glass abnormality (black arrow) and diffuse mosaic attenuation

This is a new approach with respect to 2002 recommendations [2] which stated that the histologic patterns provide the primary basis for the various categories of IIP and that a highly probable diagnosis of IPF can be made without a lung biopsy, whereas a definitive diagnosis can be established only with the aid of a surgical lung biopsy. The impact of the new guidelines on the estimation of prognosis and life expectancy of patients with IPF has not been evaluated yet.

On the basis of the concept that in a clinical classification “diagnosis is prognosis” [6], the objective of this study is to determine whether HRCT criteria for UIP, possible UIP or no-UIP pattern recommended by ATS/ERS/JRS/ALAT guidelines 2011 are able to predict progression and prognosis of the disease in a group of patients with fibrotic IIP.

Materials and methods

Patients’ selection

One hundred forty-four patients consecutively admitted at Cardiothoracic Dept., University Hospital Pisa, Italy, from January 1996 to February 2012 with a previously made diagnosis of fibrotic interstitial lung disease, in which the diagnosis of IPF was not excluded, were retrospectively evaluated. Seventy-four patients were excluded through multidisciplinary discussion, due to the following: chemotherapy-related and postradiation therapy pneumonia, environmental exposure, clinical and serologic evidence of collagen vascular disease, history of ARDS or sarcoidosis. We also excluded patients who did not complete clinical and radiological follow-up, patients with lung cancer and without definitive diagnosis of interstitial lung disease.

All 70 patients included in the study received a diagnosis of fibrotic IIP based on the clinical manifestations and presence of an HRCT pattern showing reticular opacities and/or traction bronchiectasis in association with honeycombing or ground-glass opacities on the basis of American Thoracic Society and European Respiratory Society International consensus statement 2002 [2]. The video-assisted thoracic surgery lung biopsies were available in nine patients.

This was a retrospective study and has received the approval of the ethics committee, with no prescription about the patient’s informed consent.

Methods

Pulmonary function tests such as percentage of predicted slow vital capacity (SVC), forced vital capacity (FVC) and diffusing capacity of the lungs for carbon monoxide (DLCO), blood gas analysis, 6-min walking test, bronchoscopy, bronchoalveolar lavage, differential cell count, chest x ray and HRCT were performed at the first visit and during the follow-up at variable time intervals.

All the patients included in the study had HRCT examination with 1.0-mm-thick sections at 1-cm intervals throughout the entire lung during inspiratory apnoea in the supine position. The scans were performed at the first visit and during the follow-up on 7 different spiral scanners including two with a single row of detectors and 5 with multiple rows of detectors (two with 4, one with 16 and two with 64 rows).

Two thoracic radiologists with 25 and 15 years of experience, belonging to a multidisciplinary group composed by pneumology, radiology and pathology specialists, blindly and independently examined the baseline and follow-up HRCT scans (total n 178) in separate sessions and made a subjective visual assessment, in percentage (%) of the total lung sections of the overall extent of the pulmonary parenchymal abnormalities (overall CT score), approximated to the nearest 5 % according to Shin et al. [17], Schmidt et al. [18], and Lee et al. [19] studies. In brief, the two readers, having viewed together twenty sample cases, separately scrolled in a few minutes all the apex to base slices of each HRCT scan and quantified the percentage of interstitial involvement (Figs. 1, 2, 3). The extent of ground-glass opacity, reticulation and honeycombing were similarly scored, in such a way that the sum of these abnormalities corresponded to the overall extent of the interstitial involvement. A total fibrotic score was computed as the sum of the honeycombing percentage score and reticulation. Finally, the number of lobes with traction bronchiectasis was determined. In the follow-up, HRCT scans were examined in comparison with previous examinations, such as in clinical practice.

The same two radiologists, after having assessed baseline HRCT, separately distributed the 70 patients into three groups (UIP type = group 1, possible UIP = group 2, inconsistent with UIP = group 3) on the basis of ATS/ERS/JRS/ALAT 2011 guidelines (Figs. 1, 2, 3). The UIP pattern was characterized on HRCT by all the four features: the presence of honeycombing with or without traction bronchiectasis, reticular abnormalities, subpleural and basal predominance and the absence of features listed as inconsistent with UIP pattern. The possible UIP pattern was characterized by all the three features listed below: reticular abnormality, subpleural basal predominance and the absence of features listed as inconsistent with UIP pattern. The inconsistent with UIP pattern was characterized by any of the seven features described: extensive ground-glass abnormality, profuse micronodules, discrete cysts, diffuse mosaic attenuation/air trapping, consolidation in bronchopulmonary segment/lobe, upper or mid-lung predominance and peribronchovascular predominance.

A consensus was reached by the readers for the definitive categorization of the patients after the separate reading sessions.

Data analysis

The inter-observed agreement between the two radiologists in the scoring of the abnormality and in the categorization of the patients in the three groups was evaluated by kappa-statistic measure [20].

The baseline distribution of various abnormalities in the three groups was expressed as means and standard deviation and their statistical relevance was calculated using ANOVA and multiple comparisons such as Bonferroni test [21] and Mann–Whitney test [22].

The time percent variation of each abnormality (honeycombing, reticulation, ground glass, traction bronchiectasis) and of the overall CT score and fibrotic score, during follow-up was assessed for all the patients with two or more interval scan. Longitudinal data were analysed using least squares multilevel model (ML GLS); each subject was observed at least twice [23].

We then evaluated the mortality rate during the follow-up for each group of patients. Survival in the three groups of patients was compared using the log rank test and displayed using Kaplan–Meier curves [24].

In the patients belonging to UIP group we evaluated, for each abnormality with age-corrected Cox regression analysis [24] the more accurate threshold to differentiate patients with different life expectancy and compared the different survival curves. Cox proportional hazard models were constructed for honeycombing, reticulation, ground glass, overall CT score and fibrotic score thresholds ranging from 5 to 45, in increments of 5. For the traction bronchiectasis, all the possible thresholds ranging from 1 to 6 lobes were evaluated.

Finally, a multivariate Cox analysis of the correlation between a best-identified threshold of the abnormalities and the survival was performed in the group 1.

Results

The mean age of the patients at the time of the first visit was 69 years (range 51–88); 34 patients were non-smokers, 5 were smokers and 31 ex-smokers; 36 were males and 34 females.

All the patients were treated with corticosteroid and acetyl cysteine therapy.

The inter-observer agreement between the two radiologists in the scoring of the abnormalities was substantial or almost perfect (k = 0.72–0.85). In particular, the agreement was substantial for the fibrotic score (k = 0.73), overall CT score (k = 0.72), reticulation (k = 0.79) and traction bronchiectasis (k = 0.80) and was almost perfect for honeycombing (k = 0.84) and ground glass (k = 0.81). Since data were comparable, the scores attributed by the first reader were used for the computation.

Furthermore, as regard to the categorization of patients in the group 1 (UIP type), group 2 (possible UIP) and group 3 (inconsistent with UIP) on the basis of ATS/ERS/JRS/ALAT 2011 guidelines, the inter-observer agreement between the two radiologists was almost perfect (k = 0.92). Forty-four patients were classified in consensus into group 1, 13 into group 2 and 13 into group 3.

The diagnosis of the surgical lung biopsies were: UIP in five group 1 and in 1 group 2 patients, “no classifiable fibrosis” in another group 2 patient, “probable UIP” in 1 group 1 subject and non-specific interstitial pneumonia in the only group 3 biopsied patient.

The baseline distribution of various abnormalities in the three groups and their statistical relevance is showed in Table 1.

Table 1 Distribution of age, sex, smoking habit, HRCT abnormalities and functional parameters at baseline in the three groups and their statistical relevance

The honeycombing score was significantly higher in group 1 than in group 3 (p < 0.003). The percentage of reticulation was significantly lower in group 1 than in group 2 (p < 0.03). The distribution of ground glass significantly differed only between group 1 and group 3 (p < 0.001) and fibrotic score only between group 1 and group 3 (p < 0.03). The distribution of overall CT score and traction bronchiectasis was not significant.

The overall follow-up was 3029 days, (mean follow-up 1386 days, DS 915, range 730 days). The total number of HRCT observed in the 70 patients was 178, 103 for group 1, 37 for group 2 and 38 for group 3.

The changes of the HRCT abnormalities during the follow-up in the three groups are synthesized in Table 2. The time distribution of overall CT score is shown in Fig. 4. The increase was significant only in group 1 (p < 0.01) (Fig. 5a, b). On the contrary, in the patients belonging to possible UIP group, the overall CT score did not show a significant progression during the follow-up (Fig. 5c, d). Honeycombing had a significant progression in group 1 (p < 0.01) (Fig. 5a, b) and in group 3 (p < 0.01) and the traction bronchiectasis showed a significant increase in group 1 (p < 0.01) and 2 (p < 0.01). On the contrary, reticulation score showed a significant regression in group 1 (p < 0.007) and ground glass a significant regression in group 3 (p = 0.04).

Table 2 Distribution of the changes of CT abnormalities during the follow-up in the three groups and their statistical relevance
Fig. 4
figure 4

Overall CT score time relation evaluation plots. Each point represents the percentage of pulmonary involvement in each HRCT during the follow-up in the three groups of patients. The increase was significant only in group 1 (mean 1.87 points/years, p < 0.01). On the contrary, in the possible UIP or no-UIP groups, the overall CT score did not show a significant progression during the follow-up

Fig. 5
figure 5

a, b HRCT images of a patient belonging to UIP group at the baseline (a) and after 1 year of follow-up (b). Axial HRCT images show overall increment of the abnormalities and honeycombing progression (black arrows). c, d HRCT images of a patient belonging to possible UIP group at the baseline (c) and after 2 years of follow-up (d). Axial HRCT images show the stability of the interstitial pattern

The mortality rate during the follow-up was significantly higher (p < 0.04) in group 1 (18 dead) versus group 2 and 3 (1 dead in each group) (Fig. 6).

Fig. 6
figure 6

Survival in the three groups of patients displayed using Kaplan–Meier curves [24]. The mortality rate during the follow-up was significantly higher in group 1 (18 dead) versus group 2 and 3 (1 dead in each group)

A series of possible thresholds, which better differentiate patients with different life expectancy, was found in group 1 patients (Table 3). Patients with baseline honeycombing score >25 % had worse 5-year survival rate (53 %) than patients with honeycombing <25 % (72 %) (p < 0.04). Patients with baseline fibrotic score >30 had worse 5-year survival rate (58 %) than patients with fibrotic score <30 (88 %) (p < 0.03). Patients with baseline overall CT score >45 had worse 5-year survival rate (41 %) than patients with overall CT score <45 (80 %) (p < 0.0003). Patients with traction bronchiectasis in more than 4 lobes had worse 5-year survival rate (51 %) than patients with traction bronchiectasis in less than 4 lobes (88 %) (p < 0.002).

Table 3 CT abnormalities: list of thresholds determined by univariate analysis in group 1 patients, which better differentiate patients with different life expectancy (p, hazard ratios and 95 % confidence intervals are reported)

In a multivariate analysis including age, gender and smoking habit, overall CT score <45 and honeycombing <25 % predicted survival rate (HR = 9.07, p = 0.001 and HR = 4.43 p = 0.026, respectively).

If overall CT score was excluded from the analysis also the traction bronchiectasis in more than four lobes significantly predict the survival (HR = 8.43, p = 0.008) (Fig. 7).

Fig. 7
figure 7

Multivariate cox regression curves for patients of group 1 with overall CT score above and below 45 (a), for patients with honeycombing score above and below 25 % (b), and for patients with traction bronchiectasis in more and less than 4 lobes (c)

Discussion

The classification of IIPs was focused in the review of Hansell [6], which emphasized that the basic requirements of a clinical classification are that it is readily applied and that usefully informs patient monitoring and treatment. The aim of our study is in line with this strategic prospective.

The currently accepted IIP classification derives from the 2002 ATS/ESR consensus [2] and from the 2013 update [3], but there is wide experience that patients do not fit into ATS/ESR categories and the categorization of the fibrotic disease is often problematic for clinicians, radiologists and pathologists [25].

Idiopathic pulmonary fibrosis is the most frequently seen severe disorder in the IIP group, with a prognosis worse than that of many cancers, and it is the more important condition to rule out. The new ATS/ESR/JRS/ALAT [15] guidelines have evidenced HRCT as an essential component of the diagnostic pathway in IPF and have defined the criteria for UIP pattern, possible UIP pattern and inconsistent with UIP pattern. The latter categorization [15] seems easier to use, is potentially more reproducible than the 2002 classification and may have a significant prognostic value, even without lung biopsy.

In our study, HRCT has been used to characterize IPF according to the 2011 ATS/ERS/JRS/ALAT statement. The role of HRCT has been evaluated to identify the UIP pattern to recognize progression of the different abnormalities and to predict mortality.

The HRCT abnormalities were simply scored to the nearest 5 % of parenchymal involvement based on the percentage of lung parenchyma that showed evidence of each abnormality. This modality allowed for a high inter-observer agreement in evaluating the lung parenchyma abnormalities (k between substantial to almost perfect), in line with previous studies [19, 26, 27].

The HRCT classification into three groups (UIP type, possible UIP and no-UIP) resulted in being easy and reproducible (k = 0.92). It was obtained evaluating the HRCT sections (20–30 images) and the results were determined in order of minutes.

On the contrary, the difficulty to diagnose IPF on the basis of the HRCT and to distinguish UIP from NSIP and other interstitial patterns, such as classified in 2002 and 2013 consensus statement, is well known in the literature [28]. As mentioned in the Thomeer study et al. [29], the level of agreement on the definition of UIP and NSIP pattern between the HRCT readers has been fair to moderate and, as stated by Sverzellati et al. [30], thin-section CT findings in patients with IPF overlap with those of other chronic interstitial lung diseases, particularly NSIP, chronic hypersensitivity pneumonitis or sarcoidosis; as much as 62 % of biopsy-proved IPF was regarded as alternative diagnoses in his study.

In our experience, the difference of distribution of honeycombing, reticulation and ground glass, the subpleural and basal predominance, the absence of features inconsistent with UIP, allowed for an easy and reproducible categorization of the fibrotic disorder on the basis of 2011 consensus statement, with a significant prognostic implication. Therefore, the patients belonging to the UIP group, classified on the basis of HRCT UIP pattern, had a poor survival rate compared with patients belonging to the other two groups, as showed in Fig. 6, demonstrating a high prognostic value of the classification. This result is in apparent conflict with the view taken by Raghu [16] in which 79 out of 84 patients with possible UIP pattern on HRCT have had a biopsy confirmation of UIP. In this paper, however, no data are provided on the prognosis and evolution of these subjects, which, free of honeycomb by definition, may express a less advanced and/or less progressive IPF. As a matter of fact, Sumikawa et al. [31] recently demonstrated that the outcome of patients with a possible UIP pattern at HRCT was significantly better than that of patients with a UIP pattern (mean survival 73.0 versus 33.5 months, respectively).

Moreover, in our study in the UIP-type group, the subgroups of patients with different prognosis were well differentiated on the basis of HRCT abnormalities. A threshold honeycombing score of 25 %, a threshold fibrotic score of 30 %, a threshold overall CT score of 45 % and traction bronchiectasis in more than 4 lobes were able to determine a major or minor life expectancy in multivariate and/or univariate analysis. These data are in agreement with other previous studies [15, 19, 27, 3234]. As mentioned in ATS/ERS/JRS/ALAT statement 2011 [15], several groups have demonstrated that the extent of fibrosis and honeycombing on HRCT are predictive of survival in IPF. Best et al. [32] stated that the visually determined disease extension on CT images was a strong independent predictor of mortality in IPF. Sumikawa et al. [33] demonstrated in 98 patients with histological diagnosis of UIP that traction bronchiectasis and fibrotic score influenced prognosis. Lynch et al. [34] reported that a higher extent of fibrotic score increased the risk of death in 315 patients with IPF. In the Edey et al. [27] study, CT features predictive of a worse outcome were coarse reticulation, honeycombing traction bronchiectasis and overall extent of parenchymal abnormality, whereas Lee et al. [19] in 2012 showed on univariate and multivariate analysis that the overall extent of parenchymal abnormalities was a prognostic factor predictive of poor survival duration in fibrotic interstitial pneumonias with little honeycombing. However, our HRCT analysis first demonstrated the different time progression of overall CT score, fibrotic score and individual abnormalities in the three IIPs groups determined on the basis of the ATS/ERS/JRS/ALAT 2011 guidelines. In particular, in the UIP group, the time progression of honeycombing, fibrotic score and overall CT score was demonstrated and increased about 2–3 points/year, more than in no-UIP patients, whereas only bronchiectasis progressed significantly in the possible UIP group, (Table 2). This time progression was in agreement with the temporal course of IPF on HRCT, as showed by Misumi et al. [35] in pharmacologically treated patients, where the reticular abnormalities usually progress to honeycombing and the areas of honeycombing increase inexorably in extent. Also in fibrotic IIP with little honeycombing, Lee et al. [19] showed on serial scan that honeycombing and reticulation progressed in extent and ground glass decreased differently in UIP and in fibrotic NSIP.

Finally, our study seems to indicate the utility and the accuracy of the classification based on the 2011 statement which emphasizes the role of HRCT and gives the opportunity to verify the progression and the mortality rate considering three different patterns (UIP type, possible UIP and inconsistent with UIP) on the basis of simple morphologic CT criteria, obviating surgical biopsy in many cases.

Our study had some limitations.

First, the study is retrospective and the HRCT examinations were not homogeneous for CT scanner and technical parameters.

The second limitation is that the surgical lung biopsies were available only for nine patients, due to the potential risks associated with this procedure, which are often greater than the attended benefit.

The third limitation is that during the follow-up, the CT scans were not performed at regular time intervals and the functional parameters and DLCO were not available at the same time, so that a reliable confrontation with CT as regard to the evaluation of progression and prognosis of IPF was not possible.

Conclusion

HRCT classification based on ATS/ERS/JRS/ALAT 2011 guidelines showed high accuracy in stratifying fibrotic changes because, in our study, UIP, possible UIP and inconsistent with UIP pattern seem to be correlated with different progression and mortality rate of fibrotic IIP.

As a consequence, is possible to stratify the risk of individual patient on the basis of belonging to one of the three groups and, among the UIP-type group subjects, on the basis of the extent of the abnormalities presented, even without biopsy confirmation. Then is possible to use the classification to predict the likely outcome for a given patient: “diagnosis is prognosis” [6].