Introduction

In patients with emphysema, there is a need for a clinically accessible marker that accurately reflects the morphologic extent of disease. The last decade has seen a resurgence of interest in lung volume reduction surgery for patients with emphysema [14]. Thus, there is now a need to stratify patients according to disease severity so that those most likely to benefit from surgical intervention are identified. The relationships between clinical features and disease extent are complex, and assessment of symptom severity alone cannot accurately gauge the morphologic extent of emphysema [5]. Similarly, because of issues of observer disagreement [6, 7] and an insensitivity to anything but relatively extensive disease [8, 9], plain chest radiography is a poor predictor of emphysema severity. Compared with plain radiography, computed tomography (CT) is a more sensitive tool for detection and quantification of emphysema [10, 11], and previous studies have documented the relationships between CT extent, physiologic tests and morphometric estimates [1215]. Methods for quantifying emphysema on CT include visual estimation and automated techniques, both of which have their proponents [1619].

Pulmonary function tests are the most widely used measures of emphysema severity. However, many physiologic indices are measured routinely, and the selection of indices for optimally quantifying the extent of emphysema is problematic. The aim of the present study was to construct a composite physiologic index (CPI) best reflecting the global extent of emphysema as judged by CT quantification using visual and automated methods.

Materials and methods

Consecutive patients presenting to our institution between July 1994 and July 2000 with evidence of emphysema on high-resolution computed tomography (HRCT) were identified retrospectively from computerised CT reports; the cohort also included a subgroup of consecutive patients evaluated at our institution as part of an ongoing (Institutional Review Board approved) trial of lung volume reduction surgery (LVRS). Case notes were reviewed for demographic data and the following exclusion criteria: (1) α-1-antitrypsin deficiency; (2) concurrent interstitial lung disease; (3) previous pulmonary resection; (4) bronchogenic neoplasm exceeding 5 cm in diameter; (5) HRCT features of bronchiectasis; (6) concurrent pneumothorax; (7) concurrent pleural disease (large pleural effusion or significant pleural thickening) and (8) concurrent left ventricular failure. Generic approval was given by the Ethics Committee at our institution for retrospective analysis of clinical and imaging data.

CT technique

Patients were scanned in accordance with departmental protocol on an ultrafast electron beam scanner (Imatron Inc., San Francisco, CA, USA) using a collimation of 1 mm with a 10-mm interslice spacing. Scanning was performed in the supine position and in full inspiration. All images were reconstructed using a high spatial frequency algorithm. Individual studies were downloaded from digital storage discs onto a workstation [MagicView 1000, Version 40 (release B), Siemens, Erlangen, Germany) and displayed at window settings appropriate for viewing the lung parenchyma (window centre=−550 HU; window width=1,500 HU).

CT estimation of emphysema extent: automated quantification

The extent of emphysema (emphysemaauto) was estimated on each CT section using an automated technique in which voxels with attenuation values below a specified threshold of −950 HU were highlighted; the threshold was chosen since this value was reported to predict the macroscopic and microscopic extent of emphysema [14, 15]. Prior to the application of the density threshold, macroscopic anatomical structures (i.e. chest wall and blood vessels) were segmented; the two most cranial and caudal images were excluded from automated segmentation because of the variable paucity of lung parenchymal markings in the upper zones and movement artefact from the diaphragm in the lower zones.

The technique for segmentation has been described in detail in previous studies [20, 21]: however, in brief, the chest wall (which has a high-intensity value) is thresholded out using an empirically selected CT value of −250 HU. Because of the large intrinsic difference in CT attenuation between chest wall (soft tissue approximately 0–100 HU and bone approximately 700 HU) and lung parenchyma (approximately −800 HU), a empirically selected threshold is satisfactory despite intersubject variation. To account for the partial volume effect at the interface between the lung parenchyma and chest wall, a morphologic dilation operator with a square flat-top structuring element with a size of five pixels was applied to remove those pixels located within these regions, creating a regional mask through which the largest connected component is extracted. This represents the chest wall, and the two largest inner regions delineate the boundary of the lung parenchyma. Smaller inner regions, when visible, were discarded, as they represent other anatomical structures such as the oesophagus. For segmentation of pulmonary vessels, an empirically selected threshold of −750 HU was chosen and combined with a morphologic dilation operator to create a vessel mask that removes superimposed blood vessels. A three-pixel flat-top square structuring element was employed for this morphologic operator. Morphologic erosion was finally applied to eliminate blood vessels, with a size of magnitude of one pixel [20].

CT estimation of emphysema extent: visual quantification

Soft-copy images were reviewed independently by three thoracic radiologists on a workstation. As with automated quantification, the two most cranial and caudal images were excluded from visual analysis. The extent of emphysema (emphysemavis) was estimated visually to the nearest 5% on each image section. Emphysema was defined as areas of decreased attenuation, usually without discrete walls and of nonuniform distribution, causing permeative destruction of lung parenchyma [22].

Pulmonary function tests

Pulmonary function tests (expressed as a percentages of predicted values for the patient’s age, gender and height) [23], included the forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and their ratio (FEV1/FVC); total lung capacity (TLC), residual volume (RV) and their ratio (RV/TLC); single-breath carbon monoxide diffusing capacity (Dlco) corrected for haemoglobin concentration was measured with a single-breath technique or a rebreathing technique with adjustment to single-breath values using a P.K. Morgan respirometer (P.K. Morgan Ltd., Chatham, Kent, UK) and also adjusted for alveolar volume (Kco). Ear-lobe blood gases were performed on air at rest. The median interval between pulmonary function testing and CT examination was 1 day (range=0 days to 8 months); the interval between physiologic tests and CT exceeded 3 months in only five patients.

Statistical analysis

The mean extent of emphysema estimated visually by the three observers and the extent of emphysema quantified using automated estimation were analysed. Interobserver variation was quantified using analysis of variance (with individual patients as the “group variable”) [24]. For selected subanalyses, the cohort was divided into equal groups designated as “limited” (extent of emphysemavis<45%; n=53) and “extensive” emphysema (extent of emphysemavis≥45%; n=53). Univariate correlations were examined using Pearson’s product-moment correlation coefficient because CT data comprised normally distributed continuous variables, and pulmonary function tests are evaluated parametrically in clinical practice. For generation of composite pulmonary functional indices, separate multiple linear regression models were constructed, with the extent of emphysema (based on visual and automated estimation) designated as the dependent variable in separate models. Independent functional variables comprised FEV1, FVC, TLC, alveolar volume (VA), Dlco and Kco. Stepwise techniques were used to exclude collinear variables not contributing to equation explanatory power; all variables not independently linked to the extent of emphysema (p<0.01) were excluded. Testing for heteroscedasticity confirmed that neither model violated the parametric assumptions of multiple linear regression.

Results

Patient characteristics

The study population comprised 106 patients, with a median age of 61 (range 26–86 years) and a male:female ratio of 71:35. Clinical characteristics included a previous suspected diagnosis of asthma (n=33), current bronchodilator therapy (n=86) and chronic bronchitis (n=19). A partial or full smoking history was available in 104/106 patients: there were 13 current smokers, 91 ex-smokers and two patients with inadequate data. A full pack-year smoking history was available in 96 patients (median=35 pack-years; range 7–175 pack-years). Pulmonary function test results and the extent of emphysema (from visual and automated quantification) on HRCT are summarised in Table 1.

Table 1 Pulmonary function indices and extent of visually (emphysemavis) quantified and automated (emphysemaauto) emphysema in 106 patients

Interobserver agreement

The standard deviation between the three observers for emphysema extent was 10.6%. There were striking correlations between observers (observer 1 versus observer 2, r=0.92; observer 1 versus observer 3, r=0.89; observer 2 versus observer 3, r=0.93; p<0.0005 for all three relationships).

Relationship between visual and automated estimation of emphysema extent

There was a strong correlation between the extent of emphysemavis and emphysemaauto (r=0.90, p<0.0005) (Fig. 1). The extent of emphysemavis was greater than the extent of emphysemaauto in the whole group (40.4±30.5% versus 28.4±12.3%; p<0.00005) and in the subgroup with extensive emphysema (68.3±11.4% versus 38.4±7.5% respectively; p<0.00005). However, emphysemaauto was more extensive than emphysemavis in patients with limited emphysema (18.3±6.7% versus 12.6±12.8%; p<0.0001).

Fig. 1
figure 1

Relationship between extent of visually quantified emphysema (emphysemavis) and extent of emphysema using automated estimation (emphysemaauto); dashed line indicates the line of identity. There was a strong correlation between the extent of visually estimated and automated emphysema (r=0.90; p<0.0005). However, there is apparent systematic overestimation of emphysema by automated methods when disease is of limited extent and overestimation by visual methods when emphysema is extensive

Differences between the extent of emphysemavis and emphysemaauto were further evaluated in a subgroup of 23 patients with percentage predicted Kco levels less than 40% (mean±SD=31.2±6.8), indicative of functionally severe disease. The extent of emphysemavis (mean±SD=63.8±20.2%) was compatible with severe disease. However, the extent of emphysemaauto (mean±SD=38.0±8.9%) in this subgroup suggested major underestimation of disease severity (as judged by pulmonary function tests) when using automated estimation.

Functional impairment in relation to CT emphysema extent

As shown in Table 2, functional–morphologic correlations were similar in strength for emphysemavis and emphysemaauto (although most indices correlated slightly more strongly with the extent of emphysemavis), especially measures of gas transfer. The strongest correlations involved measures of gas transfer (Dlco, Kco), FEV1 and the FEV1/FVC ratio.

Table 2 Univariate correlations between pulmonary function indices and the extent of visually estimated and automated emphysema in 106 patients

Correlations between CT and functional indices were examined separately in limited and extensive emphysema (Table 3). Functional–morphologic correlations were generally weaker in extensive than in limited emphysema, regardless of the method of quantification. However, in extensive disease, pulmonary function indices were more closely linked to emphysemavis than to emphysemaauto.

Table 3 Functional–morphologic correlations in patients with limited (n=53) and extensive (n=53) emphysema

Bivariate models were constructed to determine whether functional CT relationships were strengthened with the combined use of visual and automated quantification of emphysema. On multivariate analysis, individual pulmonary function variables were linked solely to the extent of emphysemavis, with no significant linkage to emphysemaauto once the extent of emphysemavis had been accounted for.

Composite physiologic indices

CPIs were generated in separate models against the extent of emphysemavis and emphysemaauto using stepwise techniques. As shown in Table 4, FEV1 and Kco levels were the only retained variables; the equation explanatory power was higher when using emphysemavis (r 2=0.57) than when quantification of emphysema was automated (r 2=0.48). The combination of indices fitting best with the extent of emphysemavis and emphysemaauto are expressed by the formulae:

$${\text{Extent of emphysema}}_{{{\text{vis}}}} {\text{ = 96}}{\text{.8 - }}{\left( {{\text{0}}{\text{.67}} \times {\text{Kco}}} \right)}{\text{ - }}{\left( {{\text{0}}{\text{.41}} \times {\text{FEV}}_{{\text{1}}} } \right)}$$
$${\text{Extent of emphysemaauto = 47}}{\text{.9 - }}{\left( {{\text{0}}{\text{.19}} \times {\text{Kco}}} \right)}{\text{ - }}{\left( {{\text{0}}{\text{.20}} \times {\text{FEV1}}} \right)}$$
Table 4 Independent relationships between pulmonary function indices and the extent of visually estimated (emphysemavis) and automated (emphysemaauto) emphysema

There was excellent correlation between the CPIs derived by visual and automated methods (r 2=0.98) (Fig. 2). Furthermore, correlation between the extent of emphysemavis and the CPI (r=0.75; p<0.0005) (Fig. 3) was stronger than the relationship between extent of emphysemavis and any individual pulmonary function parameter.

Fig. 2
figure 2

Relationship between composite physiologic indices derived by visual [composite physiologic index (CPIvis)] and automated (CPIauto) methods

Fig. 3
figure 3

Relationships between the extent of visually estimated emphysema (emphysemavis) and a composite physiologic index (CPI), b percent predicted forced expiratory volume in 1 s (FEV1) and c percent predicted alveolar volume (Kco). Correlation between the CPI and extent of visually quantified emphysema (emphysemavis) (r 2=0.57) is stronger than that between FEV1 and Kco (r 2=0.41 and 0.49, respectively) and emphysemavis

Discussion

One of the principal goals of the current study was to determine whether estimation of the morphologic extent of emphysema is improved by using a combination of routine pulmonary function indices rather than individual measures in isolation. We have shown that gas transfer and FEV1 levels correlated best with the global extent of emphysema as judged by CT. It is emphasised that the composite index does not replace the morphological information provided by CT but does add precision to evaluation of functional severity. In the present study, functional-CT relationships were stronger with combinations of the Kco and FEV1 levels whether using visual or automated quantification of disease extent. Moreover, visual estimation of disease extent was more closely related to the severity of pulmonary function impairment, despite measurement noise due to interobserver variation in CT scoring. Thus, the CPI derived from visual scoring [96.8-(0.67×Kco)-(0.41×FEV1)] is preferable to the composite index derived from automated scoring.

In an early study, the extent of emphysema on postmortem histopathologic examination correlated negatively with the most recent premortem percentage predicted FEV1 and Dlco [25]. Subsequently, it was demonstrated that the extent of emphysema on CT is consistently related to indices of gas transfer and/or airflow obstruction [2630]. Results of the present study integrate these two relationships in a CPI. Furthermore, our results may be reconciled with the recent findings of the National Emphysema Treatment Trial Research Group investigation [4]. In that multicentre study, patients deemed to be at high risk of mortality from LVRS were randomised to surgical or medical intervention. The authors showed convincingly that mortality in patients with severe impairment of FEV1 and severe reduction in gas transfer levels (Dlco) was significantly higher with surgical treatment. Our index provides a means of combining these two functional variables into a single endpoint although Kco (rather than Dlco) was retained in stepwise regression.

Statistical methods in the current study, which were chosen to address the problem of collinearity in multivariate analysis, warrant explanation. Many pulmonary function indices are measured routinely and might, in theory, be combined in a complex CPI. However, there are important mathematical interrelationships between many indices and therefore their integration is not straightforward. Stepwise techniques exclude collinear variables that add little or nothing to the explanatory power of multivariate equations, an approach that provided a simple two-component index derived from both models in the present study.

The principle of integrating pulmonary functional indices into a CPI is desirable because multiple functional measurements can be integrated into a single index, an approach that has been adopted recently in patients with other diffuse parenchymal diseases such as idiopathic pulmonary fibrosis (IPF) [31]. By using CT estimates of the extent of interstitial fibrosis in patients with IPF, a three-component index was developed that was more accurate at predicting global disease extent than any individual functional parameter taken in isolation. Moreover, the CPI was also the strongest predictor of survival in patients with IPF. One of the principal benefits of the composite index in IPF was that, as in the current study, the score was derived from routinely collected functional measures. In patients with emphysema, there is a theoretical advantage to combining the Kco and FEV1 in a single physiologic index. The composite index, developed in the present study, can be conceptualised as quantifying the degree of airflow obstruction ascribable to emphysema alone. The advantage of Kco over Dlco in this regard is that Dlco generally falls in intrinsic airway disease when the FEV1 is less than 1 l. By contrast, Kco tends to be preserved [32] and may therefore be more robust in making the distinction between intrinsic airways disease and emphysema.

The difficulties of quantifying emphysema on CT need to be considered since both visual and automated methods have flaws. A potential drawback of visual estimation is measurement “noise” due to observer variation even though observer disagreement for scoring the extent of emphysema has not been a major factor in previous studies [31, 33, 34]. The theoretical attraction of computer-aided estimation of emphysema is elimination of measurement noise [18, 19]. Nevertheless, it must be emphasised that automated quantification is also prone to measurement error [3540]. Moreover, our results show that correlations between individual physiologic indices and visually estimated emphysema were similar in magnitude to, and in some instances stronger than, those with automated scoring.

It is commonly assumed that automated methods are superior to visual scoring with, for example, overestimation of the extent of emphysema, particularly in the upper zones, with visual estimation [41]. On first analysis, the same conclusions might be reached from our findings and from those of Bankier and colleagues [17]; in both studies, advanced emphysema was strikingly more extensive when scored visually. Against this, we have shown that functional-CT relationships in extensive emphysema were consistently stronger using visual scoring. Moreover, our results indicate that in end-stage emphysema (defined by a Kco level of less than 40% predicted), the mean extent of emphysema, using automated scoring, was only 38% and never exceeded 50%. Thus, it would seem that the discrepancy in quantification of emphysema extent between visual and automated estimation, observed by ourselves and others, might represent underscoring by automated techniques rather than overscoring using visual estimation.

Plainly, both visual and automated approaches to CT quantification of emphysema are imperfect, and it was for this reason that we explored both methods for the purpose of deriving the CPI. The absence of a readily available gold standard (i.e. morphometric estimation of emphysema extent) against which to compare CT quantification is a limitation in this field. However, the relative similarity in structure–function relationships in the present study and the strong correlation between the visual and automated composite indices are themselves striking findings. Thus, we believe that our results are robust since, regardless of the method of quantification on CT, the same two physiologic components (FEV1 and Kco) were identified in separate multivariate analyses.

In summary, we developed a two-component CPI for predicting the global extent of emphysema. The CPI can be derived from routinely available pulmonary function tests and, thus, should be both easily applicable and reproducible. We suggest that the benefits of the index include its application to accurate stratification of patients with emphysema in epidemiologic studies and trials of new therapies, such as LVRS.