Introduction

Systemic sclerosis (SSc) is an autoimmune connective tissue disease (CTD) characterized by progressive thickening and hardening of the skin as well as various internal organs. Oesophageal involvement is frequent in SSc [1, 2]. The relationship between oesophageal dilatation (OD), detected on chest high-resolution computed tomography (HRCT), and oesophageal dysmotility is well established in these patients [3,4,5,6].

Simultaneously, the presence of interstitial lung disease (ILD) is also a common feature in SSc patients [7]. Although the pathologic mechanisms underlying ILD are not yet fully elucidated, there is evidence that oesophageal motility disturbances and gastro-oesophageal reflux (GER) are implicated in ILD development in several lung conditions, including SSc. It is assumed that both abnormalities of oesophageal peristalsis and decreased low oesophageal sphincter pressure may lead to repeated microaspirations of gastric acid content into the respiratory tract, with consequent and progressive airway damage [8,9,10].

OD detected on chest HRCT is frequently associated with an extensive ILD, as well as with low pulmonary performance [4, 5]. The coexistence of ILD and GER disease (GERD) in patients with connective tissue diseases (CTDs) [6] and the evidence that CTD patients with ILD have a higher incidence of pathologic reflux reinforce the assumption that GERD may play a role in the natural history of lung disease in subjects with CTD [9].

Several studies have reported the prevalence of OD on chest HRCT in SSc, using empirical cutoff values to define OD without regard to normal standards [3, 4, 11,12,13].

The purposes of this study were to confirm the association of OD and ILD on chest HRCT in patients with SSc, and to identify the oesophageal diameter cutoff value with the best association (higher sensitivity and specificity) with SSc-ILD.

Materials and methods

Study population

From January 2016 to December 2017, consecutive SSc patients, defined by the American College of Rheumatology classification criteria [14], were included from the Rheumatological Clinic and the Medical Clinic of the Università Politecnica delle Marche. Participants were classified as suffering from limited or diffuse cutaneous involvement (lcSSc and dcSSc, respectively), according to Le Roy et al. [15]. The modified Rodnan skin score (mRSS) (score 0–51, where lower values represent a better condition) was employed to assess skin damage [16].

The presence of autoantibodies, including anti-topoisomerase I and anti-centromere, was also assessed.

The exclusion criteria were represented by: current or recent (within 3 months) respiratory infection, severe pulmonary hypertension requiring specific treatment, uncontrolled congestive heart failure, and clinically significant abnormalities other than ILD identified on chest HRCT. Echocardiography and right heart catheterization were examinations not included in the evaluation for this study.

Patient-centred measures

The patient-centred measures were collected to evaluate dyspnoea, physical function, and GER symptoms, respectively, employing the Borg Dyspnea Index (Borg score) [17, 18], the Health Assessment Questionnaire-Disability Index (HAQ-DI) [19, 20], and the GerdQ questionnaire [21].

The Borg score assesses the perceived dyspnoea (breathing discomfort) with a numerical scale from 0 to 10 (0 = no breathlessness at all, 0.5 = very very slight (just noticeable), 1 = very slight, 2 = slight breathlessness, 3 = moderate, 4 = somewhat severe, 5 = severe breathlessness, 7 = very severe breathlessness, 9 = very, very severe (almost maximum) and 10 = maximum) [22].

The HAQ-DI is a tool to measure the functional status (evaluating activities of daily living), and is calculated as an ordinal variable (from 0 = no disability, to 3 = severe disability).

The intended use of HAQ-DI is for arthritis [19]; however, it was shown to correlate with visceral and cutaneous involvement in SSc and to detect deterioration of function in these patients [20, 23].

The GerdQ questionnaire is a simple six-item self-administered tool [24]. Four items assess the symptoms and situations considered as positive predictors for GERD diagnosis: heartburn, regurgitations, disorders related to sleep, and use of over-the-counter products. The other two questions evaluate symptoms considered negative predictors for reflux, such as nausea and epigastric pain. The patient answers each question about symptom frequency during the last week using a Likert-like scale from 0 to 3 for positive features, and from 3 to 0 for negative attributes [21]. The maximum score that can be obtained is 18. GerdQ cutoff 9 gave the best balance with regard to sensitivity [66%; 95% confidence interval (CI): 58–74] and specificity (64%; 95% CI 41–83) for GERD.

Pulmonary function tests (PFTs)

PFTs were carried out within 2 weeks from the chest HRCT assessment, with a spirometry using a computerized lung analyser (MasterScreen Diffusion, Jaeger GmbH, Höchber, Germany). Forced vital capacity (FVC), first second forced expiratory volume (FEV1), and the single breath carbon monoxide diffusing capacity of the lung (DLco) were recorded. These parameters of PFT were expressed as percentage of predicted value. At least three measurements were taken for each variable to guarantee repeatability.

Parenchymal abnormalities on HRCT

All HRCT examinations were performed according to a standard protocol, using a CT 64GE light Speed VCT power scanner with a rotation tube scanning time of 0.65 s. Scans were acquired in full inspiration from the apex to the lung base in supine position, at 120 kV and 300 mAs, and slice thickness and spacing of scans of 1.25 and 7 mm, respectively. Contrast media agents were not employed. Lung abnormalities were examined by an experienced general and thoracic radiologist (MC), blinded to clinical and functional findings. The lung parenchymal abnormalities were assessed according to the Warrick scoring. For a detailed description of the Warrick scoring, the reader can refer to the original article [25].

Oesophageal diameter measurement in HRCT

In this study, the widest oesophageal diameter (WOD) was used as a measure of OD. As employed by Richardson et al. [5], for each patient WOD was collected on chest HRCT (axial images) measuring the largest distance (mm) between the internal oesophageal mucosal limits in three levels: above the aortic arch, between the right inferior pulmonary vein and the aortic arch, and between the diaphragmatic hiatus and the right inferior pulmonary vein (Fig. 1). A similar method to measure the OD has been already used in other SSc-ILD studies with good interobserver agreement [12, 13, 26]. The oesophageal axial diameter in HRCT was measured by OsiriX MD 7 (Fig. 2), a DICOM viewer software (OsiriX MD version 7, 64-bit format) on a Mac Mini (2.8 GHz Intel Core 2 Duo Desktop Computer, 16 GB random-access memory; Apple Computer, Cupertino, CA, USA) running Mac Operating System macOS High Sierra, version 10.13.2.

Fig. 1
figure 1

Widest oesophageal diameter (WOD) assessment (arrows) on chest high-resolution computed tomography scans. According to Richardson et al. [5], the three oesophageal diameter measurements were performed: above the aortic arch (a), between the right inferior pulmonary vein and the aortic arch (b), and between the diaphragmatic hiatus and the right inferior pulmonary vein (c). Note interstitial lung disease with ground-glass and reticular opacities and traction bronchiectasis

Fig. 2
figure 2

Representative sequence of the OsiriX measurement process of the widest oesophageal diameter in axial high-resolution computed tomography scan

Interobserver agreement for the oesophageal axial diameter measurement in HRCT was tested in 20 examinations. There was a good agreement for three oesophageal diameter measurements on supine axial HRCT images (above the aortic arch, weighted kappa = 0.69; between the right inferior pulmonary vein and the aortic arch, weighted kappa = 0.71; between the diaphragmatic hiatus and the right inferior pulmonary vein, weighted kappa = 0.63).

Statistical analysis

Data were entered into a Microsoft Excel database and analysed using MedCalc® version 16.0 (MedCalc Software, Mariakerke, Belgium). Values were expressed both as mean ± SD (standard deviation) and median (interquartile range, IQR). The interobserver agreement was calculated using a Fleiss weighted kappa test. A value of 0–0.20 was considered poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 good, and 0.81–1.00 excellent. A two-sample “t” test was used to compare continuous variables and χ2 test to compare categorical variables between patients. The relationships among WOD, Warrick score, PFTs results and patient-centred measures were calculated using Pearson’s product–moment correlation (Pearson r values). Furthermore, multivariate regression analysis was performed to identify the factors associated with OD on HRCT. Covariates considered in the model included: age, gender, disease duration, anti-topoisomerase antibodies, mRSS, Borg score, GerdQ, HAQ-DI, FVC, and DLco. The results were expressed as multivariate regression coefficient (R) and square regression coefficient corrected (R2) for the number of variables entered in the analysis. Significance was set at p < 0.05. The area under the receiver operating characteristic curves (AUC-ROCs) analysis was used to identify the WOD with the best sensitivity and specificity associated with SSc-ILD. A Warrick score of 7 was employed as cutoff point to consider the presence of a significant SSc-ILD [27].

Results

Overall, 126 SSc patients were included in the study. The mean (± SD) age was 60.7 (± 10.7) years, the mean (± SD) disease duration was 11.15 (± 7.96) years, and 82% were women. The group of patients having dcSSc (53 patients), in comparison with lcSSc patients (73 patients), was older (64 vs. 58 years; p = 0.002). The mean (± SD) WOD was 13.5 (± 4.2) mm, and in 76 (60.3%) participants WOD was ≥ 11 mm. ILD was diagnosed in 86 SSc patients (Warrick score ≥ 7), while in 40 subjects the lung findings were normal. On PFTs, mean FVC was 87.0 ± 78.5%, average FEV1 87.9 ± 16.2%, and DLco 71.6 ± 14.4% of predicted. Sixty-four (50.8%) patients had a total GerdQ score ≥ 9. SSc patients with ILD (86 patients) had larger mean oesophageal diameter than those without lung disease (mean ± SD 14.8 ± 4.3 vs. 10.6 ± 2.7 mm; p 0.003). Subjects with greater WOD were more likely to be anti-topoisomerase I positive (31% vs. 19%, p = 0.002), have dcSSc (59.2 vs. 40.7%, p = 0.001), and longer disease duration (12.6 vs. 9.0 years, p = 0.013). They were also more likely to be older (64 vs. 55 years, p = 0.001). Baseline study cohort characteristics are shown in Table 1.

Table 1 Baseline study cohort characteristics

The results of the analyses of the relationships among WOD, patient-centred measures, Warrick score, and PFTs results are shown in Fig. 3. A high correlation was observed between WOD and GerdQ (r = 0.886, p < 0.001), Borg score (r = 0.705, p < 0.001), and Warrick score (r = 0.614, p < 0.001). WOD negatively correlated with DLco (r = − 0.508, p < 0.001). Fair to moderate correlations were found between WOD and disease duration (r = 0.302, p = 0.001), age (r = 0.346, p = 0.001), and HAQ-DI (r = 0.422, p < 0.001).

Fig. 3
figure 3

Scatter plots with regression line, illustrating the correlation between the widest oesophageal diameter (WOD) and GerdQ (r = 0.886, p < 0.001) (a), Borg index (r = 0.705, p < 0.001) (b), DLco (r = − 0.508, p < 0.001) (c), and Warrick score (r = 0.614, p < 0.001) (d)

The results of the multivariate regression analysis revealed positive associations between WOD and GerdQ (p < 0.0001), Borg score (p = 0.0005), and Warrick score (p = 0.0192) with a coefficient of determination R2 of 0.837 (Table 2). Age, sex, disease duration, SSc disease pattern, anti-topoisomerase I antibodies, HAQ-DI, mRSS, and PFTs were not significantly associated with WOD on HRCT.

Table 2 Multivariate regression analysis between the widest oesophageal diameter (WOD) and the other variables

The AUC-ROC analysis for the presence of significant SSc-ILD gave the optimal balance between sensitivity (80.2%; 95% CI 70.2–88.0) and specificity (72.5%; 95% CI 56.1–85.4) with a WOD cutoff ≥ 11 mm (AUC = 0.819; SD 0.036; 95% CI 0.746–0.891) (Fig. 4).

Fig. 4
figure 4

Receiver operating characteristic curve for determination of the widest oesophageal diameter (WOD) optimal extent threshold. The circle on the curve shows the optimal cutoff point, corresponding with the maximum sum of sensitivity and specificity

Calculations of the negative and positive predictive values (NPV and PPV), as well as of the positive and negative likelihood ratio (LR+ and LR−) confirmed the optimal cutoff point of 11 mm in this study population. Values > 9 mm for WOD increased the sensitivity to 93.0% but decreased the specificity to 30.5%, whereas measures > 19 mm increased the specificity to 97.5% but decreased the sensitivity to 23.3% (Table 3).

Table 3 Receiver operating characteristic curve analysis for the best cutoff point for the widest oesophageal diameter (WOD) associated with the presence of interstitial lung disease (applying a Warrick score of 7 as external criterion)

Discussion

This study demonstrated that OD is a frequent feature of SSc patients, and that this condition is more common in subjects with a coexisting ILD. Moreover, there is a clinically significant association between OD and HRCT findings of ILD: oesophageal diameter positively correlates with patient-centred measures of dyspnoea, gastro-oesophageal symptoms and functional disability, and is negatively correlated with DLco. Furthermore, OD is more prevalent in subjects with longer disease duration and is significantly more correlated with the presence of anti-topoisomerase I serum autoantibodies.

Additionally, to the best of our knowledge, it is the first research that defines a cutoff point for OD associated with SSc-ILD using ROC curve analysis.

The mechanisms underlying SSc-ILD are not yet completely known. Some evidences suggest that both cell-mediated and humoral immunity play a role in the pathogenesis of ILD [28,29,30,31,32].

Oesophageal motor alterations have also been considered as contributing factors of SSc-ILD [8,9,10]. The changes in oesophageal peristalsis and decreased low oesophageal sphincter tone may induce a predisposition to GER [8,9,10, 33].

Many investigators have described how GER can be one of the initiating factors of a variety of respiratory disorders (e.g., asthma, bronchiectasis, and recurrent acute pneumonia) [34,35,36,37]. Microaspirations of gastric content into the airways are believed to work as trigger mechanism in inducing pulmonary parenchymal lesions. Many works have pointed out that GER therapy could potentially improve symptoms and PFT parameters in these patients [34, 37, 38].

Several studies have reported the prevalence of OD on chest CT scans in SSc patients. These studies used empirical cutoff values to define OD without regard to normal standards. For example, Bhalla et al. [3] and Pitrez et al. [11] employed a definition of dilatation as an oesophageal diameter below the aortic arch > 10 mm on axial scans, based on a computed tomography atlas [39]. Takekoshi et al. proposed a cutoff value of 10 mm at the carinal level and 15 mm for maximum diameter [40]. Pitrez et al. used the ROC curves to determine the oesophageal diameter associated with oesophageal dysmotility, as assessed by radionuclide scintigraphy [11]. They found that an oesophageal diameter below the aortic arch > 9 mm had 83.1% sensitivity and 94.1% specificity for dysfunction.

However, the literature is somewhat conflicting regarding the association between OD and SSc-ILD.

Previous studies that extrapolated the 9 or 10 mm oesophageal diameter cutoff point to study the association with radiographic ILD on HRCT yielded conflicting results: Vonk et al. (≥ 10 mm) and Pandey et al. (≥ 9 mm) did not find a significant association between OD and ILD [4, 13]. However, both 10 and 9 mm oesophageal diameter cutoff points seem to have low specificity for the association with ILD.

Although Pandey et al. concluded that there was no association between OD and ILD, it is possible that the size of the cohort or the cutoff point of 9 mm may account for the lack of association. Interestingly, the authors noted a statistically significant reduction in DLco and a non-significant trend towards reduction in total lung capacity in those patients with oesophageal diameters > 9 mm. This finding may suggest that DLco is a more sensitive marker of lung injury related to silent aspiration as has been shown in other forms of lung injury.

In 2012, Patiwetwitoon et al. published results from another study involving 71 patients with SSc and showed a significant correlation between the extent of honeycombing on HRCT and oesophageal diameter [12]. The authors did not report PFTs results.

Lock et al. revealed that in SSc patients, the presence of hypomotility or aperistalsis detected on oesophageal manometry is associated with lower lung volumes and reduced DLco values [33]. In addition, Richardson et al. revealed that an augmented oesophageal diameter on HRCT in SSc patients is associated with more severe ILD, lower lung volumes, and worse CO diffusion [5].

Although our study does not demonstrate a causal relationship between oesophageal diameter and SSc-ILD, our findings are consistent with the results of previous studies corroborating the hypothesis that GER and microaspiration may be involved in the SSc-ILD pathogenesis.

Three potential limitations to our study have to be mentioned. Firstly, there were some intrinsic issues: the nature of this study was cross-sectional, and the information on risk factors for SSc-ILD progression was not available to our cohort. Moreover, endoscopic oesophageal techniques were not performed routinely, and information about some baseline variables, such as pack-years of tobacco exposure, were not available. Secondly, the generalizability of our results may be limited by the single university recruitment. Thirdly, we had no control group or patients with other causes of oesophageal dysfunction to compare with SSc patients.

In conclusion, our findings confirm that patients with SSc-ILD had more dilated oesophagus on chest HRCT compared with patients with SSc and no significant lung disease. Using chest HRCT measurements has several advantages in assessing oesophageal alterations over the conventional methods (non-invasive, widely used in SSc patients). Therefore, the detection of OD in the early stage of ILD may help start early treatment and prevent further progression of lung disease [6]. Future longitudinal studies to determine whether a dilated oesophagus is a risk factor for ILD progression should be designed to include careful assessment of the SSc subset, quantitative changes on HRCT scan of the lungs [18], and objective reference criteria for GERD diagnosis.