Introduction

Systemic sclerosis (SSc) is a generalized disorder of the connective tissue characterized by a progressive fibrosis of the skin (both limited and diffuse) and internal organs which cause disability and might lead to organ failure and death.

While the heart, kidney, and vascular involvement were deeply investigated in the past, until now only few and very recent studies have investigated its role in the assessment of the skin [1, 2], musculoskeletal disease [3], nerves [4], and lung involvement. This review will provide an update of the available data regarding the role of ultrasound (US) in diagnosis and clinical evaluation of lung involvement in SSc.

The lung involvement in SSc

Lung involvement is common in patients with SSc (being second to esophagus in terms of prevalence) and most often comprises fibrosis or interstitial lung disease (ILD) and pulmonary vascular involvement leading to arterial hypertension [57] that might lead to severe morbidity and mortality [6]. Clinical signs of ILD involvement occur significantly in about 40 % of patients and can include atypical chest pain, cough, fatigue, and dyspnea [6] but might, overall in the early phase, be subclinical or difficult to distinguish from other comorbidities, in particular when there is a relevant involvement of the heart. In fact, lung fibrosis is present in most SSc patients during postmortem examination [8] and may account for 16 % of deaths [9].

Even if ILD is more frequently associated with diffuse pattern of SSc [1012], neither the extent nor the severity of cutaneous manifestations correlate with pulmonary involvement [13] that might occur also in limited SSc or in patients without skin involvement (“SSc sine scleroderma”) [14]. Given its high prevalence, disability, and mortality, the early detection of ILD and the prompt onset of treatment are therefore of paramount importance in improving outcome of SSc. For these reasons, in the last few years, there has been a great debate to create a diagnostic gold standard for early diagnosis and successive follow-up in SSc [15, 16].

Lung biopsy is still considered the gold standard for diagnosis but is still not suitable as a screening method for its invasiveness and the possibility of sampling errors [17], so the use of high-resolution computerized tomography (HRCT) is considered essential for differential diagnosis, early detection of alveolitis, definition of geographical distribution of fibrosis and pattern (reticular or ground glass), and finally, for staging the disease [18], even if there is not always a complete concordance between HRCT features and histological findings [19].

In the last few years, HRCT employment has significantly increased the sensitivity of lung fibrosis diagnosis, particularly with respect to chest X-ray that seems to be quite insensitive [20], and might play a determining role for the prognosis of SSc patients [21]. However, it is not used frequently during follow-up [7] to avoid the risk correlated to its high rate of radiation.

Thus, ILD is monitored during follow-up with the reduction of diffusion capacity of carbon monoxide (DLCO) of pulmonary function tests (PFTs), which demonstrated to be correlated to the severity of ILD detected by HRCT [22, 23]. Unfortunately, PFTs might not reveal specific changes in the very initial phase of disease [24] and given only information on lung function but not on the amount or localization of fibrosis. Otherwise, the need to have an imaging technique which is easy to perform, rapid, cheap, and radiation free for routine monitoring has led, in recent years, to explore the possible use of transthoracic US to assess lung fibrosis in SSc.

Lung US evaluation

Lung US evaluation was initially performed in the late 1960s to assess pleural effusion, but it was only in the 1990s that the “comet tail” in the lung (ULCs) was for the first time described by Lichtenstein [25] for the assessment of alveolar-interstitial syndrome.

ULCs are US signs of thickness of subpleural interlobular septa (from the usual value of 300 to 700 μm) due to the presence of fluid in lung edema or collagen tissue accumulation in pulmonary fibrosis [26, 27]. Physically, the pathological lung shows these US vertical “artifacts,” initially called “ULC” and successively defined “B lines,” which strictly arise from the pleural line, are well defined and laser-like, move with lung sliding, and spread to the edge of the screen without fading and erasing the normal “A line” (horizontal lines that arise from the pleural line; Fig. 1) [28]. This artifact is not exclusively found in pathologic conditions and can be visualized in about 27 % of healthy subjects [25, 29].

Fig. 1
figure 1

Gray-scale US of the lung with the probe on the anterior axillary line showing US lung comet or B lines (arrowheads). White arrows indicate pleural line

Firstly, Lichtenstein et al. [25] established for the first time that B lines correlate with the water thickening of interlobular septa in interstitial disease and in alveolar edema, through a systematic comparison with chest tomography (CT). Furthermore, CT data have shown that US is able to distinguish the “water pattern” from the “fibrotic pattern” of the thickening in chronic obstructive pulmonary diseases.

A few years later, Jambrik et al. [29] described the methodology of the US examination and provided an initial scoring system. In this study, the intercostal spaces of 121 consecutive hospitalized patients affected by cardiology and pneumology diseases (32 interstitial disease) were examined firstly on the parasternal and emiclavicular lines (with the patient supine), on the anterior, medial and posterior axillar lines (with the patient lying laterally), and on the paravertebral and angular scapularis lines (with the patient seated). All of the spaces from II to V (on the anterior and lateral thorax), from I to XI (on the paravertebral line), and VII–VIII on the angular scapularis line were scanned longitudinally and bilaterally. A correlation between the fluid amount estimated on X-ray [30] and the number of B lines was then demonstrated by the authors [29].

US assessment was made with a low-frequency cardiac transducer (2.5–3.5 MHz), and the intra- and inter-observer variability was 5.1 and 7.4 %, respectively. Otherwise, a study limitation was represented by the use of two different types of machines (Optigo Philips and Hewlett Packard Sonos 5500) to perform US. One year later, Agricola et al. [31], using the PiCCO System [a device for cardiac output measurement and then extravascular lung water (EVLW)], demonstrated a significant positive linear correlation between B lines total score and radiologic score as for the EVLW.

In 2007, Gargani et al. [32] showed that, in an animal model, the lung biopsies (showing interstitial and alveolar edema) confirmed the US data of ULC presence. Similar results were given by Jambrick et al. [33] in 2010.

The first score of ILD severity on the basis of the sum of the B lines noticed during the full exam was given by Picano et al. [34] (0, normal or <5 ULCs; 1, low, with 5–15 ULCs; 2, moderate, with 15–30 ULCs; 3, severe, with >30 ULCs).

The real history of US assessment in SSc (and more generally, in rheumatology) starts in 2008, when Doveri et al. [35] and Gargani et al. [36] described “ULCs” as an index of interstitial fibrosis in SSc patients. Doveri et al. [35] examined 30 SSc patients with the same method of Jambrick [29] and the score of Picano [34] with a cardiac probe (2.5–3.5 MHz, Optigo Philips), without any healthy or other diseases controls. HRCT was defined with a score from 0 to 3 (0, normal; 1, bibasilar fibrosis; 2, diffuse fibrosis; and 3, honey combing) and correlated with the score of ULCs. In the same study, the number of ULCs was higher in Scl-70 positive patients (usually related to a worse prognosis) in comparison to ACA-positive patients. Finally, the number of B lines was not correlated with PAH. The limitations of this study are that neither the intra- nor the inter-reader variability was reported and that the US results were not compared with a score for HRCT more structured and articulated for basic alterations and lung areas involved.

Successively, Gargani et al. [36], using the same Jambrick methodology [29] and type of US machine used by Doveri [35], studied 33 SSc patients comparing the total number of ULCs to the HRCT Warrick score (that considers parenchymal alteration such as ground glass, irregular pleural margins, septal lines, honey combing, subpleural cysts, and the number of lung segments involved) [37]. They modified the Picano cutoff [34], defining as “pathologic” a total number of B lines higher than ten (the full “white” in a single scanning was considered as corresponding to ten B lines).

The number of ULCs correlated to the HRCT assessment. However, also in this study, the results did not describe the concordance between single segments involved in US and HRCT. Interestingly, weaker but still statistically significant correlations were found between the number of B lines and PFT parameter impairment (total lung capacity, vital capacity, and DLCO).

Finally, in this study, the intra- and inter-observer variability between two blinded and independent sonographers was evaluated and confirmed the data previously published by Jambrick [29]. More recently, Delle Sedie et al. [38] demonstrated that B lines were also visible using a linear probe with a higher frequency (6 MHz). The authors modified the scanning protocol previously described [35], not performing mid axillary line scanning because, given the width (6 cm) of the linear probe footprint, B lines might be numbered twice. Twenty-five patients were studied both with a linear (6 MHz, Toshiba Powervision 6000) and with a cardiac transducer (2.5–3.5 MHz, Optigo Philips), previously used in other studies, comparing the results with the gold standard provided by the HRCT (using the Warrick score).

A significant intra-class correlation (ICC) was found between ULCs obtained by both probes (ICC = 0.681), with better (but not statistically significant) results for the anterior chest region than for the posterior side. Moderate to good intra-class correlation was shown between cardiac or linear probe and HRCT (ICC = 0.547 and 0.600, respectively). Furthermore, specificity and sensitivity of US with respect to the HRCT were 70 and 85 % for the cardiac probe, and 60 and 85 % for the linear one. Pathologic cutoffs for B lines were calculated for each probe used, resulting in 11 and 5 for linear and cardiac probe, respectively. Finally, the authors concluded that lung US examination does not seem to be related to a specific US transducer, but a specific cutoff has to be set for each type of US equipment.

Recently, Gutierrez et al. [39] evaluated the possibility to use a reduced US scan protocol with respect to the Jambrick method [29] (employed in all of the previous studies), in order to have a less time-consuming methodology. The authors studied a group of 36 patients with different connective tissue diseases (28 SSc), comparing the US results (obtained by using a MyLab70 Esaote machine with a 2–7-MHz convex transducer) with the HRCT Warrick score.

The comprehensive US assessment was made using the Jambrick approach [29] and, by a simple post hoc analysis resulting from US comprehensive assessment, 14 sites were chosen (because of the demonstrated higher prevalence of US B lines in the comprehensive assessment and the easy accessibility by US) to determine the simplified US assessment. The 14 intercostal spaces chosen were: II lung intercostal spaces (LIS) on the parasternal; IV LIS on the midclavear, anterior axillary, and mid axillary lines; and VIII LIS on the paravertebral, subscapular, and posterior axillary lines. A positive correlation was found between the number of B lines and the Warrick score in both methods, and they also had a significantly high correlation. The inter-operator agreement was very high (k value >0.8), even minimally lower than in comprehensive assessment (k > 0.9). Finally, the mean time spent to perform the simplified score was less with respect to the comprehensive one (only 8.3 versus 23.3 min).

Conclusions

On the basis of the present literature, US could have a role in the assessment of ILD. Moreover, it is a non-ionizing, bedside procedure, inexpensive, widely available, with a low variability, and readily and largely accepted by the patient. It might be performed with different machines and probes and with simplified, less time-consuming methods.

Although the results are promising, some important questions still need an answer. Even if few data were given on healthy subjects [40, 41], no data are provided on large population studies to confirm that the given cutoffs are applicable in healthy subjects.

Furthermore, even if US results are concordant to the HRCT score, US data were not supported by lung biopsy results (that still represent the gold standard examination for fibrosis), in patients with SSc or, more generically, with ILD. Finally, the sensitivity to change of US (the variation of parameter over time) should be demonstrated in future studies, for possible employment of US in the follow-up in daily practice and clinical trials. So, we can conclude that US lung evaluation is actually a promising imaging technique for the assessment of ILD, but larger and prospective studies (involving healthy subjects and patients affected by other diseases) with a comparison with the gold standard (biopsy) are needed to conclude its definite role in clinical practice.

Key messages

• Lung US is a simple, radiation free, available, and well-reproducible imaging technique.

• In SSc patients, it correlates well with HRCT results.

• It does not need a sophisticated machine and might be scored with less “time-consuming” methods.