1 Introduction

Lung ultrasonography (LUS) has been increasingly used as a surrogate of chest X-ray in the intensive care unit (ICU) since more than 15 years. Lung ultrasound is the second type of ultrasound exam performed in critically ill patients [1]. The diagnosis accuracy of LUS is superior to conventional chest X-ray in ICU or emergency department, especially for pleural effusion, atelectasis and pneumothorax [2,3,4]. Finally, the daily use of LUS was shown to be associated with a reduction of the number of computerized tomography (CT) scanner or conventional chest X-ray without increasing the morbi–mortality [5,6,7].

Lichtenstein and coworkers emphasized the idea that a very simple and low-cost device with a microconvex 5 MHz probe was sufficient for LUS in ICU patients. The feasibility of LUS with such device was largely demonstrated [3].

However, the rapid development of new ultrasound techniques such as 3-D imaging raises the question of their potential usefulness for LUS. The 3-D LUS semiology has never been described. Therefore, the primary objective of the present study was to determine the global diagnosis concordance between 2-D and 3-D LUS in ventilated critically ill patients. The secondary endpoint aimed at assessing the inter-observer reliability. Finally, the concordance between 2-D and 3-D exam was assessed considering four specific lung diagnoses: normal lung, interstitial syndrome, consolidation and pleural effusion.

2 Materials and methods

2.1 Study design

The present trial was a prospective, single center, observational, pilot study. As it was strictly observational, the Institutional Review Board (IRB) approved the study protocol (no. 15/07.06, IRB of Centre Hospitalier Universitaire de Nîmes) before the enrollment of the first patient. According to the French law, the IRB waived patient informed consent. However, an information letter was given to each patient when conscious or relatives who could refuse to participate. Anonymous data were declared to the French Data protection authority (CNIL Commission Nationale de l’Informatique et des Libertés) (declaration number: 1878494 30th July 2015).

Patients were enrolled in the study from July the 20th to August the 10th 2015.

2.2 Population description

Inclusion criteria were: patient ≥18-year-old admitted to ICU, invasive ventilation and need for LUS examination.

Non-inclusion criteria were: age <18, patients without social insurance, refusal for non-sedated patient, chest drainage and extensive dressings.

Moreover, patients in whom LUS examination was not available were excluded.

2.3 Lung ultrasound protocol

The imaging was performed with EPIQ 7 with a probe X5-1 (5–1 MHz Matrix transducer, Philips®, Eindhoven, Netherlands) allowing 2-D and 3-D real time examinations.

In a first time, a non expert trained operator (defined as a daily LUS practice ≥5 years), performed a 2-D LUS in eight lung areas (four on the right side and four on the left side) in each patient as described by international guidelines published in 2012 [8] (Fig. 1). Immediately after, the same operator performed a 3-D LUS with the same ultrasonography machine. A simple switch of the machine from 2-D to 3-D permitted to perform the two exams in 2-D and 3-D LUS, respectively.

Fig. 1
figure 1

Areas of interest for LUS according to literature [8]. A side denotes the right side of patient; B is for the left side. PSL parasternal line, AAL anterior axillary line, PAL posterior axillary line

For 2-D and 3-D analysis, one cine loop over five cardiac cycles was stored on a computer hard drive.

According to literature, four main LUS diagnoses were possible [8]:

  • Normal lung: defined as sliding pleural line with A-lines without B-lines patterns.

  • Interstitial syndrome: defined as presence of B-lines, discrete laser-like vertical hyperechoic reverberation artifacts arising from pleural line to the bottom of the ultrasonography screen without fading, moving synchronously with lung sliding.

  • Pleural effusion: defined as a generally hypoechoic space between parietal and visceral pleura with or without consolidation.

  • Consolidation without pleural effusion: defined as a subpleural echo-poor region or one tissue-like echotexture.

  • Because the study period was short (3 weeks), there were not enough patients with pneumothorax for analyzing this diagnosis.

In a second time, two experts (1 and 2) performed a review of these loops. Expert was defined as an intensivist with a national diploma of Critical Care ultrasonography with a daily practice of LUS since more than 5 years and regular teaching of ultrasonography in medical university.

All exams were blinded and stored without visible connection between 2-D and 3-D exams. The experts had to make a diagnosis between those four possibilities and to quote by a numerical scale the quality of the exam from 0 (poorest quality) to 10 (best quality).

The experts blindly and randomly analyzed the exams on a computer (Fig. 2).

Fig. 2
figure 2

Comparative exams 2-D and 3-D for main diagnoses. 2-D two-dimensional, 3-D three-dimensional

2.4 Studied parameters

In each patient, sex, age, main diagnosis and Simplified Acute Physiology Score II (SAPSII) score [9] were recorded.

2.5 Statistical analysis

Assumptions for the sample size were based on our daily practice in our ICU. The number of patients needed was calculated in order to be able to estimate a kappa (κ) coefficient over 0.6 [10, 11] with a one-sided alpha-risk of 5%, a power of 90% and expecting the following proportions: normal lung (40%), interstitial syndrome (30%), lung consolidation (10%) and pleural effusion (20%). Therefore, 114 LUS examinations were mandatory.

We tested the concordance by unweighted κ test of Cohen [11] between 2-D and 3-D LUS diagnosis for expert 1 to the primary end point.

For secondary objectives, the inter-rater reliability was tested for 3-D LUS by a Fleiss’ Kappa [12], unweighted κ of Cohen for main diagnoses were calculated and quality evaluation scale value were compared by Wilcoxon signed-rank test for paired samples. A value of κ ranges from −1 (total disagreement) to 1 (perfect agreement), a value of 0 indicates no agreement than chance [11]. We used the Landis and Koch table to characterize κ values: <0 indicating no agreement, from 0 to 0.20 a slight agreement, from 0.21 to 0.40 a fair agreement, from 0.41 to 0.60 a moderate agreement, from 0.61 to 0.80 a substantial agreement and from 0.81 to 1.0 an almost perfect agreement [13].

Data are expressed in mean value with standard deviation (SD). Qualitative variables were expressed in absolute number with percentage (%).

Statistical significance was set at 5% for every test used.

3 Results

The study enrolled 16 patients. Demographic data and the mains of ICU admission are shown in Table 1. One hundred and twenty-eight cine loops of LUS examinations were performed in 2-D and 126 cine loops were performed in 3-D. Two cine loops were excluded because of poor quality in 2-D exams leading to analyze 126 cine loops in 2-D and 126 in 3-D (Fig. 2).

Table 1 Demographic data

3.1 Primary endpoint

The ultrasound semiology described for 2-D LUS was the same in 3-D LUS (Fig. 3). The observed concordance between 2-D and 3-D exam was 83.3% (105/126) for the operator, 80.2% (101/126) for expert 1 and 77.6% (99/126) for expert 2.

Fig. 3
figure 3

Flow chart. ICU intensive care unit, LUS lung ultrasonography, 2-D two-dimensional, 3-D three-dimensional

The Cohen’s κ value was 0.69 [95% CI 0.58–0.80] for expert 1 meaning a substantial agreement between 2-D and 3-D diagnosis.

3.2 Secondary endpoints

The inter-rater reliability was very good according the Fleiss’ κ value, which was 0.94 [95% CI 0.87–1.0] for 3-D exams (Table 2).

Table 2 Concordance tests for 2-D and 3-D imaging by using the Landis and Koch scale [12]

Comparing with main diagnosis, for expert 1, the Cohen’s κ was excellent for pleural effusion (κ = 0.93 [95% CI 0.76–1.0]), substantial for normal lung diagnosis (κ = 0.68 [95% CI 0.51–0.86]) or interstitial syndrome (κ = 0.62 [95% CI 0.45–0.80]) and fair for consolidation diagnoses (κ = 0.47 [95% CI 0.30–0.64]) (Table 3).

Table 3 Inter-rater reliability for 2-D and 3-D exams [11, 12]

The quality numerical scales were 6.8 ± 1.6 in 2-D versus 6.7 ± 1.4 in 3-D for the operator (p = 0.50), 7.3 ± 1.7 in 2-D versus 7.2 ± 1.9 in 3-D for expert 1 (p = 0.76) and 7.5 ± 1.4 in 2-D versus 7.1 ± 1.7 in 3-D for expert 2 (p = 0.09) (Fig. 1).

4 Discussion

The present study showed a substantial agreement (κ = 0.69) between 2-D and 3-D real-time LUS. The inter-operator reliability was excellent (κ = 0.95). Nevertheless, the concordance was poor for lung consolidation (κ = 0.47) whereas it was substantial for normal lung (κ = 0.68), interstitial syndrome (κ = 0.62) and almost perfect for pleural effusion (κ = 0.92).

Modern ultrasound machines allow real-time 3-D ultrasound imaging (4-D). The usefulness of the technique has been described in fetal echocardiography [14]. It has also been used in adult echocardiography for cardiac chamber quantification or valvular disease [15, 16]. To the best of our knowledge, this is the first study showing the feasibility of LUS by using a 3-D ultrasound probe.

Even if daily practice of LUS was recommended in critically ill patient, it is rarely used in ICU as it represents only 20% of the ultrasound exams [1]. This could reflect that this technique is not easy to implement and to teach, especially because basic images are more based on artifacts than on anatomical structures. This is particularly true for lung sliding and interstitial syndrome (B lines artifacts). In order to improve quality of images and diagnosis accuracy, it could be hypothesized that 3-D ultrasound could be useful.

In clinical practice, the present study shows that the global quality of imaging with 2-D is different from 3-D LUS without determining which is the better. This was particularly true for lung consolidation. The 2-D semiology for consolidation syndrome is probably not strictly concordant to 3-D semiology. As CT scanner imaging was not performed in the present study, this raises the need for describing a specific semiology for 3-D LUS. Moreover, considering the price of ultrasonography machine with 3-D probes (comparing to 2-D conventional devices), it is at present hard to see how this could be developed outside the context of clinical research for describing the imaging of LUS in 3-D and reporting its clinical implications.

The second difficulty for 3-D LUS use in clinical practice is the volume data acquisition of images. When positioning the 3-D LUS probe on the thorax, the physician obtains a cone-shaped image. By using machine trackball, this cone can be moved in every position of space. It is hard to determine the optimal position of such cone in order to get the best diagnosis performance.

Some limitations should be advanced.

First, the number of patients is low. This is due to the fact that the ultrasonography machine was lent to our unit for only 3 weeks. We increased the statistical power of this study by performing an important number of exams of each region of interest. This leaded to analyze 252 LUS exams. Second, the post hoc analysis of videos loops probably reduces the diagnosis accuracy because the experts cannot control the transducer position on thorax. The numerical quality scale was 7.3 ± 1.8 in 2-D and 7.2 ± 1.9 in 3-D for expert 1. These two values are moderate meaning that post hoc computer analysis is more difficult than performing a real-time ultrasound examination. Third, we choose to present 3-D loops in same orientation to 2-D loops. That implies a loss of a great part of interest of 3-D imaging. Forth, we used the cardiac phased array probe whereas the LUS has been extensively described by using a 5 MHz microconvex probe. However, International evidence-based recommendations for point-of-care lung untrasound proposed that the choice of the transducer could be based on physician preference [17]. Finally, we did not compare 2-D and 3-D LUS exams to a gold standard as CT scanner.

5 Conclusion

This first pilot study shows a substantial concordance between 2-D and 3-D LUS diagnoses in ventilated critically ill patients. The inter-observer reliability was good. Nevertheless, the diagnosis concordance for lung consolidation was poor. As no comparison with CT scan was performed, larger studies are needed to precise the exact role for 3-D LUS in clinical practice.