Introduction

Two-dimensional ultrasound (2DUS) is the primary investigation for gallbladder disease. It is accurate, with reported sensitivity of 85% to 99% for the most common gallbladder diseases [14], but 2DUS is operator-dependent [5, 6] and leaves a minimal permanent record of the examination.

Three-dimensional ultrasound (3DUS) is an emerging technology that manufacturers claim addresses these issues and increases department efficiency by cutting examination time [7, 8]. There have been several studies showing good equivalence between 2DUS and 3DUS in obstetric image quality and time efficiency improvements of up to 57% [6, 911].

Similar improvements in efficiency have been demonstrated in abdominal imaging [12]; however, there has been little research to investigate whether 3DUS has the diagnostic image quality to maintain the high sensitivity rates currently seen in abdominal ultrasound.

Materials and methods

Approval for the study was obtained from the local Research Ethics Committee, and consent was obtained from each participant after the nature of the study was fully explained.

An equivalence trial was felt to be the best methodology to address the research question given that there is no “true” gold standard with which to compare 2DUS and 3DUS. The sample size was therefore calculated using a power calculation based on equivalence studies [13] with a two-tailed significance of 95% and power of 80%.

Eighty consecutive patients referred for abdominal ultrasound were included in the study over a period of 12 weeks between May and August 2008. The population recruited for this study comprised 37 (46%) female and 43 (54%) male patients with an average age of 53.5 years (range, 20–88 years) and an average body mass index (BMI) of 26.2 (range, 17.6–39.4). Twenty-six patients (33%) were referred with a history of various gallbladder/biliary conditions, and the remaining 54 patients (77%) were referred according to other non-gallbladder abdominal referral criteria.

Each patient underwent imaging of the gallbladder and was included in the study regardless of the presence or absence of disease. A sonographer performed the routine imaging using 2DUS with a C5–1 abdominal probe and an iU22 ultrasound machine (Philips Medical Systems, Bothel, WA). The gallbladder was assessed in transverse and longitudinal sections with all available image optimisation, and representative images were stored digitally.

At the end of the routine examination two anonymised 3DUS volumes were acquired by the same sonographer using a V6–2 mechanically steered probe (Philips Medical Systems), one sweep with the longitudinal section in the A-plane with the patient lying supine and the other with the transverse section in the A-plane with the patient lying decubitus (Fig. 1). Once again, all image optimisation tools were used to optimise the B-mode image before the volume sweep.

Fig. 1
figure 1

Cartesian axes and planes used in geometric description

The sonographer performing the examination reported the 2DUS images, and a radiologist blinded to the 2DUS results reported the 3DUS data independently using QLab 3DUS manipulation software (Philips Healthcare, WA, USA). The diagnoses from both techniques were compared for equivalence.

Results

In three patients (4%) the gallbladder could not be examined.

In 91% of patients in whom a volume of the gallbladder was acquired (70/77), it was estimated that the entire gallbladder was visualised on the 3DUS volume sets, and at least 90% of the gallbladder was visualised in 97% of the examinations (75/77).

Baseline 2D ultrasound made 82 diagnoses in 77 patients: 45 (58%) normal gallbladders were reported, 18 (23%) gallbladders contained calculi, 10 (13%) polyps were reported, 2 (3%) cases of acute cholecystitis, 1 (1%) case of chronic cholecystitis and 6 (8%) diagnoses that fell into the “other” category [2 (3%) wall-thickening but not cholecystitis, 3 (4%) biliary sludge and 1 (1%) contracted gallbladder (Fig. 2)]. No gallbladder carcinomas were identified in the course of this study.

Fig. 2
figure 2

Graph demonstrating the breakdown of diagnoses by 2DUS and 3DUS

Overall agreement of diagnosis with 2DUS and 3DUS occurred in 89% of the cases (73/82). The negative predictive value was 91% (41/45), positive predictive value was 89% (33/37), and the specificity was 86% (32/37).

Diagnoses demonstrated substantial agreement [14] of the two tests (p = 0.05 for a two-tailed Cohen’s kappa 0.67). Chi-squared test found no significant difference between the two techniques (p = 0.95).

Of the nine discrepancies, four were polyps that were seen on 2DUS but not diagnosed on 3DUS, one was a calculus seen on 2DUS but diagnosed as a polyp on 3DUS, and four cases were felt to be normal on 2DUS but were diagnosed as polyps (two cases), calculi (one case) or cholecystitis (one case) using 3DUS.

Joint re-evaluation of the data showed that in four of the nine cases without agreement the discrepancies were due to technical factors, such as boundary clarity not being sufficient to distinguish polyps measuring <4 mm. The other five cases were felt to be reporter or operator discrepancies (Figs. 3 and 4). The theoretical agreement based on technical factors and allowing for operator error is therefore 95% (78/82) with a negative predictive value of 98% (42/43), a positive predictive value of 95% (37/39) and a specificity of 92% (36/39).

Fig. 3
figure 3

Table showing the joint evaluation of the nine cases where there were discrepancies between the diagnoses

Fig. 4
figure 4

Graph demonstrating the breakdown of diagnoses by 2DUS and 3DUS after joint re-evaluation

Cohen’s kappa for the adjusted data demonstrates almost perfect agreement of the 3DUS and 2DUS groups overall [p = 0.05 (two-tailed kappa = 0.81)]. Chi-squared test shows even less statistical divergence (p > 0.995).

Discussion

This study has shown that diagnoses from remotely reported 3DUS volumes of gallbladders correlate substantially with 2DUS diagnosis and that a diagnostic volume of the gall-bladder is achievable in most patients.

The 3DUS dataset was only unachievable in three patients. This was due to absence or complete contraction of the gallbladder in two cases and the patient being unable to tolerate the probe due to severe pain in one case. In all three cases 2DUS was also unable to satisfactorily image the gallbladder.

Interestingly, all the false negatives were in the polyp group and the polyps measured 4.1 mm or smaller. Any polyps that were over 4.1 mm in diameter were reported. This therefore appears to be the cutoff diameter for 3DUS visualisation in this study (Fig. 5).

Fig. 5
figure 5

Graph demonstrating polyp diameters and their visualisation on 3DUS

The reason for the difference in polyp pick up between 2DUS and 3DUS is difficult to assess, but it was felt to be at least partially due to the clarity of the fluid-edge boundary of the gallbladder. Assuming the boundary is not parallel to the incident beam, then a cystic-solid boundary should be well defined; however, with 3DUS there appears to be a marked difference in definition compared with 2DUS. Whether this is inherent in the technology or software or is due to technique, such as movement artefact or control manipulation, is an area that requires further investigation.

Patient size was not a contributing factor in the difficulty 3DUS has in picking up small polyps. The average BMI of this group was 25.8 (range 19.8–31.8) compared with 26.2 for the whole cohort.

The average BMI of the entire discrepancy group is 26.0 (range 17.6–32.9), which, again, is not significantly different from that of the overall study population (26.2). From this it can be assumed that BMI is not a factor in the visualisation of disease with 3DUS compared with 2DUS.

There is some doubt as to the relevance to patient management of a polyp measuring <4 mm. It is standard practice in many hospitals to only follow up or treat polyps that have a diameter of greater than 10 mm with cholecystectomy. Gallbladder polyps have a baseline malignancy rate of 3–8% [15], but with a diameter of 10 mm or larger this rises to a significantly greater (37–88%) chance of being malignant [16, 17]. However, current opinion suggests that patients with polyps over 5 mm in diameter are at a greater risk of malignancy if combined with patient age of over 50, ethnicity or rapid growth of the polyp, and follow-up and/or cholecystectomy should be considered for this group [18, 19]. While this is still above the 4 mm visualisation cutoff for this study, it does not leave much room for leeway.

Two of the false-positive cases could be attributed to grating or side lobe artefacts mimicking polyps. On re-evaluation, what appeared polypoid on 3DUS on the C-plane was in fact linear on the A- and B-planes (Fig. 6), suggesting artefact. This demonstrates the importance of visualising disease in at least two planes, which can be simply done using a (split-pane) multi-plane reconstruction (MPR) view. Learning to recognise artefacts in 3DUS reconstructions may represent a new skill set to be acquired by the remote reporter.

Fig. 6
figure 6

Polypoid echoes on the 3DUS C-plane (pane 3) appearing linear on the A- and B-planes (panes 1 and 2)

The viewing platform was also felt to be of great importance. The same volume sets appeared to show differing properties on different viewing platforms. In our opinion there was a distinct variation in clarity when using two different offline 3DUS manipulation software tools (ViewForum™, which was developed as a multimodality workstation, and QLAB™, a dedicated ultrasound workstation (both Philips Healthcare, WA) (Fig. 7)], which may be due to the monitors or graphics capability of that particular work station. This observation was not quantified in any way, but all 3DUS manipulation was done using QLAB for this reason. Further assessment of this perceived phenomenon is warranted.

Fig. 7
figure 7

Slight difference in edge clarity between two different offline 3DUS manipulation tools on the same MPR image, using the same workstation (left image is ViewForum™; right image is QLAB™). Both softwares are manufactured by Philips Healthcare (Philips Medical Systems, Bothel, WA)

Despite conventional 2DUS being considered the gold standard in this study, in one case the 3DUS diagnosis of acute cholecystitis, which was not diagnosed on 2DUS, was made with sufficient confidence to alter the formal report for the patient. It was felt that the laminar thickening of the gallbladder wall and the peri-cholecystic oedema were more easily seen on the 3DUS volume (Figs. 8 and 9). This highlights the difficulty in equivalence studies without an absolute gold standard.

Fig. 8
figure 8

2DUS reported as demonstrating biliary sludge but no other abnormality

Fig. 9
figure 9

3DUS showing biliary sludge but also peri-chholecystic oedema with laminar wall-thickening consistent with acute cholecystitis

The fact that the potential agreement is good is important. As the 3DUS technique is used both the operator and the reporter will become more proficient at acquiring, manipulating and interpreting the data, and these adjusted data suggest that the technology is not the limiting factor for improvement.

This study has some limitations. As often seen when comparing imaging techniques, there was no true gold standard with which to compare the diagnoses. Therefore, the sensitivity and specificity of 3DUS were in relation to the ability to predict the 2DUS result rather than the true diagnosis.

There were several positive groups, such as acute and chronic cholecystitis, that would have benefited from larger numbers in the group. This would have allowed for greater externalisation of these findings.

The relative experience of the operator and remote reporter is a factor. The operator acquiring and reporting on the 2DUS has 4 years’ post-qualification experience compared with 31 years for the 3DUS reviewer. However, there was little difference in experience with the regular use of 3DUS in detailed clinical assessment of the abdomen and offline manipulation, both approximately 3 years.

While there is a difference in operator ultrasound experience, the fact that it is a sonographer and a radiologist that are being compared should not significantly affect the findings of this study as it has been shown that there is no statistically significant difference between sonographers or radiologists with regard to routine abdominal ultrasound [20].

Conclusion

3DUS could replace 2DUS in the detection of most significant gallbladder problems and maintain the high sensitivity and specificity seen with 2DUS.

However, there are still some reservations considering the difficulty in detecting polyps measuring <4 mm with 3DUS, although these are not regarded as clinically significant. Although the rate at which this technology is improving, with increased speed of acquisition, improved isotropic resolution and boundary clarity, the next generation of volume transducers should address these issues.

Therefore, the authors feel that 3DUS should be used as a complementary tool to 2DUS until these issues are resolved

As non-obstetric applications of 3DUS increase, this study demonstrates the need to validate each application individually.

Quality assurance assessment of the 3DUS manipulation software requires further study.