Introduction

The assessment of cardiac morphology and function by ultrasound imaging has made a significant step forward by the introduction of real-time 3D echocardiography (RT3DE), as it allows a truly 3D visualization of the heart avoiding some of the problems intrinsically associated with 2D imaging (foreshortening, the need for geometrical assumptions when computing volumes, out-of-plane motion artifacts, etc.). Several studies have shown that left- and right-ventricular volumes (end-diastolic volume, end-systolic volume and stroke volume) can be accurately measured with volumetric ultrasound by comparing the RT3DE findings with the current gold standard, i.e. Magnetic Resonance Imaging (MRI) [16]. Given the advantages of ultrasound imaging in a clinical environment (such as its low cost and its bed-side applicability), RT3DE may well become the standard for the assessment of left ventricular volumes in the future.

However, accurate volume measurements require precise delineation of the endocardial border. Manual delineation of these boundaries in three dimensional data is a cumbersome and time-consuming task, making the introduction of this approach in clinical routine impractical. Hereto, several software packages have been introduced to aid the clinician in this contouring process by providing some form of automation. TomTec Imaging Systems (Unterschleissheim, Germany) presented commercial tools for 3D volume quantification, using a semi-automated approach [7]. The initial software relied on a multi-planar interpolation approach (MIM), where multiple equidistant long-axis planes were sliced through the 3D data. After manual initialization and semi-automated border detection plus manual correction in each 2D slice, a spatio-temporal smoothing spline is used to reconstruct the LV model. A more recent software tool from the same company expanded the scope of the LV segmentation to 4D detection (full-volume reconstruction approach, FVR) [8]. In this software, the user is asked to manually trace the endocardial border in 3 long axis planes at end-diastole and end-systole. A spatio-temporal deforming balloon is then fitted to these initial contours and deformed until it best fits the cardiac walls in each frame, being manually corrected afterwards. Contrarily to the purely offline approach offered by TomTec, Philips Healthcare (Best, The Netherlands) introduced the possibility of both offline and online analysis with their 3DQ Advanced software suite [9]. Using this tool, 5 anatomical landmarks are used to initialize a deformable shell model [10], which is afterwards deformed towards the LV boundaries, with the option of manual correction. More recently, also General Electric (GE Vingmed, Horten, Norway) introduced a software package, 4D AutoLVQ, which allows automated segmentation and volume quantification of the left ventricle [11]. After a total of nine landmarks are defined at end-diastole (2 points at the mitral annulus plus 1 point at the apex in each of the three long-axis apical views), non-temporal 3D surface detection is immediately performed to extract the endocardial border and to compute the EDV. Additional landmarks can be added to improve the surface detection and the process is then repeated for end-systole. A more extensive overview on the available methods for automated LV border detection can be found in the recent review by Leung and Bosch [12].

Next to the accuracy and precision of a given segmentation methodology, the time required to obtain those results is critical for introduction of the approach in clinical routine. Jacobs et al. [9] have shown that online LV volumetric analysis can provide accurate results in less than 2 min per volume. However, in their study, manual adjustments were required in 42 % of the analyzed cases using an online quantification tool, increasing the analysis time from 2 min to up to 5 min per volume. Jenkins et al. [13] have shown the superiority of the offline approaches in terms of accuracy and precision, at the cost of tripling the analysis time (630 ± 30 s vs. 240 ± 20 s). Hansegaard et al. [11] and Muraru et al. [14] have shown that a more advanced, automated software package (AutoLVQ 4D, GE Vingmed, Horten, Norway) can reduce the average time of analysis when compared with standard semi-automated strategies, while keeping comparable accuracy. However, the results of Muraru et al. show that a noticeable increased agreement can be achieved by manually adjusting the results from an automated method, at the cost of doubling the total analysis time (48 ± 24 s vs. 112 ± 30 s). Thus, the need of manual input required for accurate LV volumetric analysis has a strong impact in the human time cost associated with the extraction of such indices from RT3DE data. Furthermore, the state-of-the-art automated methods can reduce the analysis time when compared to semi-automated approaches but still heavily depend on user correction to achieve accurate results. Since manual refinement is dependent on the expertise of the operator, these corrections also intrinsically introduce observer-dependent variability [15].

We have recently presented a fast automated segmentation framework (B-spline Explicit Active Surfaces, BEAS) that allows segmenting challenging inhomogeneous data in real-time [16]. This algorithm only requires manual initialization using 6 points per 3D volume with the subsequent segmentation being automated. The focus of the present work was the assessment of the accuracy and precision of BEAS against manually extracted references from RT3DE. In order to reduce the influence of manual editing, no additional refinement was allowed in order to only evaluate the quality of the results provided by the algorithm upon manual initialization. Moreover, in this way, a time-consuming part of the volume assessment—as typically required in the current state-of-the-art commercial software packages—was avoided having direct consequences for the feasibility of these measurements in routine clinical practice.

Methods

Image acquisition

24 real-time 3D echocardiography exams were acquired using a Siemens Acuson SC2000 rev. 1.5 (Siemens Ultrasound, Mountain View, CA) using a 4Z1c matrix transducer. Volume sequences were acquired from an apical window and the sonographer aimed at the inclusion of the entire LV within the pyramidal field of view. Volume rates ranged from 20 to 40 volumes per second. The ultrasonic transducer operated at a central frequency of 2.8 MHz, with the acquisition sector width varying between 60° and 90°. The image data had isotropic voxel dimensions, which varied between 0.7 and 1.3 mm (1.038 ± 0.17 mm).

The acquired data corresponds to RT3DE scans randomly taken from an existing database, where the only inclusion criterion was to have the entire LV endocardial border inside the image pyramid, even if not properly visible. All patients gave their written informed consent, as well as the healthy volunteers.

The image quality of each exam in the analyzed dataset was classified by three experts as poor, fair or good in accordance to the percentage of the myocardial wall clearly visible in the image (<60 %, 60–75 %, >75 %), the contrast between the blood pool and the tissue and the presence of severe image artifacts.

LV volumetric analysis

Manual analysis

The end diastolic (ED) and end systolic (ES) frames were defined based on the ECG as well as the frames with the largest and smallest volume of the LV assessed visually. Next, both ED and ES volumes were segmented by the operator using a semi-automated method (eSie LVA pre-release software, Siemens Ultrasound, Mountain View, California, USA). Hereto, on the apical 4 chamber view, the apex and base of the septal and lateral wall were marked. The software then detects and displays a mesh on the endocardial left ventricular surface. The user then visually assesses the resulting contour in multiple views and planes for accuracy and adjusts manually as required.

Automated analysis

The automated border detection algorithm has been previously described and has as key feature its ability to capture the boundaries of challenging inhomogeneous objects in real-time [16]. Furthermore, it does not use any kind of statistical shape prior or geometrical assumption. The underlying principle of the algorithm uses local contrast as an attractor of the contour. Thus, after initialization of the LV boundary position, the contour is automatically deformed towards positions of high contrast, indicating the blood-tissue interface. This is done in a framework that guarantees the smoothness and the spatial coherency of the 3D surface, allowing therefore a result that maximizes the total contrast around the entire contour without introducing any cusps or spatial irregularities on the segmented LV shape. Furthermore, the evolution of the contour is intrinsically made robust to noise. Further information regarding the technical details of the BEAS algorithm can be found in Barbosa et al. [16].

The analysis dataflow is shown in Fig. 1. After loading the data and giving a single click at the center of the LV in an automatically defined short-axis (SAX) image, one long axis (LAX) plane is sliced from the 3D data. For this plane, the user can scroll between the end-diastolic and end-systolic frames, in order to better visualize the cardiac structures by including its temporal behavior throughout the cycle. The user is then asked to draw 3 points indicating the mitral annulus and the apex, at end-diastole and end-systole. These points are used to re-slice a second LAX image orthogonal to the first LAX plane and the manual landmark identification process is repeated. The points indicated by the user are simply used to initialize the algorithm, through an ellipsoid fitting process. From this initial ellipsoid the segmentation algorithm starts to iterate in order to find the endocardial border without further geometrical assumptions on the segmentation result. The left ventricular volumetric parameters are then computed once both segmentation results are available.

Fig. 1
figure 1

Dataflow for automated segmentation algorithm

Statistical analysis

All data shown is expressed as mean ± standard deviation. For accuracy and agreement assessment, linear regression and Bland–Altman analysis [17] were performed between the volumetric indices estimated with BEAS (end-diastolic, end-systolic and stroke volumes and ejection fraction, EDV, ESV, SV and EF respectively) and the reference ones (average of the volumetric indices extracted manually by the 3 experts). Note that in the Bland–Altman analysis the difference is calculated as the reference value minus the automated software value.

The William’s test, as introduced by Chalana et al. [18], was employed to assess whether the results from the automated method are within the agreement limits of the experts’ manual references. The William’s test allows comparing the agreement of an observer with the joint agreement of other observers. When the Williams index is close to one, it indicates that the LV indices estimated with the automated method differ from the manual LV indices as much as the LV indices differ from one manual expert to another. If the Williams index is higher than one, it implies that the average agreement between the automated method and the manual references is actually higher than the average agreement between the observers. Further details regarding this test can be found in the original work of Chalana et al. [18].

For the assessment of reproducibility, a second user was asked to initialize the algorithm for the entire data set. In order to provide a standard for comparison of the user dependency of the algorithm, the pairwise difference between the 3 experts was used. It is important to address that the two users providing the initialization using the automated method were not the same users that provided the manual analysis, being thus fully blinded to the results from the manual contouring process.

The software tools PASW Statistics 18 (SPSS Inc, Chicago, IL, USA) and Excel 2007 (Microsoft Corporation, Redmond, WA, USA) were used for the statistical analysis. A significance level of p < 0.05 was used in the statistical tests.

Results

The analyzed dataset includes a wide spread in LV dimensions [EDV range = (50, 240 ml)] and LV global functional status [EF range = (22, 68 %)]. The dataset comprised both healthy individuals (# = 5) and patients (# = 19) with a multitude of conditions, including valvular disease, congenital heart defect and ischemic/dilated cardiomyopathy with severely distorted LV geometry. An overview regarding the clinical details of the dataset used in the present study is given in Table 1. The results of the expert assessment on image quality are given in Table 2, illustrating the variable image quality of the data.

Table 1 Clinical and echocardiographic characteristics of the study population [μ ± σ, (range in absolute values)]
Table 2 Image quality evaluation

The average analysis time required to extract all the LV volumetric indices was 30.7 ± 7.5 s. Note that most of this time was associated with the user preview of the data and manual initialization with 12 + 1 clicks per dataset. Indeed, the time required by the automatic segmentation algorithm to process one 3D volume was 0.33 ± 0.06 s, in a non-optimized MATLAB (MathWorks, Natick, MA, USA) implementation.

The results of the linear regression analysis are shown in Fig. 2. Regression analysis shows a strong correlation between the LV volumetric analysis using the BEAS algorithm and the manual reference values (Pearson product-moment correlation coefficients of 0.963, 0.947, 0.944 and 0.853 for EDV, ESV, SV and EF respectively).

Fig. 2
figure 2

Linear regression analysis for EDV, ESV, SV and EF

Bland–Altman analysis, shown in Fig. 3, reveals a bias (limits of agreement) of 2.59 (−25.39, 30.57) ml, −2.11 (−24.91, 20.69) ml, 4.70 (12.93, 22.34) ml and 3.45 (−8.96, 15.87) %, for EDV, ESV, SV and EF respectively. However, this bias was only statistically significant for SV and EF (paired t test, p < 0.05).

Fig. 3
figure 3

Bland–Altman plots for EDV, ESV, SV and EF, comparing the measured errors against the reference values [green bias (*, p < 0.05), red limits of agreement (μ ± 1.96σ)]

The William index (and its 95 % confidence interval) was 1.06 (0.99, 1.13), 1.10 (1.02, 1.17), 1.05 (0.98, 1.12) and 0.97 (0.91, 1.04) for EDV, ESV, SV and EF respectively. These results show that the level of agreement of the BEAS algorithm with respect to the manual references is within the inter-observer joint agreement interval.

In Table 3, the inter-observer variability results are shown. The inter-observer agreement using the automated method was significantly better than the inter-observer variability of the manual analysis for both EDV and ESV (unpaired t test, p < 0.05).

Table 3 Inter-observer variability analysis

Discussion

The major finding of this study is bi-fold: (1) the proposed BEAS algorithm shows similar accuracy and reproducibility than current state-of-the-art tools for the assessment of volumetric parameters from 3D cardiac ultrasound data; (2) the proposed methodology obtains this accuracy without the need for further user correction after manual initialization. As a result, analysis time is sped up significantly being an advantage over the current state-of-the-art.

The good agreement of LV volumetric measurements between RT3DE and MRI opens a very important opportunity to include accurate 3D analysis in clinical routine without the high cost of MR imaging. Nonetheless, automatic left ventricular segmentation is not a trivial task as image segmentation is strongly influenced by ultrasonic data quality [19]. Furthermore, the results of Muraru et al. show that even current state-of-the-art commercial solutions only offer the required accuracy when a posterior manual correction is performed, which implies spending additional time on data analysis. Obviously, this makes the translation of these approaches to routine clinical practice more difficult.

In the present study, we aimed to assess a recently developed algorithm without using manual correction of its results, in order to tackle the above problem and in order to avoid an over-dependency on the user’s skills to correct eventual flaws of the software. In addition, we have not explicitly selected a data set comprising patients with excellent image quality, in order to truly represent the conditions met in clinical routine. In this setting, the presented results show that BEAS is a valid tool for 3D LV analysis given that its accuracy is well within manual inter-observer agreement range. Furthermore, it provides these results with minimal user input and within roughly 30 s (30.7 ± 7.5 s), including the user input stage, making it suitable for online use. This time window is very competitive when compared with the results reported in the literature, which range from around 2 min. [11, 14] to 10 min. [1, 2], although several studies report analysis times around 5 min per dataset [5, 6, 9, 15].

The present results outperform the results of another automated tool recently reported by Muraru et al. [14]. In this study, the correlation found between the results of the automated software and a semi-automated approach were lower (R = 0.88, 0.82 and 0.75 for EDV, ESV and EF respectively) than the ones found in the present study. Also the biases found in the present study are smaller than the ones previously reported for automated methods by Muraru et al. (Bias: −24, −6.5 ml and −4.3 % for EDV, ESV and EF respectively). The software analyzed in the present study has slightly tighter limits of agreement for both EDV and ESV than the ones reported by Muraru et al. [LOA: (−57, 9) ml and (−31, 18) ml for EDV and ESV respectively], while for EF they are comparable [LOA: (−15, 6) %]. Nevertheless, it should be noted that after manual correction of the automated results, Muraru et al. have shown better limits of agreement and smaller bias pointing to the fact that whenever the user is allowed to refine the results from the automated method, the software validation process suffers an increased dependency on the user skills.

Ejection fraction was underestimated with BEAS. Although this trend was already present in the results of the automated method presented in Muraru et al. [14], the importance of this index in clinical practice makes the EF bias estimated with BEAS an important issue. This might be related to the fact that the user is asked to independently initialize the ED and ES frames, which can introduce a bias on both EDV and ESV measurements. Given that both SV and EF are calculated from these values, there is a cumulative error effect that translates in higher bias and an increase in the limits of agreement for these measures. Note that this cumulative error effect on the increased SV and EF bias arises from the opposite signs of both EDV and ESV bias.

For EDV and ESV, the inter-observer variability was lower using BEAS than for fully manual analysis. This is one of the main advantages of automated analysis. It is also noticeable that the inter-observer variability for the manual references is higher than the values that have been previously reported. This most likely stems from image quality issues (cf. Table 2), since the dataset comprises exams of regular quality which were not selected based on image quality, in order to mimic clinical routine realistically. Nonetheless, it can also be related with different levels of experience of the clinical experts providing the references.

However, inter-observer variability for functional indices (SV and EF) is still comparable to fully manual analysis. Therefore, future work will aim at strategies to reduce the user dependency of the algorithm, either by implementing automatic initialization procedures or by explicitly expanding the scope of the boundary detection algorithm. Nonetheless, it is still worthwhile to mention that these values can still be measured much faster with BEAS, which could lead to a significant improvement in the routine clinical practice.

Study limitations

An important limitation of the present study is that no direct comparison was made to clinically available automated software. However, manual analysis remains the most accurate way of extracting LV volumes from ultrasound recordings, which was used to test the proposed framework. On the other hand, the results presented were compared primarily with the study of Muraru et al., as this remains, to the author’s best knowledge, the only study to explicitly assess the added value of manual correction of the results offered by an automated volume quantification software suite.

Since the primary aim of the present study was to assess the performance of BEAS on the extraction of the most relevant LV volumetric parameters from RT3DE, manual segmentation of RT3DE data was used as reference. Nonetheless, MRI remains the gold standard for LV volume assessment. On the other hand, several previous studies have shown that manual segmentation of RT3DE data offers an accurate alternative to MRI estimation of LV volumetric indices [16]. In the scope of the present study it was therefore decided to use manual segmentation of RT3DE data as reference measurements and focus our efforts on validating software tools that may alleviate the physicians in the cumbersome task of manual 3D segmentation.

Another limitation of the present study was the available dataset size. However, it should be noted that several relevant studies have also used similar dataset sizes (Muraru et al. (# = 23) [14], Kuhl et al. (# = 24) [7], van den Bosch et al. (# = 29) [3], Sugeng et al. (# = 31) [4], Hansegaard et al. (# = 35) [11]). Furthermore, more than the dataset size, it should be noted its heterogeneity, both in terms of image quality (cf. Table 2) and global LV dimensions and function (cf. Table 1), which demonstrates that the tested algorithm is able to cope with significantly different imaging conditions, as one would encounter in routine clinical practice.

It is important to stress that, although being automated, the BEAS software still relies on manual initialization. Given that the algorithm underneath looks for optimal boundary positions in a neighborhood region, the initialization does influence the convergence basin of the algorithm. Furthermore, by introducing different errors at end-diastolic and end-systolic frames, the indices that express the volume change between these cardiac phases are naturally more prone to errors and less repeatable. Although this could be tackled with automatic initialization procedures, these are not straightforward and do not guarantee an optimal correspondence between the automatic initialization at different cardiac phases. This will be topic of future research, in order to move one step closer to fully automatic LV volumetric analysis in RT3DE data.

A note should be addressed to the importance of accurate EF estimation in patients with low EF values. Although BEAS presents a competitive performance over the entire tested dataset, whose EF ranged from 20 to 70 %, there are only two patients with EF < 35 %. Thus future work should further address the use of BEAS to accurately identify EF in risk patients.

Conclusion and future work

The BEAS framework provides an accurate and repeatable platform for 3D LV volumetric analysis, without the need of manual editing. It is an automated method, which only relies on manual initialization by 6 clicks per 3D volume. Furthermore, extraction of relevant volumetric cardiac indices takes 30 s on average. Therefore, the BEAS framework allows for a fast quantification of 3D cardiac morphology and global function, facilitating its use in the clinical routine and in bed-side applications.