Introduction

In the last decade, transcatheter aortic valve replacement (TAVR) has become a routine alternative procedure for open surgical valve replacement in high-risk patients that shows comparable short- and long-term results [1, 2]. ECG-triggered computed tomography (CT) has become one of the most important imaging modalities for use prior to implantation and provides exact measurements of the aortic root, access paths and fluoroscopic projections in patients with severe aortic stenosis [3, 4]. CT provides more accurate sizing of the aortic annulus than 2D echocardiography and can reduce the rate of paravalvular leakage [59].

The number of TAVR procedures has been rising rapidly in recent years. In addition, improvement in the implantation method and in preoperative imaging has resulted in a variety of new parameters, which are measured prior to the procedure, and has resulted in increased effort and prolonged evaluation time. Particularly, accurate measurement of the aortic root, including the sizing of the aortic annulus and of the distance to the ostia, are crucial for the success of the procedure and avoidance of perioperative complications [911]. Several studies have described slightly different approaches for evaluating the aortic root [4, 7, 1214], and recent studies have shown a wide range of inter-observer reliability for these measurements [11, 15, 16].

It was hypothesized that semi-automatic workflows that guide physicians through the evaluation process as well as automatic aortic annulus detection of the aortic annulus plane could reduce the duration of the preoperative analysis and could improve measurement accuracy, particularly for inexperienced observers.

The aim of this study was to evaluate the potential advantage of semi-automatic image data segmentation versus a manual system for experienced and inexperienced observers.

Materials and methods

Patient population

Thirty consecutive, high-risk patients who had been scheduled for TAVR due to severe symptomatic aortic stenosis and received a successful ECG-triggered cardiac CT as a part of standard preoperative planning between the 4th and 22nd of October 2013 were included in our study. The baseline characteristics of the patients are shown in Table 1. Prior to the CT scan, the patients provided written informed consent. The scan protocol was performed according to the statutes of our local ethics committee.

Table 1 Baseline characteristics (N = 30)

CT protocol

All CT scans were performed using a 128-row-detector, dual-source CT (SOMATOM Definition Flash® Siemens Healthcare, Erlangen, Germany) with a prospectively ECG-triggered, high-pitch technique according to the standard protocol of our facility. The prospective ECG-triggering was set to capture the aortic root in late diastole to avoid movement artifacts. In total, 70 ml (Iomeprol, 400 mg iodine/ml, Bracco Imaging, Konstanz, Germany) of non-ionic, iodinated contrast medium was applied at a flow rate of 3.5 ml/s prior to the scan. After a delay of 6 s, a bolus tracker in the ascending aorta using a threshold of 100 HU was performed and by reaching the threshold the scan in cranio-caudal direction. Scanner parameters are as previously published [17].

Evaluated software

The entire CT image analysis was completed using commercially available, client–server-based software for 3D reading and advanced visualization (syngo.via VA20, Siemens AG, Healthcare, Erlangen, Germany). For the analysis, two separate workflows (“semi-automatic segmentation” and “manual segmentation”) were used. First, a TAVR workflow (“semi-automatic segmentation”; syngo.CT Cardiac Function—Valve Pilot) that included an automatic annulus plane and coronary ostial detection plus an automatic tool for fluoroscopy angulation estimations was used (“semi-automatic segmentation”; Valve Pilot; cf. Fig. 1). Second, a version of the TAVR workflow (“manual segmentation”) that required manual segmentations using MPR reconstructions to measure the aortic root (cf. Fig. 2). In both workflows, automatic detection of the vessel centerlines of the aorta and iliac runoffs was available, as was an automatic 3D volume rendering technique (VRT) tool.

Fig. 1
figure 1

Semi-automatic segmentation and evaluation. A, B Semi-automatic segmentation of the aortic valve annulus plane, with marking of the most basal aortic cusp insertion points. C Semi-automatic segmentation of the aortic valve annulus, with automatic calculation of the surface- and perimeter-derived maximal as well as minimal effective diameters (EDs). D Semi-automatic estimation of the aortic annulus angulations, including 3D-visualization. Ao aorta, ED effective diameter, LA left atrium, LV left ventricle

Fig. 2
figure 2

Manual segmentation and evaluation. A, B Manual segmentation of the aortic valve annulus plane using the MPR tool. C Manual segmentation of the aortic valve annulus. The effective diameter has to be calculated separately. D Manual estimation of the aortic annulus angulations, including 3D visualization. The angulation plane was defined as the plane parallel to the right coronary sinus (red). After adjustment, the angulations are estimated using the red-framed image (A). Ao aorta, LA left atrium, LV left ventricle

Image analysis

All measurements were performed separately by two radiologists (B.F., C.J.) with 5 years and 1 year of experience in analyzing cardiac CT images. First, all datasets were analyzed using the manual segmentation followed by the semi-automatic software. All cases were evaluated in multiple sessions without a fix order selecting the cases randomly. Both of the examiners were blinded to the results of the other examiner as well as to the results of the manual analysis.

Based on our institutional standards, the duration of the following evaluation steps was estimated: pre-processing, data loading, 3D-VRT and measurements of aorta diameter; the curved multiplanar reconstructions (cMPR) of the aorta and iliac runoffs with diameter measurement and planimetry of the aortic annulus; estimation of the surface- and perimeter-derived effective diameter (ED); measurement of coronary ostial heights; and determination of the aortic annulus angulations for fluoroscopy. The list of these variables with corresponding description is shown in Table 2. Additionally, inter-software and inter-observer reliability for the most critical parameters, such as the surface- and perimeter-derived ED as well as for the ostial heights, were assessed.

Table 2 Measurements

Statistical analysis

Statistical analyses were performed using commercially available statistical software (SPSS, v.17, for Windows, IBM Corp., Chicago, IL, USA). All variables were tested with Shapiro–Wilk test to determine the distribution. The continuous quantitative parametric variables were expressed as the means ± standard deviations, and the level of significance was defined as p < 0.05 and tested using Student’s paired t test The null hypothesis was tested using t-distribution. Non-parametric variables were expressed as the median ± quartiles. The null hypothesis was tested using Mann–Whitney U Test or Wilcoxon signed ranks test.

The intra-class correlation coefficient (ICC), which was defined as the ratio of the between-subject variance to the total variance, and Cronbach’s Alpha were determined to estimate the inter-software and inter-observer reliability. The 95 % confidence intervals (CIs) were calculated, and the ICC values and were interpreted as follows: absence of agreement, 0 or less; poor agreement, less than 0.20; fair agreement, 0.21–0.40; moderate agreement, 0.41–0.60; good agreement, 0.61–0.80; and excellent agreement, greater than 0.80 [18]. Bland–Altman plots with 95 % limits of agreement were used to visualize the differences between the results from the analysis software and from the observers [19].

Results

Patient characteristics

All of the patients fulfilled the inclusion criteria (N = 30). The study population consisted of 17 (56.7 %) female and 13 (43.3 %) male patients with an average age of 78.4 ± 6.7 years. All of the patients had a tricuspid native aortic valve, and the mean heart rate during the image acquisition was 79.7 ± 16.8 bpm (ranging from 58 to 104 bpm). No beta-blockers were administered because of clinical contraindications prior to the examination.

Analysis duration

The mean total evaluation times when using semi-automatic segmentation were 520.4 ± 117.6 (median 485.5) s, for the expert and 580.2 ± 162.8 (median 517.0) s for the novice. When using manual segmentation, the expert required 607.8 ± 104.6 (median 592.0) s for the mean total evaluation, and the novice required 693.2 ± 159.5 (median 670.0) s (Fig. 3). The expert was, on average, 59.8 ± 101.4 s (approx. 10.3 %, p ≤ 0.02) faster than the novice when using semi-automatic segmentation and was 85.4 ± 83.8 s (approx. 12.3 %, p ≤ 0.001) faster when using the manual software.

Fig. 3
figure 3

Total evaluation duration. Distribution of the evaluation duration depending on the observer and software solution (novice = 1 year of experience; expert = 5 years of experience)

The mean duration of each step, which depended on the software type as well as on the observer, is shown in Fig. 4 and Table 3. Regardless of the level of experience, the three most time-consuming steps were the measurements of aortic diameters, the 3D-VRT and the sizing of the aortic annulus.

Fig. 4
figure 4

Duration of the individual evaluation steps when using the semi-automatic workflow. The duration of the individual measurement steps per observer. The three most time-consuming steps include obtaining the measurements of the aortic and annular diameters as well as performing the 3D VRT

Table 3 Duration of the individual evaluation steps A: Manual segmentation, B: Semi-automatic segmentation

Semi-automatic segmentation (Valve Pilot) successfully recognized the aortic annulus plane in 28 (93.3 %) patients. In 4 (13.3 %) of the 28 patients, additional corrections of the aortic annulus plane were necessary, and minor manual corrections of the automatically recognized annulus contour were performed in all of the patients. The automatic recognition of the coronary ostia worked correctly in 30 (100 %) patients. No patients had an abnormal coronary origin, bicuspid aortic valve. All patients presented at least mild valvular calcifications.

Inter-software reliability

The overall inter-software reliability was excellent, and the details are listed in Table 4. The Bland–Altman analysis of the mean inter-software measurements is shown in Fig. 5A (the mean difference: 0.42 mm, upper/lower limit: 2.61–1.76). Two measurements exceeded the levels of agreement.

Table 4 Inter-software and inter-observer reliability
Fig. 5
figure 5

Bland–Altman analysis of the mean inter-software (A) and inter-observer (B) measurements of the perimeter-derived effective diameter. A Mean difference: 0.42 mm, upper/lower limit: 2.61/−1.76. B Mean difference: 0.53 mm, upper/lower limit: 1.63/−0.58

Expert

The mean differences between the semi-automatic and manual segmentations performed by the expert were 0.5 ± 1.1 mm for the perimeter-derived ED and 0.5 ± 0.5 mm for the area-derived ED. The inter-software reliability was excellent for the perimeter- and surface-area-derived EDs (ICC 0.96, 95 % CI 0.92–0.98/ICC 0.98, 95 % CI 0.87–0.99).

For the expert, the mean LCA and RCA heights did not significantly differ between the semi-automatic and manual measurements (0.5 ± 1.9 mm, p = 0.10/0.7 ± 3.9 mm, p = 0.36). Furthermore, the inter-software reliability was high for the LCA and RCA height measurements (ICC 0.88, 95 % CI 0.74–0.94/ICC 0.91, 95 % CI 0.81–0.96).

Novice

The measurements performed by the novice resulted in mean differences between the semi-automatic and manual segmentations of 0.9 ± 1.4 mm for the perimeter-derived ED and 0.3 ± 1.1 mm for the area-derived ED. The inter-software reliability for the inexperienced observer was slightly lower than that for the experienced observer; however, both were excellent (ICC 0.92, 95 % CI 0.75–0.97/ICC 0.96, 95 % CI 0.91–0.98).

For the novice, the mean LCA and RCA heights for the semi-automatic and manual measurements were not significantly different (0.1 ± 2.4 mm, p = 0.92/0.24 ± 2.3 mm, p = 0.58). The inter-software reliability for the novice was lower than that for the expert; however, both were high (ICC 0.87, 95 % CI 0.73–0.94/ICC 0.88, 95 % CI 0.76–0.95).

Inter-observer reliability

The overall inter-observer reliability was high to excellent; the details are provided in Table 3. The Bland–Altman analysis of the mean inter-observer measurements is shown in Fig. 5B (mean difference: 0.53 mm, upper/lower limit: 1.63/−0.58). One measurement exceeded the levels of agreement.

Comparing both software versions, there were lower mean differences between both observers in measurements of perimeter based ED as well as LCA and RCA ostia heights when using the semi-automatic segmentation. The corresponding mean differences are listed below.

Semi-automatic segmentation

There was a not significant difference between the means for the two observers of 0.1 ± 0.6 mm for the perimeter-derived ED and of 0.02 ± 0.7 mm for the surface-area-derived ED when semi-automatic segmentation was applied. The inter-observer reliability for the perimeter- and surface-area-derived ED was excellent (ICC 0.99, 95 % CI 0.97–0.99/ICC 0.99, 95 % CI 0.97–0.99).

The semi-automatically assessed mean LCA heights between the observers differed significantly, with a mean difference of 0.9 ± 1.5 mm, p = 0.01, whereas the mean difference in the RCA height was not significant (0.9 ± 2.4 mm, p = 0.06). The inter-observer reliability was high, with an ICC of 0.90 for the LCA (95 % CI 0.75–0.96) and an ICC of 0.94 for the RCA (95 % CI 0.87–0.97).

Manual segmentation

The manual segmentation revealed mean differences of 1.0 ± 1.0 mm and 0.8 ± 0.8 mm for the perimeter- and surface-area-derived diameters, respectively. The inter-observer reliabilities for the perimeter- and surface-area-derived EDs were excellent (ICC 0.95, 95 % CI 0.67–0.99/ICC 0.96, 95 % CI 0.64–0.99).

The automatically assessed mean LCA heights differed significantly between the observers, with a mean difference of 1.6 ± 2.0 mm, p < 0.001, whereas there was no significant difference in the mean RCA heights (0.9 ± 1.7 mm, p = 0.01). The inter-observer reliability was high, with an ICC of 0.81 for the LCA (95 % CI 0.39–0.93) and an ICC of 0.91 for the RCA (95 % CI 0.78–0.96).

Discussion

The results of this study show that software using semi-automatic segmentation of the aortic root and recognition of the aortic valve annulus, contours and angulation significantly reduces the evaluation duration, improves the inter-observer reliability, and maintains the accuracy of manual measurements, particularly for examiners with little experience in evaluating CT images prior to TAVR.

Accurate measurements of the aortic root and implantation path dimensions by CT are essential for correct prosthesis sizing and selection of the implantation path to perform successful implantation and avoid complications in patients undergoing TAVR. Aortic root measurements are particularly challenging and time consuming, and with an ever-increasing number of implantations performed worldwide, measurement accuracy and efficiency are more important. Several automatic or semi-automatic tools for the evaluation of CT data prior to TAVR are available. Automatic tools are only useful if they do not compromise the accuracy of the assessments, if they facilitate significant time saving, and if they support the complete range of measurements that are required for the implantation. Additionally, the possibility of manual editing of the measurements is essential, particularly for patients with abnormal anatomical characteristics or strong calcifications.

The software for routine 3D and advanced visualization that were applied in this study allowed all of the measurements that are required for implantation in our facility (Table 2), which correspond to the parameters recommended by Holmes et al. [3]. Converting to the manual mode or editing the automatically assessed findings was always possible. Only the editing of the centerline of the aortic root within the automatic segmentation tool was not possible.

Our study population consisted of patients presenting extended valvular calcifications in most cases, expected to be a challenge for the automatic annulus segmentation. However, automatic recognition of the aortic root failed only in two (6.7 %) patients and in four (13.3 %) patients the aortic annulus plane had to be adjusted manually. Due to heavy calcifications in most cases, minor adjustments of the contours of the aortic annulus were required for all patients. These results correspond to recently published study, which examined the same semi-automatic segmentation software [20]. Using the automatic segmentation, both, the correction of the aortic annulus plane as well as the adjustment of the annulus contours could be done very fast and easily.

According to the duration of each evaluation step, there were some outliers. Most of these outliers were caused by the failure of the automatic vessel segmentation, which was the same in the manual and the semi-automatic software version. Affected were measurements of the aorta and iliac runoffs.

Regarding the inter-software reliability, there were differences between the mean perimeter-based ED measurements. In the Bland–Altman analysis of both software types, there were only two measurements beside the level of agreement showing that both software versions can deliver significantly different values. Furthermore there were lower mean differences between both observers in measurements of perimeter based ED as well as LCA and RCA ostia heights when using the semi-automatic segmentation. This allows the conclusion that the semiautomatic-segmentation is more accurate for the measurements of variables, which are crucial prior to TAVR.

Furthermore in our study, the inter-observer agreement as shown in the ICC analysis was excellent and comparable to those of recently published studies [16, 20, 21]. Although the manual estimation of the diameter of the aortic annulus revealed excellent inter-observer reliability, the reliability when using semi-automatic segmentation was improved, which is beneficial, particularly for the novice examiner. Thus semi-automatic segmentation might improve the reproducibility and help to standardize the evaluation process.

There was a significant saving of evaluation time of approximately 12 % for the novice and of 10 % for the expert when using semi-automatic evaluation. Therefore, semi-automatic evaluation allows more time efficient and accurate estimation of the required measurements in high-risk patients prior to TAVR. In clinical routine due to steady rising numbers of TAVR cases, time saving of couple of minutes per case does get clinically relevant. In our study the novice had already approximately 1 year of experience with the manual evaluation of TAVR cases, thus we conclude, that centers without any experience with TAVR evaluation, might benefit even more from the semi-automatic segmentation. Especially the preset and guided order of the evaluation steps might be advantageous.

One of the major limitations of the semi-automatic software is the semi-automatic segmentation itself. On one hand the semi-automatic segmentation provides a convenient guided way of evaluation of all currently required variables prior to TAVR. On the other hand the automatic segmentation has only limited flexibility, which might become an issue if additional variables become important or existing evaluation steps need to be replaced or even only adjusted. Additionally semi-automatic segmentation delivers results, which always need to be confirmed for their accuracy. Thus semi-automatic segmentation should support manual segmentation tools rather than replacing them.

There are some limitations of this study. This study is a single-center study describing the efficiency of two particular software solutions delivering data that are only comparable I the setting using the same software tools. To confirm our results, further studies with additional software solutions and larger study populations should be performed. Our study population consisted of patients who had datasets with good image quality and who had besides the severe aortic stenosis neither valvular nor coronary anatomical abnormalities, so the accuracy of the semiautomatic approach in patients with anatomical abnormalities could not be evaluated.

Conclusions

TAVR evaluations can be accomplished significantly faster and without compromising accuracy when using semi-automatic compared with manual segmentation, showing a benefit that is independent of the experience level of the examiner.