1 Introduction

Aortic valve functions to allow the single directional flow of the oxygenated blood from the left ventricle (LV) to the aorta. Blood regurgitation is prevented during ventricular diastole, and sufficient cardiac output is allowed to be channeled from the LV during ventricular systole [6]. Regurgitation and stenosis are common malfunctions of the aortic valve. Aortic regurgitation or leakage of blood into the LV can be caused by aging process, high blood pressure or injury, whereas aortic stenosis is characterized by narrowing of the valve opening due to the buildup of calcium and scarring [4, 23]. Both conditions can cause the increase in volume and pressure in the LV, resulting in ventricular hypertrophy and eventually heart failure in untreated cases. The overall burden of disease due to severe aortic stenosis and regurgitation in the elderly population is substantial, with 12.4 and 14.6% of the elderly above ≈70 years old being affected, respectively [25, 27]. Standard treatment strategies for aortic valve disease include catheterization by balloon valvotomy or transcatheter aortic valve replacement (TAVR) [29]. Alternatively, patients are prescribed with transcatheter valve implantation (TAVI) if found affected by severe aortic stenosis and at high surgical risk [17]. These minimally invasive surgical procedures have been shown to reduce risk and morbidity associated with the open surgical procedure [17].

Imaging modalities such as echocardiography and cardiac CT are integral to the diagnosis of aortic valve disease and treatment procedural success. Cardiac CT is widely adopted for pre-procedural assessment of candidates for TAVR and TAVI [10]. The volumetric CT scans provide essential information critical in treatment planning, including valve morphology and the severity of stenosis. However, cardiac CT is infeasible to be used as a real-time intra-procedural guidance due to the use of high-dose ionizing radiation and is lacking in providing crucial information on hemodynamics such as transvalvular pressure gradients and the presence of regurgitation [5]. Echocardiography, by contrast, is commonly applied as an in vivo imaging tool during minimally invasive aortic surgical procedure [10]. With the use of the safe probing beam, echocardiography provides structural details of aortic valve and regurgitation, as well as systolic ejection performance and the extent of wall hypertrophy [32]. Despite being low cost and easy to use, the quality of echocardiography images is inferior as compared to cardiac CT due to the presence of speckle noise and limited field of view [9]. Establishing a direct spatial correspondence between both modalities for complementary anatomical and functional information fusion could be beneficial to improve the diagnosis accuracy of aortic dysfunction and to facilitate pre-procedural planning and intra-procedural navigation during TAVR and TAVI.

The fundamental steps in the fusion process involve bringing the imaging modalities into spatial alignment through image registration. Intra-modality cardiac image registration has been proposed to date for modalities including echocardiography [9], MRI [19], PET [26] and SPECT [24]. The registration frameworks were introduced to address primarily the issue of motion artifacts and to enable the quantitation of cardiac wall and fluid dynamics. Efforts to register images from multiple modalities have also been introduced, but these works focused mainly between CT-MR [35], CT-PET [3], SPECT-PET [8], MR-PET [28] and MR-SPECT [7]. Very limited registration frameworks [12, 13, 18, 30] have been investigated to spatially align 2D echocardiography and CT volumetric data. These limited studies [12, 18] used optical tracker to track the position and orientation of the echocardiography probe during echo-CT registration. These methods are currently restricted by the coarse resolution and the availability of the optical tracker for clinical application. In addition, these methods have not been investigated for aortic valve registration in TAVR and TAVI, but merely tested on the mitral valve, phantom and animal studies. Registration between 3D echocardiography and CT has also been described [15, 21], but this specialized ultrasound technology is relatively new, requires additional user training and therefore has yet to be widely adopted. The additional examination cost associated with 3D ultrasound imaging and the limited access to 3D streaming make it poorly suited for real-time fusion with other modalities [13]. In addition, little attention has been paid to the use of image fusion to assess aortic valve disease and for surgical guidance during TAVR and TAVI.

In the present study, we propose an automatic 2D to 3D registration framework for the fusion of echocardiography and CT data, specifically targeting to guide aortic surgery. Our technique simultaneously addresses the issues of temporal synchronization and spatial alignment, offering opportunities for novel ways to display composite structural and functional information from intra-operative transthoracic echocardiography and preoperative CT data. We have shown the applicability of this technique using echocardiography images acquired without optical tracking information and validated the accuracy of the proposed framework quantitatively and qualitatively using the short-axis “Mercedes Benz” sign views of the aortic valve and long-axis parasternal views of 10 patients.

2 Methodology

2.1 Cardiac CT and echocardiography acquisition

All data were collected retrospectively under the standard clinical acquisition protocol, and the study has received prior approval by Hospital Research Ethics committee. Ten patients undergoing TAVR were recruited, and the data collected were anonymized. Each data set contains a retrospectively gated cardiac CT volumetric scan acquired preoperatively using a 64-slice dual-source CT scanner (Somatom Definition, Siemens Medical Solutions, Germany). The volumetric scan has a spatial resolution of 0.4 × 0.4 × 0.4 mm at single frame per cardiac cycle. The reconstruction of the raw CT data was performed at 40–70% of the R–R interval with a slice thickness of 0.75 mm and a matrix size of 512 × 512.

Additionally, each patient underwent 2D time series echocardiography scanning after a few days or on the same day of cardiac CT scan. Echocardiography scanning was performed using a Philips IE33 ultrasound system equipped with an S5-1 (1.0–3.0 MHz) transducer. The images acquired include short-axis “Mercedes Benz” sign views of the aortic plane and long-axis parasternal views. The images have 800 × 600 pixels at a resolution of 0.3 × 0.3 mm, frame rate of 60–80 f/s and cover 4–5 cardiac cycles.

2.2 Image registration

The proposed method consists of two major steps: temporal synchronization and spatial registration. Temporal synchronization was performed to align echo and cardiac CT images in time because both modalities produce images of different sampling latencies and temporal resolution [12]. Depending on heart rate and frame rate, echocardiography typically depicts a pumping heart moving through systolic and diastolic phases during surgical navigation. Cardiac CT scan, on the other hand, was captured at a single time point, typically at the diastolic phase,during preoperative assessment. Since the structure of the heart alters over diastole and systole, correct selection of 2D planar echocardiography images is critical for the success of spatial registration with the CT volume.

During spatial registration, a rigid geometrical transformation was applied to the selected 2D planar echocardiography image to spatially align with the static cardiac CT volume using intensity-based registration algorithm. The overall proposed registration workflow is shown in Fig. 1, and the details of each step are described in the following sections.

Fig. 1
figure 1

Overall registration workflow

2.2.1 Temporal synchronization

Temporal synchronization of both modalities was achieved by time stamping using ECG gating signals that were obtained concurrently during echocardiography imaging. The R peaks were automatically detected from the ECG signal based on the signal amplitude. In order to select echocardiography images that have similar cardiac phases to the preoperative cardiac CT volume, the ECG signal was linearly interpolated within the R–R interval and 3 frames located within 40–70% of the interval were chosen (Fig. 2). This interval was determined from the capturing time delay recorded in the CT header file and was found to reside close to the end-diastolic phase of the cardiac cycle.

Fig. 2
figure 2

Echocardiography image frames were selected based on ECG gating information

2.2.2 Spatial registration

To perform spatial registration of each 2D planar echocardiography image frame to the corresponding preoperative cardiac CT volume, a rigid 2D to 3D intensity-based registration scheme was employed, utilizing mutual information as the similarity metric and pattern search algorithm as the underlying optimizer. In the registration process, the 2D planar echocardiography image was spatially maneuvered within the vicinity of the cardiac CT volume to search for the best matching cardiac CT plane. The search was started off by placing the echocardiography image at the seed position, including the orientation and position of the aortic valve plane, which were estimated and provided by the expert during pre-procedural planning on CT volume with respect to 3D axes of the patient’s body.

Rigid spatial transformation consisting of 8-degree of freedom was applied to the 2D planar echocardiography images, consisting of 3D translation (t x , t y and t x ), 3D rotation (R x , R y and R z ) and 2D scaling (S x and S y , i.e., along both dimensions of the planar image), as shown in Eqs. 1 and 2.

$$\left[ {\begin{array}{*{20}c} {u^{{\prime }} } \\ {v^{{\prime }} } \\ {w'} \\ 1 \\ \end{array} } \right] = T \times \left[ {\begin{array}{*{20}c} u \\ v \\ w \\ 1 \\ \end{array} } \right]$$
(1)
$$\begin{aligned} {\text{T}} & = 2 {\text{D}}\;{\text{Scaling}} \times 3 {\text{D}}\;{\text{Rotation}} \times 3 {\text{D}}\;{\text{Translation}} \\ & = \left[ {\begin{array}{*{20}c} {Sx} & 0 & 0 & 0 \\ 0 & {Sy} & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array} } \right] \times \left[ {\begin{array}{*{20}l} {\cos \beta \cdot \cos \gamma } \hfill & { - \cos \beta \cdot \sin \gamma } \hfill & { - \sin \gamma } \hfill & 0 \hfill \\ { - \sin \alpha \cdot \sin \beta \cdot \cos \gamma + \cos \alpha \cdot \sin \gamma } \hfill & {\sin \alpha \cdot \sin \beta \cdot \sin \gamma + \cos \alpha \cdot \cos \gamma } \hfill & { - \sin \alpha \cdot \cos \beta } \hfill & 0 \hfill \\ {\cos \alpha \cdot \sin \beta \cdot \cos \gamma + \sin \alpha \cdot \sin \gamma } \hfill & { - \cos \alpha \cdot \sin \beta \cdot \sin \gamma + \sin \alpha \cdot \cos \gamma } \hfill & {\cos \alpha \cdot \cos \beta } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & 1 \hfill \\ \end{array} } \right] \times \left[ {\begin{array}{*{20}c} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ {tx} & {ty} & {tz} & 1 \\ \end{array} } \right] \\ \end{aligned}$$
(2)

where u, v and w are the input image coordinate, u′, v′ and w′ are the transformed coordinate, and α, β and γ are the rotation angle about x, y and z axes, respectively.

Linear interpolation was performed on the cardiac CT volume to sample 2D CT plane in the same size and spatial location as the echocardiography image. The best matching interpolated 2D CT plane, and therefore the optimal alignment, was determined using normalized mutual information (NMI). NMI is arguably the excellent similarity metric and has been shown to be robust and well suited for multimodal image registration [11, 20]. This metric does not assume a linear relationship between image intensities of different modalities and the co-occurrence of the most probable values in the two modality images is maximized at the optimal registration [13]. Assuming A is the 2D interpolated CT plane and B the echocardiography image, the mutual information metric NMI (A, B) is given as follows:

$${\text{NMI}}\left( {A,B} \right) = \frac{1}{n}\sum\limits_{j = 1}^{n} {\frac{{H\left( A \right) + H\left( {B_{j} } \right)}}{{H\left( {A,B_{j} } \right)}}}$$
(3)

where H(A) and H(B) are the Shannon–Wiener entropies of image A and B, respectively, while H(A, B) is the joint entropy of the two images [34]. n is the number of echocardiography frames used in the registration process.

Registration task was carried out iteratively using generalized pattern search (GPS) algorithm to find the optimal spatial transformation parameters that maximize the NMI measure of overlap image pixel intensities. GPS algorithm [1] comes from a family of numerical optimization methods that do not require gradient of the objective function to be optimized, specifically suited for function that is not differentiable such as mutual information in its usual form. GPS works by sampling points in a fixed pattern around the seed point [2] provided from pre-procedural planning using CT.

During optimization, all transformation parameters were scaled to within the range of −1 to 1 (except scaling from 0 to 1) to compensate for the difference in the order of magnitude between parameters and to improve convergence speed, as well as the accuracy of the output. The objective functions were evaluated using parallel processing, and the optimizer was iterated until the search step becomes sufficiently small and convergence to a maximum NMI has been achieved. The optimal transformation found was subsequently applied to cardiac CT volume to create matching 2D CT planar image. This image can be fused with echocardiography image to create a composite image for surgical guidance. The flow of the proposed registration algorithm is summarized in Fig. 3.

Fig. 3
figure 3

Intensity-based spatial registration pipeline

The registration framework was implemented in MATLAB (vR2014a, Mathworks, Natick, USA) on an Intel(R) Xeon (R) CPU E5-26200 @ 2.00 GHz computer.

2.3 Validation of registration accuracy

The accuracy of our proposed method was compared against gold standard manual registration. This manual registration was performed by an expert to select the best matching CT plane through manual manipulation of the transformation parameters. The validation was performed on both short-axis aortic valve planes and long-axis parasternal planes of 10 patients. Expert contours the aortic valve annulus and aorta on both echocardiography and the best matching CT planes (obtained from manual and automatic registrations). Two quantitative validation measures were calculated to assess registration accuracy, namely Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD).

DSC measures the spatial overlap between two contoured regions of interest (ROI) in the best matching 2D CT plane and echocardiography frames. This measure is sensitive to both location and size of the contoured ROI [38], and we defined the measure as:

$${\text{DSC}} = \frac{1}{n}\sum\limits_{j = 1}^{n} {\frac{{2\left( {R_{A} \cap R_{Bj} } \right)}}{{R_{A} + R_{Bj} }}}$$
(4)

where R A and R B are the contoured ROI in the 2D CT plane and the echocardiography frames, respectively. n is the number of echocardiography frame used in registration. DSC ranges from 0 to 1, indicating no overlap (DSC = 0) and complete congruence (DSC = 1) between two ROIs.

HD, by contrast, is a distance measure to estimate the distance between two ROIs by the spatial distance of their contour points. This measure is defined as:

$$H\left( {A,B} \right) = \frac{1}{n}\sum\limits_{j = 1}^{n} {\hbox{max} \left( {h\left( {A,B_{j} } \right),h\left( {B_{j} ,A} \right)} \right)}$$
(5)
$$h\left( {A,B} \right) = \mathop {\hbox{max} }\limits_{a \in A} \mathop {\hbox{min} }\limits_{b \in B} \left\| {a - b} \right\|$$
(6)

where A and B are two sets of contour points in the cardiac CT plane and echocardiography frames, respectively. In other words, HD computes the mutual proximity between two ROIs by indicating the maximal distance between any points of one ROI to the other ROI [14].

The DSC and HD are indirect assessment of the registration accuracy, whereby the degree of area similarity and shape difference of the aortic valve between two comparing planes are evaluated. As the clinical intention is to fuse 2D echocardiography and 2D interpolated plane of CT, the 2D measurement of aortic valve is considered reasonable measurement for assessing the accuracy of the proposed technique.

In addition, the diameter of the aortic valve annulus was measured from both automatic and manually registered CT planes. Since the shape of the aortic valve annulus is non-circular, the diameter was defined as the circumcircle diameter of the valve annulus. Agreement in diameter measurements obtained from the two different registration mechanisms was subsequently compared using a Bland–Altman plot. Statistical differences between echocardiography-automatically registered CT (echo-autoCT) and echocardiography-manually registered CT (echo-manualCT) were also analyzed using Student’s T test with p value set at 0.05.

3 Results

Figure 4 depicts the results of registration on 3 patients at the short-axis “Mercedes Benz” sign of the aortic valve. Row 1 shows one of the echocardiography images of the aortic valve ring with its 3 leaflets, which were extracted out from the time series data set at the temporal synchronization step. Row 2 displays the interpolated CT planes resulted from the proposed automatic registration method. The appearance of aortic valves in these images was found to resemble those manually collocated CT planes by the expert (row 3).

Fig. 4
figure 4

Registration result on 3 patients. Row 1 shows the echocardiography images extracted from temporal synchronization. The interpolated CT planes resulting from the proposed automatic registration method are shown in row 2. Row 3 depicts the corresponding interpolated CT planes which were manually selected by the expert

The DSC and HD of the common features, including aortic valve annulus and aorta, as delineated in both echocardiography and CT planes are tabulated in Table 1. Comparing both echo-autoCT and echo-manualCT in short-axis “Mercedes Benz” sign views, both DSC values are comparable at 0.81 (±0.08) and 0.79 (±0.09), respectively. DSC value above 0.7 indicates that the contoured ROI in both types of images has considerable similarity in area size and spatial overlap [37]. Similar HD was also computed between the two pairs of registration, with the value of the former equal to 1.30 (±0.13) mm and the latter equal to 1.32 (±0.04) mm. This registration error is comparable in magnitude to about 3–4 pixels in the resolution of CT and echocardiography, which is acceptable in view of the much bigger field of view yielded by both imaging modalities (i.e., 0.4 × 0.4 × 0.4 mm for a CT volume and 0.3 mm × 0.3 mm for a 2D echocardiography plane). The performance of the proposed algorithm was found consistent for long-axis parasternal views.

Table 1 Registration accuracy as measured using DSC and HD

The agreement between the automatic registration method and the gold standard manual registration was assessed through Bland–Altman analysis. The Bland–Altman plot in Fig. 5 demonstrates a good agreement in the diameter measurement of aortic annulus as obtained from the automatically and manually registered CT planes. While the limit of agreement is about −2.83 to 3.07 mm, the mean difference for diameter measurement was found to be −0.12 ± 1.50 mm, indicating no significant bias for both methods.

Fig. 5
figure 5

Bland–Altman plot of aortic annulus diameter measurement obtained from automatically and manually registered CT. The blue line indicates bias, and black dotted line indicates 95% confidence interval

Figure 6 shows the fusion of echocardiography and interpolated CT at both short-axis aortic plane and long-axis parasternal plane. The aortic valves are perfectly aligned in both views. Thus, expert’s visual inspection of overlapped aortic valves and neighboring structures divulges that the proposed registration method is able to align the images successfully as desired in clinical applications.

Fig. 6
figure 6

Fusion of echocardiography frame and cardiac CT at short-axis “Mercedes Benz” sign view (left image) long-axis parasternal view (right image) of the aortic valve and aorta

There was no significant difference in aortic annulus diameter measurement between the echo-autoCT and echo-manualCT (p > 0.05). The average computation time for a registration was approximately 50 s.

4 Discussion

TAVR and TAVI are important procedures to treat patients with aortic valve disease. As echocardiography can provide noninvasive guidance tool during the TAVR or TAVI procedure, the difficulty to relate the images to its anatomical context due to its poor image quality limits their utility as a means of guidance procedures. To improve the interpretability of these images, the registration with its 3D high-quality volumes such as CT can improve the interpretability of these images, thus maintaining the echocardiography as a flexible real-time imaging modality. In this study, we have presented a novel registration method for information fusion between 2D planar echocardiography and 3D cardiac CT in the application of TAVR and TAVI surgical guidance. Compared to previous study [12, 13, 18, 30] where optical tracking system was employed to provide probe localization, our proposed method has shown practicability in the absence of such tracking technology. This registration aims to establish direct spatial relationship between anatomical and functional information from both modalities for rapid and accurate assessment of the aortic valve structure for procedural planning, potentially reducing the risk and morbidity associated with the surgery. Other potential advantages provided by this multimodality registration method include the assessment of the ventricular hypertrophy condition, the left ventricular motion and thickness, which are not available in either echocardiography or cardiac CT alone.

Currently, we employed the proposed method to the retrospective data. The validation results, nevertheless, have shown promising accuracy of the proposed technique to be applied for intra-procedural navigation during TAVR and TAVI. In addition, the automatically registered CT plane has shown good agreement with the manually selected CT plane by an expert, with insignificant visual and quantitative difference in the aortic valve appearance and diameter between the respective interpolated planes. With the current prototype framework in MATLAB without speed optimization, registration requires an average computation time of 50 s. Implementation in other programming platform such as ITK when used as a standalone software may provide higher image processing speed for real-time application. Furthermore, registration speed may be further improved via a pyramid or multi-resolution implementation [31].

Our method requires a prior knowledge of the rough position and orientation of the aortic valve plane to initialize the search space. This information is currently estimated during treatment planning by the clinician before actual surgical procedure. Inaccurate positioning of the initial seed plane by physician may jeopardize the accuracy of the final CT plane found through the optimization process—Current results show small distance between the seed position and the correct position of the aortic valve plane. When tested against large displacement of a few centimeters offset, the resulting interpolated CT plane may show low degree of area similarity (DSC) and huge shape difference (HD) to the echocardiography image. A feature-based method [22, 36] may be incorporated into the current framework, in case no prior knowledge is available, to provide a rough estimation of the initial valve plane position before fine tuning with intensity-based registration. The inclusion of feature-based registration method may also improve the overall registration time for practical clinical application.

With current registration, the heart is assumed as a rigid body structure; therefore a rigid spatial transformation was employed in the registration process. Since the heart is naturally non-rigid and dynamic, deformation of the heart inevitably occurs during the pumping cycle making the perfect registration unachievable. Though our current registration based on rigid spatial transformation provides reasonable accuracy, registration may be further optimized with the use of non-rigid registration techniques [16] to compensate for such deformation. However, this may add more complexity to the process and compromising the speed of registration.

Linear interpolation was used in the proposed method to sample 2D CT plane with the same size and spatial location as the echocardiography image. Since the choice of the interpolation methods influences the quality of the interpolated image, the accuracy of image registration may be affected to some extent by this choice [33]. In the current study, we did not observe any significant image artifacts within the 2D interpolated CT plane when using linear interpolation. However, we do not deny the need for further investigation of possible errors due to this interpolation or further effort to seek for a better interpolation method in future study.

5 Conclusion

The primary objective of this work is to fuse the 2D planar echocardiography images with cardiac CT volume. To the best of our knowledge, this is the first discussion of image registration for aortic valve region with no optical tracking information provided to aid the process. Instead, prior knowledge of the orientation of echocardiography image was determined from the pre-procedural planning, which was used to initialize the optimization search volume. Our registration produces an interpolated cross-sectional view from the cardiac CT volume which matches the echocardiography image during the navigation process. The proposed technique could be applied in image guidance and diagnosis procedure for TAVR and TAVI. Results on 10 patients indicate the promising accuracy of the proposed technique to be applied to intra-procedural navigation during TAVR and TAVI. In future works, the use of pyramid or multi-resolution image registration approach and GPU processing will be investigated in order to speed up the computation time. A feature-based method could be incorporated into the current framework to facilitate the physician to estimate the initial point of the echocardiography views.