Keywords

1 Introduction

The diagnosis and follow up of cardiac diseases require a precise assessment of the cardiac morphology and function. Cardiac magnetic resonance imaging modalities, such as Cine or tagged MR, have shown to be able to provide accurate evaluation of global and regional cardiac functions. However, the analysis of cardiac deformation still largely relies on manually tracing the contours in Cine images, which is a time-consuming process. Although tagged MR is considered as the gold standard for quantifying local myocardial deformations, its use in clinical practice is somewhat held back by the lack of reliable automatic post-processing tools.

Recently, fast automatic or semi-automatic quantification algorithms in 3D were proposed for processing tagged MR [1]. Obviously, the introduction of these new algorithms requires a solid validation process. One of the common approaches is to use synthetic sequences with known ground truth. In such cases, the exact motion and/or deformation of the myocardium is known and serves as reference to assess the accuracy of semi/fully automatic algortihms. The usefulness of such tool is strongly linked to the degree of realism of the generated sequence.

Several groups already worked on the in silico generation of synthetic cardiac tagged MR sequences. Crum et al. [2] and Waks et al. [3] were among the first to simulate tagged MR images. Both of them simulated the tagging pattern by applying a sinusoidal modulation function in the spatial domain. Crum et al. [2] simulated the left ventricle (LV) in short axis slices. They modeled the corresponding anatomy using a simple ring shape. Using a motion directly computed from a real Cine sequence, the authors then proposed to warp the initial simulated image at end-diastole to the rest of the sequence. Later in [4], Crum et al. improved the generation of tag intensity profiles by using a frequency-domain model. Similarly, Waks et al. [3] used a prolate sphere to define the LV geometry and a 13-parameter kinematic motion model. The model parameters were determined by a least-squares fit to the displacements of the implanted markers tracked from a Cine sequence of a dog heart. Sermesant et al. [5] segmented myocardium from a real tagged MR image and further added tag lines to the binary mask. Finally, this image was warped by cardiac deformations generated by an electro-mechanical (E/M) model. Clarysse et al. [6] warped a real short-axis tagged MR image at end-diastole by a simple kinematic mode-based heart motion model. The use of real images ensures the realism of myocardium/background intensities. However, the motion model is too simplistic to represent the complexity of true heart motion and the integration of pathological case is not straightforward.

Figure 1 shows typical simulated images obtained from the work described in [2, 3, 5]. From these images one can see that in all cases only the intensity inside the myocardium is simulated. The absence of any intensity or motion artifact in the background considerably reduces the realism of the synthetized images. Moreover, these methods make appear highly contrasted borders between the myocardium and the background which is not realistic. Finally, it is worth pointing out that all the proposed tagged MR simulators generate synthetic data in 2D, nothing having been proposed for 3D yet.

Fig. 1.
figure 1

Synthetic 2D short-axis tagged MR images presented in [2, 3, 5].

In this study, we propose a pipeline for generating realistic 3D+t tagged MR images. The proposed pipeline is inspired by the work presented in [7] and [8] where realistic ultrasound/cine MR images were simulated. In particular, we propose to combine a template tagging sequence acquired from a volunteer (in order to derive realistic pixel intensity mapping) and an electro-mechanical (E/M) model [9] (in order to apply realistic cardiac motion and deformation). The template sequence we used come from 3D CSPAMM acquisitions [12] and it consists of three sequences with orthogonal tagging directions.

Our contributions in this paper are two-fold: (1) the reference displacement field involved in the simulation was generated by the E/M model. It is therefore unbiased to any motion estimation algorithm. Another interest of using the E/M model is the possibility to generate a wide range of synthetic deformation fields, from normal to different pathological cases; and (2) we made full use of the real tagging sequence in order to simulate realistic intensity information for the myocardium and the surrounding structures. A background with artifacts, or subject to a different motion field than the myocardium, represents a difficult challenge for any tracking algorithm. It is therefore important to have such challenges properly represented in the validation data.

2 Methodology

Figure 2 shows the pipeline of the proposed method. We first segment and track the heart in the template sequence (named as the image space hereinafter). Next, we use the E/M model to generate myocardial deformations (named as the simulation space hereinafter) corresponding to the heart geometry in the template sequence. In this way, the obtained image and simulation spaces are naturally aligned at the first frame. Further, as it will be described later in Sect. 2.3, we defined a set of spatio-temporal transformations which allows making a direct correspondence between a point in the simulated space and its equivalent in the image space. As a result, one is able to assign for each voxel of the simulated sequence a corresponding intensity value sampled from the real image sequence. In the following, we will describe the proposed pipeline in more details.

Fig. 2.
figure 2

The proposed pipeline for simulating tagged MR sequences. The three tagged MR sequences with line tagging patterns were multiplied for better visualization.

2.1 Segmenting and Tracking the LV in the Template Sequence

The left ventricle (LV) needs to be segmented and tracked for two reasons: (1) the E/M model requires a heart geometry at end-diastole as input; and (2) the heart motion needs to be tracked in order to build the spatio-temporal alignment that serves to assigning voxel intensities later.

Since with the template tagged MR sequence used, tissue and blood are both tagged and cannot be distinguished at the first frame, we chose to perform the segmentation at the last frame. A bandpass filter introduced in [1] was applied to untag the last frame image of the template sequence. We then segmented the LV manually as is described in [1]. This yields a surface mesh encompassing both the endo- and epi- cardium. In order to have a dense representation of the LV myocardium, we further resampled the surface into a volumetric mesh by methods described in [1].

Finally, this mesh was tracked backward in time in order to obtain the LV segmented shapes for each frame in the sequence [1]. In the sequel, \(\mathcal {M}_t\) denotes the LV volumetric mesh generated from the template sequence at time t.

2.2 Simulating Heart Deformations by the E/M Model

To launch the E/M model [9] for simulating myocardial deformations, a biventricle heart geometry (tetrahedral mesh) with defined LV/RV electrical (activation) and mechanical (fibers, contractilities) properties is required. Instead of redefining all those informations on the segmented LV mesh, we opted for mapping a template heart geometry to the image space through a Thin Plate Spline (TPS)-based transformation.

To achieve this goal, the LV AHA segments were defined manually for both the tracked LV mesh \(\mathcal {M}_0\) (Sect. 2.1) and the template geometry. The centers of these 17 AHA segments were then taken as control points to build the TPS transformation from the template geometry to \(\mathcal {M}_0\). This transformation was then applied to both the mesh nodes and the fiber orientation vectors, leading to a well-defined biventricle geometry corresponding to the first frame of the template sequence.

We then simulated myocardial deformations by the E/M model taking the transformed heart geometry as input. The output of the simulation was a sequence of tetrahedral meshes denoted as \(\mathcal {S}_t\).

2.3 Spatio-Temporal Alignment

The simulated heart motion is completely independent of the template sequence’s. Since we intend to sample image intensities from the real acquisition, a spatio-temporal alignment of the template sequence and the E/M model is required. This enables us to sample intensities from the template image and further assign them to voxels located in the simulation space.

Temporal Alignment. We propose to match each time point in the simulation space to a continuous timing in the real space by linearly stretch/shrinking the time axis. Both the template sequence and the E/M simulation consist of one cardiac cycle, but with different numbers of frames (denoted as \(\mathcal {N}_{img}\) and \(\mathcal {N}_{simu}\), respectively). Besides, the end-systolic frame indexes (denoted as \(n^{es}_{img}\) and \(n^{es}_{simu}\) respectively) vary. As a result, we opted for aligning the systolic and diastolic time intervals respectively.

We aim to map a discrete time point \(t_{simu}\) of the simulation space to a continuous timing \(t_{img}\) in the real image space. We used linear mappings defined as follows:

$$\begin{aligned} \phi (t)={\left\{ \begin{array}{ll} \frac{n_{img}^{es}}{n_{simu}^{es}}t,&{} \text {if } t \le n^{es}_{simu}\\ \\ \frac{\mathcal {N}_{img}-1-n_{img}^{es}}{\mathcal {N}_{simu}-1-n_{simu}^{es}}(t - n_{simu}^{es})+n_{img}^{es},&{} \text {otherwise } \\ \end{array}\right. } \end{aligned}$$
(1)

Then the non-integer timing value in the image space corresponding to the discrete time instant in the simulation is computed by \(t_{img}=\phi (t_{simu})\).

Spatial Alignment. Once the simulation and the real recording are temporally aligned, we need to compute the correspondences between the spatial locations at these two time points.

In this section, we describe how to map a spatial position \(\mathbf {x}_{simu}\) (at time \(t_{simu}\) in the simulation space) to its corresponding position \(\mathbf {x}_{img}\) (at time \(t_{img}\) in the real image space). This can be achieved by chaining two TPS transformations.

First, in the simulation space, it is easy to compute a TPS transformation from the simulation meshes \(\mathcal {S}_{t_{simu}}\) to \(\mathcal {S}_0\) (Sect. 2.2). We denote this TPS as \(\mathcal {T}_{\mathcal {S}_{t_{simu}} \rightarrow \mathcal {S}_0}\). In the meantime, in the real recording sequence, we match time 0 and time \(t_{img}\) using the two meshes \(\mathcal {M}_0\) and \(\mathcal {M}_{t_{img}}\) (Sect. 2.2). A second TPS transformation is computed and denoted as \(\mathcal {T}_{\mathcal {M}_0 \rightarrow \mathcal {M}_{t_{img}}}\).

Thanks to these transformations, a point \(\mathbf {x}_{simu}\) in the simulated sequence can then be located in the template image space through the following expression:

$$\begin{aligned} \mathbf {x}_{img}=\mathcal {T}_{\mathcal {M}_0 \rightarrow \mathcal {M}_{t_{img}}} \circ \mathcal {T}_{\mathcal {S}_{t_{simu}} \rightarrow \mathcal {S}_0}(\mathbf {x}_{simu}) \end{aligned}$$
(2)

where \(\mathcal {M}_{t_{img}}\) (i.e. the tracking mesh at \(t_{img}\)) was computed by a linear interpolation between \(\mathcal {M}_{\lfloor {t_{img}}\rfloor }\) and \(\mathcal {M}_{\lceil {t_{img}}\rceil }\) (\(\lfloor {}\rfloor \) and \(\lceil {}\rceil \) are respectively the floor and ceiling operators).

2.4 Image Generation

Our goal is to assign realistic image intensities for voxels located in the simulation space. In this section, we will describe how to assign voxel intensities using the spatio-temporal alignment technique we introduced previously in Sect. 2.3.

First of all, we need to define the voxel positions. Since the simulation and image spaces are naturally aligned at the first frame as is described in Sect. 2.2, we chose to follow the image information (origin, spacing, size and axis orientations) of the template recording used. This further defines the voxel positions.

Next, using the temporal alignment, we associated each time frame \(t_{simu}\) in the simulation space to a continuous time \(t_{img}\) in the template sequence. A new image \(\mathcal {I}(t_{img})\) was created by linearly interpolating images of the two closest time frames \(\lfloor {t_{img}}\rfloor \) and \(\lceil {t_{img}}\rceil \).

Finally, using the spatial alignment, each voxel position of frame \(t_{simu}\) in the simulation space was mapped to a spatial location at time \(t_{img}\) in the template space. Spatially interpolating \(\mathcal {I}(t_{img})\) at that position yields the intensity value to be assigned.

2.5 Correcting the Intensities of Myocardium

By far, the cardiac motion represented in the simulated images corresponds actually to the TPS transformation \(\mathcal {T}_{\mathcal {S}_{t_{simu}} \rightarrow \mathcal {S}_0}\) described in Sect. 2.3. This transformation is computed from the simulation meshes \(\mathcal {S}_{t_{simu}}\) and \(\mathcal {S}_{0}\) but slightly differs with the true displacements represented in the simulations, due mainly to the use of spatial regularization when computing the TPS. As a result, it is necessary to further correct the intensities of the myocardium so that the motion corresponds exactly to that simulated by the E/M model.

We chose not to modify the image at the first frame, and further propagate the corresponding myocardial intensities to all the other time instants through the transformation contained in the simulation sequence \(\mathcal {S}_t\).

For each myocardial voxel \(\mathbf {x}^t\) at time t, we first find the tetrahedron cell of the simulation mesh \(\mathcal {S}_t\) that contains \(\mathbf {x}^t\), and further compute the barycentric coordinates of \(\mathbf {x}^t\) in that local tetrahedron cell. Since all the tetrahedrons are indexed, we can find the tetrahedron with the same index at the first frame. By combining the positions of this tetrahedron at time 0 and the previously evaluated barycentric coordinates of \(\mathbf {x}^t\), we can compute the voxel’s corresponding position at the first frame, denoted as \(\mathbf {x}^0\). Finally, since the simulation and image spaces are naturally aligned at the first frame, we computed the voxel intensity by linearly interpolating intensities at position \(\mathbf {x}^0\).

After refining all the myocardial voxels’ intensities by this way, the myocardial motion underlying the synthetic images corresponds to the E/M model, making it reasonable to compare displacement field tracked by cardiac motion tracking algorithms against the E/M model generated ground truth.

3 Result

3.1 Simulation of a Normal Heart

A healthy volunteer dataset from [11] was used as the template sequence. The tagged MR images for a normal heart were simulated and shown in Fig. 3. Three sequences with orthogonal line tagging directions were generated with line tag spacing 7 mm. Each sequence consists of 17 slices. The inter-slice thickness is 7.71 mm, and the in-plane pixel resolution is \(0.96\,\mathrm {mm} \times 0.96\,\mathrm {mm}\).

Compared to the previous work of several groups as shown in Fig. 1, the images simulated by the proposed method show more realistic surrounding tissue intensities instead of a whole-black background. Moreover, the appearance of the myocardium is less binary-like. To faciliate the use of our simulated images for benchmarking, we put this synthetic dataset (both the images and the ground truth meshes) at http://bit.ly/1nnEIMl which is publicly available to the research community.

Fig. 3.
figure 3

Synthetic tagged MR images for a normal heart using the proposed pipeline. The intensities of the three line tagging sequences were multiplied for better visualization.

3.2 Evaluation of State-of-the-Art Algorithms

In this section, we show the interest of using the simulated synthetic dataset for evaluating the performance of two recent cardiac motion tracking algorithms: HarpAR [1] and Sparse Demons (with default parameters) [10]. They were compared using the synthetized tagging sequence described in Sect. 3.1.

The LV of the first simulation mesh was tracked over the cardiac cycle by both methods. The tracking results were compared against the ground truth (i.e. the E/M simulations). Since the motion error reaches its maximum at end-systole, we compare the two methods at that time point. In Fig. 4, we can observe that HarpAR gave smaller median and variance of the tracking errors. This result is further confirmed by two statistical tests. The Levene’s test returned a p-value below 0.05, rejecting the hypothesis that their variances are equal. Also, we applied the Wilcoxon signed-rank test to see if their median values are equal. The returned p-value is below 0.05, rejecting the hypothesis that their median values are equal. The results from the statistic tests are coherent with what we observe in Fig. 4. Note that here our aim is to show the possibility of using the simulated dataset for benchmarking different algorithms, rather than determining which one is superior to the other, especially given that a thorough parameter tuning task remains to be done for SparseDemons.

Fig. 4.
figure 4

Comparison of HarpAR [1] and Sparse Demons [10] with respect to end-systolic tracking errors using the synthesized 3D+t tagged MR sequence (each data point represents the motion error of certain mesh node).

4 Discussion

The proposed pipeline has several limitations. First, since the myocardial intensities are corrected a posteriori as described in Sect. 2.5, the transitions between the myocardium and the remaining parts are not smooth enough as revealed by a careful visual inspection of the output image. This is due to the fact that the warping fields applied to the myocardium and the surrounding tissues are different: the myocardial motion corresponds to the E/M simulations while the warping applied to other parts comes from the TPS transformation. Although the two kinds of warping fields are close, it does result in intensity inconsistencies that can be perceived by human eyes as shown in Fig. 3. This somehow reduces the degree of the realism of the simulated images and it should be specially dealt with in the future.

Second, as described in Sect. 2.5, the intensities of the myocardium are all assigned from the first frame, meaning that no tag fading over the cardiac cycle is simulated. Obviously, tag fading has always been one of the key issues in tagged MR. However, in CSPAMM acquisitions, tag fading effects are much less apparent after the subtraction of two SPAMM acquisitions of inverse tagging preparation patterns [12]. Indeed, in the kind of CSPAMM images we aim to simulate, tag fading is rather limited, as can be seen in [13]. The absence of tag fading in the myocardium seems to decrease the realism of the simulation, but far from a significant level. Nonetheless, we expect to integrate additional considerations about tag fading into the pipeline which remains parts of future work as well.

A final aspect to be improved is the MR simulator. The proposed simulation pipeline is principally based on warping image intensities. A physical MR simulator taking tissue-specific properties (T1, T2 and proton density) as inputs should be more reasonable and might yield better results. In the future, we wish to integrate a MR simulator based on solving Bloch equations, which we think would perform better than simply warping the image intensities.

5 Conclusion

In this paper, we proposed a novel pipeline for synthesizing realistic 3D+t cardiac tagged MR images. We combined an electro-mechanical model and a template tagged MR sequence of a healthy volunteer for achieving this goal. The E/M model was used for simulating heart motions and the template sequence is used for picking up realistic image intensities. A spatio-temporal alignment technique was applied to help mapping the simulation and image spaces. One major advantage brought by the E/M model is that the heart deformation results solely from the model and is thus unbiased to any motion tracking algorithms.

As a preliminary result, we generated a synthetic dataset of a normal heart despite that the E/M model used is quite flexible and can generate a range of heart deformations corresponding to both normal and pathological cases. In the future, we intend to extend the simulation to different levels of pathological extents. The current work merely aims to show the feasibility of combining the E/M model and a template sequence for simulating CSPAMM images that represent a relatively good level of realism. Moreover, we show in this paper the interest of using our simulated dataset for comparing the performance of two recent cardiac motion tracking algorithms (HarpAR and Sparse Demons). This comparison can be easily extended to more algorithms and can be done more thoroughly while including more measurements such as myocardial strains. All these are currently left to future work.