Introduction

Diffeomorphic image registration allows the computation of a smooth and invertible deformation field and thus ensures that salient image features are not lost after image resampling with the obtained deformation fields. A key step in many clinical applications, diffeomorphic image registration can be employed in quantifying inter-subject variability of brain [1], studying Alzheimer’s disease [2], statistical shape analysis [3], brain atlas construction [4], and estimation of tissue deformation for surgery [5].

Several studies have proposed diffeomorphic algorithms to perform intra-modal/contrast image registration. Beg et al. [6] implemented the Large Deformation Diffeomorphic Metric Mapping (LDDMM) to register brain MRIs of Alzheimer’s and Schizophrenia patients, but their computational cost was high. Later, many algorithms were proposed to make the computation more efficient. Vialard et al. [7] shortened the computational time by employing geodesic shooting to register 3D MRI scans of fetus brains. Zhang et al. [8] proposed Fourier-Approximated Lie Algebras (FLASH) to perform inter-subject registration of 3D brain MRIs. Similar to [7], they also employed geodesic shooting and improved the efficiency by performing the calculations in a band-limited space. Wu et al. [9] implemented cross-correlation (CC)-based LDDMM for fast brain image registration via GPU acceleration.

In general, performing diffeomorphic image registration with iterative optimization can be computationally expensive and time-consuming. Therefore, a number of deep learning (DL)-based algorithms were designed to tackle this problem [10,11,12]. In [13], the comparison with multiple registration tasks suggests that compared with DL-based techniques, classic registration methods still have good performance and can offer satisfactory speed with the option of parallel computing.

In the last decade, several groups have attempted to design inter-modal diffeomorphic image registration techniques in various applications. Mitra et al. [14] proposed an inter-modal diffeomorphic algorithm to register 2D transrectal ultrasound images to MR slices. Kutten et al. [15] implemented the mutual information (MI)-based LDDMM on a Hamiltonian framework to register CLARITY images. Reaungamornrat et al. [16] proposed a MIND Demons which is based on SyN [17], diffeomorphic Demons [18], and MIND features [19] to perform deformable MRI-CT registration for image-guided surgery. However, inter-modal image registration remains a challenging task in medical image registration. In general, the algorithms should show a certain degree of robustness against intensity inhomogeneities, noise, and image artifacts. Moreover, the algorithms should be time-efficient for real clinical applications. To address some of these requirements, Rivaz et al. [20] proposed RaPTOR to register 3D inter-modal images of the BITE database [21]. Later in [22], an affine version of RaPTOR was used to successfully register inter-modal images of RESECT [23] and BITE [21] databases. Recently in [24], a rigid version of RaPTOR was employed to register preoperative CT and intraoperative US images of lumbar vertebrae.

This study intends to design a diffeomorphic algorithm to perform intra- and inter-modal image registration. In [20, 22, 24], it was shown that RaPTOR could successfully align images with different modalities. In [8], it was shown that FLASH could perform computationally efficient diffeomorphic registrations compared to vector momentum LDDMM [25]. However, RaPTOR and FLASH have the following drawbacks. First, RaPTOR uses B-spline as the transformation model which does not guarantee a smooth inverse transformation. Second, FLASH uses sum-of-squared differences (SSD) that is unable to directly measure the similarity between images of different modalities and contrasts [26]. Therefore, FLASH cannot be used to perform inter-modal/contrast image registration. Third, FLASH does not use multiresolution image pyramids to tackle larger deformations which is a standard approach in many inter-modal image registration methods. Herein, we proposed DiffeoRaptor, a novel algorithm to bring together the benefits of RaPTOR and FLASH while mitigating their drawbacks. We decided to build on this similarity metric by making it diffeomorphic. Other excellent choices are normalized Gaussian fields (NGF) and MIND. FLASH framework was selected in favor of other diffeomorphic approaches, because it is based on the well-established LDDMM framework. The performance of DiffeoRaptor was demonstrated in three applications, including (1) healthy individual MRI-to-template registration; (2) registration between Alzheimer’s disease (AD) and healthy brains, as well as brain scans at different stages of AD; (3) nonlinear registration of MR and CT abdominal data. The contributions of this work are threefold:

  1. 1.

    Proposing a diffeomorphic image registration framework using RaPTOR.

  2. 2.

    Devising inter-modal/contrast image registration with geodesic shooting in the bandlimited space of velocity fields.

  3. 3.

    Employing gradient descent (GD) with momentum to improve the convergence in contrast with classical GD optimization in FLASH and RaPTOR .

Our results show that DiffeoRaptor could achieve (1) better alignment of brain and abdominal images compared to Mattes MI+SyN, NiftyReg [27], and FLASH as assessed by Dice scores; (2) smoother deformation fields compared to Mattes MI+SyN and NiftyReg in the alignment of brain MR images, and (3) comparable computation time with FLASH while performing more challenging tasks.

Methodology

In this section, backgrounds of bandlimited space of velocity fields, bandlimited geodesic shooting, and formulation of RaPTOR metric are presented. Then, the formulation of DiffeoRaptor objective function is derived. Lastly, the optimization technique to minimize the objective function is detailed.

Space of bandlimited velocity fields

In pairwise diffeomorphic image registration, the reference image \(X\in \varOmega \) and the source image \(Y\in \varOmega \) are given. Ideally, the objective is to find a mapping \(\phi \in \mathrm{Diff}(\Omega )\) such that \(X\circ \phi \approx Y\) and \(Y\circ \phi ^{-1}=X\). Diffeomorphisms \(\phi :\varOmega \rightarrow \varOmega \) are a smooth mapping that has an smooth inverse \(\phi ^{-1}\). The tangent vector space at the identity \(id\in \mathrm{Diff}(\Omega )\) over the space of diffeomorphisms is defined as \(V=T_{\mathrm{id}}\mathrm{Diff}(\Omega )\). Given V, the space of bandlimited velocity fields \({\widetilde{V}}\) was constructed and proper Lie algebra in this space was defined in [8]. Time series \(t\in \left[ 0, 1\right] \) of diffeomorphisms \(\phi _t\in \mathrm{Diff}(\Omega )\) is created in the process of solving an ordinary differential equation (ODE). The time series of bandlimited velocity fields \({\tilde{v}}_t\in {\widetilde{V}}\) are related to \(\phi ^{-1}_t\) by Eq (1).

$$\begin{aligned} \frac{d\phi ^{-1}_t}{dt}=-D\phi ^{-1}_t\cdot \iota \left( {\tilde{v}}_t\right) \end{aligned}$$
(1)

where D is the derivative operator and \(\iota :{\widetilde{V}}\rightarrow V\) is the inverse Fourier transform from the bandlimited space to the space of dense velocity fields [8]. The geodesic shooting is the process of integrating the geodesic path of diffeomorphisms forward in time which is uniquely determined with the velocity \({\tilde{v}}_0\) in \(t=0\). The geodesic evolution equation in the discrete Fourier space is defined in Eq (2).

$$\begin{aligned} \frac{\partial {\tilde{v}}_t}{\partial t}=-{\widetilde{K}}\bigg [({\widetilde{D}}{\tilde{v}})^T\star {\tilde{m}}_t+{\widetilde{\varGamma }}({\tilde{m}}_t\otimes {\tilde{v}}_t)\bigg ] \end{aligned}$$
(2)

where K is the smoothing operator which is the inverse of the differential operator L. There is an in-depth discussion of possible choices of L in [6, 28, 29]. In this paper, it is set \(L=(-\alpha \varDelta +I)^c\) similar to [6, 8] where \(\varDelta \) is the Laplacian operator. \({\widetilde{K}}\) is the smoothing operator in the bandlimited space [8], \(\star \) is the truncated auto-correlation, \({\widetilde{\varGamma }}\) is the discrete divergence, \(\tilde{m_t}={\widetilde{L}}{\tilde{v}}_t\) is the momentum, \({\widetilde{L}}\) is the representation of L in the frequency domain, \(\otimes \) denotes the tensor product, and \({\widetilde{D}}\) is an operator that computes the spatial gradient in the bandlimited Fourier space [8].

Geodesic shooting in the bandlimited space

By setting the geodesic shooting as the constraint of the cost function, it does not require calculating the velocity fields \({\tilde{v}}_t\) and diffeomorphisms \(\phi _t\) in a dense time grid and it suffices to calculate the initial velocity \({\tilde{v}}_0\in {\widetilde{V}}\). The cost function for FLASH was defined as Eq. (3).

$$\begin{aligned} E({\tilde{v}}_0)=\frac{1}{2\sigma ^2}\big \Vert Y\circ \phi ^{-1}_1-X\big \Vert ^2+\langle {\widetilde{L}}{\tilde{v}}_0,{\tilde{v}}_0\rangle ,\quad \mathrm{s.t.} \,\mathrm{Eq. } (2) \end{aligned}$$
(3)

where \(\sigma \) is the noise variance, \(\Vert \cdot \Vert \) is the norm operator in the space \(\varOmega \), \({\widetilde{L}}\) is the inverse of \({\widetilde{K}}\), and \(\langle ,\rangle \) is the inner-product in the space \({\widetilde{V}}\) [8]. Gradient of the energy function E can be calculated as in Eq. (4) for the minimization of cost.

$$\begin{aligned} \nabla _{{\tilde{v}}_1}E=\nu \bigg (-K\bigg (\frac{1}{\sigma ^2}(Y\circ \phi ^{-1}_t-X)\cdot \nabla (Y\circ \phi ^{-1}_1)\bigg )\bigg ) \end{aligned}$$
(4)

where \(\nu :V\rightarrow {\widetilde{V}}\) is the projection mapping to the bandlimited space of velocity fields and K is the smoothing operator.

RaPTOR

One possible choice for the similarity metric is the Correlation Ratio (CR) [30]. For challenging inter-modal image registration tasks, calculation of CR needs to be robust and possibly time-efficient. RaPTOR is a dissimilarity metric that is based on CR [20] and addresses the shortcomings of CR [30]. RaPTOR and its derivative can be calculated as in Eq. (5). It calculates CR in local patches \(\varTheta \). Instead of calculating the iso-sets of X, the histogram of X over \(N_b\) bins is calculated, and then, Parzen windowing was applied to make the bins continuous and differentiable.

$$\begin{aligned} 1-\eta (Y|X)&=\frac{1}{N\sigma ^2}\bigg (\sum _{i=1}^{N}y^{2}_i - \sum _{j=1}^{N_b}N_j\mu _j^{2}\bigg ) \end{aligned}$$
(5a)
$$\begin{aligned} \mu _j&=\frac{\sum _{i=1}^{N}\lambda _{ij}y_i}{N_j}, N_j=\sum _{i}\lambda _{ij} \end{aligned}$$
(5b)
$$\begin{aligned} \mathrm {RaPTOR}(Y,X)&=\varPsi (Y,X)=\frac{1}{N_p}\sum _{i=1}^{N_p}(1-\eta (Y|X;\varTheta _i)) \end{aligned}$$
(5c)
$$\begin{aligned} \nabla _{\varphi }\varPsi&=\frac{\partial \varPsi }{\partial \varphi }=\frac{\partial \phi }{\partial \varphi }\cdot \frac{\partial Y}{\partial \phi }\cdot \frac{\partial \varPsi }{\partial Y} \end{aligned}$$
(5d)
$$\begin{aligned} \frac{\partial (1-\eta )}{\partial y_i}&=\frac{2}{N\sigma ^2}\bigg (y_i-\lambda _{i,j-1}\mu _{j-1}-\lambda _{ij}\mu _j\nonumber \\&\quad -\frac{1}{(N-1)\sigma ^2}(y_i-\mu )\bigg (\sum _{a=1}^{N}y_a^{2}-\sum _{c=1}^{N_b}N_c\mu _c^{2}\bigg )\bigg ) \end{aligned}$$
(5e)

where N is the number of pixels in a image patch \(\varTheta _i\), \(\sigma ^2=\mathrm {Var}[Y;\varTheta _i]\) is the variance of a patch i in Y, \(y_i\) is the intensity of sample i in image Y, let j and \(j-1\) be the closest bins to sample \(x_i\) (intensity of sample i in X); then, according to its distance to these bins centers, \(\lambda _{ij}\) is the linear contribution of \(x_i\) to the bin j, \(N_p\) is the number of patches, \(\varphi \) is the parameter of transformation \(\phi \), and \(\mu =E[Y]\) is the average value of Y. \(\eta (Y|X)\) can measure the functional dependence between the input images. When there is no functional dependence \(\eta (Y|X)=0\) and when \(\eta (Y|X)=1\) there is a deterministic relationship between X and Y. Calculating gradient of RaPTOR analytically enables efficient minimization of the dissimilarity metric using gradient-based optimization and employing the outlier suppression technique elaborated in [20].

DiffeoRaptor

The energy function in Eq. (3) can be generalized to the form in Eq. (6).

$$\begin{aligned} E({\tilde{v}}_0)=\mathrm {dist}\big (Y\circ \phi ^{-1}_1,X\big )+\langle {\widetilde{L}}{\tilde{v}}_0,{\tilde{v}}_0\rangle ,\quad \mathrm{s.t. }\,\mathrm{Eq. }(2) \end{aligned}$$
(6)

where \(\mathrm{dist}(,)\) is a normalized distance function or a dissimilarity function. DiffeoRaptor is the cost function in the form of Eq. (6) with the RaPTOR defined in Eq. (5) as the dissimilarity function. So it takes the form in Eq. 7.

$$\begin{aligned} E({\tilde{v}}_0)=\varPsi \big (Y\circ \phi ^{-1}_1,X\big )+\langle {\widetilde{L}}{\tilde{v}}_0,{\tilde{v}}_0\rangle ,\quad \mathrm{s.t. }\,\mathrm{Eq. }(2) \end{aligned}$$
(7)

Equation. (4) is no longer valid for Eq. (7) and the gradient of cost function needs to be calculated for the optimization. A similar approach to [6] is taken to calculate \(\partial _u\varPsi \), the variation in cost in Eq. (7) with respect to the velocity \(u=D\phi ^{-1}_1\) which is obtained by taking the derivative of \(\phi ^{-1}_1\).

Given the fact that we are working with image intensities in a grid according to Eq. (5), the variation in energy \(\partial _uE\) takes the form \(\partial _uE=\langle \nabla _u E,u\rangle _{V_g}\), and therefore, \(\partial _u\varPsi =\langle \nabla _u \varPsi ,u\rangle _{V_g}\). The inner-product \(\langle ,\rangle _{V_g}\) calculation is over a finite grid (\(V_g\) is the space of velocities where the inner-product \(\langle ,\rangle _{V_g}\) is taken). To calculate the Gateaux derivative of cost in Eq. (7), one is required to derive \(\partial _u\varPsi \) first as in Eq. (8).

$$\begin{aligned} \partial _u\varPsi =\left\langle \frac{\partial \varPsi }{\partial Y}\cdot \nabla (Y\circ \phi ^{-1}_1),u\right\rangle _{V_g} \end{aligned}$$
(8)

Detailed derivation of Eq. (8) is presented in Sect. S5 of Supplementary Material. Equation (8) indicates that \(\nabla _{u}\varPsi =\frac{\partial \varPsi }{\partial Y}\cdot \nabla (Y\circ \phi ^{-1}_1)\) which is known and can be calculated using Eq. (5e). By similar calculations to [6] and [8], the gradient of cost can be written as Eq. (9).

$$\begin{aligned} \nabla _{{\tilde{v}}_1}E=\nu \bigg (-K\bigg (\frac{\partial \varPsi }{\partial Y}\cdot \nabla (Y\circ \phi ^{-1}_1)\bigg )\bigg ) \end{aligned}$$
(9)

the gradient in Eq. (9) for the velocity in \(t=1\) can be used to find the gradient \(\nabla _{{\tilde{v}}_0}E\) in \(t=0\) with the reduced adjoint Jacobi field in bandlimited velocity fields elaborated in [8]. This process is called backward integration. To minimize the cost in Eq. 7, forward integration of Eq. 2 is used to find the velocity in \(t=1\). Then, \(\nabla _{{\tilde{v}}_0}E\) is used in GD with momentum optimization to update the velocity. Finally, Eq. 1 is used to calculate diffeomorphisms. Since similar process was used in [8] to calculate diffeomorphisms, the diffeomorphic registration is guaranteed. The employment of multi-resolution pyramid, gradient descent with momentum, and implementation details of DiffeoRaptor can be found in the Supplementary Materials.

Experiments and results

DiffeoRaptor was validated on three public datasets: IXI (http://brain-development.org/ixi-dataset), OASIS3 [31], and The Cancer Imaging Archive (TCIA) MR-CT abdominal data [32]. It is compared against Mattes MI+SyN, which is available in Advanced Normalization Tools (ANTs) [33] and NiftyReg [27] (using the normalized mutual information (NMI) as the similarity metric), as well as in several tasks with FLASH. Dice scores of overlapping regions are used as evaluation metrics. The default parameters for NiftyReg with the GD optimization produced the best results for us. Mattes MI+SyN is a diffeomorphic algorithm which uses Mattes MI as the similarity metric and models the deformation fields with SyN, and is suitable for inter-modal/contrast image registration. The parameters for Mattes MI+SyN were tuned such that it produced the optimal results. The number of bins for MI was set to 32 and the gradient step, the update field variance, and total field variance were set to 0.5, 3, and 0.5 for SyN, respectively.

Pre-processing of brain MRI

Table 1 Abbreviation of subcortical structures which were automatically labeled in the segmentation of brain volumes using volBrain [34]
Fig. 1
figure 1

Coronal view of two slices (rows) of four different IXI dataset subjects (columns). The images are overlaid by the segmentation of CSF, GM, and WM. The large variability of structures across subjects requires a deformable registration

Brain MR images of the IXI and OASIS3 datasets were first skull-stripped using nonlocal intracranial cavity extraction [35]. For each case, the extracted brain was carefully inspected. Then, two types of segmentations were generated for each volume using the volBrain algorithm [34] so that Dice scores can be used to evaluate registration accuracy. Here, in the first one, brain tissues are classified into Cerebrospinal Fluid (CSF), Gray Matter (GM), and White Matter (WM). The second type of segmentation consists of 16 subcortical structures which are abbreviated in Table 1. Lastly, the volumes were affinely registered using ANTs with Mattes MI as the metric (see Fig. 1).

IXI dataset: inter-subject registration

Table 2 Dice score (mean ± sd) evaluation of T1-T1, T1-T2, and T1-PD registrations of IXI dataset for DiffeoRaptor, Mattes MI+SyN, FLASH, and NiftyReg in overlapping regions of brain tissues and sixteen subcortical structures

Twenty young adult subjects (\(\mathrm{age}<30\mathrm{yo}\)) of the IXI dataset were selected randomly. Given the fact that the IXI dataset offers T1w, T2w, and PDw for each subject, three different tasks were designed, including T1-T1, T1-T2, and T1-PD registrations. T1w MRI scans of three subjects (Subjects 15, 17, and 21) were randomly selected as the reference volume, and the rest are set as the source volumes for inter-subject registration (in total \(3\times 19=57\) registrations). The results of Dice score evaluation are summarized in Table 2, which shows that DiffeoRaptor, Mattes MI+SyN, and NiftyReg could successfully align volumes in each task, whereas FLASH underperformed in terms of Dice scores in intra-contrast tasks and failed in inter-contrast tasks as expected. It can also be seen that DiffeoRaptor in general did better than Mattes MI+SyN.

IXI dataset: subject-to-template registration

Table 3 Dice score (mean ± sd) evaluation of ICBM152-T1, ICBM152-T2, and ICBM152-PD registrations of IXI dataset for DiffeoRaptor, Mattes MI+SyN, FLASH, and NiftyReg in overlapping regions of brain tissues and sixteen subcortical structures

Given the IXI subjects in Section “IXI dataset: inter-subject registration”, the volumes are set as the source volumes and they were registered to the T1w ICBM152 template [36]. Here, the template is set as the reference volume and similar tasks were performed as in Section “IXI dataset: inter-subject registration” for subject-to-template registration. The results are summarized in Table 3, which shows that DiffeoRaptor, Mattes MI+SyN, and NiftyReg could successfully align volumes in each task while DiffeoRaptor in general did better than Mattes MI+SyN and NiftyReg, especially in alignment of subcortical structures.

Fig. 2
figure 2

From the left to right: coronal slices of the ICBM152 (reference volume), the PDw source volume of the IXI dataset, result of NiftyReg, FLASH, Mattes MI+SyN, and DiffeoRaptor, respectively. Rows show different coronal views. Subcortical structural segmentations are shown in colored contours. Arrows are pointing to the regions where the image alignments are more visible

Figure 2 demonstrates two coronal views of registration results. The subcortical structures are shown in the figure as colored outlines. DiffeoRaptor shows better alignment of slices and anatomical structures compared to other methods. The cerebrum shape with DiffeoRaptor registration looks closer to the ICBM152 template than other methods.

OASIS3 dataset: intra- and inter-subject registration

The OASIS3 dataset consists of subjects intended for investigating Alzheimer’s disease (AD) [31]. Twenty AD patients from this dataset were randomly selected with matching T1w and T2w MRIs. In the first sub-task, intra-contrast intra-subject registration was performed for brain scans obtained at different stages of AD progression, where the T1w volume at the baseline was set as the reference and the T1w image from the latest session (> 6 months apart) with visible atrophy was registered to the reference. This sub-task represents the need in neuroimage analysis for tracking disease-related anatomical changes. The results are included in Table S1 of the Supplementary materials.

Table 4 Dice score evaluation (mean ± sd) of T1-T2 inter-subject registration of IXI data with OASIS3 data for DiffeoRaptor, Mattes MI+SyN, and NiftyReg in overlapping regions of brain tissues and sixteen subcortical structures
Fig. 3
figure 3

From the left to right: axial slices of the T1w reference volume from the IXI dataset, the T2w MRI source volume of the OASIS3 dataset, result of NiftyReg, Mattes MI+SyN, and DiffeoRaptor, respectively. Rows show different axial views. Subcortical segmentations are shown in colored contours

In the second sub-task, T1w MRIs of four young healthy adults of the IXI dataset in Section “IXI dataset: inter-subject registration” were used as the references and the T2w MRI scans of the latest session for each subject from the OASIS3 dataset were set as the source volumes, resulting in \(4\times 20=80\) registrations. This way, we defined a more challenging, inter-contrast, inter-subject, and inter-dataset task to better compare DiffeoRaptor with Mattes MI+SyN and NiftyReg. The results of T1-T2 registrations are summarized in Table 4, where DiffeoRaptor outperformed Mattes MI+SyN and NiftyReg. Note that FLASH was not included in these experiments because it continuously failed to perform inter-contrast registration. In Fig. 3, it can be seen that DiffeoRaptor has improved the alignment of subcortical structures and ventricles better than Mattes MI+SyN and NiftyReg.

TCIA abdominal MR-CT intra-subject registration

The TCIA dataset contains eight subjects. Each subject has a T1w MRI scan and CT scan (with deformation) of the abdomens. The manual segmentations of the liver, spleen, left kidney, and right kidney are provided by the Learn2Reg organizers (https://learn2reg.grand-challenge.org). By setting the MRI scan for each subject as the reference volume, CT scans were aligned to perform intra-subject registrations. The deformable registration for MR-CT of these subjects is required because the images were taken in different time points, with different modalities, and misalignments due to patient movement, respiration, and etc. The results are summarized in Table 5.

Table 5 Dice score (mean ± sd) evaluation of MR-CT intra-subject registration for TCIA abdominal data using DiffeoRaptor, RaPTOR [20], Mattes MI+SyN, and NiftyReg
Fig. 4
figure 4

From left to right: coronal slices of Subject 7’s MRI (reference volume), the corresponding CT source volume, results of NiftyReg, Mattes MI+SyN, DiffeoRaptor, and NiftyReg, respectively. Rows show different slices of volumes. Segmentations of key organs are shown with colored contours. Arrows are pointing to the regions where the image alignments are more visible

Table 6 Dice scores (mean ± sd) of cumulative results for DiffeoRaptor, Mattes MI+SyN, and NiftyReg in overlapping subcortical structures. The p-values from ANOVA are shown for each anatomical structure

Given the fact that the initial affine registration achieved mean Dice score of \(0.72\,\pm \,0.10\), Table 5 shows DiffeoRaptor, RaPTOR [20], Mattes MI+SyN, and NiftyReg could successfully improve the image alignment. Besides, DiffeoRaptor outperformed Mattes MI+SyN and NiftyReg in alignment of all the targeted regions. Note that two subjects didn’t have the segmentation of the right kidney, and thus, they were excluded from the Mean Dice calculation of Table 5. In Fig. 4, it can be seen that compared to the affine registration, Mattes MI+SyN and DiffeoRaptor show improvement in alignment of segmented organs. However, DiffeoRaptor shows better alignment of organs compared to Mattes MI+SyN and NiftyReg.

Cumulative results

Given the inter-contrast registration results (total 291) in Sections “IXI dataset: inter-subject registration, IXI dataset: subject-to-template registration and OASIS3 dataset: intra- and inter-subject registration” for brain structures, the mean Dice scores and the associated p-values from comparing the three methods using the one-way analysis of variance (ANOVA) were listed for the sixteen subcortical structures in Table 6. Furthermore, post hoc multiple comparison (Tukey–Kramer) tests were performed to reveal the performance of the methods (Table 7). With the statistical tests, we confirm that DiffeoRaptor outperforms the rest in terms of Dice scores for aligning each subcortical region, as well as the mean Dice score (\(p<0.05\)). It is worth mentioning that the average mean Dice is \(0.63\pm 0.12\) for the affine registration. To better visualize the results for the last row of Table 6, the box plots of average Dice scores over all evaluation regions are demonstrated in Fig. 5.

Table 7 Post hoc multiple comparison (Tukey-Kramer) tests of DiffeoRaptor against Mattes MI+SyN and NiftyReg for the average Dice in overlapping subcortical structures
Fig. 5
figure 5

The box plots of average Dice score for the total of 291 brain image registrations. DiffeoRaptor has a higher mean and lower std with fewer outliers

Deformation smoothness analysis

Fig. 6
figure 6

Logarithm of determinant of Jacobian \(\mathrm{log}_{10}(\mathrm{det}(J))\) was calculated for each voxel of the deformation field. Then, they were accumulated in bins for Mattes MI+SyN and DiffeoRaptor

With the cumulative results in Section “Cumulative results”, in each registration, the determinant of Jacobian J was calculated for each voxel of the deformation field. Figure 6 shows \(\mathrm{log}_{10}(\mathrm{det}(J_\phi ))\) for each voxel that they were accumulated in bins for DiffeoRaptor and Mattes MI+SyN. For example, the bin centered at the origin means no deformation, bins with the negative centers show contraction, and bins with positive center show expansion. The further the bin from the center, the more deformation the bin represents. From the experiments, we observed that the number of nonzero samples is similar across DiffeoRaptor, Mattes MI+SyN, and NiftyReg. However, DiffeoRaptor has fewer samples far from the central bin (Fig. 6) and generates smoother deformations than Mattes MI+SyN and NiftyReg, as shown in Table S2 of the Supplementary Materials. The determinants of Jacobians for DiffeoRaptor, Mattes MI+SyN, and NiftyReg are visualized in Fig. S2 of Supplementary Material. For the ablation study, the deformation smoothness of DiffeoRaptor, RaPTOR [20], Mattes MI+SyN, and NiftyReg is compared in the TCIA abdominal dataset and the results are summarized in Table S3 of the Supplementary Materials.

Discussions

When RaPTOR is employed as the similarity metric, it may require additional parameter tuning. This motivates more advanced optimization technique rather than the classical GD to minimize the cost function. This was shown and explored in [22] and [24]. For DiffeoRaptor, the parameter settings were mostly the default values from RaPTOR and FLASH as elaborated previously. However, for the cases where affine registration fails to perform good initial alignments, we should be careful in choosing the step size for the gradient update and the maximum number of iterations. The average computational times were calculated for DiffeoRaptor and FLASH on a single core of a 6 core Linux Mint system for 10 T1-T1 brain MRI registrations with the image size of \(176\times 256\times 256\) voxels. The mean computational time per registration of DiffeoRaptor (\(384.50\pm 0.01\) s) is comparable to that of FLASH (\(416.14\pm 0.01\) s). It should be noted that there are issues with using surrogates such as tissue overlap to evaluate the performance of registration methods [37], as outlined in more detail in the Supplementary Materials.

Conclusion

We present DiffeoRaptor, a diffeomorphic inter-modal/contrast image registration algorithm based on RaPTOR and geodesic shooting in bandlimited space. The algorithm is validated on several different applications. Compared with FLASH, Mattes MI+SyN, and NiftyReg, it achieves comparable or better results. In addition, DiffeoRaptor offers smoother deformation fields than Mattes MI+SyN and NiftyReg.