Keywords

1 Motivation

Deep-learning-based approaches for medical image registration usually involve an elaborate learning procedure and yet they often struggle with the estimation of large deformations and the versatile usability for a wide range of tasks. To address the different registration tasks of the Learn2Reg2021 challengeFootnote 1 [8], we present a fast and accurate optimisation method for image registration that requires little learning. Our method robustly captures large deformations by using discretised displacements and a coupled convex optimisation. In order to be versatile for various tasks, we include a hand-crafted feature extractor in our method that is contrast and modality invariant and still highly discriminative for local geometry.

2 Methods

The main idea of our method is to perform large-deformation image registration by using a coupled convex optimisation [6] that approximates a globally optimal solution of a discretised cost function followed by an Adam-based instance optimisation to further improve the local registration accuracy. Dense correlation has already been used extensively in learning based optical flow estimation (cf. PWC-Net [16]) and end-to-end trainable 3D registration networks (cf. PDD-Net [5]), however both approaches have limitations. PWC-Net requires multiple warping steps and is difficult to extend from 2D to 3D (see [4]). PDD-Net employs a dense 3D displacements, but substantially simplifies the optimisation strategy, which may lead to some inaccuracies. ConvexAdam aims to combine the best of both worlds (learning and optimisation-based) by leveraging segmentation priors where available and relying on robust hand-crafted features and fast discrete optimisation.

Fig. 1.
figure 1

The structure of our registration method. It consists of a feature extractor (MIND and/or nnUNet) and a dense correlation layer followed by a coupled convex optimisation and an Adam-based instance optimisation.

As visualised in Fig. 1, the basic structure of our registration method consists of a feature extractor, a correlation layer, a coupled convex optimisation, and an instance optimisation.

The feature extractor outputs contrast and modality invariant features from the fixed and moving input images. For this, hand-crafted MIND features [7] ensuring versatility regarding different types of registration tasks can be employed. Depending on the availability of labelled image data, automatic segmentations as provided by the nnU-Net [10] can be used instead. Different to other state-of-the-art supervised deep learning registration methods [14] we avoid using the expert labels only at the end for the warping loss, which may lead to sub-optimal results due to limited gradient backflow. We instead found that using off-the-shelf segmentation networks produce best results.

The obtained features are fed into a correlation layer, which computes a sum-of-squared-differences (SSD) cost volume with a box filter and gives an initial best displacement for each voxel (simply taking the \({\text {argmin}}\)). Therefore, we employ a search space with up to 5000 discretised displacements per voxel. The capture range can be up to at least 48 voxels in each dimension (setting for Task 2) and therefore estimate large motion accurately.

The correlation layer’s output is used to solve two coupled convex optimisation problems for efficient global regularisation: In several iterations, alternating steps are performed for similarity and smoothness optimisation, i.e. a spatially smoothed field based on the current \({\text {argmin}}\) (minimal SSD costs) displacements followed the by adding a penalty to the discreted SSD costs based on the discrepancy of this current globally smooth optimum.

The resulting displacements in turn are used as a starting point for an Adam-based instance optimisation in order to provide the final deformation grid used for warping of the moving input image. This step is very similar to classic optical flow estimation [15]. For this purpose, the cost function is linearised and the Adam optimiser [11] is used for gradient descent. Smoothness of the displacement field is induced by adding a B-spline deformation model and diffusion regularisation.

3 Experiments and Results

Each of the Learn2Reg2021 tasks entails certain challenges that we face with slightly varying experimental setups as outlined in the following. The complete implementation details can be found in our publicly available repository.Footnote 2 Table 1 presents quantitative results and Fig. 2 shows qualitative results for the individual tasks.

Table 1. Results for the different Learn2Reg2021 tasks. Accuracy is measured by the Dice similarity of organ segmentations (Dice), the target registration error for anatomical landmarks (TRE), and the \(95\%\) Hausdorff distance for segmentations (HD). Robustness is measured by the \(30\%\) lowest Dice scores (\(\mathrm {Dice}_{30}\)), Dice scores for additional segmentations (\(\mathrm {Dice}_{+add}\)) and the \(30\%\) highest TRE values (\(\mathrm {TRE_{30}}\)). Plausibility of the deformations is measured by the standard deviation of the logarithmic Jacobian determinant (SDlogJ). Dice similarities are reported in \(\%\), TRE and HD values are given in millimetres and inference time is given in seconds. The last table displays the challenge scores and ranks for the overall 1st, 2nd, and 3rd place.
Fig. 2.
figure 2

Qualitative results of our proposed method (top row: colourmap overlay of fixed and moving image (Task 1 and 2) or segmentation (Task 3); bottom row: overlay of fixed and warped moving image or segmentation).

Task 1 Thorax-Abdomen CT-MR. The first task aims to align multimodal intra-patient data [1,2,3, 12]. Besides of multimodal image registration, the objectives of learning from few and noisy labels, as well as dealing with large deformations and missing correspondences are challenging. For this task, we extract hand-crafted MIND features and include an inverse-consistency constraint as introduced in [6] to enforce a minimised discrepancy between the forward and backward transformations in order to avoid implausible deformations. To further regularise the displacement field during Adam instance optimisation, we add thin plate splines yielding smooth deformation fields. As large deformations are to be expected, we chose a search space that includes discretised displacements with a capture range of 64 mm for each dimension within the scanned anatomy.

Task 2 Lung CT. The second task is to perform inspiration-expiration registration on intra-patient lung CT data [9]. In this task, there is the challenge of estimating large breathing motion for scans with only partial visibility of the lungs in the expiration scans. The displacement search range is selected in order to capture motion with up to \(42\times ~30~\times ~42\) mm for the x-, y-, and z-dimension respectively. Like in the first task, MIND features of both input images are used to compute the SSD cost volume.

Task 3 Whole Brain MR. The third task deals with the registration of inter-patient T1-weighted brain MRI [13]. Here, the main challenge is to precisely align small structures of variable shape. For this reason, we chose a displacement capture range of 16 mm for each dimension within the scanned brain structures. As this task comprises a large amount of labelled image data, nnU-Net predictions for segmentation guidance are employed. We use the nnU-Net predictions in the form of inverse class-weighted one-hot encodings as features for our method’s optimisation steps.

4 Conclusion

Our contribution to the Learn2Reg2021 challenge showed that image registration can be performed fast and accurately using an optimisation strategy with little learning. It is highly parallelisable on a GPU and robust by using a large search space of discretised displacements. Smoothness of the deformation fields could be induced by a global convex regularisation, diffusion regularisation, and B-spline interpolation. By using an efficient Adam-based instance optimisation, our method yields very precise results and by integrating a modality-invariant feature extractor, we achieve a wide versatility. We were able to achieve the overall Learn2Reg2021 challenge’s second place, winning Task 1, being second in Task 3, and being third in Task 2.