1 Introduction

In medical image analysis, image registration is an important and fundamental task for many clinical applications such as population analysis, longitudinal studies, image fusion, and image-guided interventions, etc [1,2,3,4,5,6,7]. The significance of image registration lies in its ability to align medical images with different perspectives or modalities, facilitating comprehensive analysis and assisting medical professionals in making treatment decisions.

In recent years, medical image registration techniques have witnessed considerable advances, driven by rapid advances in computer hardware and software. Remarkably, deep learning-based methods have attracted wide attention due to their outstanding constant-time inference capabilities, prompting a rapid technological revolution in the field of image registration. Notable examples include DIRNet, VM, Transmorph, etc [8,9,10,11]. These methods leverage network training strategies to obtain deformation fields between images, facilitating accurate registration even for previously unseen images. Nonetheless, the substantial requirements of training data, the low robustness to multi-site data, and the constraints imposed by computational hardware place limitations on the exploration of deep learning-based image registration methods, especially when compared to classical image registration techniques. Specifically, as highlighted in paper [12], deep learning methods in unsupervised settings have yet to outperform their regular algorithmic counterparts based on continuous iterative optimization. These challenges motivate us to revisit classical registration methods.

Compared to deep learning-based methods, classical registration methods are more robust and accurate for large-scale internal motion tasks. For example, Rühaak et al. propose an NLR method [13] based on minimizing the distance metric of a normalized gradient field with curvature regularization, supplemented with lung segmentation as a mask, exhibiting remarkable runtime performance. However, such mask-based methods present drawbacks, notably the need for a challenging initial segmentation stage.

Moreover, Markov random field (MRF)-based image registration methods also show commendable performance. For instance, Tang et al. [14, 15] introduce the graph cut method to transform the image registration problem into a label matching problem on MRF, and then utilize the expansion algorithm to optimize the process, resulting in innovative performance. Heinrich et al. [16] employ a key-point operator for feature extraction and utilize a part-based model to exploit contextual information for the regularization of the neighboring displacement vectors through MRF, overcoming the limitation of estimating larger deformations. In addition, Peng et al. [17] apply the Markov chain monte carlo (MCMC) algorithm to image registration and propose LO-MRF and HO-MRF methods based on whether higher-order cliques are used or not, effectively dealing with the image mismatch problem arising from local extreme values. However, the optimization of general MRFs is known to be NP-hard, presenting challenges in producing accurate and fast solutions.

In response, registration methods that leverage parametric optimization and regularization augmentation have been explored to provide viable solutions. For example, Vishnevskiy et al. [18, 19] use anisotropic total variation (aTV) regularization and leverage the alternating direction method of multipliers (ADMM) for registration. This strategy helps to avoid potential local minima in the optimization process, leading to efficient solutions. To better estimate the sliding motion between thoracic and abdominal organs during respiration, Vishnevskiy et al. propose an isotropic total variation (isopTV) [20] regularization metric for registration. This approach enables accurate registration near the sliding interface and demonstrates satisfactory performance on both lung CT images and liver MRI images. Subsequently, for the same energy function, the authors further enhance the registration performance by replacing the original ADMM optimization algorithm with the limited-memory BFGS optimization algorithm, naming the method pTVreg [21]. Unlike ADMM, the limited-memory BFGS algorithm avoids the decomposition of the original problem into sub-problems for iterative optimization. The results further confirm the influence of optimization algorithms on registration performance and motivate further research on optimization algorithms.

During the optimization of the energy function, the attention allocated to each term directly affects the performance of the algorithm. For example, in pTVreg, which utilizes non-decompositional optimization, attention depends on the weights between similarity and regularization terms. Therefore, a favorable registration performance can be achieved by empirically selecting an optimal weight. However, in isopTV, which employs decomposition-based optimization, the attention is determined comprehensively by the optimization order of the subproblems and the weights between similarity and regularization. Therefore, it is necessary to investigate the optimization order of the subproblems in the decomposition-based optimization algorithm to alter the attention allocation strategy and thus improve the registration performance.

In this work, we choose the challenging task of lung 4D CT image registration for experimental analysis. According to [22], lung 4D CT image registration faces the challenges of local intensity inhomogeneity and sliding motion between different organs. To address the aforementioned issues, the modified version of the SSIM [23] metric and the vector-modulus-based regularization metric are proposed. In summary, the specific contributions are as follows:

  • Under the decomposition-based optimization framework, the optimization order is altered to prioritize the optimization of the regularization term by modifying the distributed alternating direction method of multipliers (DADMM) [19]. Such attention allocation strategy can achieve fast convergence without loss of registration accuracy.

  • The vector-modulus-based regularization metric is proposed. The proposed regularization metric takes into account inhomogeneous contributions from different directions, which can effectively preserve the topological structure of the deformation field and accurately estimate the non-smooth motion of the anatomies at the sliding interface.

  • A modified SSIM metric is designed by integrating the local correlation coefficients with the intensity metric. The proposed similarity metric takes into account both the structural and intensity information of the image in a comprehensive way, enabling better registration performance.

The remainder of the paper is structured as follows. In Sect. 2, the deformable image registration model is formulated as a decomposition-based problem by introducing redundant variables. The modified SSIM metric is presented, which considers both structural and intensity information. The computation of the vector modulus-based regularization term is introduced to efficiently handle complex sliding motions. Then, the experimental results on 4D-CT image dataset and COPD image dataset are discussed and analyzed in Sect. 3 Finally, we summarize the proposed image registration method and discuss possible directions for further research.

2 Methods

2.1 Deformable Image Registration Model

Assume that there are N-dimensional fixed image I and moving image J in the image domain \(\Omega\). The deformed image J(d) can be obtained by using an N-dimensional displacement field \(d=(d_1,d_2,\dots ,d_N)\) to deform J. To achieve image registration, it is often necessary to make J(d) infinitely close to the fixed image I. Therefore, a mathematical model can be developed to estimste the image registration problem:

$$\begin{aligned} E(d)=E_D(I,J(d))+\lambda {E_R(d)} \end{aligned}$$
(1)

where \(E_D(I,J(d))\) measures the similarity or dissimilarity between I and J(d), depending on whether the optimization mode of the energy function E(d) is maximized or minimized. \(E_R(d)\) regularizes the deformation field to maintain the topology and the smoothness, often by constraining the the deformation field relationship of neighboring pixels. The coefficient \(\lambda\) is essential to strike a balance between registration similarity and deformation field smoothness. Its value should adhere to the following principles: Avoid overly small regularization constraints that lead to reduced smoothness and topological confusion. Similarly, avoid excessive regularization constraints, which can diminish image feature extraction and degrade registration similarity.

2.2 Dissimilarity Metric

In this paper, we use a minimization strategy to perform registration model optimization. Then, the dissimilarity metric \(E_D\) between the deformed image J(d) and the fixed image I is chosen. Commonly, dissimilarity between images can be described by computing distances between images, such as the Manhattan distance (\(\hbox {L}_1\)) and the Euclidean distance (\(\hbox {L}_2\)). In addition, some similarity metrics \(E_S\), such as correlation coefficient and structural similarity (SSIM) [23], can also be transformed into dissimilarity metrics \(E_D\) by leveraging simple mathematical transformations (e.g., inverse or derivative).

In [24], the authors use the transformed local correlation coefficients (LCC) as a dissimilarity metric and convert complex convolution operations into simple frequency domain products to compute a weighted sum of correlation coefficients for pixel-centric image patches. Such a strategy measures the structural similarity between images and reduces the computational complexity. The details of \(E_D{(I,J(d))}\) and its gradient are as follows:

$$\begin{aligned} E_D{(I,J(d))}&=\sum _{p\in {\Omega }}{\left[ 1-LCC_p(I,J(d))\right] } \end{aligned}$$
(2)
$$\begin{aligned}&=\sum _{p\in {\Omega }}{\left[ 1-\frac{<I,J(d)>_p}{\sigma _p(I)\cdot {\sigma _p(J(d))}}\right] } \end{aligned}$$
(3)
$$\begin{aligned} \frac{\partial {E_D{(I,J(d))}}}{\partial {J(d)}}\approx {-\frac{\left[ (I-\bar{I}) -{\frac{(J(d)-\bar{J}(d))\cdot <I,J(d)>}{\sigma ^2(J(d))}}\right] }{\sigma (I)\cdot {\sigma (J(d))}}} \end{aligned}$$
(4)

where

$$\begin{aligned} \begin{aligned} \bar{J}_p&=\sum _{q\in {\Omega _p}}{G_p[q]*J[q]}\\ \sigma _p^2{(J)}&=\sum _{q\in {\Omega _p}}{G*(J^2)[q]-(G*J)^2[q]}\\ <I,J>_p&=\sum _{q\in {\Omega _p}}{G_p*(I\cdot {J})_q-\left( {G_p*I}\right) _q\cdot \left( {G_p*J}\right) _q} \end{aligned} \end{aligned}$$
(5)

and \(G_p\) denotes the Gaussian convolution kernel function centered at pixel p, and \(*\) denotes the convolution operation.

However, the LCC metric only focuses on structural correlations between image patches, while ignoring similarities in image intensities. The SSIM metric [23] detailed below integrates image intensity, contrast, and structural information by computing the corresponding similarities l(IJ), c(IJ), and s(IJ). Compared to LCC, SSIM can comprehensively evaluate the similarity between images.

$$\begin{aligned} SSIM(I,J(d))=l(I,J(d))\cdot {c(I,J(d))}\cdot {s(I,J(d))} \end{aligned}$$
(6)

where

$$\begin{aligned} \begin{aligned} l(I,J)&=\frac{2\bar{I}\cdot \bar{J}+c_1}{\bar{I}^2+\bar{J}^2+c_1}\\ c(I,J)&=\frac{2\sigma _{I}\cdot \sigma _{J}+c_2}{\sigma _{I}^2+\sigma _{J}^2+c_2}\\ s(I,J)&=\frac{\sigma _{IJ}+c_3}{\sigma _{I}\cdot \sigma _{J}+c_3}\\ \end{aligned} \end{aligned}$$
(7)

and the constants \(c_1\), \(c_2\), and \(c_3\) are often introduced as corrections to ensure the stability of SSIM.

Since the optimization algorithm used in this paper requires the computation of gradients, we ultimately choose a variant of SSIM as the dissimilarity metric to reduce the computational complexity. Only intensity and structural similarity metrics are used. Meanwhile, in image intensity similarity computation, only the intensity of the current pixel is used instead of the mean intensity of the image patches. Moreover, LCC is used instead in structural similarity calculations. Such an approach reduces computation and more accurately reflects the similarity between images. The details of \(E_D\) and its gradient are given below. Specifically, we also tried to use SSIM directly, but the experimental results are worse than those in this paper.

$$\begin{aligned} \begin{aligned} E_D{(I,J(d))}&=\sum _{p\in {\Omega }}{\left[ 1-E_S^p(I,J(d))\right] }\\&=\sum _{p\in {\Omega }}{\left[ 1-h_p(I,J(d))\cdot {LCC_p(I,J(d))}\right] } \end{aligned} \end{aligned}$$
(8)

where h(IJ) represents the proposed intensity similarity,

$$\begin{aligned} h(I,J)=\frac{2{I}\cdot {J}+c_1}{{I}^2+{J}^2+c_1} \end{aligned}$$
(9)

and the gradients can be expressed as:

$$\begin{aligned} \frac{\partial {h(I,J)}}{\partial {J}}\approx {\frac{2{I}}{{I}^2+{J}^2+c_1} -\frac{2{J}\cdot {\left( 2{I}{J}+c_1\right) }}{{\left( {{I}^2+{J}^2+c_1}\right) ^2}}} \end{aligned}$$
(10)
$$\begin{aligned} \frac{\partial {E_D{(I,J(d))}}}{\partial {J(d)}}\approx {-LCC\cdot \frac{\partial {h}}{\partial {J(d)}}-{h}\cdot \frac{\partial {LCC}}{\partial {J(d)}}} \end{aligned}$$
(11)

2.3 Regularization

Regularization constraints are usually implemented by constraining the displacement relation between neighboring pixels, with the aim of smoothing the displacement field and improving registration accuracy. Typical regularization constraints include \(\hbox {L}_1\) regularization (Lasso) and \(\hbox {L}_2\) regularization (Ridge). Take 2D space as an example, as shown in Fig. 1.

Fig. 1
figure 1

Illustration of displacement vectors

The displacement vector at pixel A is assumed to be \(\vec {{{\textbf {A}}}}=(d_1,d_2,\dots ,d_N)\), where \(d_i, i=1,2,\dots ,N\) denotes the displacement change in dimension i, and the remaining pixels are described similarly. Then, it follows that the gradient vector of the displacement along the direction \(i=1\) can be approximated by

$$\begin{aligned} \vec {\Phi }_1\approx {\vec {{{\textbf {B}}}}-\vec {{{\textbf {A}}}}} =(\nabla _1{d_1},\nabla _1{d_2},\dots ,\nabla _1{d_N}) \end{aligned}$$
(12)

Similarly, the displacement gradient vector along the i-th dimension is approximated as follows:

$$\begin{aligned} \vec {\Phi }_i\approx {(\nabla _i{d_1},\nabla _i{d_2},\dots ,\nabla _i{d_N})} \end{aligned}$$
(13)

Different from \(\hbox {L}_1\) and \(\hbox {L}_2\), which regularize each component of the gradient vector, we adopt a novel regularization approach that constrains the displacement of neighboring pixels by constraining the modulus of \(\vec {\Phi }_i\) in each direction [expressed as Eq. (14)], thereby smoothing the displacement field. Unlike the isopTV regularization which directly applies the \(\hbox {L}_{2,1}\) norm, the proposed regularization can effectively constrain the variation of the displacement field while making it more physically meaningful.

$$\begin{aligned} E_R(d)=\sum _{p\in \Omega }{\sum _{j=1}^{N}{{||\vec {\Phi }_j(d_p)||_2}}} \end{aligned}$$
(14)

2.4 Energy Function Gradient

To reduce computation, interpolation of the control point displacement field k is often used instead of dense displacement field d. Referring to the chain rule used in [20], we can obtain the gradient of \({E_D(I,J(d(k)))}\) with respect to the control point displacement field k. The first term in Eq. (15) is the image metric derivative. The second term is the gradient of the warped image in the i-th direction. The third term is the Jacobian of the displacement parametrization, which describes the volume change of the voxel block affected by the interpolation of the control points.

$$\begin{aligned} \frac{\partial {E_D(I,J(d(k)))}}{\partial {k_i}}\approx {\frac{\partial {E_D(I,J(d))}}{\partial {J(d)}}\cdot \frac{\partial {J(d)}}{\partial {d}}\cdot \frac{\partial {d(k)}}{\partial {k_i}}} \end{aligned}$$
(15)

where \(\frac{\partial {d(k)}}{\partial {k_i}}\approx {\prod _{n\le {N}}{\left[ {1-\frac{|\alpha ^n|}{K_n}}\right] _+}}\). And \(\alpha ^n\) denotes the distances between the pixels and the control points in the n-th dimension. \(K_n\) denotes the grid spacing of the control points. Correspondingly, the gradient of the regularization \({E_R(d(k))}\) at the control point can be expressed by

$$\begin{aligned} \frac{\partial {E_R}}{\partial {k_i}}\approx {\left( {\sum _{j=1}^{N} {\frac{\nabla _j^T\nabla _j{k_i}}{||\vec {\Phi }_j(k)||_2}}}\right) \cdot {\left( {\prod _{n\le {N}}{K_n}}\right) }} \end{aligned}$$
(16)

2.5 Distributed Alternating Direction Method of Multiplier

Referring to [25], by introducing redundant variables \(z_p\), we equate the image registration model to the following model and optimize it with DADMM. Compared to isopTV, the proposed method allows for distributed computation, enables parallel processing, and has a more concise update step.

$$\begin{aligned} \begin{aligned}&minimize\quad {E_D\left( \sum _{p\in \Omega }{z_p}\right) +\lambda \sum _{p\in \Omega }{E_R^p\left( k_p\right) }}\\&subject\quad {to}\quad {z_p=E_S^p{\left( {p;k_p}\right) }} \end{aligned} \end{aligned}$$
(17)

By using the method in [19] to solve the optimization problem of the model introduced in Eq. (17), the displacement field that minimizes the energy function can be obtained by the following steps:

$$\begin{aligned} {k_p^{j+1}} =\,&\underset{k_p,p\in \Omega }{argmin}{E_R^p(k_p) +\frac{\rho }{2\lambda }{||E_S^p(p;k_p)-E_S^p(p;k_p^j)+\bar{E}_S^j-\bar{z}^j+u^j||_2^2}} \end{aligned}$$
(18)
$$\begin{aligned} \bar{z}^{j+1} =\,&\underset{\bar{z}}{argmin}{E_D(L\bar{z}) +\frac{\rho {L}}{2}{||\bar{z}-u^j-\bar{E}_S^{j+1}||_2^2}}\end{aligned}$$
(19)
$$\begin{aligned} u^{j+1} =&u^j+\bar{E}_S^{j+1}-\bar{z}^{j+1} \end{aligned}$$
(20)

where \(\bar{E}_S^{j}=\frac{1}{L}{\sum _{p\in \Omega }{E_S^p(k_p^j)}}\) and \(\bar{z}^{j}=\frac{1}{L}{\sum _{p\in \Omega }{z_p^j}}\). L is the total number of pixels in the image domain \(\Omega\). The penalty factor \(\rho\), introduced by a transformation of the augmented Lagrangian formula, initially takes a constant value greater than 1. And \(\rho\) is updated in subsequent iteration steps to accelerate the convergence.

  • \(\bar{z}\)-update:

\(E_D\) can be described in terms of redundant variables \(\bar{z}\), and can be reduced to the following form:

$$\begin{aligned} E_D=\sum _{p\in \Omega }{\left( 1-z_p\right) }=L-L\bar{z} \end{aligned}$$
(21)

Moreover, it is easy to note that the gradient of \(\frac{\rho {L}}{2}||\bar{z}-u^j-\bar{E}_S^{j+1}||_2^2\) with respect to \(\bar{z}\) equals \(\rho {L}{\left( {\bar{z}-u^j-\bar{E}_S^{j+1}}\right) }\), and the gradient of \(E_D\) with respect to \(\bar{z}\) is \(-L\). Considering that the solution is optimal when the gradient of the model function reaches zero, we can obtain an analytical solution after iterative updating:

$$\begin{aligned} \bar{z}^{j+1}=\frac{1}{\rho }+u^{j}+\bar{E}_S^{j+1} \end{aligned}$$
(22)
  • u-update:

Substituting the optimal \(\bar{z}\) into the u-update, we find that \(u^{j+1}\) also has the fixed form \(\frac{1}{\rho }\). Moreover, \(u^{j+1}\) changes with the update of \({\rho }\).

  • k-update:

Combining the gradients obtained in Sect. 2, we can obtain the gradient of the function in Eq. (18) with respect to \(k_p\). Therefore, the optimal solution of \(k_p\) can be obtained by some gradient-based optimization algorithms. In this paper, we use the quasi-Newton limited-memory BFGS algorithm in the minFunc [26] package to find the optimal solution.

  • stopping criterion:

The algorithm satisfies the stopping criterion when both the primal residual and the dual residual reach very small values. The residuals can be expressed in terms of \(\bar{z}\) as follows.

$$\begin{aligned} \begin{aligned} ||r^{j+1}||_2^2&=L\cdot ||\bar{z}^{j+1}-\bar{E}_S^{j+1}||_2^2 \\ ||s^{j+1}||_2^2&=\rho \cdot ||\bar{z}^{j+1}-\bar{z}^{j}||_2^2 \end{aligned} \end{aligned}$$
(23)

Finally, the registration process is shown in Algorithm 1. Meanwhile, the algorithm can be enhanced by introducing the image pyramid method to handle more image information. The specific registration process is shown in Fig. 2.

Fig. 2
figure 2

Illustration of the registration process. Registration starts at the highest level of the pyramid and the deformation field is estimated at the highest level. Subsequently, the deformation field at the highest level is used as initialization for image registration at the next level. This method iteratively propagates the registration process from top to bottom and applies a series of similar operations. Finally, a warped image of the same size as the original image and the corresponding deformation field are obtained

Algorithm 1
figure a

Image Registration Process

3 Experiments

To validate the effectiveness of the proposed registration method, we conduct experiments on various image datasets [27, 28], including the 4D-CT image dataset and the COPD image dataset. Details of the datasets, experimental results and analysis are described below.

3.1 4D-CT Dataset

The 4D-CT dataset consists of sequences of chest CT images with landmarks in the respiratory phase. It consists of ten cases, each containing 3D CT images of the same resolution taken from ten different phases during the respiratory process of the same patient. Additionally, the dataset provides 75 expert landmark points for each image in the full respiratory cycle. In addition, the 4D-CT dataset provides an additional 300 expert landmark points for extreme expiratory phase images and extreme inspiratory phase images as references. These landmark points are used to determine whether registration has been achieved.

In the image registration experiments on the 4D-CT dataset, the extreme expiratory phase image (T50) is chosen as the moving image and the extreme inspiratory phase image (T00) is chosen as the fixed image. The images are cropped to the appropriate size by extending the range of the given landmark points, and then the voxels are resized to \(1\times {1}\times {1}\, {\text{mm}}^3\). At the same time, the intensity values of the images are restricted to [80, 900], followed by the normalization operation. After comparing the experimental results for values of \(\lambda\) ranging from 0 to 1, the coefficient that minimizes the mean TRE of the entire 4D-CT dataset is considered optimal. The registration results with an optimal coefficient value of 0.14 are chosen for display.

The 3D displacement field for Case 8 is visualized in Fig. 3. It can be observed that the voxels in the lower lobe of the lung move in an upward direction, while the voxels in the upper lobe move in a cyclonic direction, which verifies that the obtained displacement field can accurately match the motion during inspiration. It is also possible to verify the accuracy of the displacement field by comparing the expected results given in the dataset with those obtained in this paper, see Fig. 4. In order to show the displacement field of the 4D-CT dataset more clearly, the rendering of lung images with the displacement field is also shown in Fig. 5.

Fig. 3
figure 3

Visualization of displacement field in 4D-CT Case 8. Anterior (a) and posterior (b) perspectives are color-coded with the displacement magnitude in [0,15] mm

Fig. 4
figure 4

CT image slices of the 4D-CT Case 8 with displacement vectors. Coronal, axial, and sagittal views from left to right, with the displacement magnitude in [0,15] mm

Fig. 5
figure 5

Rendering image of the displacement field for 4D-CT Case 8. The displacement vector magnitude range is [0,15] mm

To visually demonstrate the accuracy of the proposed method, we select Case 5 for detailed analysis, see in Fig. 6. Figure 6a and b show fixed and moving images, respectively. Figure 6c illustrates the specific overlay differences between these images. The registration results of DADMMreg and isopTV are depicted in Fig. 6d and e, respectively. To emphasize the disparity of the results, we visualize the difference values between the images, see in Fig. 6f. We also present the difference between the warped and fixed images of the two methods in Fig. 6g and h, respectively. By comparing the degrees of difference, the registration quality can be assessed.

Figure 6g and h show that the difference between the warped and fixed images of DADMMreg is smaller than that of isopTV, which is particularly evident at the lung parenchyma boundary. Furthermore, Fig. 6 illustrates the close similarity of the two methods, observed in the proportion of regions where the color tend to bluer. Detailed analysis reveals the superior accuracy performance of our method, evident in the fewer non-zero difference values in the images. Remarkably, this distinction is particularly evident at the lung boundary, affirming the enhanced reliability of the proposed method and highlighting its performance advantage on sliding interfaces.

In addition, we compare the obtained TRE results with other methods. It is shown in Table 1 to quantify and analyze the accuracy of the proposed registration method. Among them, mask-based classical registration method [13] such as NLR, as well as MRF-based registration methods [17] such as LO-MRF and HO-MRF, and some regularization-based registration methods [18, 20, 21] such as aTV, isopTV, and pTVreg are selected for comparison of experimental results. In the table, bold data indicate the best results obtained for each Case across various methods.

The results in Table 1 show that the proposed method can achieve accurate registration on the 4D-CT dataset, achieving the TRE value of about 0.91 mm. Compared with other methods, the proposed method can reduce the registration error, obtain higher registration accuracy, and achieve the best registration results. Moreover, the performance is more balanced for each Case, with some degree of robustness.

Table 1 Average TRE results on 4D-CT dataset image registration experiments

3.2 COPD Dataset

The Chronic Obstructive Pulmonary Disease (COPD) dataset provided by the DIR database, also provides 10 Cases with landmark points. However, unlike the 4D-CT dataset, the COPD dataset only provides 3D-CT lung images at extreme phases and 300 landmark points per image as a final reference.

In the experiments on the COPD dataset, the end-expiratory phase image (T50) is also set as the moving image, and the end-inspiratory phase image (T00) is set as the fixed image. The images are cropped to the corresponding size by extending the labeled range, and the image voxels are resized to \(1\times {1}\times {1}\,{\text{mm}}^3\). Additionally, the intensity values of the images are constrained to be within \(\left[ 80,900\right]\), followed by a normalization operation. After comparing the experimental results under \(\lambda \in \left[ 0,1\right]\), the registration results at \(\lambda =0.04\) that minimizes the mean TRE of the entire COPD dataset are selected for display.

Similarly, the 3D displacement field visualization for Case 8 in the COPD dataset is shown in Fig. 7. To see the movement of the displacement field more clearly, the lung slice image with the displacement field is shown in Fig. 8. Also, the rendered lung image with displacement field is visualized in Fig. 9. The accuracy of the obtained displacement field can be clarified by examining its direction and comparing it with the displacement field images supplied in the dataset.

Fig. 6
figure 6

Comparison results for 4D-CT Case 5. a Fixed image, b moving image, c overlay difference between a and b, d warped image of DADMMreg, e warped image of isopTV, f difference value between d and e, g, h are the difference results between warped and fixed images using DADMMreg and isopTV, respectively

Fig. 7
figure 7

Visualization of the 3D displacement field in registered COPD Case 8. Results of anterior (a) and posterior (b) perspectives. Color-coded vector map with the displacement magnitude range from [0,60] mm

Fig. 8
figure 8

CT image slices of COPD Case 8 with overlaid displacement maps. Coronal, axial, and sagittal views from left to right. The magnitude of the displacement vector ranges from 0 to 60 mm

Fig. 9
figure 9

Rendering image of the displacement field for COPD Case 8. The displacement vector magnitude range is [0,60] mm

At the same time, using the isopTV method as a comparison, Case 6 is selected for specific analysis, and the effectiveness of the proposed method is more visually illustrated in Fig. 10. As shown in Fig. 10, the direction of the displacement field obtained by DADMMreg is approximately the same as that obtained by isopTV. In the lower right side of the pictures, it is observed that the displacement field obtained by DADMMreg exhibits a smaller difference compared to that obtained by isopTV. This is evident from the bluer color tones and the reduced color fluctuation range. These observations collectively highlight the superior registration quality achieved by DADMMreg.

Moreover, the quantitative results of registration are shown in Table 2 accordingly. Mask-based classical registration method [13] such as NLR, and MRF-based registration method [16, 17] such as MRF, LO-MRF and HO-MRF, as well as regularization-based methods [20, 21] such as isopTV and pTVreg, are selected for the comparison results.

The results in Table 2 show that DADMMreg achieves a TRE value of about 0.92 mm, demonstrating favorable registration performance in experiments conducted on the COPD dataset. Although the proposed method does not perform the best, it still holds an advantage over some methods and attains sub-optimal registration results. Compared to the best performance in pTVreg, the mean TRE difference between them is only about 0.07 mm. The experimental results further demonstrate that DADMMreg can achieve better registration quality and accuracy.

Table 2 Average TRE results on COPD dataset image registration experiments

3.3 Verification of Regularization Term

To assess the smoothing effect of the proposed regularization term, comparison experiments are performed on COPD datasets. Specifically, Case 1 is selected for the experiments. And we compare the registered displacement field when the SSIM variant is used solely with the registered displacement when the regularization term is incorporated. The results are presented in Fig. 11. Moreover, to clarify the effect of the proposed regularization term on the displacement field at the sliding interface, we perform a comparison with two types of regularization terms: the \(\hbox {L}_2\) regularization term and the isopTV regularization term. As shown in Fig. 12.

The experimental results in Fig. 11 demonstrate that the incorporation of the proposed regularization term effectively enhances the smoothness of the displacement field in the registration results. There is a notable reduction in displacement folding, and the extent of displacement distortion is mitigated. Simultaneously, parameter tuning is performed for reference. The results show that larger parameters generally contribute to better smoothing effects. However, it is observed that excessive values may compromise the accuracy of the displacement prediction. After empirical tests, a final parameter value of 0.04 is chosen to strike a balance between achieving enhanced smoothness and maintaining accurate displacement predictions.

In addition, as the landmark in Fig. 12 sits at the sliding interface, the surrounding displacement field tends to be non-smooth. From Fig. 12, it is evident that the proposed regularization term outperforms \(\hbox {L}_2\) regularization, notably seen in the small and uniform displacement on the right side of the landmark. In comparison with isopTV regularization, the proposed regularization term effectively avoids over-smoothing, which is evident in the displacement field curve around the landmark.

Fig. 10
figure 10

Displacement fields at 300 landmark points in COPD Case 6. Results of isopTV (a) and DADMMreg (b). The color-coded modulus difference between the estimated and the true displacement, ranging from 0 to 16 mm

Fig. 11
figure 11

Comparison of the image registration displacement field. a Fixed image, b moving image, ce are the displacement field after registration with \(\lambda =0\), \(\lambda =0.04\), \(\lambda =0.1\), respectively

Fig. 12
figure 12

Comparison of displacement fields with various regularization terms. Left: fixed image; middle: moving image; right: displacements of the pixel at the horizontal position of the red landmark. The green line is used to indicate the position

3.4 Verification of Optimization Algorithm

To test the proposed optimization algorithm, we compare DADMMreg with isopTV and pTVreg for the same problem formulation. Figure 13 compares the image registration energy E(d(k)) during iterations and Fig. 14 compares the mean TRE for one paired image. The experiment uses the same deformation field initialization and is performed at the same pyramid level (i.e., the single layer with the original resolution) to facilitate visualization of the results.

Fig. 13
figure 13

Comparison of the image registration energy. Experiment results on 4D-CT Case 1 (a) and COPD Case 8 (b)

Fig. 14
figure 14

Comparison of the mean TRE value. Experiment results on 4D-CT Case 1 (a) and COPD Case 8 (b)

From the figures, it can be seen that similar to isopTV and pTVreg, the energy and TRE values obtained by DADMMreg are decreasing. However, unlike isopTV, DADMMreg is able to converge at lower energy values with lower error, indicating better optimization performance of DADMMreg. Furthermore, DADMMreg achieves convergence in fewer steps than pTVreg and achieves comparable results, suggesting the enhanced ability of DADMMreg to avoid local extrema during optimization.

4 Discussions

A thorough understanding of the impact of attention allocation strategies is important for improving registration performance and evaluating the contribution of metrics. In DADMMreg, the decomposition-based optimization framework is maintained and the attention allocation strategy is optimized by prioritizing the optimization of the regularization term, leading to better convergence performance. Unlike the prevailing trend in deep learning, DADMMreg diverges by not relying on convolutional learning strategies or extensive datasets. Compared to existing methods, DADMMreg achieves the lowest average error on the 4D-CT dataset and suboptimal performance on the COPD dataset. Nonetheless, we believe there is still something worth discussing.

First, the experimental results show that DADMMreg exhibits lower registration performance in images with high mean intensity and significant organ displacement (i.e., COPD dataset). This is mainly attributed to the use of consistent parameters across different motion estimates. In general, the use of homogeneous smoothing priors poses a challenge when approximating non-smooth motions [20], leading to mis-registration issues. Future research should focus on optimizing the constraints on the regularization term to take into account the trade-off between the non-smooth interface (e.g., organ boundary) and the smooth part (e.g., organ parenchyma).

Additionally, excessive interpolation can negatively impact registration performance. In the case of DADMMreg, its pyramid strategy involves frequent interpolation. Although DADMMreg can achieve satisfactory performance at a single resolution level, the frequent interpolation may result in the loss of original image details, thereby affecting registration accuracy. In turn, reducing the frequency of interpolation may pose challenges in learning complex features. Therefore, future research should focus on finding a strategy to strike a balance between interpolation frequency and registration accuracy.

Since the study [29] emphasizes the importance of accuracy and reliability of registration methods in clinical applications, it is worth further research to enhance performance in these aspects. Despite the potential impact of image quality on registration performance, leading to failures with low-quality images, DADMMreg introduces innovative concepts and a distinctive problem-solving approach. Its methodological innovations positively contribute to advancing research in this area.

5 Conclusion

In this paper, the DADMM algorithm is employed for optimizing image registration, enabling parallel processing and demonstrating satisfactory convergence performance. Additionally, we introduce the vector-modulus-based regularization metric and combine it with the modified SSIM similarity metric for registration experiments. Experimental results on the lung medical image datasets demonstrate the effectiveness of DADMMreg. At the same time, the experimental results show that modifying the optimization order to adjust the attention allocation enables the algorithm to converge in fewer steps. However, the impact of DADMMreg on registration accuracy may be limited. This finding is essential for evaluating registration performance and developing implementation strategies for future studies.