Multi-modality Image Registration Models and Efficient Algorithms

Zhang, Daoping; Theljani, Anis; Chen, Ke

doi:10.1007/978-981-16-2701-9_3

Daoping Zhang⁴,
Anis Theljani⁴ &
Ke Chen⁴

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 360))

Included in the following conference series:

International Workshop On Image Processing and Inverse Problems

562 Accesses

Abstract

In this Chapter we discuss multi-modality image registration models and efficient algorithms. We propose a simple method to enhance a variational model to generate a diffeomorphic transformation. The idea is illustrated by using a particular model based on reformulated normalized gradients of the images as the fidelity term and higher-order derivatives as the regularizer. By adding a control term motivated by quasi-conformal maps and Beltrami coefficients, the model has the ability to guarantee a diffeomorphic transformation. Without this feature, the model may lead to visually pleasing but invalid results. To solve the model numerically, we present both a Gauss-Newton method and an augmented Lagrangian method to solve the resulting discrete optimization problem. A multilevel technique is employed to speed up the initialization and reduce the possibility of getting local minima of the underlying functional. Finally numerical experiments demonstrate that this new model can deliver good performances for multi-modal image registration and simultaneously generate an accurate diffeomorphic transformation.

Dedicated to Professor R. H. Chan on the Occasion of His 60th Birthday.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Two-dimensional diffeomorphic model for multi-modality image registration

Article 21 December 2022

Improved optimization methods for image registration problems

Article 24 February 2018

A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Article Open access 10 April 2018

Keywords

AMS subject classifications

1 Introduction

Working on a pair of images of the same object taken at different times or acquired using different devices, image registration aims to either find differences between them or fuse complementary information to each other which is otherwise not possible with a single modality. In either case, the key is to find a reasonable spatial geometric transformation between these two images. Though the task is required in diverse fields such as astronomy, optics, biology, chemistry and remote sensing and particularly in medical imaging, and much work have been done, getting a robust model for the task is still a challenge. For an overview of image registration methodologies and approaches, especially for registering images acquired by the same modality (e.g. CT-CT), we refer to [17, 18, 33, 35, 40]. For a more recent survey, see [8]. This Chapter is mainly concerned with registering two images from different modalities (e.g. CT-MRI or digital-Infrared) and focuses on one important question of how to impose a constraint so that the underlying transformation is diffeomorphic.

The image registration problem can be described as follows: given a fixed image R (the reference) and a moving image T (the template), both represented by scalar function mappings over $\varOmega \subset \mathbb {R}^d\longrightarrow \mathbb {R}$, find a suitable geometric transformation $\boldsymbol{\varphi }(\boldsymbol{x}) = \boldsymbol{x} + \boldsymbol{u}(\boldsymbol{x}),\ \boldsymbol{u} : \mathbb {R}^d \longrightarrow \mathbb {R}^d$ such that

$$\begin{aligned} G_1(T [\boldsymbol{\varphi }])=G_1(T(\boldsymbol{x}+\boldsymbol{u}(\boldsymbol{x}))\approx G_2(R), \end{aligned}$$

(1)

where $G_1, G_2$ must be chosen suitably in multi-modality scenario, because only features or patterns in T, R visually resemble each other, not their given intensities. In contrast, in mono-modality registration where intensities as well as features in T, R resemble each other, we have $G_i(\cdot )=I_d,\ (i=1,2)$ or $T\approx R$ pixel wise. In the special case of parametric models, the solution $\boldsymbol{u}$ (or $\boldsymbol{\varphi }$) is assumed to belong to some linear spanned space with known Ansatz functions, depending on few parameters (e.g. affine with 6 parameters in 2D or 12 parameters in 3D). However, not all problems can be solved by parametric models.

Here, we focus on variational models for deformable non-parametric image registration where the unknown $\boldsymbol{u}$ sought in a properly chosen functional space is not assumed to have any parametric forms. The reconstruction problem based on model (1) is an ill-posed inverse problem and thus regularization techniques are needed to overcome ill-posedness [7, 11, 13, 14, 21, 30, 31, 47]. Generally speaking, a regularization technique turns the ill-posed problem (1) into a well-posed optimization model

$$\begin{aligned} \min _{ \mathbf{u}\in \mathcal {H}} \Big \lbrace \mathcal {J}(\mathbf{u})=S(\boldsymbol{u})+\frac{\lambda }{2} D(T(\boldsymbol{x}+\boldsymbol{u}), R)\Big \rbrace \end{aligned}$$

(2)

where the displacement $\boldsymbol{u}$ is a minimizer of the above joint energy functional and $\lambda $ is a positive weight which controls the trade-off between them.

In (2), the first term $S(\boldsymbol{u})$ is a regularization term which controls the smoothness of $\boldsymbol{u}$ and reflects our expectations in penalizing unlikely transformations. Various regularizers have been proposed, such as first-order derivatives-based on total variation [10, 23], diffusion [15] and elastic regularizer registration models, higher-order derivatives-based on linear curvature [16], mean curvature [12], Gaussian curvature [24], and fractional order derivatives based models [50]; refer also to [11, 31, 44, 51, 52].

The second term $D(T (\boldsymbol{x}+\boldsymbol{u}),R)$ is a fidelity measure, which quantifies distance or similarity between the transformed template image $T(\boldsymbol{x}+\boldsymbol{u})$ and the reference R. For mono-modal registration, a widely-used data fidelity term $D(T (\boldsymbol{x}+\boldsymbol{u}),R)$ is the sum of squared differences $D=\Vert T (\boldsymbol{x}+\boldsymbol{u})-R\Vert ^2_2\equiv \mathrm {SSD}(T (\boldsymbol{x}+\boldsymbol{u}),R)$ to measure the difference between the reference image R and the deformed template image $T(\boldsymbol{x}+\boldsymbol{u})$. However for multi-modality registration, the choice of $D(T (\boldsymbol{x}+\boldsymbol{u}),R)$ is more challenging. The main issue is how to design the right (or rather better) similarity measures that can support the difference (in features, colours, gradients, illumination etc.) between images from different modalities (e.g. SSD no longer makes sense). Various measures have been proposed and tested in the literature. Designing a measure which is based on the geometric information such as the gradients of the images is a good choice. See for instance the normalized gradient field ($\mathbf {NGF}$) [22, 26, 39], edges sketching registration [1], normalized gradient fitting ($\mathbf {GT}$) [22, 43] and Mutual Information [29, 37, 46]. Recently [9] proposed a cross-correlation similarity measure based on reproducing kernel Hilbert spaces and found advantages over Mutual Information.

Many models in the literature, of type (2), do not usually contain constraints to ensure that $\boldsymbol{\varphi }(\boldsymbol{x})$ is a diffeomorphic map for the mono-modal registration. And even fewer theoretical or experimental studies deal with diffeomorphic maps for the multi-modal registration. But non-diffeomorphic maps cause phenomena such as folding or tearing which are usually seen as non-natural transformations between the two images, unless $\lambda $ is small (implying a poor registration fidelity error). Over the last decade, more and more researchers have focused on diffeomorphic image registration where folding measured by the local invertibility quantity $\det (J_{\boldsymbol{\varphi }})$ is reduced or avoided where $\det (J_{\boldsymbol{\varphi }})$ is the Jacobian determinant of $\boldsymbol{\varphi }$. Under desired assumptions, obtaining a one-to-one mapping is a natural choice, see [7, 14, 19, 20].

After surveying a few models of type (2) for multi-modal images, this Chapter shows how to incorporate a suitable constraint into a model so that it can deliver a diffeomorphic map. We illustrate our idea by a specific model: minimizing a new functional based on using reformulated normalized gradients of the images as the fidelity term [43], higher-order derivatives and a new Beltrami coefficient based term [28, 48]. An effective, iterative scheme is also presented and numerical experimental results show that the new registration model has a good performance.

2 Review of Related Models

For a variational image registration model (2), while there exist many choices for a regularizer $S(\boldsymbol{u})$ such as the diffusion operator or the Laplacian [8], below, we briefly review a few of such choices of $D(T(\boldsymbol{x}+\boldsymbol{u}), R)$ for registering a pair of multi-modal images T, R.

Normalized Gradient Field (NGF) and its variants. The basic idea of NGF [22, 26, 39] is the use of a derived information from the image intensity, i.e., the gradient. Similarity measures depending on the gradients or geometry of the images, which naturally encode information about the shape, can be better. The aim is to align the gradients $\nabla T(\boldsymbol{x}+\boldsymbol{u})$ and $\nabla R$ by minimizing the cosines distance between them. More precisely, on each point $\boldsymbol{x} \in \varOmega $, try to find a displacement $\boldsymbol{u}(x)$ such that $\cos \Theta =1$ where $\Theta $ is the angle between $\nabla T(\boldsymbol{x}+\boldsymbol{u})$ and $\nabla R$, which leads to minimizing the similarity term:

$$\begin{aligned} D^{NGF}(T(\boldsymbol{x}+\boldsymbol{u}),R)=\int \limits _\varOmega (1- (\cos \Theta )^2)\,\mathrm {d}\boldsymbol{x}=\int \limits _\varOmega (1- (\nabla _n T(\boldsymbol{x}+\boldsymbol{u})\cdot \nabla _n R)^2)\,\mathrm {d}\boldsymbol{x}, \end{aligned}$$

(3)

where $\nabla _n T(\boldsymbol{x}+\boldsymbol{u}) =\nabla T(\boldsymbol{x}+\boldsymbol{u}) / |\nabla T(\boldsymbol{x}+\boldsymbol{u})| $ and $\nabla _n R= \nabla R / |\nabla R|$ are normalized unit vectors. An alternative form of the NGF that avoids using terms $\nabla _n T(\boldsymbol{x}+\boldsymbol{u}) $ and $\nabla _n R $ which are degenerated in homogeneous regions, reformulate NGF as

$$\begin{aligned} D^{NGF}(T(\boldsymbol{x}+\boldsymbol{u}),R)=\int \limits _\varOmega (|\nabla T(\boldsymbol{x}+\boldsymbol{u})|^2 |\nabla R|^2- (\nabla T(\boldsymbol{x}+\boldsymbol{u})\cdot \nabla R)^2)\,\mathrm {d}\boldsymbol{x}, \end{aligned}$$

(4)

Mutual Information (MI). It was firstly proposed in [46] and has been studied in various literatures (see [29, 37]), showcasing its great capability as well as limitations. The basic idea is to compare the histograms of the images by exploiting the following quantity

$$\begin{aligned} D^{MI}(T(\boldsymbol{x}+\boldsymbol{u}),R)=-\int \limits _{\mathbb {R}^2} p_{T,R}(t,r) \log \dfrac{p_{T,R}(t,r) }{p_{T}(t) p_{R}(r) }\,\mathrm {d}t\mathrm {d}r, \end{aligned}$$

(5)

where $p_R, p_T$ are probability distributions of the gray values in R and T, while $p_{T,R}$ is the joint probability of the gray values which can be derived from the joint histogram. The main drawback of $\mathbf {MI}$ is its sensibility to image quantization and the difficulty in estimating the joint probability density function (PDF). In addition, the measure also fails when two features with different intensities in one image have similar intensities in the other one [27].

Maximum Correlation Coefficient (MCC). It is an extension of well-known Normalized cross correlation ($\mathbf {CC}$) measure , which is only efficient for mono-modal images [6, 33], to a measure that is able to handle multi-modal images [9]. The similarity measure is defined by

$$D^{MCC}(T(\boldsymbol{x}+\boldsymbol{u}),R)=(1- \mathbf {MCC}(T,R))^p:= (1- \max _{f,g}\mathbf {CC}(M,N))^p,\; 0<p<1,$$

where $M(\boldsymbol{x}) = f(T(\boldsymbol{x} + \boldsymbol{u})$, $N(\boldsymbol{x}) = g(R(\boldsymbol{x}))$, f and g are two measurable functions. This $\mathbf {MCC}$ formulation does not require estimation of the continuous joint PDF and offers a powerful alternative to the models based on maximizing $\mathbf {MI}$. However. the computation of the maximum over all functions f and g is a big challenge. The recommended approach in [9] is to approximate it based on the theory of reproducing kernel Hilbert space ($\mathbf {RKHS}$) [2, 5].

3 The New Model

We aim to design a variational model building on the energy of the form (2)

$$\begin{aligned} \min _{ \mathbf{u}\in \mathcal {H}} \Big \lbrace \mathcal {J}(\mathbf{u})=S(\boldsymbol{u})+D(T(\boldsymbol{x}+\boldsymbol{u}), R) + \gamma C(\boldsymbol{u})\Big \rbrace \end{aligned}$$

(6)

which is comprised of three building blocks: a data fidelity term with similarity measure D, a regularization term S and a control term C. The emphasis of this Chapter is how to choose C. To do this for a concrete model, we now specify our choice of all three terms.

3.1 Data Fitting

We consider a similarity measure based on the gradient information [43]. This measure is motivated by the standard NGF [22, 32] and it primarily explores the potential of normalized gradients beyond its standard form. We shall consider normalized gradients fitting combined with a measure based on the triangular similarity inequality. More precisely, we consider the following fitting term

$$\begin{aligned} D(T(\boldsymbol{x}+\boldsymbol{u}), R)= D^{GF}(\boldsymbol{u})+\alpha D^{TM}(\boldsymbol{u}) \end{aligned}$$

(7)

where GF stands for ‘gradient filed difference’ and TM for ‘Triangular Measure’ with

$$\begin{aligned} \begin{aligned} D^{GF}(\boldsymbol{u})&=\int \limits _{\varOmega }|\nabla _{n}T(\boldsymbol{x}+\boldsymbol{u})-\nabla _{n}R|^{2}\mathrm {d}\boldsymbol{x},\\ D^{TM}(\boldsymbol{u})&=\int \limits _{\varOmega }(|\nabla T(\boldsymbol{x}+\boldsymbol{u})|+|\nabla R|-|\nabla T(\boldsymbol{x}+\boldsymbol{u})+\nabla R|)^{2}\mathrm {d}\boldsymbol{x}. \end{aligned} \end{aligned}$$

3.2 Regularization

A regularizer controls the smoothness. Our primary choice for smoothness control is the diffusion model [15] which uses first-order derivatives and promotes smoothness. As affine linear transformations are not included in the kernel of the $H^1$-regularizer, we desire a regularizer which can penalize such transformation. As such, we add the regularizer based on second-order derivatives (LLT) to the model which allows to remove the need of any pre-registration step of affine transformations. The second-order derivatives allows also getting smooth transformations [52]. Our adopted regularizer is given by

$$\begin{aligned} S(\boldsymbol{u})=\frac{\beta _{1}}{2}S_{1}(\boldsymbol{u})+\frac{\beta _{2}}{2}S_{2}(\boldsymbol{u}) \end{aligned}$$

(8)

where

$$\begin{aligned} \begin{aligned} S_{1}(\boldsymbol{u})&=\int \limits _{\varOmega }|\nabla \boldsymbol{u}|^{2}\mathrm {d}\boldsymbol{x}, \;\;\;\;\ S_{2}(\boldsymbol{u})=\int \limits _{\varOmega }|\nabla ^{2} \boldsymbol{u}|^{2}\mathrm {d}\boldsymbol{x}.\end{aligned} \end{aligned}$$

3.3 Invertibility

A diffeomorphic map ensures local invertibility of the map and this is achievable by a control term C that imposes the constraint $\det (J_{\boldsymbol{\varphi }})>0$ at any $\boldsymbol{x}\in \varOmega $. This latter idea is much used in the literature with somewhat limited success because either strong assumptions on T, R or compromised fidelity error are required; see tests and remarks from [48]. Here, instead of controlling $\det (J_{\boldsymbol{\varphi }})$ directly, we control the Beltrami coefficient [48] in getting a diffeomorphic map and propose the use of

$$\begin{aligned} C(\boldsymbol{u}) = \!\int \limits _{\varOmega }\!\phi (|\mu (\boldsymbol{u})|^{2})\mathrm {d}\boldsymbol{x}, \end{aligned}$$

(9)

where $\phi (v)=\frac{v^{2}}{(v-1)^{2}}$ and $|\mu (\boldsymbol{u})|^{2}=\frac{(\partial _{x_{1}}u_{1}-\partial _{x_{2}}u_{2})^{2} +(\partial _{x_{2}}u_{1}+\partial _{x_{1}}u_{2})^{2}}{(\partial _{x_{1}}u_{1}+\partial _{x_{2}}u_{2}+2)^{2} +(\partial _{x_{2}}u_{1}-\partial _{x_{1}}u_{2})^{2}}$.

One notes that our choice of the first two terms S, D for (6) is quite common while the third term [48] is relatively new to readers. This is the key idea of this Chapter: an old, non-diffeomorphic, variational model of form (2) can be converted to a diffeomorphic model by adding a control term such as C from (9). This can be done in 2D and also in 3D following our recent work. It should be remarked that model (6) is non-convex so its solutions are not unique (as true for all registration models). However we can show that the model admits at least one solution in the space $W^{2,2}(\varOmega )$, following the idea of [49].

4 The Solution Algorithm

Here, we choose first-discretize-then-optimize method, namely directly discretize the variational model to get a discrete optimization problem and then use optimization methods to solve this resulting optimization problem. In this section we focus on a Gauss-Newton (G-N) method and in the next section we briefly introduce another alternating iteration method just before numerical results are shown.

4.1 Discretization

In the implementation, we employ the nodal grid and define a spatial partition

$$\varOmega _{h}^{n} = \{\boldsymbol{x}^{i,j}\in \varOmega | \boldsymbol{x}^{i,j} =(x_{1}^{i},x_{2}^{j})=(ih,jh), 0 \le i \le n , 0 \le j \le n\},$$

where $h = \frac{1}{n}$ and the discrete domain consists of $n^{2}$ cells of size $h \times h$. We discretize the displacement field $\boldsymbol{u}$ on the nodal grid, namely $\boldsymbol{u}^{i,j} = (u_{1}^{i,j},u_{2}^{i,j}) = (u_{1}(x_{1}^{i},x_{2}^{j}), u_{2}(x_{1}^{i},x_{2}^{j}))$. By lexicographical ordering, we reshape four matrices to two long vectors of dimension $\mathbb {R}^{2(n+1)^{2}\times 1}$

$$\begin{aligned} X&= (x_{1}^{0},x_{1}^{1},...,x_{1}^{n},\ldots , x_{1}^{0},x_{1}^{1},...x_{1}^{n}, x_{2}^{0},x_{2}^{0},...,x_{2}^{0},\ldots ,x_{2}^{n},x_{2}^{n},...x_{2}^{n})^{T}, \\ U&= (u_{1}^{0,0}, ..., u_{1}^{n,0},\ldots ,u_{1}^{0,n}, ..., u_{1}^{n,n}, u_{2}^{0,0}, ..., u_{2}^{n,0},\ldots ,u_{2}^{0,n}, ..., u_{2}^{n,n})^{T}. \end{aligned}$$

4.1.1 Discretization of Fitting Term

Firstly, set $\mathbf {R} = \mathbf {R}(PX) \in \mathbb {R}^{n^{2}\times 1}$ as the discretized reference image and $\mathbf {T}(PX+PU) \in \mathbb {R}^{n^{2}\times 1}$ as the discretized deformed template image, where $P \in \mathbb {R}^{2n^{2} \times 2(n+1)^{2}}$ is an average matrix from the nodal grid to the cell-centered grid. In order to discretize $\nabla T$ and $\nabla R$, we introduce two discrete operators: $D_{1}= I_{n}\otimes \partial _{h}^{1}$ and $D_{2}=\partial _{h}^{1}\otimes I_{n}$, where

$$\begin{aligned} \partial _{h}^{1} = \frac{1}{2h}\begin{bmatrix} -1 &{} 1 \\ -1 &{} 0&{} 1\\ &{}...&{}...&{}...&{} \\ &{} &{} -1 &{}0 &{}1 \\ &{} &{} &{} -1&{} 1 \end{bmatrix}\in \mathbb {R}^{n\times n}. \end{aligned}$$

Hence, the discretized $\nabla T$ and $\nabla R$ are $[D_{1}\mathbf {T}, D_{2}\mathbf {T}]$ and $[D_{1}\mathbf {R}, D_{2}\mathbf {R}]$ respectively. Set $\mathrm {LT} = (\sum _{i=1}^{2}D_{i}\mathbf {T}\odot D_{i}\mathbf {T}+\epsilon )^{.1/2}$, $\mathrm {LR} = (\sum _{i=1}^{2}D_{i}\mathbf {R}\odot D_{i}\mathbf {R}+\epsilon )^{.1/2}$ and $\mathrm {LTR} = (\sum _{i=1}^{2}D_{i}(\mathbf {T}+\mathbf {R})\odot D_{i}(\mathbf {T}+\mathbf {R})+\epsilon )^{.1/2}$, where $\odot $ indicates component-wise product and $(\cdot )^{.1/2}$ indicates the component-wise square root.

Then for $D^{GF}(\boldsymbol{u})$ and $D^{TM}(\boldsymbol{u})$, we have the following discretizations:

$$\begin{aligned} D^{GF}(\boldsymbol{u})\approx h^{2}p_{1}^{T}p_{1}, \quad D^{TSM}(\boldsymbol{u})\approx h^{2}p_{2}^{T}p_{2}, \end{aligned}$$

(10)

where (using ./ to indicate the component-wise division)

$$\begin{aligned} p_{1}&= [D_{1}\mathbf {T}./\mathrm {LT}-D_{1}\mathbf {R}./\mathrm {LR}; D_{2}\mathbf {T}./\mathrm {LT}-D_{2}\mathbf {R}./\mathrm {LR}]\\ p_{2}&= \mathrm {LT}+\mathrm {LR}-\mathrm {LTR}. \end{aligned}$$

4.1.2 Discretization of Regularization Term

The first-order regularization term can be discretized into the following form:

$$\begin{aligned} S_{1}(\boldsymbol{u}) \approx h^{2}\sum _{i=0}^{n-1}\sum _{j=0}^{n-1}\sum _{l=1}^{2} \big (\frac{u_{l}^{i+1,j}-u_{l}^{i,j}}{h} \big )^{2} +\big (\frac{u_{l}^{i,j+1}-u_{l}^{i,j}}{h}\big )^{2} \end{aligned}$$

(11)

by using the forward difference and mid-point rule.

Define $B_{1} = I_{n+1}\otimes \partial _{h}^{2} \in \mathbb {R}^{(n+1)^{2}\times (n+1)^{2}}$, $C_{1} = \partial _{h}^{2}\otimes I_{n+1} \in \mathbb {R}^{(n+1)^{2}\times (n+1)^{2}}$,

$$\begin{aligned} \partial _{h}^{2} = \frac{1}{h}\begin{bmatrix} -1 &{} 1 \\ ...&{}...&{}...&{} \\ &{} -1 &{}1 &{} \\ &{} &{} 0&{} \end{bmatrix}\in \mathbb {R}^{(n+1)\times (n+1)}, \ A_{1} = \begin{bmatrix} B_{1}&{}0\\ C_{1}&{}0\\ 0&{}B_{1}\\ 0&{}C_{1} \end{bmatrix}\in \mathbb {R}^{4(n+1)^{2}\times 2(n+1)^{2}}, \end{aligned}$$

where $\otimes $ denotes the Kronecker product. Then (11) can be rewritten into the following form (noting $U\in \mathbb {R}^{2(n+1)^2\times 1}$)

$$\begin{aligned} S_{1}(\boldsymbol{u}) \approx h^{2}U^{T}A_{1}^{T}A_{1}U. \end{aligned}$$

(12)

The second-order regularization term can be discretized into the following:

$$\begin{aligned} S_{2}(\boldsymbol{u})&\approx h^{2}\sum _{i=0}^{n-1}\sum _{j=0}^{n-1}\sum _{l=1}^{2} \big (\frac{u_{l}^{i+1,j}-2u_{l}^{i,j}+u_{l}^{i-1,j}}{h^{2}})^{2} + (\frac{u_{l}^{i,j+1}-2u_{l}^{i,j}+u_{l}^{i,j-1}}{h^{2}}\big )^{2} \nonumber \\&+ 2h^{2}\sum _{i=0}^{n-1}\sum _{j=0}^{n-1}\sum _{l=1}^{2}\big (\frac{u_{l}^{i,j}-u_{l}^{i+1,j}-u_{l}^{i,j+1}+u_{l}^{i+1,j+1}}{h^{2}}\big )^{2} \end{aligned}$$

(13)

by using the central difference, mid-point rule and Neumann boundary conditions ($l=1,2$): $ u_{l}^{i,0} = u_{l}^{i,-1}, u_{l}^{i,n} = u_{l}^{i,n+1}, u_{l}^{0,j} = u_{l}^{-1,j}, u_{l}^{n,j} = u_{l}^{n+1,j}. $

Further define $B_{21} = I_{2}\otimes (I_{n+1}\otimes \partial _{h}^{3})$, $B_{22} = I_{2}\otimes (\partial _{h}^{3}\otimes I_{n+1})$, $C_{2} = I_{2}\otimes (E\otimes E)$, $\tau _1=(n+1)\times (n+1)$, $\tau _2=n\times (n+1)$, where

$$\begin{aligned} \partial _{h}^{3} = \frac{1}{h^{2}}\begin{bmatrix} -1 &{} 1 \\ 1 &{} -2&{} 1\\ &{}...&{}...&{}...&{} \\ &{} &{} 1 &{}-2 &{}1 \\ &{} &{} &{} 1 &{} -1 \end{bmatrix}\in \mathbb {R}^{\tau _1},\quad E = \frac{1}{h}\begin{bmatrix} -1 &{} 1 \\ &{} -1&{} 1\\ &{}...&{}...&{}...&{} \\ &{} &{} -1 &{}1 &{} \\ &{} &{} &{} -1 &{} 1 \end{bmatrix}\in \mathbb {R}^{\tau _2}. \end{aligned}$$

Then (13) can be rewritten into the following form

$$\begin{aligned} S_{2}(\boldsymbol{u}) \approx h^{2}U^{T}A_{2}U,\quad A_{2} = B_{21}^{T}B_{21}+B_{22}^{T}B_{22}+2C_{2}^{T}C_{2}. \end{aligned}$$

(14)

4.1.3 Discretization of Control Term

Note that $\phi (|\mu ( \boldsymbol{u})|^{2})$ involves only first order derivatives and all $\boldsymbol{u}^{i,j}$ are available at vertex pixels. Thus it is convenient first to obtain approximations at all cell centers (e.g. at $V_5$ in Fig. 1) and second to use local linear elements to facilitate first order derivatives. We shall divide each cell (Fig. 1) into 4 triangles. In each triangle, we construct two linear interpolation functions to approximate the $u_{1}$ and $u_{2}$. Consequently, all partial derivatives are locally constants or $\phi (|\mu ( \boldsymbol{u})|^{2})$ is constant in each triangle.

Set $\mathbf{L} ^{i,j,k}(\boldsymbol{x})= (L_{1}^{i,j,k}(\boldsymbol{x}),L_{2}^{i,j,k}(\boldsymbol{x}))= (a^{i,j,k}_{1}x_{1}+a^{i,j,k}_{2}x_{2}+a^{i,j,k}_{3}, a^{i,j,k}_{4}x_{1}+a^{i,j,k}_{5}x_{2}+a^{i,j,k}_{6})$, which is the linear interpolation for $\boldsymbol{u}$ in the $\varOmega _{i,j,k}$. Note that $\partial _{x_{1}} L^{i,j,k}_{1} = a^{i,j,k}_{1}, \partial _{x_{2}} L^{i,j,k}_{1} = a^{i,j,k}_{2},\partial _{x_{1}} L^{i,j,k}_{2} = a^{i,j,k}_{4}$ and $\partial _{x_{2}} L^{i,j,k}_{2} = a^{i,j,k}_{5}$. Then according to the partition in Fig. 1, we have

$$\begin{aligned} \begin{aligned} C(\boldsymbol{u})=&\int \limits _{\varOmega }\phi (|\mu (\boldsymbol{u})|^{2})\mathrm {d}\boldsymbol{x}\\ \approx&\frac{h^{2}}{4}\sum _{i=1}^{n}\sum _{j=1}^{n}\sum _{k=1}^{4} \phi \Big (\frac{(a^{i,j,k}_{1}-a^{i,j,k}_{5})^{2} +(a^{i,j,k}_{2}+a^{i,j,k}_{4})^{2}}{(a^{i,j,k}_{1}+a^{i,j,k}_{5}+2)^{2} +(a^{i,j,k}_{2}-a^{i,j,k}_{4})^{2}}\Big ). \end{aligned} \end{aligned}$$

(15)

To simplify (15), define 3 vectors $\mathbf { r}(U), \mathbf { r}^{1}(U), \mathbf {r}^{2}(U)$ $\in \mathbb {R}^{4n^{2}}$ by $\mathbf {r}(U)_{\ell }=\mathbf { r}^{1}(U)_{\ell } \mathbf { r}^{2}(U)_{\ell }$, $\mathbf { r}^{1}(U)_{\ell }=(a^{i,j,k}_{1}-a^{i,j,k}_{5})^{2} +(a^{i,j,k}_{2}+a^{i,j,k}_{4})^{2}$, $\mathbf { r}^{2}(U)_{\ell }=1\big /[(a^{i,j,k}_{1}+a^{i,j,k}_{5}+2)^{2} +(a^{i,j,k}_{2}-a^{i,j,k}_{4})^{2}]$ where $\ell = (k-1)n^{2}+(j-1)n+i\ \in [1, 4n^2]$.

Hence, (15) becomes

$$\begin{aligned} C(\boldsymbol{u}) \approx \frac{h^{2}}{4}\boldsymbol{\phi }(\mathbf { r}(U))e^{T} \end{aligned}$$

(16)

where $\boldsymbol{\phi }(\mathbf { r}(U)) = (\phi (\mathbf { r}(U)_{1}),...,\phi (\mathbf { r}(U)_{4n^{2}}))$ denotes the pixel-wise discretization of $u_1, u_2$ at all cell centers, and $e = (1,...,1)\in \mathbb {R}^{4n^{2}}$.

Finally, combining the above three parts (10), (12), (14) and (16), we get the discretization formulation for model (6):

$$\begin{aligned} \min _{U} J(U):= h^{2}p_{1}^{T}p_{1}+\alpha h^{2} p_{2}^{T}p_{2}+\frac{\beta _{1}h^{2}}{2}U^{T}A_{1}^{T}A_{1}U+\frac{\beta _{2}h^{2}}{2}U^{T}A_{2}U+ \frac{\gamma h^{2}}{4}\boldsymbol{\phi }(\mathbf { r}(U))e^{T}. \end{aligned}$$

(17)

Remark 1

According to the definition of $\phi $ and $\mathbf { r}(U)_{\ell } \ge 0$, each component of $\boldsymbol{\phi }(\mathbf { r}(U))$ is non-negative and differentiable.

4.2 Optimization Method for the Discretized Problem (17)

In the numerical implementation, we choose a line search method to solve the resulting unconstrained optimization problem (17). Here, the basic iterative scheme is

$$\begin{aligned} U^{i+1} = U^{i}+\theta \delta U^{i}, \end{aligned}$$

(18)

where $\delta U^{i}$ is the search direction and $\theta $ is the step length. In order to guarantee a descent search direction, we employ a Gauss-Newton method as the standard Newton method does not generate a descent direction because our exact Hessian is non-definite.

4.2.1 Gradient and Approximated Hessian of (17)

Firstly, we consider computing the gradient and approximated Hessian of the discretized fitting term $h^{2}p_{1}^{T}p_{1}+\alpha h^{2} p_{2}^{T}p_{2}$. Its gradient and approximated Hessian are respectively:

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{1} &{}=&{} 2h^{2}P^{T}(\mathrm {d}p_{1}^{T}p_{1}+\alpha \mathrm {d}p_{2}^{T}p_{2})\in {\mathbb R}^{2(n+1)^{2}\times 1},\\ \hat{H}_{1} &{}= &{}h^{2}P^{T}(\mathrm {d}p_{1}^{T}\mathrm {d}p_{1}+\alpha \mathrm {d}p_{2}^{T}\mathrm {d}p_{2})P \in \mathbb {R}^{2(n+1)^{2}\times 2(n+1)^{2}}. \end{array} \right. \end{aligned}$$

(19)

where $\mathrm {d}p_{1} = [\Lambda D_{1}-\mathrm {diag}(D_{1}\mathbf {T}./t)\Gamma ; \Lambda D_{2}-\mathrm {diag}(D_{2}\mathbf {T}./t)\Gamma ]$, $\mathrm {d}p_{2} = \sum _{i=1}^{2}\mathrm {diag}(D_{i}\mathbf {T}./\mathrm {LT}-D_{i}(\mathbf {T}+\mathbf {R})./\mathrm {LTR})D_{i}$, $\Lambda = \mathrm {diag}(1./\mathrm {LT})$, $t = \mathrm {LT}^{.3}$, $\Gamma = \sum _{i=1}^{2}\mathrm {diag}(D_{i}\mathbf {T})D_{i}$ and $\mathrm {diag}(v)$ is a diagonal matrix with v on its main diagonal.

Remark 2

Evaluating the deformed template image $ \mathbf {T} $ must involve interpolation because $PX+PU$ are not in general pixel points. Here in our implementation, we choose B-splines for the interpolation.

For the discretized diffusion regularizer $ \frac{\beta _{1} h^{2}}{2} U^{T}A_{1}^{T}A_{1}U+\frac{\beta _{2} h^{2}}{2} U^{T}A_{2}U, $ its gradient and Hessian are respectively

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{2} &{}=&{} h^{2}(\beta _{1}A_{1}^{T}A_{1}+\beta _{2}A_{2})U \in {\mathbb R}^{2(n+1)^{2}\times 1},\\ H_{2} &{}= &{}h^{2}(\beta _{1}A_{1}^{T}A_{1}+\beta _{2}A_{2}) \in {\mathbb R}^{2(n+1)^{2}\times 2(n+1)^{2}}.\end{array} \right. \end{aligned}$$

(20)

Finally, for the discretized Beltrami term $\frac{\beta h^{2}}{4}\boldsymbol{\phi }(\mathbf { r}(U))e^{T}$, the gradient and approximated Hessian are as follows:

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{3} &{}=&{} \frac{\beta h^{2}}{4} \mathrm {d}\mathbf { r}^{T}\mathrm {d}\boldsymbol{\phi }(\mathbf { r}) \in {\mathbb R}^{2(n+1)^{2}\times 1},\\ \hat{H}_{3} &{}= &{}\frac{\beta h^{2}}{4} \mathrm {d}\mathbf { r}^{T}\mathrm {d}^{2}\boldsymbol{\phi }(\mathbf { r})\mathrm {d}\mathbf { r}. \end{array} \right. \end{aligned}$$

(21)

where $\mathrm {d}\boldsymbol{\phi }(\mathbf { r})= (\phi '(\mathbf { r}_{1}),...,\phi '(\mathbf { r}_{4n^{2}}))^{T}$ is the vector of derivatives of $\boldsymbol{\phi }$ at all cell centers,

$$\begin{aligned} \left\{ \begin{array}{lcl} \mathrm {d}\mathbf { r}\ &{} = &{} \text {diag}(\mathbf { r}^{1})\mathrm {d}\mathbf { r}^{2}+\text {diag}(\mathbf { r}^{2})\mathrm {d}\mathbf { r}^{1}, \\ \mathrm {d}\mathbf { r}^{1} &{} =&{} 2\text {diag}(A_{31}U)A_{31} + 2\text {diag}(A_{32}U)A_{32}, \\ \mathrm {d}\mathbf { r}^{2} &{} =&{} -\text {diag}(\mathbf { r}^{2}\odot \mathbf { r}^{2})[2\text {diag}(A_{33}U+2)A_{33} + 2\text {diag}(A_{34}U)A_{34}], \end{array} \right. \end{aligned}$$

(22)

$\odot $ denotes a Hadamard product, $\mathrm {d}\mathbf { r}, \mathrm {d}\mathbf { r}^{1}, \mathrm {d}\mathbf { r}^{2}$ are the Jacobian of $\mathbf { r}, \mathbf { r}^{1}, \mathbf { r}^{2}$ with respect to U respectively, $ [\mathrm {d}\boldsymbol{\phi }(\mathbf { r})]_{\ell }$ is the $\ell $th component of $\mathrm {d}\boldsymbol{\phi }(\mathbf { r})$ and $\mathrm {d}^{2}\boldsymbol{\phi }(\mathbf { r})$ is the Hessian of $\boldsymbol{\phi }$ with respect to $\mathbf { r}$, which is a diagonal matrix whose ith diagonal element is $\phi ''(\mathbf { r}_{i}),\ 1\le i \le 4n^{2}$. More details about $\mathbf { r}^{1}$, $\mathbf { r}^{2}$, $A_{31}$, $A_{32}$, $A_{33}$ and $A_{34}$ are shown in Appendix 1.

Therefore, combining the above results for 3 terms, we can obtain the gradient

$$\begin{aligned} d_{J} = d_{1}+d_{2}+d_{3} \end{aligned}$$

(23)

and the approximated Hessian of (17):

$$\begin{aligned} H = \hat{H}_{1}+H_{2}+\hat{H}_{3}. \end{aligned}$$

(24)

4.2.2 Search Direction

With the above approximated Hessian (24), in each outer (nonlinear) iteration, we solve the Gauss-Newton system

$$\begin{aligned} H\delta U=-d_{J} \end{aligned}$$

(25)

to obtain the search direction $\delta U$ for (17). Because H is symmetric positive semi-definite, in our implementation, we choose MINRES with diagonal preconditioning as the numerical solver [4, 36].

4.2.3 Step Length

Here, we choose a popular inexact line search condition, Armijo condition, which determines a step length $\theta $ that satisfies the following sufficient decrease condition:

$$\begin{aligned} \mathcal {J}(U + \theta \delta U) < \mathcal {J}(U) + \theta \eta {d_{\mathcal {J}}}^{T}\delta U. \end{aligned}$$

(26)

Here, we set $\eta = 10^{-4}$ and use the backtracking approach to find a suitable $\theta $. In addition, we need to check that $\mathbf { r}(U)$ is smaller than 1 which is the norm of the discretized Beltrami coefficient. For more details, please refer to [25, 34, 41].

4.2.4 Stopping Criteria

In the implementation, we choose the stopping criteria used in [33]:

(1.a)
$\Vert J( U^{i+1})-J( U^{i})\Vert \le \tau _{J}(1+\Vert J( U^{0})\Vert )$,
(1.b)
$\Vert U^{i+1}-U^{i}\Vert \le \tau _{W}(1+\Vert X+U^{0}\Vert )$,
(1.c)
$\Vert d_{J}\Vert \le \tau _{G}(1+\Vert J( U^{0})\Vert )$,
(2)
$\Vert d_{J}\Vert \le $ eps,
(3)
$i \ge $ MaxIter.

Here, eps is the machine precision and MaxIter is the maximal number of outer iterations. We set $\tau _{J} = 10^{-3}$, $\tau _{W} = 10^{-2}$ and $\tau _{G} = 10^{-2}$. If any one of (1) (2) and (3) is satisfied, the iterations are terminated. Hence, a Gauss-Newton numerical scheme with Armijo line search can be developed and summarized in Algorithm 1.

4.2.5 Multi-level Strategy

A multi-level strategy is a standard technique in image registration. In the multi-level strategy, we firstly coarsen the template T and the reference R by L levels. Then we can obtain $U_{1}$ by solving our model (6) on the coarsest level. In order to give a good initial guess for the finer level, we adopt an interpolation operator on $U_{1}$ to obtain $U_{2}^{0}$ as the initial guess for the next level. We repeat this process and can get the final registration on the finest level. The most important advantage of the multi-level strategy is that it can save computation time because of less variables on the coarser level than on the fine level. In addition, it can help to avoid trapping into a local minimum.

4.2.6 Convergence Result

Our above described Algorithm 1 will converge to a stationary point of our new model. Details are shown in Theorem 1 of Appendix 2 below.

5 Numerical Results

In this section, we will show some numerical results to illustrate the performances of our proposed model (6) using Gauss-Newton method called GNR. We compare with the standard NGF [32] and the Augmented Lagrangian approach for solving a similar model [43] called ALMR, which uses the same regularization and fitting terms. However, the local invertibility of the map is guaranteed by imposing an inequality constraint on the model. For more details about the augmented Lagrangian method, we refer to [3, 38, 42] and the reference therein.

ALMR. Alternating iteration is another popular method which might be applied to (6). However, below, we shall consider it for a related model [43] that uses a constrained optimization (different from (6)):

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \min _{\boldsymbol{u} \in \mathcal {H}} \lbrace \mathcal J_1(\boldsymbol{u})=S(\boldsymbol{u}) + \frac{\lambda }{2} D^{GF}(\boldsymbol{u}) + \frac{\lambda }{2}D^{TM}(\boldsymbol{u}) \rbrace ,\\ \text {w.r.t}\;\;\;\mathcal {C}_\epsilon (\boldsymbol{u})=\det \, (I + \nabla \boldsymbol{u})\ge \epsilon , \end{array}\right. } \end{aligned}$$

(27)

where imposing the constraint is a competing way of ensuring a diffeomorphic transformation.

To reformulate (27), introducing variables K, $\mathbf{p}$ and $\mathbf{n}$, we solve the following constrained minimization problem:

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \min _{\boldsymbol{u},K,\mathbf{p},\mathbf{n}}\lbrace S(\boldsymbol{u}) + \frac{\lambda }{2} \int \limits _\varOmega ( \mathbf{n}-\nabla _n R)^2\mathrm {d}\boldsymbol{x}+ \frac{\lambda }{2} \int \limits _\varOmega (|\mathbf{p}|+ |\nabla R | -|\mathbf{m}|)^2\,\mathrm {d}\boldsymbol{x}\rbrace ,\\ \text {w.r.t}\;\;\; K=T(\boldsymbol{x}+\boldsymbol{u}),\;\; \mathbf{p}=\nabla K,\;\; |\mathbf{p}| \mathbf{n}=\mathbf{p},\;\; \mathbf{m}=\mathbf{p}+\nabla R,\; \;\mathcal {C}>0. \end{array}\right. } \end{aligned}$$

(28)

Then, the augmented Lagrangian functional corresponding to the constrained optimization problem (28) is defined as follows:

$$\begin{aligned} \begin{aligned}&\mathcal {L}_1(\boldsymbol{u},K,\mathbf{p},\mathbf{n},\mathbf{m}, \lambda _1,\lambda _2,\lambda _3,\lambda _4,\lambda _5) \\&= S(\boldsymbol{u}) + \frac{\lambda }{2} \int \limits _\varOmega ( \mathbf{n}-\nabla _n R)^2\mathrm {d}\boldsymbol{x}+ \frac{\lambda }{2} \int \limits _\varOmega (|\mathbf{p}|+ |\nabla R | -|\mathbf{m}|)^2\,\mathrm {d}\boldsymbol{x}\\&\ \ + \frac{r_2}{2} \int \limits _\varOmega (\mathbf{p}-\nabla K)^2\mathrm {d}\boldsymbol{x}+\frac{r_3}{2} \int \limits _\varOmega (\mathbf{p}-|\mathbf{p}| \mathbf{n})^2 \mathrm {d}\boldsymbol{x}+\frac{r_4}{2} \int \limits _\varOmega (\mathbf{p}+ \nabla R-\mathbf{m})^2 \mathrm {d}\boldsymbol{x}\\&\ \ +\int \limits _\varOmega (T(\boldsymbol{x}+\boldsymbol{u})-K)\lambda _1 \mathrm {d}\boldsymbol{x}+ \int \limits _\varOmega (\mathbf{p}- \nabla K)\cdot \lambda _2 \mathrm {d}\boldsymbol{x}+ \int \limits _\varOmega (\mathbf{p}-|\mathbf{p}| \mathbf{n})\cdot \lambda _3 \mathrm {d}\boldsymbol{x}\\&\ \ + \int \limits _\varOmega (\mathbf{p}+ \nabla R-\mathbf{m})\cdot \lambda _4\, \mathrm {d}\boldsymbol{x}+\frac{r_1}{2}\int \limits _\varOmega (T(\boldsymbol{x}+\boldsymbol{u})-K)^2\mathrm {d}\boldsymbol{x}+\frac{1}{2\sigma } \int \limits _\varOmega \mathcal {C}_s(\boldsymbol{u},\lambda _5)\,\mathrm {d}\boldsymbol{x}, \end{aligned} \end{aligned}$$

(29)

where

$$\begin{aligned} \mathcal {C}_s(\boldsymbol{u},\lambda _5)=[\min \lbrace 0,\sigma (\mathcal {C}(\boldsymbol{u})-\epsilon ) - \lambda _5 \rbrace ) ]^2-\lambda _5^2, \end{aligned}$$

(30)

$\epsilon >0$ is a small parameter, $\sigma >0$ and $\lambda _i (i=1,\ldots ,5)$ are the Lagrange multipliers. The augmented Lagrangian algorithm is shown in Algorithm 2.

In practice, the minimization problem (29) or (31) is decomposed into a number of sub-problems, each of which can be solved quickly. However, the convergence of the augmented Lagrangian iterations for this case is not guaranteed due to the non-convexity of overall registration problem. Currently this is a major weakness of ALMR while the convergence of GNR (even if a bit slower) can be proved and hence recommended.

In order to reduce the number of parameters to tune, we set $\lambda =15$, $\beta _1=0.005$, $\beta _2=0.1\times \beta _1$ $r_1 = 5$, $r_2=10$ and $r_3=r_4=100$ in all numerical experiments unless stated otherwise. We consider $N_{max}=70$ as the maximum number of iterations for ALMR from Algorithm 2 and we stop the iterations before reaching $N_{max}=70$ if the following stopping criterion

$$ \frac{\Vert \mathbf{p}^k + \nabla R -\mathbf{m}^k\Vert _{L^1}}{\sqrt{l\times c}} \le \tau $$

is satisfied for a given tolerance $\tau =10^{-3}$, where l and c are the numbers of rows and columns in the image

For all compared methods, we set the zero vector as the initial guess $U^{0}$. To measure the quality of the registered images, we use the following quantities

$$\begin{aligned} \mathrm {GFer} = \frac{D^{GF}(\boldsymbol{u})}{D^{GF}(\boldsymbol{u}^{0})}, \end{aligned}$$

(31)

$$\begin{aligned} \mathrm {NGFer} = \frac{D^{NGF}(\boldsymbol{u})}{D^{NGF}(\boldsymbol{u}^{0})}, \end{aligned}$$

(32)

and

$$\begin{aligned} \mathrm {MIer} = -D^{MI}(\boldsymbol{u}). \end{aligned}$$

(33)

The good result means that it can lead to small GFer, small NGFer and large MIer. All the codes are implemented by Matlab R2019b on a PC with 3.4 GHz Intel(R) Core(TM) i5-3570 processor and 12 GB RAM.

5.1 Example 1

In this example, we consider a pair of images displayed in Fig. 2a, b. The resolution is $256\times 256$. In order to choose the parameter easily, in this example, we fix $\alpha $ and set $\alpha =0.01$.

Firstly, we consider the model without Beltrami control term, namely $\gamma =0$. For the parameters of regularizers, we set two pairs $(\beta _{1},\beta _{2})=(50,2)$ and $(\beta _{1},\beta _{2})=(50,5)$. The corresponding deformed templates and transformations are shown in Fig. 2d, e, g, h. From Fig. 2f, i, we can find that the deformed templates generated by these two pairs of parameters are visually satisfied. In addition, these two choices give similar measurements: GFer $ = 0.82$, NGFer $ = 0.81$, MIer $ = 0.58$ and GFer $ = 0.83$, NGFer $ = 0.84$, MIer $ = 0.57$ respectively. However, the first choice leads to a transformation containing folding because the minimum of the Jacobian determinant of the transformation is negative but the second choice produces a smooth transformation without folding because the minimum of the Jacobian determinant of the transformation is positive.

Since first and second order regularizers just control the smoothness, in order to overcome this drawback, we keep $(\beta _{1}, \beta _{2})=(50,2)$ unchanged and choose a suitable $\gamma $. Here, we set $\gamma =10$. Figure 3a, b shows the corresponding deformed template and transformation. From Fig. 3c, the deformed template is similar visually with the previous one without controlling the Beltrami coefficient and the measurements are also similar (GFer $= 0.82$, NGFer $= 0.82$ and MIer $= 0.57$). But the minimum of the Jacobian determinant of the transformation is positive, which illustrates that the transformation is diffeomorphic. In the same figure, we also give the result of ALMR model, which shows again from the overlay of $T(\boldsymbol{\varphi })$ and the reference R that the template image T is well registered to R.

Now, we investigate the sensitivity of $\gamma $. From Table 1, we can find that when we fix $\alpha ,\beta _{1}$ and $\beta _{2}$ and change $\gamma $, GFer, NGFer and MIer are robust and at the same time, the minimum of the Jacobian determinant of the transformations are all positive. This indicates that the Beltrami control term is not sensitive.

Table 1 Example 1: measurements obtained by using $\alpha =10^{-2},\beta _{1}=50$ and $\beta _{2}=2$

Full size table

In addition, we also investigate the convergence of the algorithm for our model. Here, we force the relative norm of the gradient of the approximated solution to reach $10^{-3}$ although it only runs several iterations by using the practical stopping criteria. Here, according to Fig. 4, we can find that the algorithm for our model is convergent.

Hence, this example illustrates that our new control term can effectively control the transformation and lead to an accurate registration. Meanwhile, the new control term can make this model more robust.

5.2 Example 2

In this example, we consider another pair of $256\times 256$ images (Fig. 5a, b). Again, in order to reduce the complexity of choosing parameters, we fix $\alpha =10^{-1}$ in this example.

Firstly, we set $\beta _{1}=50,\beta _{2}=10$ and $\gamma =0$. From Fig. 5d–f, although the deformed template is satisfied visually, we can find that the resulting transformation has folding since the minimum of the Jacobian determinant is negative.

As a comparison, we also test the model of the standard NGF [32] with the same first- and second-order regularizer. Here, we test three pairs of $(\beta _{1},\beta _{2})$ and the corresponding results are shown in Fig. 6. We can find that for the fitting term, if we choose NGF, it is very hard to choose the suitable parameters to get a good registration, namely, simultaneously get a diffeomorphic transformation and a visually satisfied deformed template. In order to overcome this difficulty, we keep $\beta _{1}$, $\beta _{2}$ unchanged and choose $\gamma $ as 1, 10 and 100 separately. Figure 7 shows that they can all generate visually satisfied deformed template and diffeomorphic transformations. Specifically, according to Fig. 7, we can see that the measurements obtained by these choices are very similar, which again demonstrates that this model can be more robust through combining the Beltrami control term. We also give the result of ALMR model in Fig. 8. We can observe from overlay of the registered and the reference images that all models work fine in producing acceptable registration result.

In summary, when the ALMR, the NGF and the GNR work, the latter has the largest MIer similarity (indicating better quality). However, NGF (or taking out an extra control term for ALMR and GNR) can fail to deliver a valid result (with negative $\det \nabla \boldsymbol{y}$) if the parameters are not chosen correctly. Although ALMR is completive to GNR (and takes less time to converge in practice), only the convergence of GNR can be proved. Hence our model GNR is robust and can be recommended for multi-modal registration.

6 Conclusions

Image registration is an increasingly important and often challenging image processing task. The quality of the transformation requires suitable control. In this Chapter to improve a multi-modality registration model, we propose a novel term motivated by Beltrami coefficient, which can lead to a diffeomorphic transformation. The advantage of the term lies in no bias imposed on its Jacobian of the transformation’s determinant. By employing first-discretize-then-optimize method, we design an effective solver to solve our proposed model numerically. Experimental tests confirm that our proposed model performs well in multi-modality images registration. In addition, with the help of the Beltrami control term, the proposed model is more robust with respect to the parameters. Future work will investigate extension of this work to a deep learning framework [45].

References

A. Angelov, M. Wagner, Multimodal image registration by elastic matching of edge sketches via optimal control. Journal of Industrial & Management Optimization 10, 567–590 (2014)
MathSciNet MATH Google Scholar
N. Aronszajn, Theory of reproducing kernels. Transactions of the American mathematical society 68, 337–404 (1950)
Article MathSciNet Google Scholar
E. Bae, J. Shi, X.-C. Tai, Graph cuts for curvature based image denoising. IEEE Transactions on Image Processing 20, 1199–1210 (2011)
Article MathSciNet Google Scholar
R. Barrett, M.W. Berry, T.F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, H. Van der Vorst, Templates for the solution of linear systems: building blocks for iterative methods, vol. 43 (Siam, 1994)
Google Scholar
A. Berlinet, C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics (Springer Science & Business Media, 2011)
Google Scholar
K. Briechle, U.D. Hanebeck, Template matching using fast normalized cross correlation, in Optical Pattern Recognition XII, vol. 4387 (International Society for Optics and Photonics, 2001), pp. 95–102
Google Scholar
M. Burger, J. Modersitzki, L. Ruthotto, A hyperelastic regularization energy for image registration. SIAM Journal on Scientific Computing 35, B132–B148 (2013)
Article MathSciNet Google Scholar
K. Chen, L. M. Lui, J. Modersitzki, Image and surface registration, in Learning of Images, Shapes, and Forms, ed. by R. Kimmel, X.-C. Tai, vol. 20 (Handbook of Numerical Analysis, North Holland - Elsevier, 2019), pp. 579–611
Google Scholar
Y.M. Chen, J.L. Shi, M. Rao, J.S. Lee, Deformable multi-modal image registration by maximizing renyi’s statistical dependence measure. Inverse Problems and Imaging 9, 79–103 (2015)
Article MathSciNet Google Scholar
N. Chumchob, Vectorial total variation-based regularization for variational image registration. IEEE Trans. Image Processing 22, 4551–4559 (2013)
Article MathSciNet Google Scholar
N. Chumchob, K. Chen, Improved variational image registration model and a fast algorithm for its numerical approximation. Numerical Methods for Partial Differential Equations 28, 1966–1995 (2012)
Article MathSciNet Google Scholar
N. Chumchob, K. Chen, C. Brito-Loeza, A fourth-order variational image registration model and its fast multigrid algorithm. Multiscale Modeling & Simulation 9, 89–128 (2011)
Article MathSciNet Google Scholar
M. Droske, W. Ring, A mumford-shah level-set approach for geometric image registration. SIAM journal on Applied Mathematics 66, 2127–2148 (2006)
Article MathSciNet Google Scholar
M. Droske, M. Rumpf, A variational approach to nonrigid morphological image registration. SIAM Journal on Applied Mathematics 64, 668–687 (2004)
Article MathSciNet Google Scholar
B. Fischer and J. Modersitzki, Fast diffusion registration, Contemp. Math., 313 (2002), pp. 117–129
Article MathSciNet Google Scholar
B. Fischer, J. Modersitzki, Curvature based image registration. Journal of Mathematical Imaging and Vision 18, 81–85 (2003)
Article MathSciNet Google Scholar
B. Fischer, J. Modersitzki, Ill-posed medicine?an introduction to image registration. Inverse Probl. 24 (2008)
Google Scholar
F. Gigengack, L. Ruthotto, M. Burger, C.H. Wolters, X. Jiang, K.P. Schafers, Motion correction in dual gated cardiac pet using mass-preserving image registration. IEEE transactions on medical imaging 31, 698–712 (2012)
Article Google Scholar
E. Haber, J. Modersitzki, Numerical methods for volume preserving image registration. Inverse problems 20, 1621 (2004)
Article MathSciNet Google Scholar
E. Haber, J. Modersitzki, Image registration with guaranteed displacement regularity. International Journal of Computer Vision 71, 361–372 (2007)
Article Google Scholar
S. Henn, A multigrid method for a fourth-order diffusion equation with application to image processing. SIAM Journal on Scientific Computing 27, 831–849 (2005)
Article MathSciNet Google Scholar
E. Hodneland, A. Lundervold, J. Rørvik, A.Z. Munthe-Kaas, Normalized gradient fields for nonlinear motion correction of dce-mri time series. Computerized Medical Imaging and Graphics 38, 202–210 (2014)
Article Google Scholar
W. Hu, Y. Xie, L. Li, and W. Zhang, A total variation based nonrigid image registration by combining parametric and non-parametric transformation models, Neurocomputing, 144 (2014), pp. 222–237
Article Google Scholar
M. Ibrahim, K. Chen, C. Brito-Loeza, A novel variational model for image registration using gaussian curvature. Geometry, Imaging and Computing 1, 417–446 (2014)
Article MathSciNet Google Scholar
C.T. Kelley, Iterative Methods for Optimization, vol. 18 (Siam, 1999)
Google Scholar
L. König, J. Rühaak, A fast and accurate parallel algorithm for non-linear image registration using normalized gradient fields, in Biomedical Imaging (ISBI), in 11th International Symposium on IEEE, vol. 2014 (IEEE, 2014), pp. 580–583
Google Scholar
D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information. IEEE transactions on medical imaging 29, 19–29 (2010)
Article Google Scholar
L.M. Lui, T.C. Ng, A splitting method for diffeomorphism optimization problem using beltrami coefficients. Journal of Scientific Computing 63, 573–611 (2015)
Article MathSciNet Google Scholar
F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, P. Suetens, Multimodality image registration by maximization of mutual information. IEEE transactions on Medical Imaging 16, 187–198 (1997)
Article Google Scholar
A. Mang, G. Biros, An inexact newton-krylov algorithm for constrained diffeomorphic image registration. SIAM journal on imaging sciences 8, 1030–1069 (2015)
Article MathSciNet Google Scholar
A. Mang, G. Biros, Constrained h1-regularization schemes for diffeomorphic image registration. SIAM journal on imaging sciences 9, 1154–1194 (2016)
Article MathSciNet Google Scholar
J. Modersitzki, Numerical Methods for Image Registration (Oxford University Press on Demand, 2004)
Google Scholar
J. Modersitzki, FAIR: Flexible Algorithms for Image Registration, vol. 6 (SIAM, 2009)
Google Scholar
J. Nocedal, S.J. Wright, Numerical Optimization 2nd (2006)
Google Scholar
F.P. Oliveira, J.M.R. Tavares, Medical image registration: a review. Computer methods in biomechanics and biomedical engineering 17, 73–93 (2014)
Article Google Scholar
C.C. Paige, M.A. Saunders, Solution of sparse indefinite systems of linear equations. SIAM journal on numerical analysis 12, 617–629 (1975)
Article MathSciNet Google Scholar
J.P. Pluim, J.A. Maintz, M.A. Viergever, Mutual-information-based registration of medical images: a survey. IEEE transactions on medical imaging 22, 986–1004 (2003)
Article Google Scholar
G. Roland, L.T. Patrick, Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics (SIAM, 1989)
Google Scholar
J. Rühaak, L. König, M. Hallmann, N. Papenberg, S. Heldmann, H. Schumacher, B. Fischer, A fully parallel algorithm for multimodal image registration using normalized gradient fields, in Biomedical Imaging (ISBI), in 10th International Symposium on IEEE, vol. 2013 (IEEE, 2013), pp. 572–575
Google Scholar
A. Sotiras, C. Davatzikos, N. Paragios, Deformable medical image registration: A survey. IEEE transactions on medical imaging 32, 1153–1190 (2013)
Article Google Scholar
W. Sun, Y.-X. Yuan, Optimization Theory and Methods: Nonlinear Programming, vol. 1 (Springer Science & Business Media, 2006)
Google Scholar
X.-C. Tai, J. Hahn, G.J. Chung, A fast algorithm for Euler’s elastica model using augmented lagrangian method. SIAM Journal on Imaging Sciences 4, 313–344 (2011)
Article MathSciNet Google Scholar
A. Theljani, K. Chen, An augmented Lagrangian method for solving a new variational model based on gradients similarity measures and high order regulariation for multimodality registration. Inverse Problems and Imaging 13, 309–335 (2019)
Article MathSciNet Google Scholar
A. Theljani, K. Chen, A nash game based variational model for joint image intensity correction and registration to deal with varying illumination. Inverse Probl. 36 (2020)
Google Scholar
A. Theljani, K. Chen, An unsupervised deep learning method for diffeomorphic mono-and multi-modal image registration, in Medical Image Understanding and Analysis (MIUA 2019), vol. 1065 (Communications in Computer and Information Science, Springer, 2020)
Google Scholar
P. Viola, W.M. Wells III, Alignment by maximization of mutual information. Int. J. Comput. Vis. 24, 137–154 (1997)
Google Scholar
C. Xing, P. Qiu, Intensity-based image registration by nonparametric local smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 2081–2092 (2011)
Article Google Scholar
D. Zhang, K. Chen, A novel diffeomorphic model for image registration and its algorithm. Journal of Mathematical Imaging and Vision 60, 1261–1283 (2018)
Article MathSciNet Google Scholar
D. Zhang, A. Theljani, K. Chen, On a new diffeomorphic multi-modality image registration model and its convergent gauss-newton solver. Journal of Mathematical Research with Applications 39, 633–656 (2019)
MathSciNet MATH Google Scholar
J. Zhang, K. Chen, Variational image registration by a total fractional-order variation model. Journal of Computational Physics 293, 442–461 (2015)
Article MathSciNet Google Scholar
J. Zhang, K. Chen, B. Yu, An improved discontinuity-preserving image registration model and its fast algorithm. Applied Mathematical Modelling 40, 10740–10759 (2016)
Article MathSciNet Google Scholar
J. Zhang, K. Chen, and B. Yu, A novel high-order functional based image registration model with inequality constraint, Computers & Mathematics with Applications, 72 (2016), pp. 2887–2899
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

EPSRC Liverpool Centre for Mathematics in Healthcare, Centre for Mathematical Imaging Techniques and Department of Mathematical Sciences, The University of Liverpool, Peach Street, Liverpool, L69 7ZL, UK
Daoping Zhang, Anis Theljani & Ke Chen

Authors

Daoping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Anis Theljani
View author publications
You can also search for this author in PubMed Google Scholar
Ke Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke Chen .

Editor information

Editors and Affiliations

Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong
Xue-Cheng Tai
Institute of Applied Physics and Computational Mathematics, Beijing, China
Suhua Wei
Beijing Computational Science Research Center, Beijing, China
Haiguang Liu

Appendices

Appendix 1—Computation of the Vector $\mathbf { r}(U)$

First of all, denote the 3 vertices of this triangle by $V_{1} = \boldsymbol{x}^{1,1}$, $V_{2} = \boldsymbol{x}^{2,1}$ and $V_{5} = \boldsymbol{x}^{1.5,1.5}$ in Fig. 1. Set $\mathbf{L} (V_{1}) = (u_{1}^{1,1},u_{2}^{1,1})$, $\mathbf{L} (V_{2}) = (u_{1}^{2,1},u_{2}^{2,1})$ at the vertex pixels, and $\mathbf{L} (V_{5}) = (u_{1}^{1.5,1.5},u_{2}^{1.5,1.5})$ at the cell centre (approximated values). Here the linear approximations are $\mathbf{L} (x_{1},x_{2}) = (a_{1}x_{1}+a_{2}x_{2}+a_{3},a_{4}x_{1}+a_{5}x_{2}+a_{6})$.

After substituting $V_{1},V_{2}$ and $V_{5}$ into $\mathbf{L} $, we get

$$\begin{aligned} \begin{pmatrix} a_{1} \\ a_{2}\end{pmatrix} = \frac{1}{\det } \begin{pmatrix} x_{2}^{1}-x_{2}^{1.5} &{} -x_{2}^{1}+x_{2}^{1.5} \\ -x_{1}^{2}+x_{1}^{1.5} &{} x_{1}^{1}-x_{1}^{1.5} \end{pmatrix} \begin{pmatrix} u_{1}^{1,1}-u_{1}^{1.5,1.5}\\ u_{1}^{2,1}-u_{1}^{1.5,1.5} \end{pmatrix}, \end{aligned}$$

(34)

$$\begin{aligned} \begin{pmatrix} a_{4} \\ a_{5}\end{pmatrix} = \frac{1}{\det } \begin{pmatrix} x_{2}^{1}-x_{2}^{1.5} &{} -x_{2}^{1}+x_{2}^{1.5} \\ -x_{1}^{2}+x_{1}^{1.5} &{} x_{1}^{1}-x_{1}^{1.5} \end{pmatrix} \begin{pmatrix} u_{2}^{1,1}-u_{2}^{1.5,1.5}\\ u_{2}^{2,1}-u_{2}^{1.5,1.5} \end{pmatrix}, \end{aligned}$$

(35)

where $ \det = \begin{vmatrix} x_{1}^{1}-x_{1}^{1.5}&x_{2}^{1}-x_{2}^{1.5} \\ x_{1}^{2}-x_{1}^{1.5}&x_{2}^{1}-x_{2}^{1.5} \end{vmatrix}$.

According to (34) and (35), we can formulate two matrices $D1\in {\mathbb R}^{4n^{2}\times (n+1)^{2}}$ and $D2 \in {\mathbb R}^{4n^{2}\times (n+1)^{2}}$ such that

$$\begin{aligned} A_{31} = [D1 ,-D2], A_{32}=[D2,D1], A_{33}=[D1,D2], A_{34}=[D2,-D1]. \end{aligned}$$

Then using the Hadamard product $\odot $, we get a compact form for

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf { r}^{1}(U) &{} = A_{31}U\odot A_{31}U+A_{32}U\odot A_{32}U,\\ \mathbf { r}^{2}(U) &{} = 1./((A_{33}U+2)\odot (A_{33}U+2)+A_{34}U\odot A_{34}U), \\ \mathbf { r}(U) &{} = \mathbf { r}^{1}\odot \mathbf { r}^{2} \ \ \in {\mathbb R}^{4n^{2}\times 1}. \end{array}\right. \end{aligned}$$

(36)

Appendix 2—The Global Convergence of Algorithm 1

In order to discuss the global convergence result of Algorithm 1 for the discretized optimization problem (17), we first review two lemmas.

Lemma 1

([25]) For the unconstrained optimization problem

$$\begin{aligned} \min _{U} J(U) \end{aligned}$$

let an iterative sequence be defined by $U^{i+1}=U^{i}+\theta \delta U^{i}$, where $\delta U^{i}=-(H^{i})^{-1}d_{J}(U^{i})$ and $\theta $ is obtained by Armijo condition. Assume that three conditions are met: (i). $d_{J}$ be Lipschitz continuous; (ii). the matrices $H^{i}$ are SPD (iii). there exist constants $\bar{\kappa }$ and M such that the condition number $\kappa (H^{i})\le \bar{\kappa }$ and the norm $||H^{i}||\le M$ for all i. Then either $J(U^{i})$ is unbounded from below or

$$\begin{aligned} \lim _{i\rightarrow \infty } d_{J}(U^{i})=0 \end{aligned}$$

(37)

and hence any limit point of the sequence of iterates is a stationary point.

Lemma 2

Let a matrix be comprised of 3 submatrices $H = H_{1}+H_{2}+H_{3}$. If $H_{1}$ and $H_{2}$ are symmetric positive semi-definite and $H_{3}$ is SPD, then H is SPD with $\lambda _{H_{3}}\le \lambda _{H}$, where $\lambda _{H_{3}}$ and $\lambda _{H}$ are the minimum eigenvalues of $H_{3}$ and H separately.

Proof

According to Rayleigh quotient, we can find a vector v such that

$$\begin{aligned} \lambda _{H} = \frac{v^{T}Hv}{v^{T}v}. \end{aligned}$$

(38)

Then we have

$$\begin{aligned} \lambda _{H_{3}}\le \frac{v^{T}H_{1}v}{v^{T}v}+\frac{v^{T}H_{2}v}{v^{T}v}+\frac{v^{T}H_{3}v}{v^{T}v} = \frac{v^{T}Hv}{v^{T}v} = \lambda _{H}, \end{aligned}$$

(39)

which completes the proof.

In the above discretization leading to (17), we do not need to introduce the boundary condition. However for theory purpose, in the following, we will prove our convergence result under the Dirichlet boundary condition (namely, the boundary is fixed) and this condition is needed to prove the symmetric positive definite (SPD) property of the approximated Hessian. In practical implementation, such a condition is not required as confirmed by experiments.

In addition, define an important set $\mathcal {X}:=\{U \ | \ \mathbf {{r}}(U)_{\ell }\le 1-\epsilon , 1 \le \ell \le 4n^{2}\}$ for small $\epsilon $. So $U\in \mathcal {X}$ means that the transformation is diffeomorphic. Under the suitable $\gamma $, each $U^{i}$ generated by Algorithm 1 is in the $\mathcal {X}$.

Theorem 1

Assume that T and R are twice continuously differentiable. For (17), by using Algorithm 1, we obtain

$$\begin{aligned} \lim _{i\rightarrow \infty }d_{J}(U^{i})=0 \end{aligned}$$

(40)

and hence any limit point of the sequence of iterates produced by Algorithm 1 is a stationary point.

Proof

It suffices to show that Algorithm 1 satisfies the requirements of Lemma 1. Recall $\mathbf { r}(U)$ and we can see that it is continuous. Here, we use the Dirichlet boundary condition and we can assume that $\Vert U\Vert $ is bounded. Then $\mathbf { r}(U)$ is a continuous mapping from a compact set to $\mathbb {R}^{4n^{2}\times 1}$ and $\mathbf { r}(U)$ is proper. So for some small $\epsilon >0$, $\mathcal {X}$ is compact.

Firstly, we show that in $\mathcal {X}$, $d_{J}$ of (17) is Lipschitz continuous. The term $\boldsymbol{\phi }(\mathbf { r}(U))e^{T}$ in the (17) is twice continuously differentiable with respect to $U \in \mathcal {X}$. In addition, T and R are twice continuously differentiable. So (17) is twice continuously differentiable with respect to $U \in \mathcal {X}$ and $d_{J}$ is Lipschitz continuous.

Secondly, we show that in $\mathcal {X}$, $H^{i}=\hat{H}^{i}_{1}+ H^{i}_{2}+\hat{H}^{i}_{3}$ is SPD. By the construction of $\hat{H}^{i}_{1}$ and $\hat{H}^{i}_{3}$, they are symmetric positive semi-definite. $H^{i}_{2}$ is symmetric positive definite under the Dirichlet boundary condition. Consequently, according to Lemma 2, $H^{i}$ is SPD.

Thirdly, we show that both $\kappa (H^i)$ and $\Vert H^{i}\Vert $ are bounded. We notice that in each iteration, $H^{i}_{2}$ is constant and we can set $\Vert H^{i}_{2}\Vert = M_{2}$. For $\hat{H}^{i}_{1}$, we get its upper bound $M_{1}$ because T is twice continuously differentiable and $\mathcal {X}$ is compact. $\boldsymbol{\phi }$ is also twice continuously differentiable with respect to $U \in \mathcal {X}$, then we have $\Vert \hat{H}^{i}_{3}\Vert \le M_{3}$. Hence, we have

$$\begin{aligned} \Vert H^{i}\Vert \le \Vert \hat{H}^{i}_{1}\Vert +\Vert H^{i}_{2}\Vert +\Vert \hat{H}^{i}_{3}\Vert \le M_{1} + M_{2} + M_{3}. \end{aligned}$$

(41)

So set $M=M_{1}+M_{2}+M_{3}$ and $\Vert H^{i}\Vert \le M$. Set $\sigma $ as the minimum eigenvalue of $H^{i}_{2}$. According to Lemma 2, the smallest eigenvalue $\lambda _{min}$ of $H^{i}$ should be larger than $\sigma $. The largest eigenvalue $\lambda _{max}$ of $H^{i}$ should be smaller than M due to $\lambda _{max}\le \Vert H^{i}\Vert $. So the conditional number of $H^{i}$ is smaller than $\frac{M}{\sigma }$.

Finally, we can find that (17) has lower bound 0. Hence, by applying Lemma 1, we complete the proof.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, D., Theljani, A., Chen, K. (2021). Multi-modality Image Registration Models and Efficient Algorithms. In: Tai, XC., Wei, S., Liu, H. (eds) Mathematical Methods in Image Processing and Inverse Problems. IPIP 2018. Springer Proceedings in Mathematics & Statistics, vol 360. Springer, Singapore. https://doi.org/10.1007/978-981-16-2701-9_3

Download citation

DOI: https://doi.org/10.1007/978-981-16-2701-9_3
Published: 26 September 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2700-2
Online ISBN: 978-981-16-2701-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Multi-modality Image Registration Models and Efficient Algorithms

Abstract

Similar content being viewed by others

Two-dimensional diffeomorphic model for multi-modality image registration

Improved optimization methods for image registration problems

A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Keywords

AMS subject classifications

1 Introduction

2 Review of Related Models

3 The New Model

3.1 Data Fitting

3.2 Regularization

3.3 Invertibility

4 The Solution Algorithm

4.1 Discretization

4.1.1 Discretization of Fitting Term

4.1.2 Discretization of Regularization Term

4.1.3 Discretization of Control Term

Remark 1

4.2 Optimization Method for the Discretized Problem (17)

4.2.1 Gradient and Approximated Hessian of (17)

Remark 2

4.2.2 Search Direction

4.2.3 Step Length

4.2.4 Stopping Criteria

4.2.5 Multi-level Strategy

4.2.6 Convergence Result

5 Numerical Results

5.1 Example 1

5.2 Example 2

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix 1—Computation of the Vector \(\mathbf { r}(U)\)

Appendix 2—The Global Convergence of Algorithm 1

Lemma 1

Lemma 2

Proof

Theorem 1

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation