A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Zhang, Daoping; Chen, Ke

doi:10.1007/s10851-018-0811-3

A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Open access
Published: 10 April 2018

Volume 60, pages 1261–1283, (2018)
Cite this article

Download PDF

You have full access to this open access article

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Download PDF

4894 Accesses
24 Citations
Explore all metrics

Abstract

In this work, we investigate image registration by mapping one image to another in a variational framework and focus on both model robustness and solver efficiency. We first propose a new variational model with a special regularizer, based on the quasi-conformal theory, which can guarantee that the registration map is diffeomorphic. It is well known that when the deformation is large, many variational models including the popular diffusion model cannot ensure diffeomorphism. One common observation is that the fidelity error appears small while the obtained transform is incorrect by way of mesh folding. However, direct reformulation from the Beltrami framework does not lead to effective models; our new regularizer is constructed based on this framework and added to the diffusion model to get a new model, which can achieve diffeomorphism. However, the idea is applicable to a wide class of models. We then propose an iterative method to solve the resulting nonlinear optimization problem and prove the convergence of the method. Numerical experiments can demonstrate that the new model can not only get a diffeomorphic registration even when the deformation is large, but also possess the accuracy in comparing with the currently best models.

Multi-modality Image Registration Models and Efficient Algorithms

Unsupervised Learning of Diffeomorphic Image Registration via TransMorph

Recent Developments of an Optimal Control Approach to Nonrigid Image Registration

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Image registration is to find a transformation to map the corresponding image data, which are taken at different times, from different sensors, or from different viewpoints, for the purpose of telling the difference or merging information. Nowadays, image registration is widely used in many areas, such as computer vision, biological imaging, remote sensing and medical imaging [6, 21, 26, 32, 36, 38, 40, 47, 57].

In reality, according to the specific application, image registration can be classified into two categories: mono-modal registration and multi-modal registration. For multi-modal registration, finding a suitable distance measure is the most essential step [22, 35, 36, 47, 57]. The idea of this paper will be applicable to multi-modal registration framework, but we focus on the mono-modal registration in this work.

In dealing with the mono-modal registration, there are many choices of a data fidelity term [33] and a common approach for computing this transformation is to use the sum of squared differences (SSD) to measure the difference between the reference image R and the deformed template image T [11]. However, minimization of SSD alone in image registration is an ill-posed problem in the sense of Hadamard since it may have many solutions. In order to overcome this difficulty, regularization is indispensable [38, 52]. However, the choice of the regularization term, which needs some prior information about physical properties and helps to avoid the local minima, depends on the specific application.

All registration models are nonlinear but they can be classified into two main categories according to the way deformation mapping is represented: linear registration and nonlinear registration. In linear registration, the deformation model is linear and global, including rotation, translation, shearing and scaling [11, 38]. Although the computation speed of a linear model is fast since it contains few variables, it is commonly used as the pre-registration for starting a more sophisticated model. This is mainly because linear models can not accommodate the local details (differences). In contrast, nonlinear registration models inspired by physical processes of transformations [47] such as the elastic model [5], fluid model [9], diffusion model [16], TV (total variation) model [19], MTV (modified TV) model [12], linear curvature model [17, 18], mean curvature model [14], Gaussian curvature model [27] and total fractional-order variation model [56] are proposed to account for localized variation in details, by allowing many degrees of freedom. The particular free-form deformation models based on B-splines lying between the above two types possess simplicity, smoothness, efficiency and ability to describe local deformation with few degrees of freedom [44, 45, 47]. For relatively small deformation, all models can be effective, but for large deformation, not all models are effective and in particular few models can guarantee a one-to-one mapping unless one fine tunes the coupling parameters to reduce the deformation magnitude allowed (since the mapping quality is perfect if deformation is zero) which in turn loses the ability of modelling large deformation.

Over the last decade, more and more researchers have focused on diffeomorphic image registration where folding measured by the local invertibility quantity $\det (J_{\mathbf{y }})$ is reduced or avoided. Here, $\mathbf{y }$ denotes the transformation in the registration model and $\det (J_{\mathbf{y }})$ is the Jacobian determinant of $\mathbf{y }$. Under desired assumptions, obtaining a one-to-one mapping is a natural choice as reviewed in [47].

In 2004, Haber and Modersitzki [23] proposed an image registration model imposing volume preserving constraints, by ensuring $\det (J_{\mathbf{y }})$ is close to 1. Although volume preservation is very important in some applications where some underlying (e.g. anatomical) structure is known to be incompressible [47], it is not required or reasonable in others. In a later work, the same authors [25] relaxed the constraint to allow $\det (J_{\mathbf{y }})$ to lie in a specific interval. Yanovsky et al. [55] applied the symmetric Kullback–Leibler distance to quantify $\det (J_\mathbf{y })$ to achieve a diffeomorphic mapping. Burger et al. [7] designed a volume penalty term that ensured that shrinkage and growth had the same cost in their variational functional. The constrained hierarchical parametric approach [41] ensures that the mapping is globally one-to-one and thus preserves topology in the deformed image. Sdika [46] introduced a regularizer to penalize the non-invertible transformation. In [51], Vercauteren et al. proposed an efficient non-parametric diffeomorphic image registration algorithm based on Thirion’s demons algorithm [49]. In addition, a framework called large deformation diffeomorphic metric mapping (LDDMM) can generate the diffeomorphic transformation for image registration [3, 15, 37, 50]. An entirely different framework proposed by Lam and Lui [30] obtains diffeomorphic registrations by constraining Beltrami coefficients of a quasi-conformal map ${\varvec{f}}=y_1({\mathbf{x }} )+ {\varvec{i}}y_2({\mathbf{x }})$, instead of controlling the map $\mathbf{y }({\mathbf{x }} )$ directly.

In this paper, we aim to reformulate the Lam and Lui Beltrami measure as a direct regularizer for controlling $\det (J_\mathbf{y })$ and to assess the effectiveness of the resulting variational models; though the idea applies to any commonly used models, we apply it to the diffusion model as one simple example. Our contributions are twofold:

We propose a new Beltrami coefficient-based regularizer that is explicitly expressed in terms of $\det (J_{\mathbf{y }})$. This establishes a link between the Beltrami coefficient of the transformation and the quantity $\det (J_{\mathbf{y }})$.
An effective, iterative scheme is presented and numerical experimental results show that the new registration model has a good performance and produces a diffeomorphic mapping while remaining competitive to the state-of-the-art models from non-Beltrami frameworks.

We remark that several interesting works that are concerned with reversible transformations (such as [8, 54]) may also benefit from this study.

The rest of the paper is organized as follows. Section 2 briefly reviews the basic mathematical formulation of image registration modelling, several typical regularization terms and how to get a diffeomorphic transformation for image registration. In Sect. 3, we propose a new regularizer and a new registration model. The effective discretization and numerical scheme are discussed in Sect. 4. Numerical experiment results are shown in Sect. 5, and finally a summary is concluded in Sect. 6.

2 Preliminaries, Regularization and Diffeomorphic Transformation

In general, image registration aims to compare, in space $\mathbb {R}^{d}$, two or more images or image sequences in a video. In this work, we consider the case of a pair of images $T, R:\Omega \subset \mathbb {R}^{d}\rightarrow \mathbb {R}$ and $d = 2$. Here by convention, R is the Reference image and T is the (moving) Template image.

The aim of image registration is to find a transformation $\mathbf y (\mathbf x )$ such that

$$\begin{aligned} T\circ \mathbf{y }(\mathbf x ) = T(\mathbf{y }(\mathbf x )) \approx R, \end{aligned}$$

where $\mathbf x =(x_{1},x_{2})$ and $\mathbf{y }(\mathbf x ) = (y_{1}(\mathbf x ),y_{2}(\mathbf x ))$. That is, the transformation $\mathbf{y }(\mathbf x )$ moves T to match R. If we define $\mathbf{y }(\mathbf x ) =\mathbf x +\mathbf{u }(\mathbf x )$, then $\mathbf{u }(\mathbf x ) = (u_{1}(\mathbf x ),u_{2}(\mathbf x ))$ indicates how much T moves, i.e. $\mathbf{u }(\mathbf x )$ is the displacement. Thus, the determination of the transformation $\mathbf y (\mathbf x )$ is equivalent to the determination of the displacement field $\mathbf{u }(\mathbf x )$.

2.1 Data Fidelity

One way to ensure that $T(\mathbf{y })$ can approximate R is to minimize the difference $T(\mathbf{y }) - R$. A commonly used difference measure is the sum of squared differences (SSD) defined by

$$\begin{aligned} {\mathcal {D}}[\mathbf{y }]= & {} \frac{1}{2}\int _{\Omega }(T(\mathbf y )-R)^{2}\hbox {d}{} \mathbf x \nonumber \\= & {} \frac{1}{2}\Vert T(\mathbf{y })-R\Vert ^{2}\nonumber \\= & {} \frac{1}{2}\Vert T(\mathbf x +\mathbf u )-R\Vert ^{2} = {\mathcal {D}}[\mathbf u ] \end{aligned}$$

(1)

where $\Vert \cdot \Vert ^{2}$ denotes the squared $L_{2}$-norm. Of course, there are some other typical distance measures, including normalized cross-correlation [38], mutual information [35, 38], normalized gradient fields [24, 39] and mass-preserving measure [7].

2.2 Regularization

Minimizing any of the above-mentioned measures is inefficient to obtain a unique transformation $\mathbf{y }$ for image registration, because $\min {\mathcal {D}}[\mathbf{y }]$ is ill-posed [38, 39]. In order to overcome this problem, regularization is necessary. Combining distance measure and regularization gives the variational model for image registration:

$$\begin{aligned} \min _{\mathbf{u }} J(\mathbf{u }) = {\mathcal {D}}[\mathbf{u }] + \alpha S[\mathbf{u }], \end{aligned}$$

(2)

where ${\mathcal {D}}[\mathbf{u }]$ is the distance measure from (1), $S[\mathbf{u }]$ is the regularizer to be discussed and $\alpha $ is a positive parameter to balance these two terms.

There exist many regularizers and we can classify them into three categories:

First-order regularizers involving $|\nabla \mathbf{u }|$ or $|\nabla \cdot \mathbf{u }|$. The diffusion regularizer [16] and the TV regularizer [19] are well-known first-order regularizers. The former one aims to control smoothness of the displacement and the latter one can preserve the discontinuity.
Fractional-order regularizer $\nabla ^\alpha \mathbf{u }$ with $\alpha \in (1,2)$. In [56], a fractional-order regularizer is used for image registration. Because the fractional-order regularizer is a global regularizer, its implementation must explore the structured Toeplitz matrices. This regularizer can not only produce accurate and smooth solutions but also allow for a large rigid alignment [56].
Second-order regularizers involving $\nabla ^2 \mathbf{u }$ or $\nabla \cdot (\nabla \mathbf{u }/|\nabla \mathbf{u }|)$. These include the linear curvature regularizer [17, 18], mean curvature regularizer [14] and Gaussian curvature regularizer [27].

The first two categories of models require an affine linear transformation in an initial pre-registration step while the latter category does not need a linear transformation in pre-registration.

Differing from the above three categories, an important class of fluid-like models based on partial differential equations was developed to capture large deformations. Christensen et al. [10] proposed an effective viscous fluid model characterized by a spatial smoothing of the velocity field. For the viscous fluid model, the deformation is governed by the Navier–Stokes equation:

$$\begin{aligned} \eta \nabla ^{2}{} \mathbf v +(\eta +\lambda )\nabla (\nabla \cdot \mathbf v )+\mathbf F =0, \quad \mathbf v =\partial _{t}{} \mathbf u +\mathbf v \cdot \nabla \mathbf u . \end{aligned}$$

(3)

Here, $\eta $ and $\lambda $ are the viscosity coefficients, the term $\nabla ^{2}{} \mathbf v $ constrains the velocity field to vary smoothly, the term $\nabla (\nabla \cdot \mathbf v )$ allows structures in the template to change in mass and $\mathbf F $ is the nonlinear deformation force field, which can be defined by $(T(\mathbf x +\mathbf u )-R)\nabla {T}$. The velocity field $\mathbf v $ is initialized as $\mathbf 0 $ in implementation. In [10], the condition $|\det (J_\mathbf{y })|\ge 0.5$ is checked at each iteration and if not satisfied, restarting the numerical solver is initiated so that a diffeomorphic transform is obtained; see also [38]. Further in [55], the model is enhanced by incorporating a volume preservation idea relating to minimizing $|\det (J_{\mathbf{y }})-1|$ again to ensure diffeomorphism without restarting.

Next, we review the Diffusion model [16]

$$\begin{aligned} \min _{\mathbf{u }} J(\mathbf{u })= & {} {\mathcal {D}}[\mathbf{u }] + \alpha S[\mathbf{u }] \nonumber \\= & {} \frac{1}{2}\int _{\Omega }(T({\mathbf{x }+\mathbf u })-R)^{2}\hbox {d}{} \mathbf x \nonumber \\&+\, \frac{\alpha }{2}\int _{\Omega }\sum _{\ell =1}^{2} |\nabla u_{\ell }|^{2}\hbox {{d}}{} \mathbf x . \end{aligned}$$

(4)

It leads to the Euler–Lagrange equation:

$$\begin{aligned}&(T(\mathbf x+u )-R)\nabla _{\mathbf{u }} T(\mathbf x+u ) - \alpha \Delta \mathbf{u }= 0\\&\qquad \ \hbox {i.e.} \begin{array}{l} (T(\mathbf x+u )-R)\partial _{u_1} T(\mathbf x+u ) - \alpha \Delta u_1 = 0, \\ (T(\mathbf x+u )-R)\partial _{u_2} T(\mathbf x+u ) - \alpha \Delta u_2 = 0, \end{array} \end{aligned}$$

subject to $\langle \nabla u_{\ell },\mathbf{n } \rangle = 0$ on $\partial \Omega $ and $\ell = 1, 2$. Particularly, there exits a fast implementation based on the so-called additive operator splitting (AOS) scheme [38, 53]. In [13], a fast solver was developed for this model.

However, as with other models reviewed in the three categories, the obtained solution $\mathbf{u }$ or $\mathbf{y }$ is mathematically correct but often incorrect physically. This is due to no guarantee of mesh non-folding which is measured by $\det (J_{\mathbf{y }})>0$, i.e. a positive determinant of the local Jacobian matrix $J_\mathbf{y }$ of the transform $\mathbf{y }$.

2.3 Models of Diffeomorphic Transformation

To achieve $\det (J_{\mathbf{y }})>0$, one can find several recent works that impose this constraint in some direct ways. We review a few of such models before we present our new constraint. In the form of (4), the idea is to choose $S_1[\cdot ]$ in the following (note ${\mathbf{y }=\mathbf{x }+ \mathbf{u }}$)

$$\begin{aligned} \min _{\mathbf{u }} J(\mathbf{u }) = {\mathcal {D}}[\mathbf{u }] + \alpha S[\mathbf{u }] + \beta S_1[ \mathbf{y } ]. \end{aligned}$$

(5)

2.3.1 Volume Control

In 2004, Haber and Modersitzki [23] used volume preserving constraint (area in 2D) for image registration, namely

$$\begin{aligned} \det (J_{\mathbf{y }}) = 1. \end{aligned}$$

As a consequence, we can ensure that the transformation is diffeomorphic. However, volume preservation is not desirable when the anatomical structure is compressible in medical imaging.

2.3.2 Slack Constraint

Improving on [25], the constraint $\det (J_{\mathbf{y }})=1$ is relaxed and a slack constraint is proposed

$$\begin{aligned} M_{a}\le \det (J_{\mathbf{y }}) \le M_{b}, \end{aligned}$$

where a positive interval $[M_{a},M_{b}]$ is provided by the user as prior information in the specific application e.g. $[M_{a},M_{b}]=[0.1, 2]$.

2.3.3 Unbiased Transform

In [55], according to the information theory, $\det (J_{\mathbf{y }})$ is controlled by the symmetric Kullback–Leibler distance

$$\begin{aligned} \int _{\Omega }|\det (J_{\mathbf{y }})-1|\log (|\det (J_\mathbf{y })|)\hbox {d}{} \mathbf x . \end{aligned}$$

It can help to get an unbiased diffeomorphic transformation. This idea was tested with the fluid regularizer (first order).

2.3.4 Balance of Shrinkage and Growth

Geometrically $\det (J_{\mathbf{y }})=1$ implies volume preservation. Similarly $\det (J_{\mathbf{y }})<1$ implies shrinkage while $\det (J_\mathbf{y })>1$ implies growth. A function that treats the cases of shrinkage and growth identically is $\phi (x)=((x-1)^2/x)^2$ since $\phi (1/x)=\phi (x)$. A volume penalty

$$\begin{aligned} \int _{\Omega }\left( \frac{(\det (J_\mathbf{y })-1)^{2}}{\det (J_{\mathbf{y }})}\right) ^{2}\hbox {d}{} \mathbf x \end{aligned}$$

(6)

is used in the hyperelastic model [7], which ensures that shrinkage and growth have the same price.

2.3.5 LDDMM Framework

In LDDMM framework, the deformation is modelled by considering its velocity over time according to the transport equation. We can write its variational formulation as follows:

$$\begin{aligned} \begin{aligned}&\min _{{\mathcal {T}},v} {\mathcal {D}}({\mathcal {T}}(\cdot ,1),R) + \alpha {\mathcal {S}}(v)\\&\hbox {{s.t.}}\quad \partial _{t}{\mathcal {T}}(\mathbf x ,t)+v(\mathbf x ,t)\cdot \nabla {\mathcal {T}}(\mathbf x ,t)=0 \ \text{ and } \ {\mathcal {T}}(\mathbf x ,0) = T, \end{aligned} \end{aligned}$$

where $v:\Omega \times [0,1]\rightarrow \mathbb {R}^{2}$ is the velocity and ${\mathcal {T}}:\Omega \times [0,1]\rightarrow \mathbb {R}$ is a series of images. For more details, please see [3, 15, 37, 47, 50]

2.3.6 Beltrami Indirect Control

In 2014, Lam and Lui [30] presented a novel approach in a Beltrami framework to obtain diffeomorphic registrations with large deformations using landmark and intensity information via quasi-conformal maps. Before introducing this model, we first describe some basic theories about quasi-conformal map and Beltrami coefficient.

A complex map $z=x_1+\mathbf{i}x_2 \longmapsto f(z)=y_1(x_1,x_2)+ \mathbf{i}y_2(x_1,x_2)$ from a domain in $\mathbb {C}$ onto another domain is quasi-conformal if it has continuous partial derivatives and satisfies the following Beltrami equation:

$$\begin{aligned} \frac{\partial f}{\partial {\bar{z}}} = \mu (f)\frac{\partial f}{\partial z}, \end{aligned}$$

(7)

for some complex-valued Lebesgue measurable $\mu $ [4] satisfying $\Vert \mu \Vert _{\infty } < 1$. Here $\mu =\mu ({\mathbf{y }})\equiv f_{{\bar{z}}}/f_z$ is called the Beltrami coefficient explicitly computable from ${\mathbf{y }}$ since

$$\begin{aligned} \left\{ \begin{aligned} f_z=\frac{\partial f}{\partial z}&\equiv \frac{1}{2}\left( \frac{\partial f}{\partial x_1} - \mathbf{i}\frac{\partial f}{\partial x_2}\right) = \frac{(y_1)_{x_1}+(y_2)_{x_2}}{2} + \mathbf{i} \frac{(y_2)_{x_1}-(y_1)_{x_2}}{2}, \\ f_{{\bar{z}}}=\frac{\partial f}{\partial {\bar{z}}}&\equiv \frac{1}{2}\left( \frac{\partial f}{\partial x_1} + \mathbf{i}\frac{\partial f}{\partial x_2}\right) = \frac{(y_1)_{x_1}-(y_2)_{x_2}}{2} + \mathbf{i} \frac{(y_2)_{x_1}+(y_1)_{x_2}}{2}, \end{aligned}\right. \end{aligned}$$

(8)

where $(y_1)_{x_{1}}=\partial y_1/\partial x_1$. Conversely $\mathbf y =\mathbf y ^{\mu }$ can be computed for a given $\mu $ through solving $\mu (\mathbf y ) = \mu $.

A quasi-conformal map is a homeomorphism (i.e. one-to-one) and its first-order approximation takes small circles to small ellipses of bounded eccentricity [20]. As a special case, $\mu =0$ means that the map f is holomorphic and conformal, characterized by $f_{ {\bar{z}}}=0$ or $y_1, y_2$ satisfying the Cauchy–Riemann equations $(y_1)_{x_1} = (y_2)_{x_2}, \ (y_1)_{x_2} =- (y_2)_{x_1}$.

Thus in the context of image registration, enforcing $\Vert \mu \Vert _{\infty } < 1$ provides the control for the transform f and ensures homeomorphism. The quasi-conformal hybrid registration model (QCHR) in [30] is

$$\begin{aligned} \min _\mathbf{y }\int _{\Omega }|\nabla \mu |^{2}+\alpha \int _{\Omega }|\mu |^{p}+\beta \int _{\Omega }(T( \mathbf{y })-R)^{2} \end{aligned}$$

(9)

subject to $\mathbf{y }=(y_1,y_2)$ satisfying

(1)
$\mu = \mu ( \mathbf{y })$;
(2)
$ \mathbf{y }(p_{j})=q_{j}$ for $1\le j\le m$ (Landmark constraints);
(3)
$\Vert \mu ( \mathbf{y })\Vert _{\infty }<1$ (bijectivity),

which indirectly controls $\det (J_{\mathbf{y }})$ via Beltrami coefficient, where $\mu ( \mathbf{y })$ is the Beltrami coefficient of the transformation $ \mathbf{y }$. The above model is solved by a penalty splitting method. It minimizes the following functional:

$$\begin{aligned}&\int _{\Omega }|\nabla \nu |^{2}+\alpha \int _{\Omega }|\nu |^{p}+\sigma \int _{\Omega }|\nu -\mu |^{2}\nonumber \\&\quad +\beta \int _{\Omega }(T( \mathbf{y }^{\mu })-R)^{2} \end{aligned}$$

(10)

subject to the constraints that $\Vert \nu \Vert _{\infty }<1$ and $\mathbf y ^{\mu }$ be the quasi-conformal map with Beltrami coefficient $\mu $ satisfying $\mathbf y ^{\mu }(p_{j}) = q_{j}$ for $1\le j\le m$. Then in each iteration, it needs to solve the following two subproblems alternately:

$$\begin{aligned} \begin{aligned}&\mu _{n+1} = \arg \min \sigma \int _{\Omega }|\mu -\nu _{n}|^{2}+\beta \int _{\Omega }(T( \mathbf{y }^{\mu })-R)^{2}\\&\hbox {{s.t.}} \quad \mathbf y ^{\mu }(p_{j}) = q_{j} \quad \hbox {{for}} \ 1\le j\le m \end{aligned} \end{aligned}$$

(11)

and

$$\begin{aligned} \nu _{n+1}= & {} \arg \min \int _{\Omega }|\nabla \nu |^{2}\nonumber \\&+\alpha \int _{\Omega }|\nu |^{p}+\sigma \int _{\Omega }|\nu -\mu _{n+1}|^{2}. \end{aligned}$$

(12)

In addition, it also solves the equation $\mu (\mathbf y )=\mu $ by the linear Beltrami solver (LBS) [34] to find $\mathbf y $ and ensures that $\mathbf y $ matches the landmark constraints.

Thus, instead of controlling the Jacobian determinant of the transformation directly, controlling Beltrami coefficient is also a good alternative providing the same but indirect control. However, since their algorithm [30] has to deal with two main unknowns (the transformation $\mathbf y $ and its Beltrami coefficient $\mu $) and one auxiliary unknown (the coefficient $\nu $) in a non-convex formulation, the increased cost, practical implementation and convergence are real issues; for challenging problems, one cannot observe convergence and therefore the full capability of the model is not realized.

We are motivated to reduce the unknowns and simplify their algorithm. Our solution is to reformulate the problem in the space of the primary variable ${\mathbf{y }}$ or $\mathbf{u}$, not in the transformed space of variables $\mu , \nu $. We make use of the explicit formula of $\mu =\mu ({\mathbf{y }})$. Working with primal mapping ${\mathbf{y }}$ enables us to introduce the advantages of minimizing a Beltrami coefficient to the above reviewed variational framework (2), effectively unifying the two frameworks.

Hence, we propose a new regularizer-based Beltrami coefficient and, in the numerical part, we can find that it is easy to be implemented. Moreover, the reformulated control regularizer can potentially be applied to a large class of variational models.

3 The Proposed Image Registration Model

In this section, we aim to present a new regularizer based on Beltrami coefficient, which can help to get a diffeomorphic transformation. Then combining the new regularizer with the diffusion model, we present a novel model. Of course, combining with other models may be studied as well since the idea is the same.

For $f(z) = y_1(x_{1},x_{2})+\mathbf{i}y_2(x_{1},x_{2})$, according to the Beltrami equation (7) and the definitions (8), we have

$$\begin{aligned} \mu (f)= & {} \frac{\partial f}{\partial {\bar{z}}}\Big /\frac{\partial f}{\partial z}\nonumber \\= & {} \frac{((y_1)_{x_{1}}-(y_2)_{x_{2}})+\mathbf{i}((y_2)_{x_{1}}+(y_1)_{x_{2}})}{((y_1)_{x_{1}}+(y_2)_{x_{2}})+ \mathbf{i}((y_2)_{x_{1}}-(y_1)_{x_{2}})}, \end{aligned}$$

(13)

$$\begin{aligned} |\mu (f)|^{2}= & {} \displaystyle \frac{((y_1)_{x_{1}}-(y_2)_{x_{2}})^{2}+((y_2)_{x_{1}}+(y_1)_{x_{2}})^{2}}{((y_1)_{x_{1}}+(y_2)_{x_{2}})^{2}+((y_2)_{x_{1}}-(y_1)_{x_{2}})^{2}} \nonumber \\= & {} \displaystyle \frac{\Vert J_{f}\Vert _2^2 - 2 \det (J_{f})}{\Vert J_{f}\Vert _2^2 + 2 \det (J_{f})}. \end{aligned}$$

(14)

Note $(y_1)_{x_{1}}(y_2)_{x_{2}}-(y_2)_{x_{1}}(y_1)_{x_{2}} = \det (J_{f})$. So $\det (J_{f})$ can be represented by the Beltrami coefficient $\mu (f)$

$$\begin{aligned} \det (J_{f}) = |f_{z}|^{2}(1-|\mu (f)|^{2}) \end{aligned}$$

(15)

Clearly $\det (\nabla f)>0$ if $|\mu (f)|<1$, and by the inverse function theorem, the map f is locally bijective. We conclude that f is diffeomorphism if we assume that $\Omega $ is bounded, simply connected.

For more details about quasi-conformal theory, the readers can refer to [1, 20, 31].

3.1 New Regularizer

Our new regularizer based on $|\mu (f)|<1$ to control the transformation to get a diffeomorphic mapping is

$$\begin{aligned} S_1[ \mathbf{y } ] = \int _{\Omega } \phi ( |\mu |^2 ) d\mathbf{x },\quad |\mu |^2=\frac{\Vert J_{\mathbf{y }}\Vert _2^2 - 2 \det (J_{\mathbf{y }})}{\Vert J_{\mathbf{y }}\Vert _2^2 + 2 \det (J_{\mathbf{y }})} \end{aligned}$$

(16)

which clearly involves the Jacobian determinant $\det (J_\mathbf{y })$ in a non-trivial way and we explore the choices of $\phi $ below.

Remark

Our new regularizer has two advantages: one is that the obtained transformation $\mathbf y $ do not need to possess $\det (J_\mathbf{y })\rightarrow 1$; the other one is that we only compute the transformation and do not need to compute its Beltrami coefficient and introduce another auxiliary unknown as [30]. In addition, from the numerical experiments, we can see that our new regularizer is easy to implement and obtain accurate and diffeomorphic transformations.

3.2 The Proposed Model

The above regularizer (16) providing a constraint on $\mathbf{y }$ is ready to be combined with an existing model. In the framework (5), using (16), the first version of our new model takes the form

$$\begin{aligned} \min _{\mathbf{y }} \frac{1}{2}\Vert T(\mathbf{y })-R \Vert ^2_2 +\frac{\alpha }{2}\Vert \ |\nabla {\mathbf{u }}|\ \Vert ^{2}_2 + \beta \int _{\Omega } \phi ( |\mu |^2 ) d\mathbf{x } \end{aligned}$$

(17)

where $\mathbf{u } = \mathbf{y }(\mathbf x )-\mathbf{x } =(y_{1}(\mathbf x ),y_{2}(\mathbf x ))-\mathbf{x }$ is the deformation field, $|\nabla {\mathbf{u }}|^{2}= |\nabla u_1|^{2}+|\nabla u_2|^{2}$ and $\mu =\mu ({\mathbf{y }})$. To promote $|\mu (f)|<1$, our first and simple choice is $\phi (v)=\phi _{1}(v)=\frac{1}{(v-1)^{2}}$, which forces (17) and $\phi (v)$ to reduce v, at the initial guess $v=0$ when $\mathbf{u}$=$\mathbf{0}$, since $\phi _{1}(v)\rightarrow \infty $ when $v\rightarrow 1$.

Remark

From (9) and (17), we see that the QCHR model focuses on obtaining a smooth Beltrami coefficient and our model focuses on the diffeomorphic transformation itself. There are major differences between the regularizer in QCHR model and our new regularizer: the former is characterized by the Beltrami coefficient $\mu $ directly and gradient of this Beltrami coefficient $\mu $, while the latter is characterized by the Beltrami coefficient indirectly in terms of the transformation $\mathbf y $ and the gradient of $\mathbf u $. Since $\mathbf y =\mathbf x + \mathbf u $ is our desired transformation, our direct regularizers such as $|\nabla \mathbf u |^2$ make more sense than indirect regularizers such as $|\nabla \mu |^2$.

However, as long as $|\mu (f)|<1$, we would not give a preference to forcing $|\mu (f)|\rightarrow 0$. To put some control on bias, similarly to [7], we are led to 2 more choices of a less unbiased function to modify $S_1[ \mathbf{y } ]$

$\phi (v) =\phi _2(v)= \frac{v}{(v-1)^{2}}$: balance $|\mu (f)|$ between 0 and 1 as $\phi _2(v)=\phi _2(1/v)$;
$\phi (v) =\phi _3(v)=\frac{v^2}{(v-1)^{2}}$: encourage $|\mu (f)|\rightarrow 0$ and $|\mu (f)|\not =1$;

Below, we list first-order derivatives and second-order derivatives for the above different $\phi (v)$:

${\phi }'_{1}(v)=\frac{2}{(v-1)^{3}}$ and ${\phi }''_{1}(v)=\frac{6}{(v-1)^{4}}$;
${\phi }'_{2}(v)=-\frac{v+1}{(v-1)^{2}}$ and ${\phi }''_{2}(v)=\frac{2v+4}{(v-1)^4}$;
${\phi }'_{3}(v)=-\frac{2v}{(v-1)^{3}}$ and ${\phi }''_{3}(v)=\frac{4v+2}{(v-1)^4}$

which will be used in subsequent solutions. With a general $\phi (v)$, the second version of our proposed model takes the form:

$$\begin{aligned}&\min _{\mathbf{u }} \frac{1}{2}\int _{\Omega }(T(\mathbf x+u )-R)^{2}\hbox {d}{} \mathbf x \nonumber \\&+\, \frac{\alpha }{2}\int _{\Omega }\sum _{\ell =1}^{2}|\nabla u_{\ell }|^{2}\hbox {d}{} \mathbf x + \beta \int _{\Omega }\phi (|\mu |^{2})\hbox {d}{} \mathbf x , \end{aligned}$$

(18)

where $|\mu |^{2} = \frac{(\partial _{x_{1}}u_{1}-\partial _{x_{2}}u_{2})^{2} +(\partial _{x_{1}}u_{2}+\partial _{x_{2}}u_{1})^2}{(\partial _{x_{1}}u_{1} +\partial _{x_{2}}u_{2}+2)^{2}+(\partial _{x_{1}}u_{2}-\partial _{x_{2}}u_{1})^2}$ is written in component form ready for discretization, using $y_1=x_{1}+u_1(x_1,x_2), \ y_2=x_{2}+u_2(x_1,x_2)$, and $\partial _{x_{1}}u_{1}=\partial u_1/\partial x_1$.

Remark

For the existence or uniqueness of a solution of (18), this is out of the scope of the present work and will be considered in our forthcoming work.

4 The Numerical Algorithm

In this section, we will present a numerical algorithm to solve model (18). We choose the discretize—optimize approach. Directly discretizing this variational model gives rise to a finite-dimensional optimization problem. Then we use optimization methods to solve this resulting problem.

4.1 Discretization

We use finite differences to discretize model (18) on a unit square domain $\Omega =[0,1]^2$. In implementation, we employ the nodal grid and define a spatial partition $\Omega _{h} = \{\mathbf{x }^{i,j}\in \Omega \ |\ \mathbf{x }^{i,j}=(x_{1}^{i},x_{2}^{j})=(ih,jh), 0 \le i \le n, 0 \le j \le n\}$, where $h = \frac{1}{n}$ and the discrete domain consists of $n^{2}$ cells of size $h \times h$. We discretize the displacement field $\mathbf{u }$ on the nodal grid, namely $\mathbf u ^{i,j} = (u_{1}^{i,j},u_{2}^{i,j}) = (u_{1}(x_{1}^{i},x_{2}^{j}), u_{2}(x_{1}^{i},x_{2}^{j}))$. For ease presentation, according to the lexicographical ordering, we reshape

$$\begin{aligned}&X = \left( x_{1}^{0},\ldots ,x_{1}^{n},\ldots ,x_{1}^{0},\ldots ,x_{1}^{n},x_{2}^{0},\ldots ,x_{2}^{0},\ldots ,x_{2}^{n},\ldots ,x_{2}^{n}\right) ^{T}\\&\qquad \in {{\mathbb {R}}}^{2(n+1)^{2}\times 1}, \end{aligned}$$

and

$$\begin{aligned}&U = \left( u_{1}^{0,0},\ldots ,u_{1}^{n,0},\ldots ,u_{1}^{0,n},\ldots ,u_{1}^{n,n}, u_{2}^{0,0},\ldots ,u_{2}^{n,0},\ldots ,u_{2}^{0,n},\ldots ,u_{2}^{n,n}\right) ^{T} \\&\qquad \in {{\mathbb {R}}}^{2(n+1)^{2}\times 1}. \end{aligned}$$

4.1.1 Discretization of Term 1 in (18)

According to the cell-centred partition in Fig. 1a and mid-point rule, we get

$$\begin{aligned} \begin{aligned} {\mathcal {D}}[\mathbf{u }]:&= \frac{1}{2}\int _{\Omega }(T(\mathbf{x }+\mathbf{u }(\mathbf{x }))-R(\mathbf{x }))^{2}\hbox {d}\mathbf{x }\\&= \frac{h^{2}}{2}\sum _{i=0}^{n-1}\sum _{j=0}^{n-1}(T(\mathbf{x }^{i+\frac{1}{2},j+\frac{1}{2}}\\&\quad +\mathbf{u }(\mathbf{x }^{i+\frac{1}{2},j+\frac{1}{2}}))-R(\mathbf{x }^{i+\frac{1}{2},j+\frac{1}{2}}))^{2}. \end{aligned} \end{aligned}$$

(19)

Set $\vec {R} = R(PX) \in {{\mathbb {R}}}^{n^{2}\times 1}$ as the discretized reference image and $\vec T(PX+PU) \in {\mathbb R}^{n^{2}\times 1}$ as the discretized deformed template image, where $P \in {{\mathbb {R}}}^{2n^{2} \times 2(n+1)^{2}}$ is an averaging matrix for the transfer from the nodal grid representation of U to the cell-centred positions.

Consequently, for SSD, we obtain the following discretization:

$$\begin{aligned} {\mathcal {D}}[\mathbf{u }] \approx \frac{h^2}{2} (\vec T(PX+PU)-\vec {R})^{T}(\vec T(PX+PU)-\vec {R}).\nonumber \\ \end{aligned}$$

(20)

4.1.2 Discretization of Term 2 in (18)

For the diffusion regularizer,

$$\begin{aligned} {\mathcal {S}}_{\mathrm{diff}}[\mathbf{u }] := \frac{\alpha }{2}\int _{\Omega }\sum _{\ell =1}^{2}|\nabla u_{\ell }|^{2}\hbox {d}{} \mathbf x , \end{aligned}$$

(21)

according to the partition in Fig. 1b and mid-point rule, we have

$$\begin{aligned} \int _{\Omega _{i,j}^{x_{1}}}\vert \partial _{x_{1}} u_{\ell }\vert ^{2}\hbox {d}{} \mathbf x \approx h^{2}(\partial ^{i+\frac{1}{2},j}_{x_{1}}u_{\ell })^{2} \qquad 1 \le j\le n-1, \end{aligned}$$

(22)

or at the boundary half-boxes

$$\begin{aligned} \int _{\Omega _{i,j}^{x_{1}}}\vert \partial _{x_{1}} u_{\ell }\vert ^{2}\hbox {d}{} \mathbf x \approx \frac{h^{2}}{2}(\partial ^{i+\frac{1}{2},j}_{x_{1}}u_{\ell })^{2} \qquad j=0,n. \end{aligned}$$

(23)

And for $\int _{\Omega _{i,j}^{x_{2}}}\vert \partial _{x_{2}} u_{\ell }\vert ^{2}\hbox {d}{} \mathbf x ,\ \ell =1,2$, we have similar results.

As designed, we use compact (short) difference schemes to compute the $\partial _{x_{1}}u_{\ell }$ and $\partial _{x_{2}}u_{\ell },\ \ell =1,2$:

$$\begin{aligned} \partial ^{i+\frac{1}{2},j}_{x_{1}}u_{\ell }\approx & {} \frac{u_{\ell }^{i+1,j}-u_{\ell }^{i,j}}{h}, \nonumber \\ \partial ^{i,j+\frac{1}{2}}_{x_{2}}u_{\ell }\approx & {} \frac{u_{\ell }^{i,j+1}-u_{\ell }^{i,j}}{h}. \end{aligned}$$

(24)

Then (21) can be rewritten in the following formulation:

$$\begin{aligned} {\mathcal {S}}_{\mathrm{diff}}[\mathbf{u }] \approx \frac{\alpha h^2}{2}U^{T}A^{T}GAU. \end{aligned}$$

(25)

See “Appendix A” for details on A and G.

Remark

Note that here the matrix A is the discretized gradient matrix. So $A^{T}GA$ is the discretized Laplace matrix.

4.1.3 Discretization of Term 3 in (18)

For simplicity, denote $|\mu ( \mathbf{y })| =|\mu ( \mathbf{x } +\mathbf u )|$ by $|\mu ( \mathbf{u })|$. From (18), note that $\phi (|\mu ( \mathbf{u })|^{2})$ involves only first-order derivatives and all $\mathbf{u }^{i,j}$ are available at vertex pixels. Thus it is convenient first to obtain approximations at all cell centres (e.g. at $V_5$ in Fig. 2) and second to use local linear elements to facilitate first-order derivatives. We shall divide each cell (Fig. 2) into 4 triangles. In each triangle, we construct two linear interpolation functions to approximate the $u_{1}$ and $u_{2}$. Consequently, all partial derivatives are locally constants or $\phi (|\mu ( \mathbf{u })|^{2})$ is constant in each triangle.

According to the partition in Fig. 2, we get

$$\begin{aligned} {\mathcal {S}}_{\mathrm{Beltrami}}[\mathbf u ]:= & {} \beta \int _{\Omega }\phi (|\mu ( \mathbf u )|^{2})\hbox {d}{} \mathbf x \nonumber \\= & {} \beta \sum _{i=1}^{n}\sum _{j=1}^{n}\sum _{k=1}^{4}\int _{\Omega _{i,j,k}}\phi (|\mu (\mathbf u )|)^{2})\hbox {d}{} \mathbf x . \end{aligned}$$

(26)

Set $\mathbf{L }^{i,j,k}(\mathbf{x })= (L_{1}^{i,j,k}(\mathbf x ),L_{2}^{i,j,k}(\mathbf{x }))= (a^{i,j,k}_{1}x_{1}+a^{i,j,k}_{2}x_{2}+a^{i,j,k}_{3}, a^{i,j,k}_{4}x_{1}+a^{i,j,k}_{5}x_{2}+a^{i,j,k}_{6})$, which is the linear interpolation for $\mathbf{u }$ in the $\Omega _{i,j,k}$. Note that $\partial _{x_{1}} L^{i,j,k}_{1} = a^{i,j,k}_{1}, \partial _{x_{2}} L^{i,j,k}_{1} = a^{i,j,k}_{2},\partial _{x_{1}} L^{i,j,k}_{2} = a^{i,j,k}_{4}$ and $\partial _{x_{2}} L^{i,j,k}_{2} = a^{i,j,k}_{5}$. According to (18), the discretization of Beltrami regularizer can be written into following:

$$\begin{aligned}&{\mathcal {S}}_{\mathrm{Beltrami}}[\mathbf{u }] \approx \frac{\beta h^{2}}{4}\sum _{i=1}^{n}\sum _{j=1}^{n}\sum _{k=1}^{4} \phi \nonumber \\&\left( \frac{\left( a^{i,j,k}_{1}-a^{i,j,k}_{5}\right) ^{2} +\left( a^{i,j,k}_{2}+a^{i,j,k}_{4}\right) ^{2}}{\left( a^{i,j,k}_{1}+a^{i,j,k}_{5}+2\right) ^{2} +\left( a^{i,j,k}_{2}-a^{i,j,k}_{4}\right) ^{2}}\right) . \end{aligned}$$

(27)

To simplify (27), define 3 vectors $\vec {\mathbf{r }}(U), \vec {\mathbf{r }}^{1}(U), \vec {\mathbf{r }}^{2}(U)$ $\in \mathbb {R}^{4n^{2}}$ by $\vec {\mathbf{r }}(U)_{\ell }=\vec {\mathbf{r }}^{1}(U)_{\ell } \vec {\mathbf{r }}^{2}(U)_{\ell }$, $\vec {\mathbf{r }}^{1}(U)_{\ell }=(a^{i,j,k}_{1}-a^{i,j,k}_{5})^{2} +(a^{i,j,k}_{2}+a^{i,j,k}_{4})^{2}$, $\vec {\mathbf{r }}^{2}(U)_{\ell }=1\big /[(a^{i,j,k}_{1}+a^{i,j,k}_{5}+2)^{2} +(a^{i,j,k}_{2}-a^{i,j,k}_{4})^{2}]$ where $\ell = (k-1)n^{2}+(j-1)n+i\ \in [1, 4n^2]$. Hence, (27) becomes

$$\begin{aligned} {\mathcal {S}}_{\mathrm{Beltrami}}[\mathbf{u }] \approx \frac{\beta h^{2}}{4}{\varvec{\phi }}(\vec {\mathbf{r }}(U))e^{T} \end{aligned}$$

(28)

where ${\varvec{\phi }}(\vec {\mathbf{r }}(U)) = (\phi (\vec {\mathbf{r }}(U)_{1}),\ldots ,\phi (\vec {\mathbf{r }}(U)_{4n^{2}}))$ denotes the pixel-wise discretization of $u_1, u_2$ at all cell centres, and $e = (1,\ldots ,1)\in \mathbb {R}^{4n^{2}}$. Here, $\vec {\mathbf{r }}(U)$ is the square of the discretized Beltrami coefficient; we rewrite it in a compact form in “Appendix B”.

Finally, combining the above three parts (20), (25) and (28), we get the discretization formulation for model (18):

$$\begin{aligned} \begin{aligned}&\min _{U} J(U):= \frac{h^2}{2}(\vec T(PX+PU)-\vec {R})^{T}(\vec T(PX+PU)-\vec {R})\\&\quad + \frac{\alpha h^2}{2}U^{T}A^{T}GAU+ \frac{\beta h^{2}}{4}{\varvec{\phi }}(\vec {\mathbf{r }}(U))e^{T}. \end{aligned} \end{aligned}$$

(29)

Remark

According to the definition of $\phi $ and $\vec {\mathbf{r }}(U)_{\ell } \ge 0$, each component of ${\varvec{\phi }}(\vec {\mathbf{r }}(U))$ is nonnegative and differentiable.

4.2 Optimization Method for the Discretized Problem (29)

In the numerical implementation, we choose line search method to solve the resulting unconstrained optimization problem (29). In order to guarantee the search direction is a descent direction, we employ the Gauss–Newton direction as the standard direction involving non-definite Hessians does not generate a descent direction. Otherwise, using a Gauss–Newton approach presents two advantages: one is that we do not need to compute the second-order term and it can save computation time; the other one is that this Gauss–Newton matrix is more important than the second term, either because of small second-order derivatives or because of small residuals [42].

Let $J(U): \mathbb {R}^{2(n+1)^{2}}\rightarrow \mathbb {R}$ be twice continuously differentiable, $U^{i}\in \mathbb {R}^{2(n+1)^{2}}$ and the approximated Hessian $H(U^{i})$ positive definite. We model J at the current point $U^{i}$ by the quadratic approximation $q^{i}(s)$,

$$\begin{aligned} J(U^{i}+s)\approx q^{i}(s)= & {} J(U^{i})+d_{J}(U^{i})^{T}s \nonumber \\&+ \frac{1}{2}s^{T}H(U^{i})^{T}s, \end{aligned}$$

(30)

where $s= U-U^{i}$ and $d_{J}(U^{i}) = \nabla J(U^{i})$. Minimizing $q^{i}(s)$ yields

$$\begin{aligned} U^{i+1} = U^{i}-[H(U^{i})]^{-1}d_{J}(U^{i}). \end{aligned}$$

(31)

In order to guarantee the global convergence of the Gauss–Newton method, we employ the line search and its iteration is as follows:

$$\begin{aligned} U^{i+1} = U^{i}-\theta _{i}[H(U^{i})]^{-1}d_{J}(U^{i}). \end{aligned}$$

(32)

where $\theta _{i}$ is a step length.

Next, we will investigate the details about the approximated Hessian $H(U^{i})$, step length $\theta _{i}$, stopping criteria and multilevel strategy.

4.2.1 Approximated Hessian H

We consider each of the three terms in J(U) from (29) separately.

Firstly, we consider the discretized SSD

$$\begin{aligned} \frac{h^{2}}{2}(\vec T(PX+PU)-\vec {R})^{T}(\vec T(PX+PU)-\vec {R}). \end{aligned}$$

(33)

Its gradient and Hessian are, respectively,

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{1} &{}=&{} h^{2}P^{T}\vec {T}_{\tilde{\mathbf{U }}}^{T}(\vec T(\tilde{\mathbf{U }})-\vec {R})\in {{\mathbb {R}}}^{2(n+1)^{2}\times 1},\\ H_{1} &{}=&{} h^{2}P^{T}(\vec {T}_{\tilde{\mathbf{U }}}^{T}\vec {T}_{\tilde{\mathbf{U }}} + \sum _{\ell = 1}^{n^{2}}(\vec T(\tilde{\mathbf{U }})-\vec {R})_{\ell }\nabla ^{2}(\vec T(\tilde{\mathbf{U }})-\vec {R})_{\ell })P \end{array} \right. \end{aligned}$$

(34)

where $\tilde{\mathbf{U }} =PX+PU$ and $\vec {T}_{\tilde{\mathbf{U }}} = \frac{\partial \vec T(\tilde{\mathbf{U }}) }{\partial \tilde{\mathbf{U }}}$ as the Jacobian of $\vec T$ with respect to $ \tilde{\mathbf{U }}$.

For $H_{1}$, we cannot ensure that it is positive semi-definite. If it is not positive definite, we may not get a descent direction. So we omit the second-order term of $H_{1}$ to obtain the approximated Hessian of (33):

$$\begin{aligned} {\hat{H}}_{1} = h^{2}P^{T}(\vec {T}_{\tilde{\mathbf{U }}}^{T}\vec {T}_{\tilde{\mathbf{U }}})P. \end{aligned}$$

(35)

Remark

Evaluation of the deformed template image T must involve interpolation because $\tilde{\mathbf{U }}$ do not in general correspond to pixel points; in our implementation, as with [39], we use B-splines interpolation to get $\vec T(\tilde{\mathbf{U }})$.

Secondly, for the discretized diffusion regularizer $\frac{\alpha h^{2}}{2} U^{T}A^{T}GAU$,

its gradient and Hessian are the following:

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{2} &{}=&{} \alpha h^{2}A^{T}GAU \in {{\mathbb {R}}}^{2(n+1)^{2}\times 1},\\ H_{2} &{}=&{} \alpha h^{2}A^{T}GA \in {{\mathbb {R}}}^{2(n+1)^{2}\times 2(n+1)^{2}}. \end{array} \right. \end{aligned}$$

(36)

Since $H_{2}$ is positive definite when U is applied with Dirichlet boundary conditions, we do not approximate it.

Finally, for the discretized Beltrami term

$$\begin{aligned} \frac{\beta h^{2}}{4}{\varvec{\phi }}(\vec {\mathbf{r }}(U))e^{T}, \end{aligned}$$

(37)

the gradient and the Hessian are as follows:

$$\begin{aligned} \left\{ \begin{array}{lcl} d_{3} &{}=&{} \frac{\beta h^{2}}{4} \hbox {d}\vec {\mathbf{r }}^{T}\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }}) \in {{\mathbb {R}}}^{2(n+1)^{2}\times 1},\\ H_{3} &{}=&{} \frac{\beta h^{2}}{4} (\hbox {d}\vec {\mathbf{r }}^{T}\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})\hbox {d}\vec {\mathbf{r }} + \sum _{\ell =1}^{4n^{2}}[\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})]_{\ell }\nabla ^{2}\vec {\mathbf{r }}_{\ell }) \in {\mathbb R}^{2(n+1)^{2}\times 2(n+1)^{2}} \end{array} \right. \nonumber \\ \end{aligned}$$

(38)

where $\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})= (\phi '(\vec {\mathbf{r }}_{1}),\ldots ,\phi '(\vec {\mathbf{r }}_{4n^{2}}))^{T}$ is the vector of derivatives of ${\varvec{\phi }}$ at all cell centres,

$$\begin{aligned} \left\{ \begin{array}{lcl} \hbox {d}\vec {\mathbf{r }}\ &{} = &{} {{\mathrm{diag}}}(\vec {\mathbf{r }}^{1})\hbox {d}\vec {\mathbf{r }}^{2}+{{\mathrm{diag}}}(\vec {\mathbf{r }}^{2})\hbox {d}\vec {\mathbf{r }}^{1}, \\ \hbox {d}\vec {\mathbf{r }}^{1} &{} =&{} 2{{\mathrm{diag}}}(A_{1}U)A_{1} + 2{{\mathrm{diag}}}(A_{2}U)A_{2}, \\ \hbox {d}\vec {\mathbf{r }}^{2} &{} =&{} -{{\mathrm{diag}}}(\vec {\mathbf{r }}^{2}\odot \vec {\mathbf{r }}^{2})[2{{\mathrm{diag}}}(A_{3}U+2)A_{3} + 2{{\mathrm{diag}}}(A_{4}U)A_{4}], \end{array} \right. \end{aligned}$$

(39)

$\odot $ denotes a Hadamard product, $\hbox {d}\vec {\mathbf{r }}, \hbox {d}\vec {\mathbf{r }}^{1}, \hbox {d}\vec {\mathbf{r }}^{2}$ are the Jacobian of $\vec {\mathbf{r }}, \vec {\mathbf{r }}^{1}, \vec {\mathbf{r }}^{2}$ with respect to U, respectively, $ [\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})]_{\ell }$ is the $\ell $th component of $\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})$ and $\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})$ is the Hessian of ${\varvec{\phi }}$ with respect to $\vec {\mathbf{r }}$, which is a diagonal matrix whose ith diagonal element is $\phi ''(\vec {\mathbf{r }}_{i}),\ 1\le i \le 4n^{2}$. Here ${{\mathrm{diag}}}(v)$ is a diagonal matrix with v on its main diagonal. More details about $\vec {\mathbf{r }}^{1}$, $\vec {\mathbf{r }}^{2}$, $A_1$, $A_2$, $A_3$ and $A_4$ are shown in “Appendix B” and some illustration of our notation is given in “Appendix C”.

To extract a positive semi-definite part out of (38), we omit the second-order term and obtain the approximated Hessian as

$$\begin{aligned} {\hat{H}}_{3} = \frac{\beta h^{2}}{4} \hbox {d}\vec {\mathbf{r }}^{T}\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})\hbox {d}\vec {\mathbf{r }}. \end{aligned}$$

(40)

Therefore for functional J(U) in (29) with any choice of $\phi $, we obtain its gradient

$$\begin{aligned} d_{J} = d_{1}+d_{2}+d_{3} \end{aligned}$$

(41)

and approximated Hessian:

$$\begin{aligned} H = {\hat{H}}_{1}+H_{2}+{\hat{H}}_{3}. \end{aligned}$$

(42)

4.2.2 Search Direction

At each iteration, using (41) and (42), we need to solve the Gauss–Newton system to find the search direction of (29):

$$\begin{aligned} H\delta U=-\,d_{J}, \end{aligned}$$

(43)

where $\delta U$ is the search direction. In our implementation, we use MINRES with diagonal preconditioning to solve this linear system [2, 43].

4.2.3 Step Length

We use the standard Armijo strategy with backtracking to find a suitable step length $\theta $. In the implementation, we also need to check that $\vec {\mathbf{r }}(U)$ (54) is smaller than 1. Recall that $\vec {\mathbf{r }}(U)$ is the norm square of the discretized Beltrami term. As a safe guard, we choose T0 = $10^{-8}$ and Tol = $10^{-12}$ as the lower bound of the step length $\theta $ and $\theta \Vert \delta U\Vert $ [7, 28, 42, 48]. The algorithm is summarized in Algorithm 1.

4.2.4 Stopping Criteria

Here, we adopt the stopping criteria as in [39]:

(1.a)
$\Vert J( U^{i+1})-J( U^{i})\Vert \le \tau _{J}(1+\Vert J( U^{0})\Vert )$,
(1.b)
$\Vert \mathbf{y }^{i+1}-\mathbf{y }^{i}\Vert \le \tau _{W}(1+\Vert \mathbf{y }^{0}\Vert )$,
(1.c)
$\Vert d_{J}\Vert \le \tau _{G}(1+\Vert J( U^{0})\Vert )$,
(2)
$\Vert d_{J}\Vert \le $ eps,
(3)
$i \ge $ MaxIter.

Here, eps is the machine precision and MaxIter is the maximal number of outer iterations. We set $\tau _{J} = 10^{-3}$, $\tau _{W} = 10^{-2}$, $\tau _{G} = 10^{-2}$ and MaxIter$ = 500$. If any one of (1) (2) and (3) is satisfied, the iterations are terminated. Hence, a Gauss–Newton numerical scheme with Armijo line search can be developed. The resulting Gauss–Newton numerical scheme by using Armijo line search is summarized in Algorithm 2.

Next, we discuss the global convergence result of Algorithm 2 for our reformulated problem (29). Firstly, we review some relevant theorem.

Theorem 1

([28]) For the unconstrained optimization problem

$$\begin{aligned} \min _{U} J(U) \end{aligned}$$

let an iterative sequence be defined by $U^{i+1}=U^{i}+\theta \delta U^{i}$, where $\delta U^{i}=-(H^{i})^{-1}d_{J}(U^{i})$ and $\theta $ is obtained by Algorithm 1. Assume that three conditions are met: (i). $d_{J}$ be Lipschitz continuous; (ii). the matrices $H^{i}$ are SPD (iii). there exist constants ${\bar{\kappa }}$ and $\lambda $ such that the condition number $\kappa (H^{i})\le {\bar{\kappa }}$ and the norm $||H^{i}||\le \lambda $ for all i. Then either $J(U^{i})$ is unbounded from below or

$$\begin{aligned} \lim _{i\rightarrow \infty } d_{J}(U^{i})=0 \end{aligned}$$

(44)

and hence any limit point of the sequence of iterates is a stationary point.

Remark

In the above discretization leading to (29), we do not need to introduce the boundary condition. However, for theory purpose, in the following, we will prove our convergence result under the Dirichlet boundary condition (namely, the boundary is fixed) and this condition is needed to prove the symmetric positive definite (SPD) property of the approximated Hessians. In practical implementation, such a condition is not required as confirmed by experiments.

In addition, define an important set ${\mathcal {X}}:=\{U \ | \ \vec {\mathbf{r }}(U)_{\ell }\le 1-\epsilon , 1 \le \ell \le 4n^{2}\}$ for small $\epsilon $. So $U\in {\mathcal {X}}$ means that the transformation is diffeomorphic. Under the suitable $\beta $, we assume that each $U^{i}$ generated by Algorithm 2 is in the ${\mathcal {X}}$.

Secondly we stage a simple lemma that is needed shortly for studying $H^i$.

Lemma 2

Let a matrix be comprised of 3 submatrices $H = H_{1}+H_{2}+H_{3}$. If $H_{1}$ and $H_{2}$ are symmetric positive semi-definite and $H_{3}$ is SPD, then H is SPD with $\lambda _{h_{3}}\le \lambda _{h}$, where $\lambda _{h_{3}}$ and $\lambda _{h}$ are the minimum eigenvalues of $H_{3}$ and H separately.

Proof

According to Rayleigh quotient, we can find a vector v such that

$$\begin{aligned} \lambda _{h} = \frac{v^{T}Hv}{v^{T}v}. \end{aligned}$$

(45)

Then we have

$$\begin{aligned} \lambda _{h_{3}}\le \frac{v^{T}H_{1}v}{v^{T}v}+\frac{v^{T}H_{2}v}{v^{T}v}+\frac{v^{T}H_{3}v}{v^{T}v} = \frac{v^{T}Hv}{v^{T}v} = \lambda _{h}, \end{aligned}$$

(46)

which completes the proof. $\square $

Theorem 3

Assume that T and R are twice continuously differentiable. For (29), when $\phi =\phi _{1},\phi _{2}$ or $\phi _{3}$, by using Algorithm 2, we obtain

$$\begin{aligned} \lim _{i\rightarrow \infty }d_{J}(U^{i})=0 \end{aligned}$$

(47)

and hence any limit point of the sequence of iterates produced by Algorithm 2 is a stationary point.

Proof

It suffices to show that Algorithm 2 satisfies the requirements of Theorem 1. Recall $\vec {\mathbf{r }}(U)$ and we can see that it is continuous. Here, we use the Dirichlet boundary condition and we can assume that $\Vert U\Vert $ is bounded. Then $\vec {\mathbf{r }}(U)$ is a continuous mapping from a compact set to $\mathbb {R}^{4n^{2}\times 1}$ and $\vec {\mathbf{r }}(U)$ is proper. So for some small $\epsilon >0$, ${\mathcal {X}}$ is compact.

Firstly, we show that in ${\mathcal {X}}$, $d_{J}$ of (29) is Lipschitz continuous. When $\phi =\phi _{1},\phi _{2}$ or $\phi _{3}$, the term ${\varvec{\phi }}(\vec {\mathbf{r }}(U))e^{T}$ in the (29) is twice continuously differentiable with respect to $U \in {\mathcal {X}}$. In addition, T and R are twice continuously differentiable. So (29) is twice continuously differentiable with respect to $U \in {\mathcal {X}}$ and $d_{J}$ is Lipschitz continuous.

Secondly, we show that in ${\mathcal {X}}$, $H^{i}={\hat{H}}^{i}_{1}+ H^{i}_{2}+{\hat{H}}^{i}_{3}$ is SPD. By the construction of $\hat{H}^{i}_{1}$ and ${\hat{H}}^{i}_{3}$, they are symmetric positive semi-definite. $H^{i}_{2}$ is symmetric positive definite under the Dirichlet boundary condition. Consequently $H^{i}$ is SPD.

Thirdly, we show that both $\kappa (H^i)$ and $\Vert H^{i}\Vert $ are bounded. We notice that in each iteration, $H^{i}_{2}=\alpha h^{2}A^{T}GA$ is constant and we can set $\Vert H^{i}_{2}\Vert = M_{2}$. For ${\hat{H}}^{i}_{1} = h^{2}P^{T}(\vec {T}_{\tilde{\mathbf{U }}}^{T} \vec {T}_{\tilde{\mathbf{U }}})P$, we get its upper bound $M_{1}$ because T is twice continuously differentiable and ${\mathcal {X}}$ is compact. For $\phi =\phi _{1},\phi _{2}$ or $\phi _{3}$, $\phi $ is twice continuously differentiable with respect to $U \in {\mathcal {X}}$, then we have $\Vert {\hat{H}}^{i}_{3}\Vert \le \frac{\beta h^{2}}{4}\Vert \hbox {d}\vec {\mathbf{r }}^{T}\Vert \Vert \hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})\Vert \Vert \hbox {d}\vec {\mathbf{r }}\Vert \le M_{3}$. Hence, we have

$$\begin{aligned} \Vert H^{i}\Vert \le \Vert {\hat{H}}^{i}_{1}\Vert +\Vert H^{i}_{2}\Vert +\Vert {\hat{H}}^{i}_{3}\Vert \le M_{1} + M_{2} + M_{3}. \end{aligned}$$

(48)

So set $M=M_{1}+M_{2}+M_{3}$ and $\Vert H^{i}\Vert \le M$. Set $\sigma $ as the minimum eigenvalue of $H^{i}_{2}$. According to Lemma 2, the smallest eigenvalue $\lambda _{min}$ of $H^{i}$ should be larger than $\sigma $. The largest eigenvalue $\lambda _{max}$ of $H^{i}$ should be smaller than M due to $\lambda _{max}\le \Vert H^{i}\Vert $. So the conditional number of $H^{i}$ is smaller than $\frac{M}{\sigma }$.

Finally, we can find that (29) has lower bound 0. So by applying Theorem 1, we finish the proof. $\square $

4.3 Multilevel Strategy

In practice, we employ the multilevel strategy. We firstly coarsen the template T and the reference R by L levels. Here, we set $T_{L} = T$ and $R_{L} = R$ in the finest level and $T_{1}$ and $R_{1}$ in the coarsest level. Then we can obtain $U_{1}$ by solving our model (18) on the coarsest level. In order to give a good initial guess for the finer level, we adopt an interpolation operator on $U_{1}$ to obtain $U_{2}^{0}$ as the initial guess for the next level. We repeat this process and get the final registration on the finest level. A multilevel strategy has several advantages: in the coarse level, only important patterns can be considered and it is a standard technique used in order to avoid getting trapped in a meaningless local minimum; the computational speed is very fast because of less variables than on the fine level; the solution on the coarse level can be a good initial guess for the fine level.

The multilevel scheme representing our main algorithm is summarized below where $I_{H}^{h}$ is an interpolation operator based on bi-linear interpolation techniques and $I_{h}^{H}$ is a restriction operator for tansferring information to a coarser level.

5 Numerical Results

In this section, we will give some numerical results to illustrate the performance of our model (18). We hope to achieve 3 aims:

(1):: Which choice of $\phi $ is the best for our model (18)?
(2):: We wish to compare with the current state-of-the-art methods (with codes listed for readers’ benefit) in the literature for good diffeomorphic mapping:

(a)
Hyperelastic Model [7]: code from http://www.siam.org/books/fa06/
(b)
LDDMM [37]: code from https://github.com/C4IR/FAIR.m/tree/master/add-ons/LagLDDMM
(c)
Diffeomorphic Demons (DDemons) [51]: code from http://www.insight-journal.org/browse/publication/154
(d)
QCHR [30]; code provided by the author Dr. Kam Chu Lam.

All of the tests are performed on a PC with 3.40 GHz Intel(R) Core(TM) i7-4770 microprocessor, and with installed memory (RAM) of 32 GB.

3). :: Most importantly, we like to test and highlight the advantages of our new model.

Let $\mathbf y $ be the final transform obtained by a particular model for registering two given images T, R. We use the following three measures to quantify the performance of this model and use them for later comparisons:

(i):

Re_SSD (the relative Sum of Squared Differences) which is given by

$$\begin{aligned} \mathrm{Re}\_\mathrm{SSD} = \frac{\Vert T(\mathbf{y })-R\Vert ^2}{\Vert T-R\Vert ^2}; \end{aligned}$$

(49)

(ii):

$\min \det (J_{\mathbf{y }})$ and $\max \det (J_{\mathbf{y }})$ that are the minimum and the maximum of the Jacobian determinant of this transformation;

(iii):

Jaccard similarity coefficient (JSC) as defined by

$$\begin{aligned} \mathrm{JSC} = \frac{|DT_{r}\cap R_{r}|}{|DT_{r}\cup R_{r}|}, \end{aligned}$$

(50)

where $DT_{r}$ and $R_{r}$ represent, respectively, the segmented regions of interest (e.g. certain image feature such as an organ) in the deformed template (after registration) and the reference. Hence, JSC is the ratio of the intersection of $DT_{r}$ and $R_{r}$ to the union of $DT_{r}$ and $R_{r}$ [29]. JSC = 1 shows that a perfect alignment of the segmentation boundary and JSC = 0 indicates that the segmented regions have no overlap after registration. Before computing JSC, in the first three examples below, we have employed a segmentation algorithm to segment the main features in both T and R but for the 4th example, the segmentation was manually done for both T and R.

In practice, we scale the intensity value of T and R to [0, 255]. Here, we state a strategy for choosing the parameters. For our model (18), $\alpha $ should be related to energy ${\mathcal {D}}[\mathbf u _{0}]$ where $\mathbf u _{0}$ is the initial guess for the displacement, and $\beta $ should be related to $\alpha $. Empirically, we set $\alpha \in [\alpha _{1},\alpha _{2}]$, where $\alpha _{1}=0.5{\mathcal {D}}[\mathbf u _{0}]10^{-2}$ and $\alpha _{2}=2{\mathcal {D}}[\mathbf u _{0}]10^{-2}$. Respectively, for $\phi =\phi _{1}$, $\phi _{2}$, $\phi _{3}$, we set $\beta \in [3\alpha ,5\alpha ],\ [0.5\alpha ,2\alpha ]$ and $[\alpha ,5\alpha ]$. For simplicity, we denote by New 1, New 2 and New 3 the model (18) with $\phi _{1}$, $\phi _{2}$ and $\phi _{3}$, respectively.

It should be noted that a good registration result should produce a small Re_SSD, be diffeomorphic and yield a large JSC value for a region of interest.

5.1 Example 1—Improvement Over the Diffusion Model

In this example, we test a pair of real medical images, X-ray Hands of resolution $128 \times 128$. Figure 3a, b shows the template and the reference. We compare our model with the diffusion model and study the improvement over it. In implementation, for both models, we use a five-step multilevel strategy.

We conduct two experiments using different parameters:

i) Fixed parameters. Our first choice uses fixed parameters. For New 1–3, we set $\beta =7$, $\beta =1$ and $\beta =9$, respectively, and fix $\alpha =2$. To be fair, we also choose $\alpha =2$ for the diffusion model. In this case, Fig. 3 shows the deformed templates $T(\mathbf{y })$ from 4 models. From it, we can see that all four models can produce visually satisfactory results. To differentiate them, we have to check the quantitative measures from Table 1. We can notice that the transformation obtained by the diffusion model is non-diffeomorphic due to $\min \det (J_{\mathbf{y }}) <0$ (i.e. mesh folded, though visually pleasing and the Re_SSD is small). Figure 4 illustrates the transform $\mathbf{y }=\mathbf{x } + \mathbf{u }$ locally at its folding point. In contrast, our New 1–3 can generate diffeomorphic transformations.

ii) Optimized parameters. The second choice uses the fine-tuned parameters for the diffusion model. We tested $\alpha \in [1,500]$ and found the smallest $\alpha =430$ with which the diffusion model generates a diffeomorphic transformation. Then for our model, we also set $\alpha =430$ (which is not optimized in order to favour the former) and set $\beta =5$ for New 1–3 (to test the robustness of our model). Table 1 shows the detailed results for this second test. From it, we can see that the Re_SSD and JSC of our model are similar to the diffusion model. And the transformations obtained by New 1–3 are all diffeomorphic while the diffusion model is only diffeomorphic with the help of an optimized $\alpha $. This shows that our model possesses the robustness (in the sense of not requiring optimized $\alpha $) with the help of a positive $\beta $.

Hence, this example demonstrates that our New 1–3 are robust and can all help to get an accurate and diffeomorphic transformation.

Table 1 Test example 1—Comparison of the new model (New 1–3) with the diffusion model based a fixed $\alpha $ and an optimized $\alpha $ for the latter

Full size table

5.2 Example 2—Test of Large Deformation and Comparison of Models

As known, if the underlying deformation is small, it is generally believed that most models can deliver diffeomorphic transformations. This belief is true if one keeps increasing $\alpha $, which in turn compromises the registration quality by resulting in an increase in Re_SSD (as seen in 2 tests of $\alpha $ in Example 1 where the larger $\alpha =430$ achieves diffeomorphism for diffusion with a worse Re_SSD value).

Therefore, to test the capability of a registration model, we need to take an example that requires large deformation. To this end, we consider Example 2—a classic synthetic example consisting of a Disc and a C shape of resolution $128 \times 128$ as shown in Fig. 5a, b. We compare our 3 models (New 1–3) with 5 other models: the hyperelastic model, LDDMM, DDemons, QCHR and the diffusion model in registration quality and performance. For this example, we use a five-step multilevel strategy for our model, the hyperelastic model and the diffusion model. For LDDMM and QCHR, we use a three-step multilevel strategy. We use a one-step multilevel strategy for DDemons as we find that multilevel does not improve the results.

Following our stated strategy for choosing the parameter for our model, we set $\beta =80, 120, 100$ for New 1–3, respectively, and fix $\alpha =70$. To be consistent, we also set $\alpha =70$ for the diffusion model. For the hyperelastic model, LDDMM and QCHR, we set, respectively, $\{\alpha _{l}=100, \alpha _{s}=0, \alpha _{v}=18\}$, $\alpha =400$ and $\{\alpha =0.1,\beta =1\}$ as used in the literature [7, 30, 37] for the same example. For the parameters of DDemons, we tried to optimize the parameters $\{\sigma _{s},\sigma _{g}\}$ in the domain $[0.5,5]\times [0.5,5]$ and took the optimal choice $\{\sigma _{s}=1.5,\sigma _{g}=3.5\}$.

We now present the comparative results. Figure 5c–j shows that except for the diffusion model, all the other models can produce the accepted registered results. Especially, our model and LDDMM are slightly better than the hyperelastic model, DDemons and QCHR. It is pleasing to see that the new model produces equally good results for this challenging example. From Table 2, we see that our New 1–3, hyperelastic model, LDDMM, DDemons and QCHR produce $\min \det (J_{\mathbf{y }})>0$, i.e. the transformations obtained by these five models are diffeomorphic but the diffusion model fails again with $\min \det (J_{\mathbf{y }})<0$.

Because New 1–3 are motivated by the QCHR model, we now discuss the results about these two types of models. On the one hand, according to Table 2, we can find that our model takes less time. This is because, as we have mentioned, the algorithm for QCHR needs to solve alternatively two subproblems (including several linear systems) in each iteration. Its convergence cannot be guaranteed. However, our model only needs to solve one linear system in each iteration. In addition, we employ the Gauss–Newton method which can be superlinearly convergent under the appropriate conditions. As we have also remarked, the QCHR algorithm can have convergence problems. This is now illustrated in Fig. 6 where we plot the relative residual of our model (New 3) and the relative residual of QCHR. We observe that New 3 decreases to below $10^{-2}$ though not monotonically, but the relative residual of QCHR does not decrease and is over 0.1.

On the other hand, we can compare the obtained solutions’ quality by checking the energy functionals. Using the same QCHR functional, the QCHR solution for Example 2 has the value 1042 while the transformation obtained by New 3 gives the value 147 which is much smaller. This indicates that the result obtained by the QCHR algorithm is not accurate. This is consistent with the fact that the Re_SSD and JSC of New 3 are also better than QCHR. Both discussions reach the same conclusion: the QCHR algorithm cannot obtain the minimizer of the original QCHR functional.

Table 2 Test example 2—Comparison of the new model (New 1–3) with 5 other models

Full size table

5.3 Example 3—Comparison of Models for a Challenging Test

Here, we illustrate the fact that area preservation between images can become unnecessary and trying to enforce it (as in the hyperelastic model) can fail to register an image. We choose the particular template and reference images, as shown in Fig. 7a, b, having significantly different areas in their main features—here the area of ’Disc’ is much larger than ’C’. The resolution of the images is $512 \times 512$. We test the performance of New 1–3 and other models. In this example, we use a seven-step multilevel strategy for New 1–3, the hyperelastic model and the diffusion model. For LDDMM and QCHR, we use a five-step multilevel strategy. We use a single level for DDemons (since multilevels do not help).

In choosing the parameters for all the models to register this example, we first follow our strategy to set $\beta =250, 50, 100$ for New 1–3, respectively, and fix $\alpha =50$. To be consistent, we also set $\alpha =50$ for the diffusion model. For the hyperelastic model, we also set $\alpha _{l} = 50$ because it contains the diffusion term, and take $\alpha _{s}=0$. For the third parameter $\alpha _{v}$ in the hyperelastic model, we test it in the range [55, 150] and choose its optimal value $\alpha _{v}=75$. For LDDMM and QCHR, we set the default value $\alpha =400$ and $\{\alpha =0.1,\beta =1\}$ as the previous example. For the parameters of DDemons, we test the parameters $\{\sigma _{s},\sigma _{g}\}$ in the domain $[0.5,5]\times [0.5,5]$ and choose its optimal choice $\{\sigma _{s}=2,\sigma _{g}=5\}$. Hence we would expect the hyperelastic model and DDemons to perform well.

The test results for Example 3 are presented in Table 3 and Fig. 7. Although all models except for the diffusion model produce diffeomorphic transformations, we can see visually that only 3 models (our New 2–3 and LDDMM) produce acceptable results, also confirmed by the table:

The badly deformed template generated by our New 1 shows that the model lacks robustness;
The hyperelastic model, though producing a diffeomorphic transform, fails (despite using an optimized parameter) because this model including a regularization term $(\det (J_\mathbf{y })-1)^{4}/(\det (J_\mathbf{y }))^{2}$ tends to preserve area. If we do not optimize parameters for the hyperelastic model, our tests show that its results are even worse.
In the previous example, we have pointed out that QCHR needs more computing time and, from Table 3, we see that the time for QCHR is about 20 times as long as our New 3;
The DDemons is trapped in a local minimum and its cpu time is also excessive ($>5000$ s). We also try to apply a multilevel strategy to DDemons, but for this example the result is not satisfied. The Re_SSD, JSC and cpu time of our New 3 are all slightly better than the second best LDDMM;
Both Tables 2 and 3 show that the diffusion model produces solutions having a negative Jacobian (folding) which might be viewed non-physical; this model is included only for reference.

Hence, our model has advantages over other models for large deformation registrations not requiring preserving area.

Table 3 Example 3—Comparison of the new model (New 1–3) with 5 other models

Full size table

We now give 2 remarks on comparing New 3 (or New 2) and QCHR. As remarked, QCHR regularizes the Beltrami coefficient only and the landmarks supplied to QCHR can severely affect the results while our model regularizes the deformation rather than Beltrami coefficient. Both points can be further tested below.

(i). On the first point, regularizing the Beltrami coefficient only leads to smooth Beltrami coefficient. To compare smoothness of solutions by New 3 and QCHR, we compute three smoothness measures $\Vert \nabla \mathbf u \Vert _{L^{2}}$, $\Vert \mu (\mathbf y )\Vert _{L^{2}}$, $\Vert \nabla \mu (\mathbf y )\Vert _{L^{2}}$ and present them in Table 4. Clearly the table indicates that QCHR does generate a smoother Beltrami coefficient than our model New 3 for both Examples 2–3, not a smoother deformation field. Hence, the model which only regularizes the Beltrami coefficient rather than the deformation is not sufficient to produce an accurate deformed template.

Table 4 Comparison of smoothness measures for solutions obtained by New 3 and QCHR. The Beltrami coefficient $\mu $ obtained by QCHR is smoother than New 3 and the displacement $\mathbf u $ obtained by New 3 is smoother than QCHR

Full size table

(ii). On the second point, we now illustrate the importance of landmarks for QCHR although for other problems the model can yield good results without any landmarks. Figure 8 shows three sets of increasing number of landmarks for Examples 2–3. We observe that more landmarks lead to better results in terms of JSC values.

As a final comparison of New 3 with LDDMM and QCHR, Fig. 9 plots the magnitudes of the Jacobian determinants of their transformations. It can be seen that New 3 and LDDMM give a similar pattern but both are different from QCHR.

5.4 Example 4—Comparison of the New Model with Other Models

In the final test, we test a pair of anonymized CT images in resolution $512 \times 512$ from the Royal Liverpool University Hospital. Figure 10a, b shows the template and the reference. The template was taken in September 2016 and the reference was taken in May 2016. We want to compare the changes of our interested regions of abdominal aortic aneurysm with stents inserted inside them (with cross sections shown as two while ‘circles’ in images in Fig. 10a, b) during these 4 months. In addition, the interested region is used to compute JSC. The small white region on top of the images helps us to identify the correct slice to compare.

Here, following the previous example, we use the same multilevel strategy: a seven-step multilevel strategy for our model, the hyperelastic model and the diffusion model, a five-step multilevel strategy for LDDMM and QCHR and a one-step multilevel strategy for DDemons.

Following our strategy for choosing the parameter of our model, we set $\alpha =20$ and set $\beta =100, 40, 75$ with New 1–3, respectively. For the diffusion model and LDDMM, we test $\alpha $ from [100, 2000] and set the optimal value 1300 and 500 ,respectively. For the hyperelastic model, we set $\{\alpha _{l}=20, \alpha _{s}=0, \alpha _{v}=50\}$. We use the default value $\{\alpha =0.1,\beta =1\}$ for QCHR. For the parameters of DDemons, we test the parameters $\{\sigma _{s},\sigma _{g}\}$ in the domain $[0.5,5]\times [0.5,5]$ and choose $\{\sigma _{s}=4,\sigma _{g}=4.5\}$.

With the optimized parameters, all the models in this example generate diffeomorphic transformations as seen from Table 5. DDemons and QCHR for this example are not as good as other models because they give worse Re_SSD and JSC. A worse JSC means the interested regions obtained by these two methods have significant differences from the reference (Fig. 10h, i). The diffusion model obtains a good JSC; however, its deformed template is a bit far (overall) from the reference (since Re_SSD = 10.02%). The other 2 models (Hyperelastic, LDDMM) generate good Re_SSD and JSC. However, our models produce the lowest Re_SSD and the best JSC. Hence, for this example of real images, our model is competitive to the state-of-the-art methods. Though there is broad agreement between Re_SSD and JSC, one has to combine with segmentation models to ensure the strict agreement.

Table 5 Example 4—Comparison of New 1–3 with 5 other models

Full size table

Remark

According to the above four examples, our New 1 is not robust while New 2–3 can both generate accurate and diffeomorphic transformations. However, we recommend New 3 as the first choice because of the least computing time and the best quality, and New 2 as the second choice.

We also test these four examples with the Dirichlet boundary condition. Similar results for Examples 1 and 4 are obtained. However, for Examples 2 and 3, the transformations would be different since the boundary is better modelled by the Neumann’s condition.

6 Conclusions

Controlling mesh folding is a key issue in image registration models to ensure local invertibility. Many existing models either do not impose any further controls on the underlying transformation beyond smoothness (so potentially generating unrealistic or non-physical transforms or mapping) or impose a direct (often strongly biased e.g. towards area or volume preservation) control on some explicit function of the measure $\det (J_{\mathbf{y }})$. This paper introduces a novel, unbiased and robust regularizer which is reformulated from Beltrami coefficient framework to ensure a diffeomorphic transformation. Moreover, we find that a direct approach (our New 1) from this Beltrami reformulation provides an alternative but less competitive method but further refinements (especially our New 3) of this new regularizer can give rise to more robust models than the existing methods. We highly recommend our model New 3, i.e. (18) with $\phi =\phi _3$.

In designing optimization methods for solving the resulting highly nonlinear variational model, we give a suitable approximation of the exact Hessian matrix which is necessary to derive a convergent iterative method. Our test results can show that the new model (New 1–3, especially New 3) is competitive with the state-of-the-art models. The main advantage lies in robustness. Our future work will include extensions to 3D problems, multi-modality models and development of faster iterative solvers.

References

L. V. Ahlfors and C. J. Earle, Lectures on Quasiconformal Mappings, van Nostrand Princeton, 1966
Barrett, R., Berry, M.W., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., Van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, vol. 43. SIAM, Philadelphia (1994)
Book MATH Google Scholar
Beg, M.F., Miller, M.I., Trouvé, A., Younes, L.: Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int. J. Comput. Vis. 61, 139–157 (2005)
Article Google Scholar
Bers, L.: Quasiconformal mappings, with applications to differential equations, function theory and topology. Bull. Am. Math. Soc. 83, 1083–1100 (1977)
Article MathSciNet MATH Google Scholar
Broit, C.: Optimal Registration of Deformed Images, Ph.D. thesis, University of Pennsylvania, USA, (1981)
Brown, L.G.: A survey of image registration techniques. ACM Comput. Surv. 24, 325–376 (1992)
Article Google Scholar
Burger, M., Modersitzki, J., Ruthotto, L.: A hyperelastic regularization energy for image registration. SIAM J. Sci. Comput. 35, 132–148 (2013)
Article MathSciNet MATH Google Scholar
Chen, Y., Ye, X.: Inverse consistent deformable image registration. in The Legacy of Alladi Ramakrishnan in the Mathematical Sciences, Springer, pp. 419–440 (2010)
Christensen, G. E.: Deformable shape models for anatomy, Ph.D. thesis, Washington University Saint Louis, USA, (1994)
Christensen, G.E., Rabbitt, R.D., Miller, M.I.: Deformable templates using large deformation kinematics. IEEE Trans. Image Process. 5, 1435–1447 (1996)
Article Google Scholar
Chumchob, N., Chen, K.: A robust affine image registration method. Int. J. Numer. Anal. Model. 6, 311–334 (2009)
MathSciNet MATH Google Scholar
Chumchob, N., Chen, K.: A variational approach for discontinuity-preserving image registration, East-West Journal of Mathematics, pp. 266–282 (2010)
Chumchob, N., Chen, K.: A robust multigrid approach for variational image registration models. J. Comput. Appl. Math. 236, 653–674 (2011)
Article MathSciNet MATH Google Scholar
Chumchob, N., Chen, K., Brito, C.: A fourth-order variational image registration model and its fast multigrid algorithm. Multiscale Model. Simul. 9, 89–128 (2011)
Article MathSciNet MATH Google Scholar
Dupuis, P., Grenander, U., Miller, M. I.: Variational problems on flows of diffeomorphisms for image matching, Quarterly of applied mathematics, pp. 587–600 (1998)
Fischer, B., Modersitzki, J.: Fast diffusion registration. Contemp. Math. 313, 117–128 (2002)
Article MathSciNet MATH Google Scholar
Fischer, B., Modersitzki, J.: Curvature based image registration. J. Math. Imaging Vis. 18, 81–85 (2003)
Article MathSciNet MATH Google Scholar
Fischer, B., Modersitzki, J.: A unified approach to fast image registration and a new curvature based registration technique. Linear Algebra Appl. 380, 107–124 (2004)
Article MathSciNet MATH Google Scholar
Frohn-Schauf, C., Henn, S., Witsch, K.: Multigrid based total variation image registration. Comput. Vis. Sci. 11, 101–113 (2008)
Article MathSciNet MATH Google Scholar
Gardiner, F.P., Lakic, N.: Quasiconformal Teichmüller Theory, vol. 76. American Mathematical Society, Providence (2000)
MATH Google Scholar
Goshtasby, A.A.: 2-D and 3-D Image Registration: for Medical, Remote Sensing, and Industrial Applications. Wiley, Hoboken (2005)
Google Scholar
Goshtasby, A.A.: Image Registration: Principles Tools and Methods. Springer, Berlin (2012)
Book MATH Google Scholar
Haber, E., Modersitzki, J.: Numerical methods for volume preserving image registration. Inverse Probl. 20, 1621 (2004)
Article MathSciNet MATH Google Scholar
Haber, E., Modersitzki, J.: Intensity gradient based registration and fusion of multi-modal images. In: Medical image computing and computer-assisted intervention-MICCAI. Springer, Berlin. pp. 726–733 (2006)
Haber, E., Modersitzki, J.: Image registration with guaranteed displacement regularity. Int. J. Comput. Vis. 71, 361–372 (2007)
Article Google Scholar
Hill, D.L.G., Batchelor, P.G., Holden, M., Hawkes, D.J.: Medical image registration. Phys. Med. Biol. 46, R1–45 (2001)
Article Google Scholar
Ibrahim, M., Chen, K., Brito-Loeza, C.: A novel variational model for image registration using Gaussian curvature. Geom. Imaging Comput. 1, 417–446 (2014)
Article MathSciNet MATH Google Scholar
Kelley, C.T.: Iterative Methods for Pptimization. SIAM, Philadelphia (1999)
Book Google Scholar
Klein, A., Andersson, J., Ardekani, B.A., Ashburner, J., Avants, B., Chiang, M.-C., Christensen, G.E., Collins, D.L., Gee, J., Hellier, P., et al.: Evaluation of 14 nonlinear deformation algorithms applied to human brain mri registration. Neuroimage 46, 786–802 (2009)
Article Google Scholar
Lam, K.C., Lui, L.M.: Landmark-and intensity-based registration with large deformations via quasi-conformal maps. SIAM J. Imaging Sci. 7, 2364–2392 (2014)
Article MathSciNet MATH Google Scholar
Lehto, O., Virtanen, K.I.: Quasiconformal Mappings in the Plane, vol. 126. Springer, New York (1973)
Book MATH Google Scholar
Lester, H., Arridge, S.R.: A survey of hierarchical non-linear medical image registration. Pattern Recognit. 32, 129–149 (1999)
Article Google Scholar
Liu, M. Z.: Total Bregman divergence, a robust divergence measure, and its applications. Ph.D. thesis, University of Florida, USA. ISBN: 978-1-267-37783-8, (2011)
Lui, L.M., Lam, K.C., Wong, T.W., Gu, X.: Texture map and video compression using beltrami representation. SIAM J. Imaging Sci. 6, 1880–1902 (2013)
Article MathSciNet MATH Google Scholar
Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., Suetens, P.: Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imaging 16, 187–198 (1997)
Article Google Scholar
Maintz, J.A., Viergever, M.A.: A survey of medical image registration. Med. Image Anal. 2, 1–36 (1998)
Article Google Scholar
Mang, A., Ruthotto, L.: A Lagrangian Gauss–Newton–Krylov solver for mass- and intensity-preserving diffeomorphic image registration. SIAM J. Sci. Comput. 39, B860–B885 (2017)
Article MathSciNet MATH Google Scholar
Modersitzki, J.: Numerical Methods For Image Registration. Oxford University Press, Oxford (2004)
MATH Google Scholar
Modersitzki, J.: FAIR: Flexible Algorithms for Image Registration. SIAM, Philadelphia (2009)
Book MATH Google Scholar
Mohamed, A., Zacharaki, E.I., Shen, D., Davatzikos, C.: Deformable registration of brain tumor images via a statistical model of tumor-induced deformation. Med. Image Anal. 10, 752–763 (2006)
Article Google Scholar
Musse, O., Heitz, F., Armspach, J.P.: Topology preserving deformable image matching using constrained hierarchical parametric models. IEEE Trans. Image Process. 10, 1081–1093 (2001)
Article MATH Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (2006)
MATH Google Scholar
Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12, 617–629 (1975)
Article MathSciNet MATH Google Scholar
Rohlfing, T., Maurer Jr., C.R., Bluemke, D.A., Jacobs, M.A.: Volume-preserving nonrigid registration of mr breast images using free-form deformation with an incompressibility constraint. IEEE Trans. Med. Imaging 22, 730–741 (2003)
Article Google Scholar
Rueckert, D., Sonoda, L.I., Hayes, C., Hill, D.L., Leach, M.O., Hawkes, D.J.: Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans. Med. Imaging 18, 712–721 (1999)
Article Google Scholar
Sdika, M.: A fast nonrigid image registration with constraints on the Jacobian using large scale constrained optimization. IEEE Trans. Med. Imaging 27, 271–281 (2008)
Article Google Scholar
Sotiras, A., Davatzikos, C., Paragios, N.: Deformable medical image registration: a survey. IEEE Trans. Med. Imaging 32, 1153–1190 (2013)
Article Google Scholar
Sun, W., Yuan, Y.-X.: Optimization Theory and Methods: Nonlinear Programming, vol. 1. Springer, Berlin (2006)
MATH Google Scholar
Thirion, J.-P.: Image matching as a diffusion process: an analogy with Maxwell’s demons. Med. Image Anal. 2, 243–260 (1998)
Article Google Scholar
Trouvé, A.: Diffeomorphisms groups and pattern matching in image analysis. Int. J. Comput. Vis. 28, 213–221 (1998)
Article Google Scholar
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45, S61–S72 (2009)
Article Google Scholar
Vogel, C.R.: Computational Methods for Inverse Problems. SIAM, Philadelphia (2002)
Book MATH Google Scholar
Weickert, J., Romeny, B., Viergever, M.: Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Trans. Image Process. 7, 398–410 (1998)
Article Google Scholar
Yang, X., Pei, J. H., Shi, J. L.: Inverse consistent non-rigid image registration based on robust point set matching, BioMedical Engineering OnLine, 13 (2014)
Yanovsky, I., Thompson, P., Osher, S., Leow, A.: Large deformation unbiased diffeomorphic nonlinear image registration: theory and implementation. In: IEEE conference CVPR’ 2007 (see also UCLA CAM Report 06-71), 71 (2007)
Zhang, J., Chen, K.: Variational image registration by a total fractional-order variation model. J. Comput. Phys. 293, 442–461 (2015)
Article MathSciNet MATH Google Scholar
Zitova, B., Flusser, J.: Image registration methods: a survey. Image Vis. Comput. 21, 977–1000 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

EPSRC Liverpool Centre for Mathematics in Healthcare, Centre for Mathematical Imaging Techniques and Department of Mathematical Sciences, The University of Liverpool, Peach Street, Liverpool, L69 7ZL, UK
Daoping Zhang & Ke Chen

Authors

Daoping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ke Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ke Chen.

Additional information

This work was partly funded by UK EPSRC Grants EP/K036939/1 and EP/N014499/1. DPZ also acknowledges the studentship support of China Scholarship Council.

Appendices

Appendix A: Computation of Matrices A and G in §4.1.2

Set $B = I_{2}\otimes I_{n+1}\otimes \partial _{n}^{1,h} \in {\mathbb R}^{2n(n+1)\times 2(n+1)^{2}}$, $C = I_{2}\otimes \partial _{n}^{1,h}\otimes I_{n+1} \in {\mathbb R}^{2n(n+1)\times 2(n+1)^{2}}$,

$$\begin{aligned} \partial _{n}^{1,h}= & {} \frac{1}{h^{2}}\begin{bmatrix} -1&1 \\&-1&1\\&\ldots&\ldots&\ldots&\\&-1&1&\\&&-1&1 \end{bmatrix}\in {{\mathbb {R}}}^{n,n+1},\\ A= & {} \begin{bmatrix} B\\ C \end{bmatrix}\in {{\mathbb {R}}}^{4n(n+1)\times 2(n+1)^{2}}, \end{aligned}$$

where $\otimes $ denotes a Kronecker product. To represent the difference between interior and boundary pixels, we need to introduce a diagonal matrix

$$\begin{aligned} G = \begin{bmatrix} G_{1}&0&0&0\\ 0&G_{2}&0&0\\ 0&0&G_{1}&0\\ 0&0&0&G_{2} \end{bmatrix}\in {{\mathbb {R}}}^{4n(n+1)\times 4n(n+1)}, \end{aligned}$$

(51)

where $G_{1}$ and $G_{2}$ are diagonal matrices. For $G_{1}$, $G_{1_{i+1+jn,i+1+jn}} = 1$ if $0\le i \le n-1, 1\le j \le n-1$ or $\frac{1}{2}$ if $0\le i \le n-1, j=0,n$. Similarly, for $G_{2}$, $G_{2_{i+1+j(n+1),i+1+j(n+1)}} = 1$ if $1\le i \le n-1, 0\le j \le n-1$ or $\frac{1}{2}$ if $i=0,n, 0\le j \le n-1$.

Appendix B: Computation of the Vector $\vec {\mathbf{r }}(U)$ in § 4.1.3

We demonstrate how to build the linear interpolation $\mathbf L $ in $\triangle V_{1}V_{2}V_{5}$, in Fig. 2.

First of all, denote the 3 vertices of this triangle by $V_{1} = \mathbf{x }^{1,1}$, $V_{2} = \mathbf{x }^{2,1}$ and $V_{5} = \mathbf x ^{1.5,1.5}$. Set $\mathbf{L }(V_{1}) = (u_{1}^{1,1},u_{2}^{1,1})$, $\mathbf{L }(V_{2}) = (u_{1}^{2,1},u_{2}^{2,1})$ at the vertex pixels, and $\mathbf{L }(V_{5}) = (u_{1}^{1.5,1.5},u_{2}^{1.5,1.5})$ at the cell centre (approximated values). Here the linear approximations are $\mathbf{L }(x_{1},x_{2}) = (a_{1}x_{1}+a_{2}x_{2}+a_{3},a_{4}x_{1}+a_{5}x_{2}+a_{6})$.

After substituting $V_{1},V_{2}$ and $V_{5}$ into $\mathbf{L }$, we get

$$\begin{aligned} \begin{pmatrix} x_{1}^{1}-x_{1}^{1.5} &{} x_{2}^{1}-x_{2}^{1.5} \\ x_{1}^{2}-x_{1}^{1.5} &{} x_{2}^{1}-x_{2}^{1.5} \end{pmatrix} \begin{pmatrix} a_{1} \\ a_{2} \end{pmatrix} = \begin{pmatrix} u_{1}^{1,1}-u_{1}^{1.5,1.5}\\ u_{1}^{2,1}-u_{1}^{1.5,1.5} \end{pmatrix}, \end{aligned}$$

$$\begin{aligned} \begin{pmatrix} x_{1}^{1}-x_{1}^{1.5} &{} x_{2}^{1}-x_{2}^{1.5} \\ x_{1}^{2}-x_{1}^{1.5} &{} x_{2}^{1}-x_{2}^{1.5} \end{pmatrix} \begin{pmatrix} a_{4} \\ a_{5} \end{pmatrix} = \begin{pmatrix} u_{2}^{1,1}-u_{2}^{1.5,1.5}\\ u_{2}^{2,1}-u_{2}^{1.5,1.5} \end{pmatrix}. \end{aligned}$$

Then

$$\begin{aligned} \begin{pmatrix} a_{1} \\ a_{2}\end{pmatrix} = \frac{1}{\det } \begin{pmatrix} x_{2}^{1}-x_{2}^{1.5} &{} -x_{2}^{1}+x_{2}^{1.5} \\ -x_{1}^{2}+x_{1}^{1.5} &{} x_{1}^{1}-x_{1}^{1.5} \end{pmatrix} \begin{pmatrix} u_{1}^{1,1}-u_{1}^{1.5,1.5}\\ u_{1}^{2,1}-u_{1}^{1.5,1.5} \end{pmatrix},\nonumber \\ \end{aligned}$$

(52)

$$\begin{aligned} \begin{pmatrix} a_{4} \\ a_{5}\end{pmatrix} = \frac{1}{\det } \begin{pmatrix} x_{2}^{1}-x_{2}^{1.5} &{} -x_{2}^{1}+x_{2}^{1.5} \\ -x_{1}^{2}+x_{1}^{1.5} &{} x_{1}^{1}-x_{1}^{1.5} \end{pmatrix} \begin{pmatrix} u_{2}^{1,1}-u_{2}^{1.5,1.5}\\ u_{2}^{2,1}-u_{2}^{1.5,1.5} \end{pmatrix},\nonumber \\ \end{aligned}$$

(53)

where $ \det = \begin{vmatrix} x_{1}^{1}-x_{1}^{1.5}&x_{2}^{1}-x_{2}^{1.5} \\ x_{1}^{2}-x_{1}^{1.5}&x_{2}^{1}-x_{2}^{1.5} \end{vmatrix}$.

According to (52) and (53), we can formulate two matrices $D1\in {{\mathbb {R}}}^{4n^{2}\times (n+1)^{2}}$ and $D2 \in {{\mathbb {R}}}^{4n^{2}\times (n+1)^{2}}$ such that

$ \mathbf{a }_{1}-\mathbf{a }_{5} = [D1 ,-D2]U = A_{1}U \in {\mathbb R}^{4n^{2}\times 1},\ \mathbf{a }_{4}+\mathbf{a }_{2} = [D2,D1]U=A_{2}U \in {{\mathbb {R}}}^{4n^{2}\times 1}$, and

$\ \ \mathbf{a }_{1}+\mathbf{a }_{5} = [D1,D2]U=A_{3}U \in {\mathbb R}^{4n^{2}\times 1},\ \ \mathbf{a }_{4}-\mathbf{a }_{2} = [D2,-D1]U=A_{4}U \in {{\mathbb {R}}}^{4n^{2}\times 1}$. Here, $\mathbf a _{\theta } = (a_{\theta }^{1},\ldots ,a_{\theta }^{4n^{2}})^{T},\theta = 1,2,4,5$, where $a_{\theta }^{l} = a_{\theta }^{i,j,k}$ and $l = (k-1)n^{2}+(j-1)n+i$.

Next using the Hadamard product $\odot $, we get a compact form for

$$\begin{aligned} \left\{ \begin{array}{ll} \vec {\mathbf{r }}^{1}(U) &{} = A_{1}U\odot A_{1}U+A_{2}U\odot A_{2}U,\\ \vec {\mathbf{r }}^{2}(U) &{} = 1/((A_{3}U+2)\odot (A_{3}U+2)+A_{4}U\odot A_{4}U), \\ \vec {\mathbf{r }}(U) &{} = \vec {\mathbf{r }}^{1}\odot \vec {\mathbf{r }}^{2} \ \ \in {{\mathbb {R}}}^{4n^{2}\times 1}. \end{array}\right. \nonumber \\ \end{aligned}$$

(54)

Appendix C: Computing the Gradient and Approximated Hessian of the term (37)

Here, as an example, we set $n=2$ and $\phi =\phi _{1}$ to compute the gradient and approximated Hessian of the discretized Beltrami term (37).

Because of $n=2$, we have

$$\begin{aligned}&U = (u_{1}^{0,0},\ldots ,u_{1}^{2,0},\ldots ,u_{1}^{0,2},\ldots ,u_{1}^{2,2}, u_{2}^{0,0},\ldots ,u_{2}^{2,0},\\&\qquad \quad \ldots ,u_{2}^{0,2},\ldots ,u_{2}^{2,2})^{T} \in {{\mathbb {R}}}^{18\times 1}. \end{aligned}$$

From (52)-(53), we can formulate two matrices $D1, D2\in \mathbb {R}^{16\times 9}$, respectively by:

$$\begin{aligned} \begin{bmatrix} -2&2&&&&\\&-2&2&&&\\&&-2&2&&\\&&-2&2&&\\ -1&1&-1&1&&\\&-1&1&-1&1&&\\&&-1&1&-1&1&\\&&-1&1&-1&1 \\&&-2&2&&\\&&-2&2&&\\&&&-2&2&\\&&&&-2&2 \\ -1&1&-1&1&&\\&-1&1&-1&1&&\\&&-1&1&-1&1&\\&&-1&1&-1&1 \\ \end{bmatrix}, \begin{bmatrix} -1&-1&1&1&&\\&-1&-1&1&1&&\\&&-1&-1&1&1&\\&&-1&-1&1&1 \\&-2&&2&&\\&-2&&2&&\\&&-2&&2&\\&&&-2&&2 \\ -1&-1&1&1&&\\&-1&-1&1&1&&\\&&-1&-1&1&1&\\&&-1&-1&1&1 \\ -2&&2&&&\\&-2&&2&&\\&&-2&&2&\\&&-2&&2&\\ \end{bmatrix}. \end{aligned}$$

Then we can build $A_{1},A_{2},A_{3}$ and $A_{4}$ and compute $\vec {\mathbf{r }}^{1} ,\vec {\mathbf{r }}^{2}$ and $\vec {\mathbf{r }}$ by (54). According to (39), we have $\hbox {d}\vec {\mathbf{r }}\in \mathbb {R}^{16\times 18}$.

When $\phi (v)=\phi _{1}(v)$, we have ${\phi }'_{1}(v)=\frac{2}{(v-1)^{3}}$, ${\phi }''_{1}(v)=\frac{6}{(v-1)^{4}}$ and so $\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})=(\frac{2}{(\vec {\mathbf{r }}_{1}-1)^3},\ldots ,\frac{2}{(\vec {\mathbf{r }}_{16}-1)^3})^{T}$ in (38). In (40) the ith diagonal element $[\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})]_{ii}= \frac{6}{(\vec {\mathbf{r }}_{i}-1)^{4}},\ 1\le i\le 16$. Similarly when $\phi (v)=\phi _{2}$, $\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})=(\frac{-\vec {\mathbf{r }}_{1}-1}{(\vec {\mathbf{r }}_{1}-1)^2},\ldots ,\frac{-\vec {\mathbf{r }}_{16}-1}{(\vec {\mathbf{r }}_{16}-1)^2})^{T}$ and $[\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})]_{ii}=\frac{2\vec {\mathbf{r }}_{i} +4}{(\vec {\mathbf{r }}_{i}-1)^{4}}$. When $\phi (v)=\phi _{3}$, $\hbox {d}{\varvec{\phi }}(\vec {\mathbf{r }})= (\frac{-2\vec {\mathbf{r }}_{1}}{(\vec {\mathbf{r }}_{1}-1)^3},\ldots ,\frac{-2\vec {\mathbf{r }}_{16}}{(\vec {\mathbf{r }}_{16}-1)^3})^{T}$ and $[\hbox {d}^{2}{\varvec{\phi }}(\vec {\mathbf{r }})]_{ii}= \frac{4\vec {\mathbf{r }}_{i} +2}{(\vec {\mathbf{r }}_{i}-1)^{4}}$.

Hence, we can get $d_{3}$ in (38) and ${\hat{H}}_{3}$ in (40).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Zhang, D., Chen, K. A Novel Diffeomorphic Model for Image Registration and Its Algorithm. J Math Imaging Vis 60, 1261–1283 (2018). https://doi.org/10.1007/s10851-018-0811-3

Download citation

Received: 22 March 2017
Accepted: 29 March 2018
Published: 10 April 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10851-018-0811-3

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Novel Diffeomorphic Model for Image Registration and Its Algorithm

Abstract

Similar content being viewed by others

Multi-modality Image Registration Models and Efficient Algorithms

Unsupervised Learning of Diffeomorphic Image Registration via TransMorph

Recent Developments of an Optimal Control Approach to Nonrigid Image Registration

1 Introduction

2 Preliminaries, Regularization and Diffeomorphic Transformation

2.1 Data Fidelity

2.2 Regularization

2.3 Models of Diffeomorphic Transformation

2.3.1 Volume Control

2.3.2 Slack Constraint

2.3.3 Unbiased Transform

2.3.4 Balance of Shrinkage and Growth

2.3.5 LDDMM Framework

2.3.6 Beltrami Indirect Control

3 The Proposed Image Registration Model

3.1 New Regularizer

Remark

3.2 The Proposed Model

Remark

Remark

4 The Numerical Algorithm

4.1 Discretization

4.1.1 Discretization of Term 1 in (18)

4.1.2 Discretization of Term 2 in (18)

Remark

4.1.3 Discretization of Term 3 in (18)

Remark

4.2 Optimization Method for the Discretized Problem (29)

4.2.1 Approximated Hessian H

Remark

4.2.2 Search Direction

4.2.3 Step Length

4.2.4 Stopping Criteria

Theorem 1

Remark

Lemma 2

Proof

Theorem 3

Proof

4.3 Multilevel Strategy

5 Numerical Results

5.1 Example 1—Improvement Over the Diffusion Model

5.2 Example 2—Test of Large Deformation and Comparison of Models

5.3 Example 3—Comparison of Models for a Challenging Test

5.4 Example 4—Comparison of the New Model with Other Models

Remark

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Computation of Matrices A and G in §4.1.2

Appendix B: Computation of the Vector \(\vec {\mathbf{r }}(U)\) in § 4.1.3

Appendix C: Computing the Gradient and Approximated Hessian of the term (37)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation