Variational Image Registration Using Inhomogeneous Regularization

Jud, Christoph; Lüthi, Marcel; Albrecht, Thomas; Schönborn, Sandro; Vetter, Thomas

doi:10.1007/s10851-014-0497-0

Variational Image Registration Using Inhomogeneous Regularization

Published: 12 February 2014

Volume 50, pages 246–260, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Variational Image Registration Using Inhomogeneous Regularization

Download PDF

Christoph Jud¹,
Marcel Lüthi¹,
Thomas Albrecht¹,
Sandro Schönborn¹ &
…
Thomas Vetter¹

480 Accesses
1 Citation
Explore all metrics

Abstract

We present a generalization of the convolution-based variational image registration approach, in which different regularizers can be implemented by conveniently exchanging the convolution kernel, even if it is nonseparable or nonstationary. Nonseparable kernels pose a challenge because they cannot be efficiently implemented by separate 1D convolutions. We propose to use a low-rank tensor decomposition to efficiently approximate nonseparable convolution. Nonstationary kernels pose an even greater challenge because the convolution kernel depends on, and needs to be evaluated for, every point in the image. We propose to pre-compute the local kernels and efficiently store them in memory using the Tucker tensor decomposition model. In our experiments we use the nonseparable exponential kernel and a nonstationary landmark kernel. The exponential kernel replicates desirable properties of elastic image registration, while the landmark kernel incorporates local prior knowledge about corresponding points in the images. We examine the trade-off between the computational resources needed and the approximation accuracy of the tensor decomposition methods. Furthermore, we obtain very smooth displacement fields even in the presence of large landmark displacements.

Convex Generalizations of Total Variation Based on the Structure Tensor with Applications to Inverse Problems

Generalized div-curl based regularization for physically constrained deformable image registration

Article Open access 01 July 2024

Combined First and Second Order Variational Approaches for Image Processing

Article 10 March 2015

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Image registration is a common problem that arises in many applications of medical image analysis. The problem is to find a non-rigid transformation which aligns two images. In this paper, we focus on the variational principle to formulate this problem, which has found broad acceptance in the literature [4, 5, 33]. The sought transformation corresponds to the minimum of a functional, which finds a compromise between the image similarity of the transformed target and the reference image $I_T,I_R:\Omega \rightarrow \mathrm{I\!R}$, defined on some domain $\Omega $, and the smoothness of the image transformation $\varvec{\varphi }:\Omega \rightarrow \Omega $. Formally, the problem can be written as

$$\begin{aligned} \mathcal {J}[\varvec{\varphi }] := \mathcal {D}[I_R,I_T,\varvec{\varphi }]+\mathcal {S}[\varvec{\varphi }], \end{aligned}$$

(1)

where $\mathcal {D}$ is an image distance measure and $\mathcal {S}$ a regularizer. Using calculus of variation, the optimum of (1) can be found by solving a system of partial differential equations.

An elegant framework to minimize Eq. (1) was proposed by Beutien et al. [4], where a minimum is reached using a convolution-based approach. Different regularization properties can be achieved by choosing different convolution kernels. However, their approach only addresses stationary kernels and only works efficiently if the kernel is separable.

In this paper, we present a generalization of this framework, where the regularization kernel are even nonseparable or nonstationary. By employing low-rank tensor decomposition [18], we approximate nonseparable convolution kernels by separable 1D kernels in order to perform the convolution separately in each space dimension by successive 1D convolutions. Compared to nonseparable convolution, this dramatically reduces the computational complexity and, additionally, it accurately retains the regularization properties. For instance, in 3D the separable Gaussian kernel requires three 1D convolutions, while a rank-4 nonseparable kernel needs $4\times 3=12$ separable 1D convolutions.

Furthermore, we extend the framework to nonstationary regularizers, i.e. regularizers which depend on the spatial location. The corresponding spatially varying local filter kernels require an efficient handling, which we approach by a powerful caching scheme. In addition, we use an extension of the framework in order to ensure that the resulting transformations are diffeomorphic. As in [28, 37, 38], in each iteration, we compute an efficient approximation of the exponential mapping that keeps the transformations diffeomorphic.

While this generalization is useful in a wide variety of registration tasks, our original motivation was to integrate landmarks into the regularization. In Fig. 1, we illustrate the practical importance of hybrid landmark and image registration. In this example, the landmarks help to greatly improve the registration accuracy of the patellar surface of a human femur. Using our nonstationary filtering approach, we can efficiently compute a solution to the hybrid registration problem as formulated by Lüthi et al. [21], which is conceptually appealing, but has so far been computationally infeasible. The idea is to integrate landmarks directly into the regularization, which in our terms means to minimize a slightly different functional

$$\begin{aligned} \hat{\mathcal {J}}[\varvec{\varphi }] := \mathcal {D}[I_R,I_T,\varvec{\varphi }]+\hat{\mathcal {S}}[\varvec{\varphi },X], \end{aligned}$$

(2)

where $X$ is a list of landmark displacements. Since the regularizer $\hat{\mathcal {S}}$ considers the given displacements at the landmark positions, the resulting kernel is nonstationary. Therefore, simple convolution approaches are not applicable, since the filter kernel varies for each image location. However, in our framework, we are able to efficiently handle the regularizer’s local dependency. Hence, performing hybrid image registration in terms of Eq. (2) becomes computationally feasible even for 3D images.

Non-rigid image registration has been extensively studied in literature and several attempts have been made to reach a general framework for different regularizers [5, 33]. For an overview of image registration methods, we refer to the survey paper of Sotiras et al. [30], and more specifically to the book of Modersitzki [23] for a deeper discussion about variational methods for image registration. For the diffeomorphic regularization we refer to [1–3, 28, 37, 38]. Nonstationary filtering methods have been used e.g. in [6, 31], where the regularization is locally adjusted depending on local image features as e.g. curvature or local transformation properties like stiffness. Different hybrid methods, which combine landmarks and image features have been proposed in literature. For example in [14, 16, 26], the landmarks are treated as additional constraints. The methods require a perfect interpolation of the landmarks resulting in numerical problems during optimization. Other methods do not enforce the landmark constraints strictly, but add the landmark differences as another cost term to the functional in Eq. (1) [2, 17, 20, 25, 29]. In our work, we were inspired by the Bayesian approach of Lüthi et al. [21], where these two kinds of methods are combined by integrating the landmarks into the regularization. This makes the uncertainty on the landmarks independent from the data term and reduces the search space to transformations, agreeing with the landmark displacements. Contrary to the approach of Lüthi et. al. [21] however, our nonstationary filtering technique is able to efficiently handle large 3D images.

Our paper is structured as follows: in the background section we briefly introduce the variational image registration framework and present ideas about the hybrid image registration framework of Lüthi et al. [21] adapted to our method. In the subsequent method section we present our separate and nonstationary filter approach. In our experiments, we show registration results using different positive definite kernel functions such as Gaussian and exponential kernels. By using low-rank approximations of nonseparable kernels, we show an accuracy gain as well as an improved convergence property during the registration process. Applying a nonstationary kernel, we show the memory savings we reach with our caching scheme using tensor decomposition. Furthermore, we discuss in more detail the introductory patellar surface example, where the incorporation of the landmarks leads to better registration performance. Finally, we discuss the advantages and challenges related to the landmark based transformation.

2 Background

2.1 Variational Image Registration Framework

Considering dense image features, for instance gray scale values of a CT image, formulating the registration problem as a variational optimization problem turned out to be very useful in literature [4, 5, 28, 33, 37, 38]. A mapping that registers the two images $I_R$ and $I_T$ is sought as the minimum of the joint functional (1). Using methods from the calculus of variations, the functional is differentiated with respect to the mapping ${\varvec{\varphi }}$. If we denote the space of all admissible mappings as $\Phi $, the functional derivatives of the two terms are $d\mathcal {D}=:\mathbf {f}:\Phi \rightarrow \mathrm{I\!R}^d$ and $d\mathcal {S}=:\mathcal {A}:\Phi \rightarrow \Phi .\, \mathcal {A}$ is typically a differential operator and a minimum of the functional has to satisfy the system of partial differential equations

$$\begin{aligned} \mathcal {A}[\varvec{\varphi }]=\mathbf {f}(\varvec{\varphi }). \end{aligned}$$

(3)

Many different strategies to solve this PDE and the associated minimization problem have been put forward, including finite difference methods, finite element methods, B-spline based methods etc.

The fastest method to solve (3), on which our work is based, is a convolution approach [28, 33]. This approach is possible if the fundamental solution or “Green’s function” for the operator $\mathcal {A}$ is known. The Green’s function is then a positive definite kernel function $k:\Omega \times \Omega \rightarrow \mathrm{I\!R}$, and the PDE can be minimized by the iteration scheme

$$\begin{aligned} \varvec{\varphi }^{i+1}=k*(\varvec{\varphi }^i \circ \mathbf {f}(\varvec{\varphi }^i)), \end{aligned}$$

(4)

where $*$ denotes the convolution operation.

The classical example is the Demons algorithm where $\mathcal {A}=-\Delta $ and $k$ is the Gaussian kernel. In principle it is also possible to choose the kernel directly, without actually specifying the corresponding operator. Long et al. [19] proposed to use the exponential kernel $k(x,y)=\frac{1}{C_d}\cdot e^{-\frac{\Vert x-y\Vert }{\alpha }}$, where $C_d$ is a normalization constant for the number of dimensions $d. C_1=2\alpha ,C_2=2\pi \alpha ^2,C_3=8\pi \alpha ^3$. They showed that compared to the Gaussian kernel, this kernel better approximates the linear elasticity regularization, which leads to better transformation properties (see also Steinke and Schölkopf [32]). Since the kernel is not separable, standard computational requirements are exceeded if it is directly applied in this framework. In our experiments later in this paper, we also show registration results using this regularizer.

2.2 Neighborhood Preservation

Common simple regularizers, as the ones discussed in the previous section, penalize unsmooth transformations. They cannot avoid foldings, nor do they lead to invertible transformations. But in medical applications invertible transformations are preferred. Various authors [1, 8, 28, 35, 37, 38] have therefore proposed an additional restriction of the mappings $\varvec{\varphi }^i$ by modeling $\varvec{\varphi }$ using geodesic flows of diffeomorphisms [3]. Following the diffeomorphic Demons [37] approach, the restriction of an optimal displacement field to be diffeomorphic $\varvec{\varphi }\in Diff (\Omega )$ can be achieved by mapping the current transformation $\varvec{\varphi }^i$ back onto the Lie group of diffeomorphisms [1, 28, 37, 38]. This ensures the transformation to be invertible and neighborhood preserving. A possible way to do that, which only marginally changes the optimization scheme in Eq. (4), is by calculating the group exponential map $\exp $ of the Lie group of diffeomorphisms of $\varvec{\varphi }^i$ after evaluating the field update

$$\begin{aligned} \varvec{\varphi }^{i+1}=k*(\varvec{\varphi }^i \circ \exp (\mathbf {f}(\varvec{\varphi }^i))). \end{aligned}$$

(5)

The exponential mapping can be efficiently approximated by a scaling and squaring algorithm [1, 37].

2.3 Hybrid Image Registration

In hybrid image registration, both dense image features and landmarks are available. In addition to the images $I_R$ and $I_T$, we are given lists $X_R=\{x_i^R\}_{i=1}^n$ and $X_T=\{x_i^T\}_{i=1}^n$ of $n$ corresponding landmark points for each image. The displacements induced by these landmarks are given by

$$\begin{aligned} X&= \{(x_1^R,x_1^T-x_1^R),\ldots ,(x_n^R,x_n^T-x_n^R)\}\\&= :\{(x_1,y_1),\ldots ,(x_n,y_n)\}. \end{aligned}$$

Further, let $u:\Omega \rightarrow \mathrm{I\!R}^d$ be an unknown displacement field, such that the warp $\varvec{\varphi }(x)=x+u(x), x\in \Omega $ and the target image is warped by $I_T(x+u(x))$. In [21], Lüthi et al. modeled the prior knowledge about $u$ using a Gaussian process $\mathcal {GP}(\mu ,k)$, which is defined by a mean function $\mu :\Omega \rightarrow \mathrm{I\!R}^d$ and a covariance function $k:\Omega \times \Omega \rightarrow \mathrm{I\!R}$. In our terms, this means to minimize

$$\begin{aligned} \mathcal {J}_{GP}[\varvec{\varphi }] := \mathcal {D}_{GP}[I_R,I_T,\mu ,u]+\mathcal {S}_{GP}[k,u], \end{aligned}$$

where the target image is warped by $I_T(x+\mu (x)+u(x))$. Assuming a zero mean, this functional is equivalent to the functional (1) introduced at the beginning of this paper. However, the strength of this interpretation is that we can now formulate the hybrid registration problem by conditioning the Gaussian process on the $n$ given landmark displacements. The resulting posterior process $\mathcal {GP}_X(\mu _{X},k_{X})$ is given in closed form by

$$\begin{aligned} \mu _{X}(x)&= \mu (x) + K_X(x)^T(K+\sigma ^2Id)^{-1}Y \\ k_{X}(x,y)&= k(x,y) - K_X(x)^T(K+\sigma ^2Id)^{-1}K_X(y)^T,\nonumber \end{aligned}$$

(6)

where $K_X(x)=(k(x,x_i))_{i=1}^n\in \mathrm{I\!R}^n, K\in \mathrm{I\!R}^{n\times n}$ is the kernel matrix with entries $K_{i,j}=k(x_i,x_j), Y=(y_1-\mu (x_1),\ldots ,y_n-\mu (x_n))^T\in \mathrm{I\!R}^n$ are the mean free landmark displacements and $\sigma ^2$ models the uncertainty about matching accuracy of the landmarks (see also e.g. Rasmussen in [27, Chap. 2.2]). Hence, our functional which we minimize becomes

$$\begin{aligned} \mathcal {J}_{GP_X}[\varvec{\varphi }] := \mathcal {D}_{GP_X}[I_R,I_T,\mu _{X},u]+\mathcal {S}_{GP_X}[k_{X},u], \end{aligned}$$

(7)

using $\mu _X$ as landmark based mean transformation and the kernel function $k_X$ for regularization.

Since $k_X$ depends on the landmark displacements, it is not stationary and cannot directly be handled by the optimization scheme (4) introduced above. In the following, we present the necessary adjustments to still benefit from the advantages of the framework.

3 Methods

In this paper, we generalize the variational image registration framework to kernel functions, which possibly are nonseparable and nonstationary. While the optimization scheme in (4) is conceptually not restricted to separable convolution kernels, nonseparable filtering exceeds standard computational requirements. In Sect. 3.1, we present a separable 1D filter approximation for nonseparable filters, based on low-rank tensor approximation, which enables us to perform the convolution efficiently.

For the generalization to nonstationary kernels, which vary depending the spatial location, the Eq. (4) has to be rewritten. We explicitly write the convolution integral but with a kernel which is not stationary (cf. the work of McOwen in [22, Chap. 2.3] and Evans [11, Chap. 2.3]). This becomes the following integral equation

$$\begin{aligned} \varvec{\varphi }^{i+1}=\int \limits _\Omega k_{X}(\cdot ,s)(\varvec{\varphi }^i \circ \mathbf {f}(\varvec{\varphi }^i))\mathrm{{d}}s. \end{aligned}$$

(8)

Similar, with $\varvec{\varphi }$ restricted to be diffeomorphic, Eq. (5) becomes

$$\begin{aligned} \varvec{\varphi }^{i+1}=\int \limits _\Omega k_{X}(\cdot ,s)(\varvec{\varphi }^i \circ \exp (\mathbf {f}(\varvec{\varphi }^i)))\mathrm{{d}}s. \end{aligned}$$

(9)

In Sect. 3.2, we further introduce a nonstationary filter approach, which makes the approximation of the integral (8) and (9) computationally feasible.

3.1 Separable Filter Decomposition

Applying the proposed optimization scheme to image registration requires the discretization of the formulation in Eq. (9). To make it more clear, we start by writing the spatially discretized version of Eq. (5), where the kernel $k$ becomes a 3D tensor $H_0$:

$$\begin{aligned} \varvec{\varphi }^{i+1}(x)= H_0*(\varvec{\varphi }^i \circ \exp (\mathbf {f}(\varvec{\varphi }^i)))_x. \end{aligned}$$

(10)

$H_0$ is the discrete unit impulse response of $k$ with elements $H_{0_{ijq}}=k(0,(i,j,q)^T)$ and $i,j,q$ cover the neighborhood around $0$, while the subscript $x$ of the second term indicates the equally large discrete neighborhood around the point $x$.

If the kernel is separable the iteration scheme of Eq. (10) can be accelerated greatly by performing the convolution separately in each space dimension by successive 1D convolutions. The Gaussian kernel has this nice property of separability. Therefore without any further effort, the convolution with this kernel can be performed separately. To still benefit from this performance gain for nonseparable kernels, like the exponential kernel [19], their separability has to be approximated. In 2D, this can be achieved by standard singular value decomposition. However in 3D, this leads to mathematical challenges that go beyond standard linear algebra, since a filter kernel in 3D is a third order tensor. In contrast to 2D matrices, it is an NP-hard problem to determine the rank of a specific given higher order tensor (see Kolda and Bader in [18]). Hence, the rank $R$ becomes a parameter which has to be estimated. Nevertheless, we are able to compute the approximation using a CANDECOMP/PARAFAC (CP) decomposition model [18]. This gives us separable 1D approximations of the discrete unit impulse response $H_0$. In Fig. 2 the decomposition model is visually illustrated. Such a decomposition can be formulated as a minimization problem

$$\begin{aligned} \min _{\tilde{H}_0}\Vert H_0 - \tilde{H}_0\Vert \quad \mathrm{{ with}}\quad \tilde{H}_0=\sum _{r=1}^R a_r\otimes b_r\otimes c_r, \end{aligned}$$

(11)

where the operation $\otimes $ denotes the three-way outer product $\tilde{H}_{0_{ijk}}=\sum _r^R a_{ir}b_{jr}c_{kr}$. Standard algorithms to optimize (11) are based on the alternating least squares (ALS) method [15], which is explained in more detail in the following section. The parameter $R$ is estimated by testing the approximation performance for different ranks (see Sect. 4.1).

Once the decomposition is performed, the distributivity (12) and the associativity (13) of the convolution operation can be exploited to perform the convolution separately in each dimension with $a_r,b_r$ and $c_r$

$$\begin{aligned} H_0*\mathbf {f}\approx \tilde{H}_0*\mathbf {f}&= \left[ \sum _r^R a_r\otimes b_r\otimes c_r\right] *\mathbf {f}\nonumber \\&= \sum _r^R\left[ a_r\otimes b_r\otimes c_r*\mathbf {f}\right] \end{aligned}$$

(12)

$$\begin{aligned}&= \sum _r^R a_r*\left( b_r*\left( c_r*\mathbf {f}\right) \right) . \end{aligned}$$

(13)

For cubic filter kernels, having a filter width $m$, the computational cost for each output pixel reduces significantly from m$^3$ to $3Rm$.

3.1.1 Alternating Least Squares Method

The low-rank approximation of $H_0$ is calculated by minimizing the optimization problem (11). The minimizers $a_r,b_r$ and $c_r$ are obtained using the alternating least squares (ALS) method [15]. To this purpose, we introduce a notation to represent a tensor in a matrix form.

Let $H\in \mathrm{I\!R}^{P\times Q \times R}$ be a third-order tensor. By fixing one index the tensor is sliced into two-dimensional sections which have horizontal (mode-1), lateral (mode-2) and frontal (mode-3) orientation for the indices $\{1,2,3\}$ respectively. The mode-$n$ unfolding denoted as $H_{(n)}$ concatenates the mode-$n$ slices horizontally to a matrix.

Following Kolda and Bader [18], the CANDECOMP/PARAFAC model can be expressed as

$$\begin{aligned} H = [\![A,B,C]\!] \equiv \sum _{r=1}^R a_r\otimes b_r\otimes c_r, \end{aligned}$$

while $H_{(1)} = A(C\odot B)^T$, with $A = (a_1, a_2, \ldots , a_R)$ and likewise $B$ and $C.\, \odot $ is the Khatri-Rao product (see ).

The matrices $A,B$ and $C$, which minimize (11) can be calculated by alternately fixing all but one matrix e.g. $A$. This is followed by minimizing

$$\begin{aligned} \min _A \Vert H_{(1)} - A(C\odot B)^T\Vert _F, \end{aligned}$$

which has the optimum at

$$\begin{aligned} A = H_{(1)}[(C\odot B)^T]^{\dagger }. \end{aligned}$$

Using the special property that

$$\begin{aligned} (A\odot B)^{\dagger }=(A^{T}A \star B^{T}B)^{\dagger }(A\odot B)^T, \end{aligned}$$

where $\star $ is the Hadamard product (see ) and $A^{\dagger }$ the Moore-Penrose pseudo-inverse, the equations can be iteratively solved for $A, B$ and $C$

$$\begin{aligned} A&= H_{(1)} (C\odot B) (B^{T}B \star C^{T}C)^{\dagger }\\ B&= H_{(1)} (C\odot A) (A^{T}A \star C^{T}C)^{\dagger }\\ C&= H_{(1)} (B\odot A) (A^{T}A \star B^{T}B)^{\dagger } \end{aligned}$$

until the values of $A,B$ and $C$ converge. The convergence speed depends on the initialization of the fixed matrices. A common choice for the initialization is to use the Higher-order SVD [10] discussed in Sect. 3.2.2.

Since $H_0$ is now decomposed, the convolution in (10) can be performed separately.

3.2 Efficient Nonstationary Filtering

While stationary kernels $k(x-y)$ only depend on the difference of $x$ and $y$, nonstationary kernels $k(x,y)$ are dependent on both arguments. Therefore, for such kernels, separable filtering is not possible since the associativity no longer holds. If we spatially discretize the integral Eq. in (9)

$$\begin{aligned} \varvec{\varphi }^{i+1}(x)= H_x*(\varvec{\varphi }^i \circ \exp (\mathbf {f}(\varvec{\varphi }^i)))_x, \end{aligned}$$

(14)

where $H_x$ is the discrete impulse response of $k$ at location $x$, i.e. $H_{x_{ijq}}=k(x,(i,j,q)^T)$, we see that $H$ now depends on $x$, which makes the problem nonstationary. In general, the calculation of all the local impulse responses makes the problem computationally unscalable. However, in the particular case where we minimize the hybrid functional (7) we can exploit the following properties of the landmark kernel $k_X$ to reach an algorithm which is computationally feasible.

3.2.1 Landmark Kernel Properties

The landmark kernel $k_X$ consists out of the kernel $k$ subtracted by a landmark dependent term. The difference between $k$ and the full landmark kernel $k_X$ becomes negligible if

$$\begin{aligned} \forall x_i\in X\quad k(x,x_i)<\xi \end{aligned}$$

(15)

i.e. if $x$ is not in the neighborhood of any landmark. This property is exploited to approximate the integral of Eq. (9) by only considering $k$, the first part of the landmark kernel $k_X$, if the value of its second part goes to zero. We perform the approximation in two steps:

1
At first, the whole image is filtered separately using the stationary part $k$.
2
Subsequently, the nonseparable and nonstationary filtering with the full kernel $k_{X}$ is performed, but only for pixels where (15) is not fulfilled.

The second part is the most expensive step, because for each point in the vicinity of the landmarks its discrete local impulse response $H_x$ has to be calculated. This means a cubically increasing amount of kernel evaluations, which covers the neighborhood of all points having landmark support, and this in each iteration of Eq. (9). To reduce the computational demands we propose the following caching scheme.

3.2.2 Local Filter Caching

Since the landmark kernel is nonstationary, but still time-invariant, it is reasonable to keep the computed filter kernels in memory to save computational time for the following iterations. Jumping out of the frying pan into the fire, the amount of memory to cache all the filter kernels grows rapidly depending on the filter width and the number of landmarks. Therefore, we compress these local filter kernels by again taking advantage of tensor decomposition, before we cache them in the memory.

As we saw in Sect. 3.1.1, the CP decomposition is obtained by the ALS method, which is quite costly due to its iterative nature. Because $H_0$ has to be decomposed only once, it is still well suited to approximate the separability of the stationary filter. However, it is too slow to decompose all the local impulse responses $H_x$.

Compared to the CP decomposition the Tucker decomposition [34] (see Fig. 3) is significantly faster. It is an alternative model to decompose a tensor. Similar to the CP model the tensor is decomposed into triplets of vectors, but they are weighted by a full so called “core” tensor.

$$\begin{aligned} \tilde{H}_x = [\![G;A,B,C]\!] \equiv \sum _{p=1}^P\sum _{q=1}^Q\sum _{r=1}^Rg_{pqr}a_p\otimes b_q\otimes c_r,\nonumber \\ \end{aligned}$$

(16)

where $g_{pqr}$ are the elements of the core tensor $G$ and $P,Q,R$ are the ranks for each space dimension. In this model the unfolded tensor $H$ is represented as

$$\begin{aligned} H_{(1)} = AG_{(1)}(C\bullet B)^T, \end{aligned}$$

where $\bullet $ is the Kronecker product (see ). Using the Higher-order SVD algorithm of De Lathauwer et al. [10]

$$\begin{aligned} \min \Vert H_{(1)} - AG_{(1)}(C\bullet B)^T\Vert \end{aligned}$$

can be very efficiently minimized by setting $A,B$ and $C$ to the leading left singular vectors of the corresponding mode-$n$ unfolding $H_{(n)}$

$$\begin{aligned} A = U_P^{(1)},\, B = U_Q^{(2)},\, C = U_R^{(3)}, \end{aligned}$$

where $U_l^{(n)}$ is the matrix consisting out of the leading $l$ singular vectors of $H_{(n)}$ and $G_{(1)}=A H_{(1)}(C\bullet B)^T$.

Compared to the CP model the Tucker decomposition is a less restricted model where the core $G$ can be dense while in the CP model the core is a super diagonal tensor^{Footnote 1} with ones on the diagonal. Although it cannot be used for separable filter approximation due to the weighting with the dense $G$, the memory savings are similar to the CP model. Setting $P=Q=R$ and having a filter length $m$, the memory consumption reduces from m$^3$ to $R^3+3Rm$ per voxel in the support of the landmarks. In this paper we chose $P,Q$ and $R$ by testing the resulting approximation performance.

3.3 Multi-Resolution Versus Multi-Scale

We presented a method to minimize the registration functional (7). It is mainly based on the local iterative minimization scheme (4). As such, it relies on a reasonable initialization and is prone to get “stuck” in local minima. In order to deal with that, we adopt a multiresolution strategy [40]. The support in voxels of the kernel function $k$ implicitly increases towards the lower resolution levels. Therefore, in combination with the posterior mean function $\mu _X$, we use a multiscale kernel $\tilde{k}$ (cf. Opfer in [24]), which combines kernels with different support, to compute the landmark based mean transformation $\mu _X$ (6):

$$\begin{aligned} \tilde{k}(x,y) = \sum _{l=0}^L{\lambda _l k^l(x,y)}, \end{aligned}$$

where $\lambda _l$ are positive weights and $k^l$ correspond to $k$ with adjusted kernel parameters per scale level $l$ and $L$ is the number of scale levels. The parameter e.g. for the Gaussian kernel becomes $\sigma _g^l=\sigma _g\cdot 2^{L-l}$. We have set the weights $\lambda _l$ to $10^{-l}$.

3.4 The Algorithm

By joining all the previously described building blocks, we have obtained a non-rigid image registration framework, in which different regularizers can be implemented by conveniently exchanging the regularization kernel, even if it is nonseparable or nonstationary. Specifically, the landmark kernel is supported by our framework. The diffeomorphic regularization is also approximated, as shown in Eq. (14). Moreover, we showed a multiscale approach that brings the landmark mean together with the image-based optimization on different resolution levels. The full algorithm which maximizes Eq. (7) by joining all the presented concepts is provided in Listing 1.

In the following, the performance of our filter approximation techniques is evaluated in detail, while we also provide a qualitative hybrid registration example.

4 Results

We presented a method which enables an efficient approximation of the optimization scheme in Eq. (9). In this section, we perform registration experiments for validating our method. First, we provide a detailed study about the separable filter approximation and discuss its approximation performance in terms of accuracy and computational aspects. Second, we analyze the local filter compression with respect to memory consumption, computational demands as well as approximation accuracy. We compare our method with Elastix [17], where the landmarks are incorporated as an additional cost term to the functional (1). This is followed by a qualitative result of the introductory patellar surface example. Likewise, we compare these results with Elastix. In an additional section we discuss the landmark based mean transformation $\mu _X$ in more detail.

As quality measurement, we use the target registration error (TRE), the dice coefficient (DICE) and we count singularities of the displacement fields, which is the number of voxels where the determinant of the Jacobian is smaller than zero. To compare two displacement fields $A$ and $B$ for each vector pair, we consider the magnitude differences and the vectors directional discrepancy. Following that, we define the accuracy loss:

$$\begin{aligned} \tau (A,B)&= \int \limits _\Omega \left( \Vert A(x)\Vert -\Vert B(x) \Vert \right) ^2 \\&+\left\| \frac{A(x)}{\Vert A(x)\Vert }\times \frac{B(x)}{\Vert B(x)\Vert }\right\| ^2 \mathrm{{d}}x,\nonumber \end{aligned}$$

(17)

where $\tau (A,A)=0$ and greater than zero for dissimilar displacement fields.

Since we only compare different regularizers and their approximations, we use for all experiments, the sum of squared differences similarity measure

$$\begin{aligned} \mathcal {D}[I_R,I_T,\varvec{\varphi }] := \int \limits _\Omega \left( I_T\circ \varvec{\varphi }(x)-I_R(x)\right) ^2\mathrm{{d}}x. \end{aligned}$$

Following Thirion [33], we perform second order gradient descent on $\mathcal {D}$ and obtain the forces

$$\begin{aligned} \mathbf {f}(\varvec{\varphi }(x))= -\frac{(I_T\circ \varvec{\varphi }(x)-I_R(x))\nabla I_T\circ \varvec{\varphi }(x)}{\kappa ^2\left( I_T\circ \varvec{\varphi }(x)-I_R(x)\right) ^2+\Vert \nabla I_T\circ \varvec{\varphi }(x)\Vert ^2}, \end{aligned}$$

with $\kappa ^2$ the reciprocal of the mean squared image spacing.

Generally, we set the prior mean function $\mu $ always to a rigid pre-alignment of the images.

We implemented our algorithm by extending the finite difference solver framework of the Insight Toolkit [39] and performed the experiments on an Intel Xeon CPU @ 3 GHz on 12 cores.

Elastix Configuration For the registration with Elastix [17] we used the B-spline transformation model combined with the mean squares metric and an LBFGS optimizer. For the landmark examples we combined the mean squares metric with the “Corresponding Points Euclidean Distance Metric” which is equivalent to the target registration error.

4.1 POPI Breathing Thorax Model

In this first experiment, we show quantitative results by different approximation ranks of the nonseparable exponential kernel $(\alpha =1)$ without considering landmarks. The filter $H_0$ has been discretized in a $23^3$ voxel neighborhood. We compare the results to the exact method, which is obtained with the same kernel, but without separable filtering. We used the POPI dataset provided by the Léon Bérard Cancer Center & CREATIS lab, Lyon, France [36], which contains $10$ CT images of a breathing lung. The images have a resolution of $482\times 360\times 141$ voxels and a spacing of $0.98\times 0.98\times 2$ mm$^3$. For our experiment, the images have been resampled to $235\times 175\times 141$ voxels and scaled to isotropic spacing at 2 $mm^3$. In the experiment, we have chosen the image number 0 to be the reference image, while we calculated the experiments on a single scale. We repeated the experiment by increasing the rank $R$ of the separable filter approximation from one to four. $R=1$ corresponds to the rank-one approximation used in Beuthien et al. [4] which serves us as baseline. The exact method corresponds to the algorithm of Long et al. [19] extended to 3D.

In Fig. 4, we illustrate the image error averaged over the nine registrations changing during the optimization for each experiment. In the first three experiments the convergence rate decreases with increasing rank $R$, while the resulting image error is getting smaller. One can also observe that for $R\ge 3$ the image error stays nearly the same and is close to the exact method. Moreover, the variance of the image error is getting more narrow with higher $R$. It can be assumed that for $R>4$ no significantly improved approximation can be achieved. For a better comparison, all mean curves are again shown together in the last plot. Furthermore, in Table 1, the results of the experiments are summarized in numerical terms. Evaluating the accuracy loss (17) between the approximations and the exact method, higher rank approximations reach greater accuracy. The CPU time is considerable high for the exact method. With a third of the effort, we achieve a good approximation, accepting only a very small loss of accuracy. For a more detailed comparison, we repeated the whole experiment again on different scale levels. The results are listed in Table 2 and the upper part of Table 3. Note that all quantities are averaged over the nine experiments.

Table 1 This table provides information about convergence properties with different approximation ranks

Full size table

Table 2 This table provides information about the computational time needed for the different experiments

Full size table

Table 3 In this table we show the performance evaluation of the POPI experiment with and without landmarks

Full size table

The results show that for nonseparable kernels, a one-rank approximation is not accurate enough to approximate the filter’s regularization property. With increasing rank, the calculation time gets larger. The increase is linear in $R$. Since the resulting image error as well as the convergence properties using $R=4$ do not significantly differ from the exact method, we think that $4$ ranks are sufficient to approximate the exponential kernel separably.

For a meaningful comparison to Elastix, we performed three experiments. First, the smoothness parameter $\sigma _\mathrm{{B-spline}}$ of the B-spline transform has been tuned to a small TRE ($\sigma _\mathrm{{B-spline}}=4$). Second, $\sigma _\mathrm{{B-spline}}$ was tuned in order that no singularities are present in the result, but simultaneously for a TRE which is as small as possible ($\sigma _\mathrm{{B-spline}}=16$). Finally, the parameter was chosen for a resulting transformation, which are approximately as smooth as the ones obtained by our method ($\sigma _\mathrm{{B-spline}}=64$). To quantify the smoothness of a displacement field $A$ we integrate over the local displacement changes:

$$\begin{aligned} \upsilon (A) = \int \limits _{\Omega }\int \limits _{\mathfrak {B}_x}\Vert A(s)-A(x)\Vert \mathrm{{ d}}s\mathrm{{d}}x, \end{aligned}$$

where $\mathfrak {B}_x$ is the neighborhood around $x$ with radius $1$. The results in Table 3 show, that you can’t have your cake and eat it too. In Elastix there is a trade-off between the TRE and the smoothness of the transformation. $\sigma _\mathrm{{B-spline}}$ can be tuned for a small TRE accepting a less smooth transformation or it is chosen such, that the resulting transformation is smooth, but with a higher TRE. However, our method reaches significantly smoother transformations compared to Elastix with a similar TRE. Since we regularize for diffeomorphic transformations it was expected, that compared to Elastix using a small smoothness parameter, no singularities will be present in the results. As soon as we increase $\sigma _\mathrm{{B-spline}}$ such that the transformations are as smooth as in our method, the TRE and DICE performance drops dramatically for Elastix.

To quantify the efficiency of our filter caching approach, we performed the experiments once more, but included $21$ landmarks provided in the POPI dataset. The landmark uncertainty was set to $\sigma =0.02$. For comparison, the exact method, which combines the separable filtering and the landmarks, does no compression. In Table 4, the average resources needed for each experiment are listed. As expected, for the Tucker decomposition, slightly more CPU time is needed. However, it is negligible compared to the memory savings reached with this compression. Furthermore, the approximation of the local filter kernels is nearly perfect resulting in a very small loss of accuracy. The most CPU intensive part in each experiment is the 1st iteration, because initially, all local filter responses have to be calculated. Without the caching scheme therefore, the overall CPU time would explode to CPU weeks.

Table 4 The table provides information about resources needed to perform the experiments

Full size table

To compare our hybrid results with Elastix, we performed the hybrid B-spline registration twice, using a small resp. a large weight $w$ for the landmark cost term (see Table 3). A large weight results in a smaller TRE while the overall smoothness decreases. Several singularities are present in the Elastix results, while the singularities in our method are negligible. The major advantage of our method becomes apparent with the overall smoothness. Despite the landmark consideration it is much higher than in the Elastix experiment.

The TRE could be decreased, but regarding the small uncertainty on the landmarks we would have expected a smaller landmark error. This discrepancy originates from the discretization of the mean transformation $\mu _X$, which in this experiment leads to a TRE drift of $0.264 \pm 0.121$.

$$\begin{aligned} \mathrm{{TRE}}_\mathrm{{drift}} = \mathrm {TRE}({\mu _X}_\mathrm{{discretized}}) - \mathrm{{TRE}}({\mu _X}_\mathrm{{exact}}) \end{aligned}$$

Compared to the experiment in Sect. 4.2 where the resolution is about twice as high the discretization error results in a TRE drift of $0.056 \pm 0.002$, which is negligible. Therefore, the mean transformation should be discretized on a finer grid.

A note on the parallelization Since our method is based on image filtering, it is well suited to perform the filtering for each voxel in parallel. Hence, the standard parallelization framework of ITK could directly be used to speed up the calculations. We performed the experiments with $24$ processes and reached an average speedup between $15$ and $18$ as listed in Table 5. Because the landmarks are not evenly distributed over the image domain, the work load is also not evenly distributed to the processes. Therefore, we reached a lower speedup in the hybrid registration experiment. The actual time needed to perform the experiments is the CPU time listed in Tables 1 and 2 divided by the speedups listed in Table 5. For example, calculating the 9 registrations on level 2, the exact method took us $2.5$ days instead of $8.5$ weeks.

Table 5 The speedups were calculated by $\dfrac{\hbox {Time (1 process)} - \hbox {Time (24 processes)}}{\hbox {Time (24 processes)}}$

Full size table

4.2 Patellar Surface Registration

We further performed a 3D experiment registering two femur shapes. The challenge with this kind of data is that the border of the patellar surface is potentially hard to recognize and its variation can be quite large, such that an accurate registration of the patellar surfaces using fully automatic algorithms is difficult. We obtained the patellar surfaces of the target and reference bone from an expert. By incorporating well-chosen landmarks, we can force our algorithm to register even the patellar surface correctly. The shapes were represented as signed distance images of $353\times 327\times 491$ voxels (isotropic spacing 0.57 mm$^3$) and registered on 5 scale levels. For $k$ we used the Gaussian kernel with $\sigma _g=1$ and a landmark uncertainty of $\sigma =0.3\times 10^{-3}$. We approximated the landmark kernel $k_X$ with $P=Q=R=5$. For illustration, we performed the experiment once without landmarks and once including the landmarks.

In Fig. 5, the warp fields are shown resampled on the bone surface depicted as arrows. Especially at the upper border of the patellar surface, one can see the strong impact of the landmarks. In Fig. 6, we plotted the warped reference shape including the dark gray marked part. Without considering the landmarks, the border of the patellar surface is clearly misaligned, while it is correctly registered when the landmarks are incorporated.

We performed the same experiment with Elastix and summarized the results in Table 6. The parameters of the hybrid B-spline registration were tuned concerning the TRE and DICE performance measures. While Elastix brings the TRE down to nearly zero a very large amount of singularities are present in the resulting transformation and the Dice coefficient is rather low. Our method reaches a small TRE as well. Furthermore, the singularity count is very low, the DICE quite high and the displacement field smooth.

Table 6 Quantitative measures of the femur example

Full size table

4.3 Smooth Mean Displacement

Since we can force multiple reference landmarks to match one single target landmark by setting $\sigma $ equal to zero, $\mu _X$ is not guaranteed to be invertible. In Fig. 7, an artificial example is shown where a grid is transformed by the mean displacement using different $\sigma $. Setting $\sigma $ equal to zero, or too small, results in unfavorable folds and barely make sense in real world medical problems. Therefore, in our patellar surface experiment, we have chosen the parameters such that folds in $\mu _X$ hardly ever occur. The mean transformed reference shape is shown in Fig. 8, where no holes can be identified on the surface.

Nevertheless, an inverse transformation can be obtained using the fixed-point approach of Chen et al. [7], where the inverse is iteratively approximated. An entirely different approach could be to perform diffeomorphic point matching [2, 9, 12, 13] for obtaining a invertible mean displacement. This will be addressed in future work.

5 Conclusion

In this paper, we implemented an efficient variational image registration framework, where a large variety of positive definite kernels can be used for regularization. Compared to standard approaches, we are able to accurately approximate separable filters for nonseparable regularizers in order to relax the computational demands. With less than a third of the computational effort, we approximate the true regularization with a very small loss of accuracy, while the rank-1 approximation is three times faster but results in a accuracy loss which is one order of magnitude larger. Furthermore, using an efficient nonstationary filtering scheme, we allow for location-dependent regularization. This enables us to perform hybrid landmark and image registration by utilizing the landmark kernel which incorporates landmark displacements as prior knowledge. For this purpose, accepting little more computational time, we can reduce the memory usage by at least one order of magnitude. Additionally, we added the diffeomorphic constraint on the resulting transformation. Its approximation does not significantly change the optimization scheme. The comparison with the hybrid B-spline registration shows, that our method results in smoother displacement fields even if landmark displacements are incorporated. We also discussed challenges associated with the invertibility of the landmark based transformation. An additional prior on this transformation, which ensures invertibility, similar to [2, 12], would further improve the registration. This will be addressed in future work.

Notes

A super diagonal tensor is the generalization of a diagonal matrix to higher order tensors, where the entries outside the main diagonal are zero.

References

Arsigny, V., Commowick, O., Pennec, X., Ayache, N.: A log-euclidean framework for statistics on diffeomorphisms. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2006. Lecture Notes in Computer Science, pp. 924–931 (2006)
Avants, B., Schoenemann, P., Gee, J.: Lagrangian frame diffeomorphic image registration: morphometric comparison of human and chimpanzee cortex. Med. Image Anal. 10(3), 397–412 (2006)
Article Google Scholar
Beg, M., Miller, M., Trouvé, A., Younes, L.: Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int. J. Comput. Vis. 61(2), 139–157 (2005)
Article Google Scholar
Beuthien, B., Kamen, A., Fischer, B.: Recursive green’s function registration. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2010. Lecture Notes in Computer Science, pp. 546–553 (2010)
Bro-Nielsen, M., Gramkow, C.: Fast fluid registration of medical images. In: Proceedings of the 4th International Conference on Visualization in Biomedical Computing, pp. 267–276. Springer, London (1996)
Cahill, N.D., Noble, J.A., Hawkes, D.J.: A demons algorithm for image registration with locally adaptive regularization. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2009. Lecture Notes in Computer Science, pp. 574–581. Springer (2009)
Chen, M., Lu, W., Chen, Q., Ruchala, K.J., Olivera, G.H.: A simple fixed-point approach to invert a deformation field. Med. Phys. 35, 81 (2008)
Article Google Scholar
Christensen, G.E., Johnson, H.J.: Consistent image registration. IEEE Trans. Med. Imaging 20(7), 568–582 (2001)
Article Google Scholar
Christensen, G.E., Yin, P., Vannier, M.W., Chao, K., Dempsey, J., Williamson, J.F.: Large-deformation image registration using fluid landmarks. In: Proceedings of the 4th IEEE Southwest Symposium on Image Analysis and Interpretation, pp. 269–273 (2000)
De Lathauwer, L., De Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000)
Article MathSciNet MATH Google Scholar
Evans, L.: Partial differential equations. Graduate Studies in Mathematics. American Mathematical Society, Providence (1998)
Google Scholar
Girdziušas, R., Laaksonen, J.: Gaussian process regression with fluid hyperpriors. Neural Information Processing, pp. 567–572. Springer, Berlin (2004)
Chapter Google Scholar
Guo, H., Rangarajan, A., Joshi, S.: Diffeomorphic point matching. Handbook of Mathematical Models in Computer Vision, pp. 205–219. Springer, New York (2006)
Chapter Google Scholar
Haber, E., Heldmann, S., Modersitzki, J.: A scale-space approach to landmark constrained image registration. In: Proceedings of the Second International Conference on Scale Space and Variational Methods in Computer Vision, pp. 612–623. Springer, Heidelberg (2009)
Harshman, R.: Foundations of the parafac procedure: models and conditions for an explanatory multimodal factor analysis. UCLA Working Papers in Phonetics, Los Angeles (1970)
Google Scholar
Johnson, H., Christensen, G.: Consistent landmark and intensity-based image registration. IEEE Trans. Med. Imaging 21(5), 450–461 (2002)
Article Google Scholar
Klein, S., Staring, M., Murphy, K., Viergever, M.A., Pluim, J.P., et al.: Elastix: a toolbox for intensity-based medical image registration. IEEE Trans. Med. Imaging 29(1), 196–205 (2010)
Article Google Scholar
Kolda, T., Bader, B.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)
Article MathSciNet MATH Google Scholar
Long, Z., Yao, L., Peng, D.: Fast non-linear elastic registration in 2d medical image. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2004. Lecture Notes in Computer Science, pp. 647–654 (2004)
Lu, H., Cattin, P., Reyes, M.: A hybrid multimodal non-rigid registration of MR images based on diffeomorphic demons. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society-EMBC 2010, pp. 5951–5954 (2010)
Lüthi, M., Jud, C., Vetter, T.: Using landmarks as a deformation prior for hybrid image registration. In: Proceedings of the 33rd International Conference on Pattern Recognition, pp. 196–205. Springer, Berlin (2011)
McOwen, R.: Partial differential equations: methods and applications. Tsinghua University Press, Haidian (1996)
MATH Google Scholar
Modersitzki, J.: Numerical Methods for Image Registration. Oxford University Press, Oxford (2004)
MATH Google Scholar
Opfer, R.: Multiscale kernels. Adv. Comput. Math. 25(4), 357–380 (2006)
Article MathSciNet MATH Google Scholar
Papademetris, X., Jackowski, A., Schultz, R., Staib, L., Duncan, J.: Integrated intensity and point-feature nonrigid registration. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI2004. Lecture Notes in Computer Science, pp. 763–770 (2004)
Papenberg, N., Olesch, J., Lange, T., Schlag, P., Fischer, B.: Landmark constrained non-parametric image registration with isotropic tolerances. Bildverarbeitung für die Medizin, pp. 122–126 (2009)
Rasmussen, C.: Gaussian processes in machine learning. In: Advanced Lectures on Machine Learning. Lecture Notes in Computer Science: Lecture Notes in Artificial Intelligence, vol. 3176, pp. 63–71. Springer, Germany (2004)
Schmidt-Richberg, A., Ehrhardt, J., Werner, R., Handels, H.: Diffeomorphic diffusion registration of lung ct images. In: Medical Image Analysis for the Clinic: A Grand Challenge-MICCAI 2010, pp. 165-174 (2010)
Schölkopf, B., Steinke, F., Blanz, V.: Object correspondence as a machine learning problem. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 776–783. ACM press, New York (2005)
Sotiras, A., Paragios, N., et al.: Deformable image registration: a survey (2012)
Stefanescu, R., Pennec, X., Ayache, N.: Grid powered nonlinear image registration with locally adaptive regularization. Med. Image Anal. 8(3), 325–342 (2004)
Article Google Scholar
Steinke, F., Schölkopf, B.: Kernels, regularization and differential equations. Pattern Recognit. 41(11), 3271–3286 (2008)
Article MATH Google Scholar
Thirion, J.: Image matching as a diffusion process: an analogy with maxwell’s demons. Med. Image Anal. 2(3), 243–260 (1998)
Article Google Scholar
Tucker, L.: Implications of factor analysis of three-way matrices for measurement of change. Problems in Measuring Change, pp. 122–137. University of Wisconsin Press, Madison (1963)
Google Scholar
Twining, C.J., Marsland, S.: Constructing diffeomorphic representations of non-rigid registrations of medical images. In: Information Processing in Medical Imaging. Lecture Notes in Computer Science, pp. 413–425. Springer, Berlin (2003)
Vandemeulebroucke, J., Sarrut, D., Clarysse, P.: The popi-model, a point-validated pixel-based breathing thorax model. In: Proceedings of the 15th International Conference on the Use of Computers in Radiation Therapy-ICCR (2007)
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Non-parametric diffeomorphic image registration with the demons algorithm. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2007. Lecture Notes in Computer Science, pp. 319–326 (2007)
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N.: Symmetric log-domain diffeomorphic registration: a demons-based approach. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI 2008. Lecture Notes in Computer Science, pp. 754–761 (2008)
Vercauteren, T., Pennec, X., Perchant, A., Ayache, N., et al.: Diffeomorphic demons using itk’s finite difference solver hierarchy. In: Insight Journal-ISC/NA-MIC Workshop on Open Science at MICCAI (2007)
Wang, H., Dong, L., O’Daniel, J., Mohan, R., Garden, A., Ang, K., Kuban, D., Bonnen, M., Chang, J., Cheung, R.: Validation of an accelerated’demons’ algorithm for deformable image registration in radiation therapy. Phys. Med. Biol. 50(12), 2887–2905 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Basel, 4056 , Basel, Switzerland
Christoph Jud, Marcel Lüthi, Thomas Albrecht, Sandro Schönborn & Thomas Vetter

Authors

Christoph Jud
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Lüthi
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Albrecht
View author publications
You can also search for this author in PubMed Google Scholar
Sandro Schönborn
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Vetter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christoph Jud.

Appendix

1.1 Matrix Products

1.1.1 Khatri-Rao Product

Given a matrix $A\in \mathrm{I\!R}^{m \times q}$ and a matrix $B\in \mathrm{I\!R}^{n \times q}$, the Khatri-Rao product of $A$ and $B$ is the matching column-wise Kronecker product

$$\begin{aligned} A\odot B = (a_1\bullet b_1, a_2\bullet b_2, \cdots a_q\bullet b_q) \in \mathrm{I\!R}^{mn\times q}. \end{aligned}$$

1.1.2 Hadamard Product

Given a matrix $A\in \mathrm{I\!R}^{m \times n}$ and a matrix $B\in \mathrm{I\!R}^{m \times n}$, the Hadamard product of $A$ and $B$ is the point-wise matrix product

$$\begin{aligned} A \star B = \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a_{11}b_{11} &{} a_{12}b_{12} &{} \cdots &{} a_{1n}b_{1n} \\ a_{21}b_{21} &{} a_{22}b_{22} &{} \cdots &{} a_{2n}b_{2n} \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ a_{m1}b_{m1} &{} a_{m2}b_{m2} &{} \cdots &{} a_{mn}b_{mn} \end{array} \right) \in \mathrm{I\!R}^{m \times n}. \end{aligned}$$

1.1.3 Kronecker Product

Given a matrix $A\in \mathrm{I\!R}^m \times \mathrm{I\!R}^n$ and a matrix $B\in \mathrm{I\!R}^q \times \mathrm{I\!R}^r$, the Kronecker product of $A$ and $B$ is given as

$$\begin{aligned} A \bullet B = \left( \begin{array}{l@{\quad }l@{\quad }l@{\quad }l} a_{11}B &{} a_{12}B &{} \cdots &{} a_{1n}B \\ a_{21}B &{} a_{22}B &{} \cdots &{} a_{2n}B \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ a_{m1}B &{} a_{m2}B &{} \cdots &{} a_{mn}B \end{array} \right) \in \mathrm{I\!R}^{mq \times nr}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jud, C., Lüthi, M., Albrecht, T. et al. Variational Image Registration Using Inhomogeneous Regularization. J Math Imaging Vis 50, 246–260 (2014). https://doi.org/10.1007/s10851-014-0497-0

Download citation

Received: 18 December 2012
Accepted: 15 January 2014
Published: 12 February 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s10851-014-0497-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Variational Image Registration Using Inhomogeneous Regularization

Abstract

Similar content being viewed by others

Convex Generalizations of Total Variation Based on the Structure Tensor with Applications to Inverse Problems

Generalized div-curl based regularization for physically constrained deformable image registration

Combined First and Second Order Variational Approaches for Image Processing

1 Introduction

2 Background