Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Diffusion magnetic resonance imaging (dMRI) is an advanced technique of medical imaging based on magnetic resonance. dMRI describes the diffusion of water particles in biological tissues. The basic mathematical description of the diffusion is through a second order tensor \( \mathbf {D}\in \mathbb {R}^{3 \times 3} \) represented by a \( 3 \times 3 \) symmetric and positive definite (SPD) matrix, whose elements \( D_{ii} \) (where \( i = x,y,z \)) represent the diffusion in the main directions and \( D_{ij} \) the correlation between them. Also, it allows to describe internal structures of living organisms through estimation of derived scalar measures obtained from \(\mathbf {D}\), such as fractional anisotropy maps [1]. The clinical applications of this type of images include: diagnosis of neurological diseases, e.g., Parkinson and Epilepsy, fiber tracking [2], detection of brain tumors [3], among others. The collection of diffusion tensors spatially related is known as diffusion tensor (DT) field or diffusion tensor imaging (DTI). A DT field is obtained from a dMRI study solving the Stejskal-Tanner formulation [4]. However, the use of DT fields estimated from dMRI is limited due to the images are acquired with low spatial resolution. Technological limitations and clinical acquisition protocols restrict the dMRI to a poor spatial resolution: from 1 to 2 \(\mathrm{mm}^3\). In some clinical applications, it is necessary to analyze in detail the studied structures (i.e. gray mater, tumors, tissues fiber) for performing a medical procedure. At these scenarios, the low spatial resolution becomes in a considerable difficulty. For this reason, researchers have proposed methodologies for enhancing the spatial resolution of dMRI studies as the presented by [5,6,7,8].

Interpolation of diffusion tensors is a feasible solution to obtain images with high resolution. Nevertheless, given that the tensors of a field have different characteristics the interpolation is challenging. Moreover, a DT has mandatory restrictions. For example, tensors must be SPD matrices; and the determinants of neighboring tensors must change monotonically for avoiding the swelling effect [7]. Another relevant factor is the spatial correlation among nearing tensors. Specifically, some DT fields have smooth spatial transitions, on the contrary, there are fields where the shape, size, and orientation of tensors change strongly. This type of fields are complex to interpolate. Regarding this, several methodologies for interpolation of diffusion tensors have been proposed. A straightforward methodology is the Euclidean interpolation [5], where each component of the tensor is interpolated linearly and independently. This Euclidean method has a drawback consisting of a swelling effect in interpolated tensors [7]. For solving this issue, it was implemented a logarithmic transformation to the tensor components for ensuring a monotonic variation of determinants, avoiding the swelling effect and preserving the SPD constraint. An important limitation of this technique is the modification of relevant clinical information extracted from DTI, i.e. fractional anisotropy (FA) and mean diffusivity (MD). Additionally, a framework based in Riemannian geometry was proposed by [6]. Here, the authors propose two methods: the rotational and geodesic interpolation. However, the methods are computationally expensive and modify the FA and MD information. Alternative approaches based on interpolation of tensor features were proposed. Basically, the tensors are decomposed in shape and orientation features (eigenvalues and Euler angles). The first attempt was presented in [7], here the features are linearly and separately interpolated. Also, in [8] a probabilistic method based on multi-output Gaussian processes is applied, unlike the method of [7], the features are jointly interpolated. The mentioned tensor decomposition is not unique, for this reason, there is an ambiguity with the tensor reconstruction. A recent probabilistic technique proposed by [9] interpolates the tensors using generalized Wishart processes (GWP). This method keeps the properties and constraints of diffusion tensors, but it has a low performance over fields with strong transitions (non-stationary fields). Hence, according to the previously mentioned, there are unsolved problems related to the low spatial resolution of DT fields, and some drawbacks and limitations of proposed methods for interpolation.

In this work, we are interested in characterizing, describing and representing non-stationary DT fields, to do this, we introduce a non-stationary GWP (NGWP) combining different kernels. The introduction of a non-stationary function implies that statistical properties of a DT field: mean, variance and covariance, are not modeled constantly into the space coordinates. For validation, we test the performance of the proposed model over three different datasets. We compare against log-Euclidean (LogEu) [5], feature based linear interpolation (FBLI) [7] and GWP [9], evaluating two metrics defined over SPD matrices: the Frobenius (Frob) and Riemann (Riem) distance. Finally, we evaluate morphological properties computing the mean squared error (MSE) for FA and MD.

The paper is arranged as follows: Sect. 2 presents the mathematical formulation of the NGWP model, and the procedure for parameters estimation. The Sect. 3 describe the databases and experimental setup. In the Sect. 4 we show interpolation results for three different datasets. In Sect. 5, we present the main conclusions about the significance of obtained results. Finally, in acknowledgments we thank to organizations funded this work.

2 Non-stationary Generalized Wishart Processes

A generalized Wishart process (GWP) is a collection of symmetric and positive definite random matrices \(\left\{ \mathbf {D}_{n}(\mathbf {z}) \right\} _{n=1}^{N}\) where \(\mathbf {D} \in \mathbb {R}^{P \times P} \), indexed by an arbitrary dependent variable \( \mathbf {z} \in \mathbb {R}^{M} \) [10]. The idea is to assume a GWP as a prior over a DT field. Thus, \( P = 3 \) is the dimensionality and \( \mathbf {z} = \left[ x,y\right] ^{\top } \) corresponds to the coordinates of each voxel in an image. A GWP is constructed through a superposition of outer products of Gaussian processes (GPs), weighted by a \(P \times P\) scale matrix \(\mathbf {V}=\mathbf {LL}^\top \),

$$\begin{aligned} \mathbf {D}(\mathbf {z}) = \sum _{i =1}^{\nu } \mathbf {L} \mathbf {\hat{u}}_{i}(\mathbf {z}) \mathbf {\hat{u}}_{i}^{\top }(\mathbf {z}) \mathbf {L}^{\top } \sim \mathcal {GWP}_{P}(\nu , \mathbf {V}, k(\mathbf {z},\mathbf {z}')), \end{aligned}$$
(1)

where \( \mathbf {\hat{u}}_{i} = \left( u_{i1}(\mathbf {z}), u_{i2}(\mathbf {z}), u_{i3}(\mathbf {z}) \right) ^{\top } \), with \( u_{ip}(\mathbf {z}) \sim \mathcal {GP}(0,k) \), \( i = 1,...,\nu \) and \( p = 1, 2, 3 \), \( k(\mathbf {z},\mathbf {z}') \) is the kernel function for the GPs and \( \mathbf {L} \) is the lower Cholesky decomposition from \( \mathbf {V}\). The joint distribution for all \( u_{id}(\mathbf {z}) \) functions evaluated in a set of input data \( \left\{ \mathbf {z}_{n} \right\} _{n=1}^{N} \), \( \left( u_{id}(\mathbf {z}_{1}), u_{id}(\mathbf {z}_{2}),..., u_{id}(\mathbf {z}_{N}) \right) ^{\top } \sim \mathcal {N}(\mathbf {0},\mathbf {K}) \) follows a Gaussian distribution, where \( \mathbf {K} \) is a \( N \times N \) Gram matrix, with entries \(K_{i,j} = k(\mathbf {z}_{i},\mathbf {z}_{j}) \). This parametrization separates the contributions between the shape parameters (\(\mathbf {L}\)) and spatial dynamic parameters (\(\mathbf {\hat{u}}_{i}\)). In particular, the parameter \( \mathbf {L} \) describes the expected tensor of \(\mathbf {D}(\mathbf {z})\), the degrees of freedom \( \nu \) controls the model flexibility, and the kernel parameters \( \varvec{\theta } \) in \( k(\mathbf {z},\mathbf {z}') \) determine how the matrices change over the spatial coordinates \(\mathbf {z}\). Unlike of the work proposed in [9], where the diffusion tensors are modeled using a GWP with a stationary kernel, we introduce a non-stationary kernel function. The purpose is spatially modeling the statistical properties (mean, variance, covariance) of a DT field. The kernel function applied in this approach was proposed by [11] and it allows to describe changes of multidimensional surfaces. We call to this model: Non-stationary generalized Wishart process (NGWP).

The non-stationary kernel is constructed by combining a set of different kernels \( \left\{ k_{i}(\mathbf {z},\mathbf {z'}) \right\} _{i=1}^{r} \), or the same kernel with different hyper-parameters [11]:

$$\begin{aligned} k(\mathbf {z},\mathbf {z'} ) = \sum _{i=1}^{r} \sigma (w_{i}(\mathbf {z}))k_{i}(\mathbf {z},\mathbf {z'})\sigma (w_{i}(\mathbf {z'})), \end{aligned}$$
(2)

where \(w_{i} (\mathbf {z}):\mathbb {R}^{M} \rightarrow \mathbb {R}^{1}\) is the weighting function, with \( M=3 \) the dimensional input, \(w_{i}(\mathbf {z}) = \sum _{j=1}^{v} a_{j}\cos (\varvec{\omega }_{j}^{\top }\mathbf {z} + \mathrm {b}_{j})\). \( \sigma (z): \mathbb {R}^{1}\rightarrow [0,1]\), is the warping function, that is computed as a convex combination over the weighting function \( \sigma (w_{i}(\mathbf {z})) = \exp (w_{i}(\mathbf {z})) / \sum _{i=1}^{r} \exp (w_{i}(\mathbf {\mathbf {z}}))\), \(\sum _{i=1}^{r} \sigma (w_{i}(\mathbf {z})) = 1 \), inducing a partial discretization over each kernel. This function induces non-stationarity, since it does not depend of the distance between input variables \((\mathbf {z},\mathbf {z}')\). In the context of DT interpolation, we employ \(r=3\) squared exponential (RBF) kernels with different inverse width \( (\gamma ) \) hyper-parameters.

2.1 Parameters Estimation

The scheme for parameters estimation is similar to the one used in previous works [9, 10], which is based on Bayesian inference employing Markov chain Monte Carlo (MCMC) algorithms. The aim is to compute the posterior distribution for the variables in the model: a vector \( \mathbf {u} \) whose elements are the values of the GPs functions, and the kernel parameters \( \varvec{\theta } = \left\{ a_{j}, \varvec{\omega }_{j}, b_{j}, \gamma _{i} \right\} \), where \( j=1,...,\nu \) and \( i=1,...,r \) given a set of data \( \mathcal {D} = (\mathbf {S}(\mathbf {z}_{1}),...,\mathbf {S}(\mathbf {z}_{N})) \). As we pointed out before, we assume that a DT field follows a NGWP prior, where the likelihood function is given by

$$\begin{aligned} p(\mathcal {D} | \mathbf {u}, \varvec{\theta }, \mathbf {L}, \nu ) \propto \prod _{i=1}^{N} \exp \left\{ -\dfrac{1}{2\beta ^2} \Vert \mathbf {S}(\mathbf {z}_{i}) - \mathbf {D}(\mathbf {z}_{i}) \Vert _{f}^{2} \right\} , \end{aligned}$$
(3)

being the \(\mathbf {S}(\mathbf {z}_{i})\) the tensors from the training set, \( \mathbf {D}(\mathbf {z}_{i})\) are the estimated tensors, \( ||\cdot ||_{f} \) is the Frobenius norm, and \(\beta ^2\) is a variance parameter. Previous works [9, 10] show that inference over the values \( \mathbf {L} \) and degree of freedom \( \nu \) increases the computational cost and does not contribute significantly to the model performance. Moreover, they suggest to fix \( \mathbf {L} \) as the average tensor of the training data, and to set \( \nu = P+1 \). According to the above, we sample from posterior distribution of parameters (\(\mathbf {u}\) and \(\varvec{\theta }\)) with an iterative procedure based on Gibbs sampling. Thus, the full conditional equations are given by,

$$\begin{aligned} p(\mathbf {u} | \varvec{\theta }, \mathbf {L}, \nu , \mathcal {D})&\propto p(\mathcal {D} | \mathbf {u}, \varvec{\theta }, \mathbf {L}, \nu ) p(\mathbf {u} | \varvec{\theta }) , \end{aligned}$$
(4)
$$\begin{aligned} p(\varvec{\theta }| \mathbf {u}, \mathbf {L}, \nu , \mathcal {D})&\propto p(\mathbf {u} | \varvec{\theta }) p(\varvec{\theta }) , \end{aligned}$$
(5)

where \( p(\mathbf {u} | \varvec{\theta }) = \mathcal {N}(\mathbf {0}, \mathbf {K}_{B}) \), \( \mathbf {K}_{B} \) is a \( NP\nu \times NP\nu \) block diagonal covariance matrix, formed by \( P\nu \) blocks of N-dimensional \( \mathbf {K} \) matrices, and \( p(\varvec{\theta }) \) is the prior for the kernel hyper-parameters. For sampling from (4) and (5) we employ elliptical slice sampling [12] and Metropolis-Hastings algorithm, respectively. Finally, we set the variance parameter \( \beta \) as the median of quadratic Frobenius norm computed from the training data.

2.2 DTI Field Interpolation

The aim is to estimate a matrix \( \mathbf {D}(\mathbf {z}_{*}) \) in a test point \( \mathbf {z}_{*} \), employing the learned parameters during the training stage. We compute the conditional distribution for \( \mathbf {u}_{*} \) given \( \mathbf {u} \) from the jointly distribution of \( \left[ \mathbf {u}, \mathbf {u}_{*} \right] ^{\top } \). This distribution is given by [10],

$$\begin{aligned} \mathbf {u}_{*}|\mathbf {u} \sim \mathcal {N}(\mathbf {A}\mathbf {K}_{B}^{-1}\mathbf {u}, \mathbf {I} - \mathbf {A}\mathbf {K}_{B}^{-1}\mathbf {A}^{\top }), \end{aligned}$$
(6)

where \( \mathbf {A} \) is the covariance matrix between the spatial coordinates \( \mathbf {z}_{*} \) and \( \mathbf {z} \) of training and test data, respectively. Once we obtain the values of \( \mathbf {u}_{*} \) from Eq. (6), we compute the matrix \( \mathbf {D}(\mathbf {z}_{*}) \) using the Eq. (1).

3 Datasets and Experimental Procedure

We test the proposed model over three datasets: First, a 2D toy DT field of \(41\times 41\) voxels obtained from a generative NGWP. Second, a synthetic field of \(29 \times 29\) tensors computed from a simulation of crossing fibers using the FanDTasia Toolbox [13]. Third, a real dMRI study acquired from the head of a healthy male subject with an age between 20 and 30 years, on a General Electrical Signa HDxt 3.0T MR scanner using a body coil for excitation, employing 25 gradient directions with a value of b equal to \( 1000\,\mathrm{S/mm}^{2} \). The DT field is a region of interest with \( 41 \times 41 \) voxels from a slice centered in the corpus callosum. We downsample in a factor of two the original datasets for obtaining the training sets. The rest of the data are used as ground truth (gold standard). For validation, we compare against log-Euclidean (LogEu) [5], feature based linear interpolation (FBLI) [7] and GWP [9]. We compute two error metrics: the Frobenius norm \((\mathrm {Frob})\) and Riemann norm \((\mathrm {Riem})\) [5],

$$\begin{aligned} \mathrm {Fr}(\mathbf D _{1}, \mathbf D _{2})&= \sqrt{\mathrm {trace}\left[ \left( \mathbf D _{1} - \mathbf D _{2} \right) ^{\top } \left( \mathbf D _{1} - \mathbf D _{2} \right) \right] },\end{aligned}$$
(7)
$$\begin{aligned} \mathrm {Ri}(\mathbf D _{1}, \mathbf D _{2})&= \sqrt{\mathrm {trace}\left[ \log (\mathbf D _{1}^{-1/2} \mathbf D _{2} \mathbf D _{1}^{-1/2} )^{\top } \log (\mathbf D _{1}^{-1/2} \mathbf D _{2} \mathbf D _{1}^{-1/2} ) \right] }, \end{aligned}$$
(8)

where \(\mathbf {D}_{1}\) and \(\mathbf {D}_{2}\) are the interpolated and the ground-truth tensors, respectively. Additionally, we evaluated morphological properties of the estimated tensors using fractional anisotropy (FA) errors maps and computing the mean square error (MSE) for the FA and mean diffusion (MD). The reader can find detailed information about FA and MD in [7].

4 Results and Discussion

4.1 Synthetic Crossing Fibers Data

We test the proposed model and the comparison methods: LogEu, FBLI, GWP, and NGWP over a simulation of crossing fibers. Graphical results of interpolation are illustrated in the Fig. 1. Where (a) Ground-truth, (b) training data, (c) LogEu, (d)FBLI, (e) GWP, and (f) NGWP. Also, we report the MSE map of MD in the Fig. 2. Finally, in the Table 1 we report numerical results of errors.

Fig. 1.
figure 1

Crossing fibers interpolation for the comparison methods: (a) Ground-truth, (b) Training data, (c) LogEu, (d) FBLI, (e) GWP, and (f) NGWP.

Fig. 2.
figure 2

MSE maps of MD of the interpolated crossing fibers DT field, (a) LogEu, (b) FBLI, (c) GWP and (d) NGWP. (Color figure online)

Table 1. Error metrics for the interpolation methods: Frobenius distance (Frob), Riemann distances (Rem) and MSE of the FA and MD.

The crossing fibers is a challenging interpolation problem, because the properties of the tensors vary abruptly across the space. The MSE maps of MD showed in the Fig. 2 show that the proposed method preserves the clinical information of the diffusion tensors with less error (color blue) than the comparison methods, mainly over the abrupt transition regions. We explain the above in the sense that the non-stationary kernel used in the NGWP model is constructed by combining different kernels, where each kernel can describe a particular region in the whole field. Finally, in Table 1, we report the mean and standard deviation of the Frob distance, Riem distance, and the MSE of the FA and MD. Statistically, there are no significant differences among all compared methods. However, our proposal is a suitable methodology for describing, representing, and interpolating complex tensor data, such as crossing fibers. Also, the NGWP preserves the clinical information (FA and MD) high accuracy.

4.2 Toy Data

Second, we evaluate the performance over a toy DT field obtained from sampling the NGWP model. This field has smooth regions and abrupt changes. The Fig. 3(a), (b) correspond to the ground-truth and training data. Figure 3(c), (d), (e), (f) show the interpolated fields with LogEu, FBLI, GWP and NGWP, respectively.

Fig. 3.
figure 3

Interpolation of a toy DT field, (a) Ground truth, (b) Training data. Interpolated fields: (c) LogEu, (d) FBLI, (e) GWP, and (f) NGWP.

The Fig. 4 shows the MSE maps of the MD for each interpolation method. Also, we evaluate the error metrics of the interpolated fields and their clinical information. These results are showed in the Table 2.

Fig. 4.
figure 4

MSE map of MD of the interpolated toy DT field, (a) LogEu, (b) FBLI, (c) GWP and (d) NGWP.

Table 2. Error metrics of the interpolation methods, Frobenius distance (Frob), Riemann distances (Rem) and MSE of the FA and MD properties.

The toy DT field has regions where the tensors properties (size, shape and orientation) change slowly. Also, there are other areas where the changes are abrupt. The proposed NGWP demonstrates that it is possible to adapt a model to different type of tensors, whether soft or complex fields, as we show in Fig. 3. Additionally, the proposed method has the ability of preserving the clinical information when a DT field is interpolated. If we give a closer look to Fig. 4, the MD is preserved with higher accuracy than the comparison methods. The metric errors reported in Table 2 show that NGWP can interpolate accurately the toy data, with a similar precision to the state of the art methods. We think the non-stationary kernel of our model, provides adaptability to the different transitions (smooth or strong) inherent to diffusion tensor data.

4.3 Real Data

Finally, we evaluate the performance of the interpolation methods in a real DT field. The Fig. 5(a) corresponds to a region of interest of \( 41 \times 41 \) tensors from a slice centered in the corpus callosum. The Fig. 5(b) is the training data. Figures 5(c), (d), (e), (f) are the interpolated fields with the LogEu, FBLI, GWP and NGWP. Also, the Fig. 6 shows the MSE maps of MD for each method. In the Table 3, we report numerical results of the error metrics.

Fig. 5.
figure 5

Interpolation of a real DT field, (a) Ground truth, (b) Training data. Interpolated fields: (c) LogEu, (d) FBLI, (e) GWP, and (f) NGWP.

Fig. 6.
figure 6

MSE maps of MD of the interpolated a real DT field, (a) LogEu, (b) FBLI, (c) GWP and (d) NGWP.

Table 3. Error metrics of the interpolation methods, Frobenius distance (Frob), Riemann distances (Rem) and MSE of the FA and MD properties.

A Real DT field has tensors with different sizes, forms, and orientations. These properties make difficult an accurate interpolation. Graphical results of Figs. 5 and 6, and error metrics of Table 3 show that NGWP can describe, represent and interpolate non-stationary tensors fields obtained from real dMRI studies. The proposed method reaches a performance similar to LogEu, FBL, and GWP. Again, the NGWP preserves the clinical information derived form the dMRI, as we show in Fig. 6 where we computed the MSE maps of MD. Moreover, error metrics are reported in the Table 3. From these results, we can establish that NGWP is a competitive methodology for interpolation of DT fields.

5 Conclusions and Future Work

In this work, we presented a probabilistic methodology for interpolation of diffusion tensor fields. Specifically, we model a DT field as a stochastic process defined over SPD matrices called Non-stationary generalized Wishart process (NGWP). The idea is to describe non-stationary properties of DT fields. To do this, we introduce a non-stationary kernel by combining different functions. Particularly, we combine \(r=3\) squared exponential kernels (RBF) with different length-scale hyper-parameters. We evaluated the performance of the proposed method using the Frobenius and Riemman distance over synthetic and real DT fields. Also, we evaluate the clinical information using errors maps of mean diffusivity (MD) and reporting the mean and standard deviation of the MSE of Fractional Anisotropy (FA). Outcomes demonstrated that NGWP is a competitive methodology for interpolating DT fields in comparison with methods of the state-of-the-art.

As future work, we would like to extend non-stationary kernel functions to more complex models such as tractography procedures where the interpolation of diffusion tensors is used.