Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Diffusion Tensor Imaging (DTI) [1, 2] has become the de facto standard today in diffusion MRI (dMRI) for investigating the complex microstructure of the cerebral white matter in-vivo and non-invasively. Its tremendous popularity is due to its simplicity in acquisition requisites and elegance in interpretation, which makes it easy to implement the technique and infer the white matter microstructure, in particular the underlying fiber orientations. Based on Fick’s phenomenological anisotropic diffusion equation, the DTI signal for the diffusion gradient G, is described by the modified Stejskal-Tanner equation parameterized by the second order diffusion tensor D [3]:

$$\displaystyle{ S = S_{0}\exp \left (-b\mathbf{g}^{T}\mathbf{D}\mathbf{g}\right ), }$$
(1)

where \(b =\gamma ^{2}\delta ^{2}g^{2}\left (\varDelta -\frac{\delta }{3}\right )\), g =  | G | , and \(\mathbf{g} = \mathbf{G}/\vert \mathbf{G}\vert \). In DTI, the apparent diffusion coefficient (ADC) is modeled by the spherical function \(D(\mathbf{g}) = \mathbf{g}^{T}\mathbf{D}\mathbf{g}\). However, in spite of its usefulness, it is well known that DTI is inherently limited in regions with heterogeneous fiber distributions, such as in fiber-crossings. In such regions DTI can neither accurately model the complex shape of the resulting ADC, nor correctly infer the underlying fiber bundle layout.

Generalized DTI (GDTI) [4], was proposed to overcome this limitation by modeling the complex shaped ADC with greater accuracy using Cartesian tensors of order higher than two, the so called higher order (diffusion) tensors (HOTs). GDTI, like DTI, is also based on Fick’s phenomenological laws of diffusion, where the diffusion tensor is replaced by a spherical diffusion function parameterized by a HOT, or as its projection on to the unit sphere. The GDTI signal for the diffusion gradient G is similarly described by:

$$\displaystyle{ S = S_{0}\exp \left (-bD(\mathbf{g})\right ),\quad D(\mathbf{g}) =\sum _{ j_{1}=1}^{3}\sum _{ j_{2}=1}^{3}\ldots \sum _{ j_{k}=1}^{3}D_{ j_{1},j_{2}\ldots j_{k}}g_{j_{1}}g_{j_{2}}\ldots g_{j_{k}}, }$$
(2)

where, \(D_{j_{1},j_{2}\ldots j_{k}}\) are the coefficients of the kth order, three dimensional, diffusion HOT \(\mathcal{D}^{(k)}\), and \(g_{j_{i}}\) are the components of the unit gradient vector g. The complex shaped ADC is described in GDTI by D(g). Since g is a unit norm vector, it can also be described by the two parameters θ ∈ [0, π] and ϕ ∈ [0, 2π) as \(\mathbf{g} = [\sin \theta \cos \phi,\sin \theta \sin \phi,\cos \theta ]^{T} = [g_{x},g_{y},g_{z}]^{T}\), which shows that the ADC or the spherical diffusion function is the projection of \(\mathcal{D}^{(k)}\) on to the unit sphere.

This form of the diffusion function helps derive certain properties of the diffusion HOT which greatly simplifies the GDTI model [4]. First, when k is odd D(−g) = −D(g). However, since negative diffusion is non-physical, this implies that k can only be even, or only even ordered HOTs are of interest in modeling the ADC. Second, although a kth order 3D HOT can have 3k independent coefficients, since only its projection along a vector g is of interest – \(\mathcal{D}^{(k)}\) has to be symmetric – or its coefficients should be equal under any permutation σ, of the coefficient indices \(D_{j_{1},j_{2}\ldots j_{k}} = D_{\sigma (j_{1},j_{2}\ldots j_{k})}\). This reduces the number of independent coefficients of the kth order HOT to a more tractable:

$$\displaystyle{ N_{k} = \frac{(k + 1)(k + 2)} {2}. }$$
(3)

In other words, to describe the ADC more accurately using GDTI, it is required to estimate from the diffusion signal the coefficients of a 3D symmetric HOT of even rank, such that the diffusion function or the estimated ADC is positive.

The independent coefficients of the kth order diffusion HOT are in practice estimated using the least squares (LS) approach [4] in a fashion almost identical to the approach for estimating the six coefficients of the diffusion tensor in DTI. The LS approach, although, rapid, since it involves only linear operations, does not guarantee that the estimated HOT will result in a positive diffusion function even when k is considered even. In other words, the reason for considering k to be even, i.e. the estimated ADC should be positive, is not satisfied by the LS estimation.

In this chapter we present two approaches for estimating, in particular 4th order, diffusion HOTs from the diffusion signal that guarantee that the estimated ADC or the diffusion function is positive. In the first method, we take recourse to the fact that 3D symmetric 4th order tensors can be rewritten through a mapping as 6D symmetric 2nd order tensors. This makes it possible to reformulate the problem of estimating a 4th order tensor with a positive diffusion profile, to a problem of estimating a 2nd order tensor with a positive diffusion profile, albeit in 6D. We solve this problem by applying the Riemannian framework developed for symmetric positive definite (SPD) tensors of order 2, for estimating DTI diffusion tensors with positive diffusion profiles.

In the second method, we base ourselves on the polynomial interpretation of HOTs. Therefore, the diffusion function D(g) is re-interpreted as a homogeneous polynomial in the components of the unit norm gradient vector g. This allows for a powerful parameterization of the diffusion signal, which ensures that the estimation process guarantees a 4th order HOT with a positive diffusion profile. This parameterization comes from the properties of ternary quartics, which was first pointed out in [5, 6]. Also it has been proposed in [7] that the affine invariant Riemannian metric may not be well suited for diffusion data. The polynomial parameterization, therefore, provides an alternative approach for estimating 4th order diffusion tensors with positive diffusion profiles, which employs the Euclidean metric that is better suited for handling diffusion data [7].

We note that solutions to the problem of estimating arbitrary even ordered HOTs with the positivity constraint have also been proposed in [20] and [8]. These methods and the contents of this chapter can be seen briefly resumed in chapter “Higher-Order Tensors in Diffusion Imaging”. However, in this chapter we present in greater detail the particular problem of estimating 4th order tensors with the positivity constraint, since 4th order tensors commonly appear in many problems, such as Diffusion Kurtosis Imaging (DKI: see again chapter “Higher-Order Tensors in Diffusion Imaging”). The importance of the methods presented here is highlighted by the fact that these methods have been recently used to estimate 4th order kurtosis tensors with positivity constraint [9].

This chapter is structured as follows. Section 2 is devoted to the Riemannian approach. Sections 2.1 and 2.2 present the algebra of 2nd and 4th order tensors which allow us to formulate the Riemannian framework. The Riemannian estimation scheme is put together in Sect. 2.3. Section 3 is devoted to the ternary quartic approach, with first the theory and then the algorithm in Sect. 3.3. Experiments and results are described and discussed in Sect. 4. We conclude in Sect. 5.

2 A Riemannian Approach for Symmetric Positive Definite Fourth Order Diffusion Tensors

The problem of estimating a diffusion tensor from the signal, which satisfies the positive diffusion profile has been extensively considered in DTI. Negative diffusion, which is non-physical, can also be a problem while estimating a 2nd order diffusion tensor D, which happens when the DTI-ADC \(\mathbf{g}^{T}\mathbf{D}\mathbf{g} < 0\), for some gradient direction g. This can occur since the LS estimation process doesn’t guarantee that the diffusion tensor will have a positive diffusion profile. This condition requires a dedicated mathematical framework which constraints the estimation process to only diffusion tensors D such that \(\mathbf{g}^{T}\mathbf{D}\mathbf{g} > 0,\ \forall \mathbf{g} \in S^{2}\).

An adequate framework for such an estimation was proposed by identifying the appropriate set of 2nd order tensors that satisfy the positive quadratic form, namely \(\mathcal{S}\mathit{ym}_{n}^{+}\), the set of SPD matrices, which satisfy \(\mathbf{x}^{T}\varSigma \mathbf{x} > 0,\ \forall \mathbf{x} \in \mathbf{R}^{n}\setminus \{\mathbf{0}\}\), and \(\varSigma \in \mathcal{S}\mathit{ym}_{n}^{+}\). In other words, if the estimation process were to only operate in the space of \(\mathcal{S}\mathit{ym}_{3}^{+}\) (in the case of DTI, n = 3), then the estimated diffusion tensor would satisfy the positive diffusion profile. The mathematical framework that was proposed, which allows to do this consists of an affine invariant metric of \(\mathcal{S}\mathit{ym}_{n}^{+}\), the Riemannian metric [1013], and a similarity invariant metric of \(\mathcal{S}\mathit{ym}_{n}^{+}\), the Log-Euclidean metric [14], which naturally confine operations on SPD matrices to the space of \(\mathcal{S}\mathit{ym}_{n}^{+}\).

Deriving an equivalent Riemannian metric for the space of 4th order diffusion tensors would, however, be far more involved due to the increase in order or the multi-linear property of HOTs. Nonetheless, such a metric would be the right framework to use in the estimation process of the 4th order diffusion tensor, since it would ensure that the estimated HOT satisfies the positive diffusion profile. However, given the symmetry condition of a diffusion HOT, this problem can be simplified by reformulating the diffusion profile of a 4th order HOT (Eq. (2)) to a bilinear form dependent on a 2nd order tensor. Mathematically, this would convert the problem to the case of estimating a 2nd order tensor in \(\mathcal{S}\mathit{ym}_{n}^{+}\), like in DTI. However, the conversion from a symmetric 4th order 3D tensor, results in a symmetric 2nd order tensor in 6D [1517]. Therefore, we would have to consider the space of \(\mathcal{S}\mathit{ym}_{6}^{+}\) instead of the space of \(\mathcal{S}\mathit{ym}_{3}^{+}\).

In this section, we propose to use this approach of transforming a symmetric 3D 4th order Cartesian diffusion tensor to a symmetric 6D 2nd order tensor, and of applying the Riemannian metric of the space \(\mathcal{S}\mathit{ym}_{6}^{+}\), to estimate a 4th order diffusion tensor from the signal with a positive diffusion profile in GDTI [18].

2.1 Algebra of Second Order Tensors

To understand the algebra of 4th order tensors, which is required to manipulate these entities, and to transform them to isometrically equivalent 2nd order tensors, we start with 2nd order tensors, which are well studied and intuitively easy to understand. Much of the following formulation of Cartesian 2nd and 4th order tensors in an Euclidean space can be found in [15, 16], where, essentially a tensor is used interchangeably with the matrix of a linear transformation.

Given an n dimensional inner product space (vector space with an inner product) V, an nD 2nd order tensor \(\mathbf{A} = \mathcal{A}^{(2)}\) is defined as the n × n matrix of the linear transformation:

$$\displaystyle{ A: V \rightarrow V,\quad \mathrm{st}\quad \mathbf{x} \rightarrow \mathbf{A}\mathbf{x},\ \mathbf{x} \in V. }$$
(4)

The transpose of the linear transformation, with matrix A T, can be defined from the inner product of V as \(\left < \mathbf{x},\mathbf{A}^{T}\mathbf{y}\right > = \left < \mathbf{A}\mathbf{x},\mathbf{y}\right >,\quad \forall \mathbf{x},\mathbf{y} \in V.\) The space of linear transformations from V to V, itself forms a vector space, which can be called Lin(V ) = { A: V → V }. The transpose of A can be used to define a natural inner product on Lin(V ) (summation over repeated indices over their whole range):

$$\displaystyle{ \left < \mathbf{A},\mathbf{B}\right >:= \mathit{tr}(\mathbf{A}^{T}\mathbf{B}) = A_{\mathit{ ij}}B_{\mathit{ij}},\quad A,B \in \mathrm{ Lin}(V ). }$$
(5)

If V is R n, then Lin(V ) is R n×n, and it is isomorphic to \(\mathbf{R}^{n^{2} }\). Therefore a tensor A in R n×n can be written as a vector a, in \(\mathbf{R}^{n^{2} }\). Furthermore, the isomorphism is an isometry, since \(\left < \mathbf{a},\mathbf{b}\right > = \left < \mathbf{A},\mathbf{B}\right >,\) where the first inner product is the natural inner product of the vector space \(\mathbf{R}^{n^{2} }\), and the second inner product is the newly defined inner product of Lin(V ) = R n×n.

A symmetric linear transformation A from V to V, can be defined from the transpose of its corresponding 2nd order tensor, as A = A T, which in terms of its components can be described by \(A_{\mathit{ij}} = A_{\mathit{ji}}\). It is then possible to decompose a 2nd order tensor (or linear transformation) into its symmetric and skew-symmetric parts by \(\mathbf{A}^{s} = (\mathbf{A} + \mathbf{A}^{T})/2\) and \(\mathbf{A}^{a} = (\mathbf{A} -\mathbf{A}^{T})/2\) respectively, such that \(\mathbf{A} = \mathbf{A}^{s} + \mathbf{A}^{a}\).

Finally the space of symmetric linear transformations \(\mathrm{Sym}(V ) =\{ A \in \mathrm{ Lin}(V )\vert \mathbf{A} = \mathbf{A}^{T}\}\), forms a subspace of Lin(V ). Since, an nD symmetric 2nd order tensor has n(n + 1)∕2 independent coefficients, if V is R n, then Sym(V ) is isomorphic to R n(n+1)∕2, and this mapping can be established in such a fashion that it is also an isometry, just like in the case of Lin(V ), or \(\left < \mathbf{a}_{s},\mathbf{b}_{s}\right > = \left < \mathbf{A}^{s},\mathbf{B}^{s}\right >\), for \(\mathbf{a}_{s},\mathbf{b}_{s} \in \mathbf{R}^{n(n+1)/2}\) and \(A^{s},B^{s} \in \mathrm{ Sym}(V )\). An example for such an isometric mapping when n = 3, can be established between a symmetric 3D 2nd order tensor B, and b, a vector or a 6D 1st order tensor:

$$\displaystyle{ \mathbf{b} = [B_{11},B_{22},B_{33},\sqrt{2}B_{12},\sqrt{2}B_{13},\sqrt{2}B_{23}]^{T}, }$$
(6)

where B ij are the coefficients of B.

2.2 Algebra of Fourth Order Tensors

The background for understanding the algebra of 4th order tensors is formed by the definition of the inner product, the isometric mapping to vectors (1st order tensors) of higher dimensions, and the symmetry properties, in particular Sym(V ), of the space of 2nd order tensors or Lin(V ). In an analogous way, we will define 4th order tensors as linear transformations from a vector space onto itself, define an inner product for the vector space of these linear transformations, study their symmetries, and establish an isometric mapping from the linear transformations to a vector space of lower order and higher dimension, which will allow us to manipulate 4th order tensors as 2nd order tensors.

The algebra of 4th order tensors can be described by proceeding in exactly the way as done above for 2nd order tensors, but with Lin(V ) as the vector space in place of V. Let an nD 4th order tensor \(\hat{\mathcal{A}} = \mathcal{A}^{(4)}\) be defined as the n × n × n × n transformation array of the linear transformation (summation over repeated indices over their whole range):

$$\displaystyle{ \mathrm{A}:\mathrm{ Lin}(V ) \rightarrow \mathrm{ Lin}(V ),\quad \mathrm{st}\quad \mathbf{C} \rightarrow \hat{\mathcal{A}}\mathbf{C} = A_{\mathit{ijkl}}C_{\mathit{kl}},\ \mathbf{C} \in \mathrm{ Lin}(V ). }$$
(7)

Since an inner product for Lin(V ) exists, it can be used to define the transpose of the linear transformation, with the transformation array \(\hat{\mathcal{A}}^{T}\), as:

$$\displaystyle{ \left < \mathbf{D},\hat{\mathcal{A}}^{T}\mathbf{C}\right > = \left < \hat{\mathcal{A}}\ \mathbf{D},\mathbf{C}\right >,\quad \forall \mathbf{C},\mathbf{D} \in \mathrm{ Lin}(V ). }$$
(8)

Again the space of linear transformations from Lin(V ) to Lin(V ) forms a vector space, which can be called \(\mathcal{L}\mathit{in}(V ) =\{\mathrm{ A}:\mathrm{ Lin}(V ) \rightarrow \mathrm{ Lin}(V )\}\), and again the transpose of A can be used to define an inner product on ℒ in(V ) (summation over repeated indices over their whole range):

$$\displaystyle{ \left < \hat{\mathcal{A}},\hat{\mathcal{B}}\right >:= \mathit{tr}(\hat{\mathcal{A}}^{T}\hat{\mathcal{B}}) = A_{\mathit{ ijkl}}B_{\mathit{ijkl}},\quad \hat{\mathcal{A}},\hat{\mathcal{B}}\in \mathcal{L}\mathit{in}(V ). }$$
(9)

If V is R n, then Lin(V ) is R n×n, and \(\mathcal{L}\mathit{in}(V )\) is R n×n×n×n, which is isomorphic to \(\mathbf{R}^{n^{4} }\). Therefore an nD 4th order tensor can be written as a vector in \(\mathbf{R}^{n^{4} }\). However, of greater interest is that \(\mathcal{L}\mathit{in}(V )\) is also isomorphic to \(\mathbf{R}^{n^{2}\times n^{2} }\), which implies that an nD 4th order tensor \(\mathcal{A}\) can be written as an n 2D 2nd order tensor A. Furthermore, this isomorphism is also an isometry \(\left < \mathbf{A},\mathbf{B}\right > = \left < \hat{\mathcal{A}},\hat{\mathcal{B}}\right >.\)

Symmetries of 4th order tensors present a richer set of possibilities than the symmetry of 2nd order tensors, since a number of symmetries can be defined by applying different “symmetry rules” on the four coefficient indices. Indeed, we shall present the major symmetry, the minor symmetry and the total symmetry. Total symmetry is, however, the symmetry of interest to us, which in the mathematical approach to tensors is the definition of symmetry of a HOT, where the coefficients of the HOT remain unchanged under any permutation of the coefficient indices. This is also the symmetry condition required by the diffusion HOT in GDTI, as implied by its properties. However, this symmetry is best called total symmetry (or complete symmetry), to differentiate it from the other possible symmetries that are derived from physics and that carry important physical interpretations.

We shall, however, not present such physical interpretations here, but content ourselves with counting the number of independent coefficients of a 4th order tensor under the various symmetries. To do this we will require the formula for counting the number of ways of choosing m elements from n elements without order and with repetition (combination) \(S_{m,n} = \left (\begin{array}{c} n + m - 1\\ m \end{array} \right ).\)

Major symmetry of an nD 4th order tensor \(\mathcal{A}\) is defined by the index symmetry rule \(A_{\mathit{ij},\mathit{kl}} = A_{\mathit{kl},\mathit{ij}}\). To count the number of independent coefficients of \(\mathcal{A}\), which satisfies major symmetry, we consider the isometrically equivalent n 2D 2nd order tensor A, which has only two indices I = ij and J = kl. Therefore, major symmetry of \(\mathcal{A}\) can be translated as the index symmetry rule of A as \(\hat{A}_{\mathit{IJ}} =\hat{ A}_{\mathit{JI}}\), where \(\hat{A}_{o_{1}o_{2}}\) are the coefficients of A, which implies that \(\mathbf{A} = \mathbf{A}^{T}\). Therefore, the number of independent coefficients of \(\mathcal{A}\), which satisfies major symmetry, is (\(\overline{M}\) is used to indicate major symmetry):

$$\displaystyle{ N_{\overline{M}} = \frac{n^{2}(n^{2} + 1)} {2}. }$$
(10)

Note that major symmetry for \(\mathcal{A}\), corresponds to the regular notion of symmetry for the 2nd order tensor A. Therefore, symmetry properties of A, such as decomposition into a symmetric part and a skew symmetric part and eigen-decomposition, can be attributed to the 4th order tensor \(\mathcal{A}\) by isomorphism. Major symmetry also corresponds to the notion of symmetry induced by the definition of the transpose of a 4th order tensor, or a linear transformation from Lin(V ) to Lin(V ).

Minor symmetry of an nD 4th order tensor \(\mathcal{A}\) is defined by the index symmetry rule \(A_{\mathit{ij},\mathit{kl}} = A_{\mathit{ji},\mathit{kl}} = A_{\mathit{ij},\mathit{lk}}\). To count the number of independent coefficients of \(\mathcal{A}\), which satisfies minor symmetry, the index rule can be seen as first choosing 2 index values {ij} from n index values without order and with repetition, and then again choosing 2 index values {lk} under the same condition. However, since {ij} and {lk} don’t swap, their mutual order is important. Therefore, the number of independent coefficients of \(\mathcal{A}\), which satisfies minor symmetry is (\(\underline{M}\) is used to indicate minor symmetry):

$$\displaystyle{ N_{\underline{M}} = \left (\begin{array}{c} n + 2 - 1\\ 2 \end{array} \right )^{2} = \frac{n^{2}(n + 1)^{2}} {4}. }$$
(11)

The number of independent coefficients of an nD 4th order tensor with combined major and minor symmetries can be computed by combining the reasonings of the individual counts. First choose 2 index values \(\{\mathit{ij}\} = I\) or {lk} = J from n index values without order and with repetition, which gives \(\sqrt{ N_{\underline{M}}}\). Then choose 2 index values {IJ} from these \(\sqrt{N_{\underline{M}}}\) index values without order and with repetition. Therefore, the number of independent coefficients of \(\mathcal{A}\), which satisfies both major and minor symmetries is:

$$\displaystyle{ N_{(\overline{M}+\underline{M})} = \left (\begin{array}{c} \sqrt{N_{\underline{M}}} + 2 - 1\\ 2 \end{array} \right ). }$$
(12)

Total symmetry or just symmetry, is defined for an nD 4th order tensor \(\mathcal{A}\) by the index symmetry rule \(A_{\mathit{ijkl}} = A_{\sigma (\mathit{ijkl})}\), where σ(ijkl) is any permutation of the indices {ijkl}. This is the symmetry satisfied by any HOT in the GDTI model, which implies from Eq. (3), that the number of independent coefficients for a 3D kth order GDTI HOT is N k . However, the number of independent coefficients of an nD 4th order tensor \(\mathcal{A}\), which satisfies total symmetry can also be counted as the number of ways of choosing 4 index values from n possible index values, therefore:

$$\displaystyle{ N_{T} = \left (\begin{array}{c} n + 4 - 1\\ 4 \end{array} \right ). }$$
(13)

If we consider k = 4, it implies N k  = 15, and if we consider n = 3, it implies N T  = 15. This establishes the consistency between N k and N T .

Any 4th order tensor \(\mathcal{A}\) satisfying major and minor symmetries can be decomposed in a unique manner into a totally symmetric 4th order tensor \(\mathcal{A}^{s}\) and its asymmetric part \(\mathcal{A}^{a}\) such that \(\mathcal{A} = \mathcal{A}^{s} + \mathcal{A}^{a}\). The coefficients of the totally symmetric part and the asymmetric part can be computed from [15]:

$$\displaystyle{ \begin{array}{@{}cl@{}} A_{\mathit{ijkl}}^{s} = &\frac{1} {3}\left (A_{\mathit{ijkl}} + A_{\mathit{ikjl}} + A_{\mathit{ilkj}}\right ) \\ A_{\mathit{ijkl}}^{a} =&\frac{1} {3}\left (2A_{\mathit{ijkl}} - A_{\mathit{ikjl}} - A_{\mathit{ilkj}}\right ). \end{array} }$$
(14)

These, along with the definition of the inner product between two 4th order tensors can be used to show that \(\left < \mathcal{A}^{s},\mathcal{B}^{a}\right > = \mathit{tr}(\mathcal{A}^{s}\mathcal{B}^{a}) = 0.\)

These symmetries greatly reduce the number of independent coefficients of an nD 4th order tensor from the total number of possible independent coefficients, which is n 4. Of particular interest are the 4th order tensors which satisfy both major and minor symmetries. These form a subspace of \(\mathcal{L}\mathit{in}(V ),\) called:

$$\displaystyle{ \mathcal{S}\mathit{ym}_{(\overline{M}+\underline{M})}(V ) =\{\mathrm{ A}: \mathcal{L}\mathit{in}(V ) \rightarrow \mathcal{L}\mathit{in}(V )\vert \mathcal{A}\ \mathrm{satisfies\ major\ \&\ minor\ symmetries}\}, }$$
(15)

which is isometrically isomorphic to \(\mathbf{R}^{N_{ (\overline{M}+\underline{M})} }\).

When n = 3, \(N_{\underline{M}} = 36\), and \(N_{(\overline{M}+\underline{M})} = 21\). Therefore, \(\mathcal{S}\mathit{ym}_{(\overline{M}+\underline{M})}(V )\) is isomorphic to R 21, which is the space of symmetric 6D 2nd order tensors. An example of an isometric isomorphism that can be established in this case between a 3D 4th order tensor \(\mathcal{A}_{(\overline{M}+\underline{M})}\) and a 6D 2nd order tensor A is [19]:

$$\displaystyle{ \mathbf{A} = \left (\begin{array}{@{}cccccc@{}} A_{\mathit{xxxx}} & A_{\mathit{xxyy}} & A_{\mathit{xxzz}} & \sqrt{2}A_{\mathit{xxxy}} & \sqrt{2}A_{\mathit{xxxz}} & \sqrt{2}A_{\mathit{xxyz}} \\ A_{\mathit{xxyy}} & A_{\mathit{yyyy}} & A_{\mathit{yyzz}} & \sqrt{2}A_{\mathit{yyxy}} & \sqrt{2}A_{\mathit{yyxz}} & \sqrt{2}A_{\mathit{yyyz}} \\ A_{\mathit{xxzz}} & A_{\mathit{yyzz}} & A_{\mathit{zzzz}} & \sqrt{2}A_{\mathit{zzxy}} & \sqrt{2}A_{\mathit{zzxz}} & \sqrt{2}A_{\mathit{zzyz}} \\ \sqrt{2}A_{\mathit{xxxy}} & \sqrt{2}A_{\mathit{yyxy}} & \sqrt{2}A_{\mathit{zzxy}} & 2A_{\mathit{xyxy}} & 2A_{\mathit{xyxz}} & 2A_{\mathit{xyyz}} \\ \sqrt{2}A_{\mathit{xxxz}} & \sqrt{2}A_{\mathit{yyxz}} & \sqrt{2}A_{\mathit{zzxz}} & 2A_{\mathit{xyxz}} & 2A_{\mathit{xzxz}} & 2A_{\mathit{xzyz}} \\ \sqrt{2}A_{\mathit{xxyz}} & \sqrt{2}A_{\mathit{yyyz}} & \sqrt{2}A_{\mathit{zzyz}} & 2A_{\mathit{xyyz}} & 2A_{\mathit{xzyz}} & 2A_{\mathit{yzyz}} \end{array} \right ), }$$
(16)

where A ijkl are the independent coefficients of \(\mathcal{A}_{(\overline{M}+\underline{M})}\). This map, along with the map in Eq. (6), which transforms a symmetric 2nd order tensor to a vector or a 1st order tensor, allows us to isometrically rewrite the effects of a linear transformation A\(_{(\overline{M}+\underline{M})}\) in \(\mathcal{S}\mathit{ym}_{(\overline{M}+\underline{M})}(V )\) on a symmetric linear transformation B s in Sym(V ), as a matrix vector product when n = 3:

$$\displaystyle\begin{array}{rcl} \mathcal{A}_{(\overline{M}+\underline{M})}\mathbf{B}^{s}& =& \mathbf{A}_{ (\overline{M}+\underline{M})}\mathbf{b}^{s},{}\end{array}$$
(17)
$$\displaystyle\begin{array}{rcl} \left < \mathcal{D}^{s},\mathcal{A}_{ (\overline{M}+\underline{M})}\mathbf{B}^{s}\right >& =& \mathbf{d}^{s^{T} }\mathbf{A}_{(\overline{M}+\underline{M})}\mathbf{b}^{s}.{}\end{array}$$
(18)

However, since diffusion HOTs from the GDTI model have to satisfy total symmetry, we are interested in the space of 3D 4th order tensors, which satisfy total symmetry. These also form a subspace of \(\mathcal{L}\mathit{in}(V ),\) called:

$$\displaystyle{ \mathcal{S}\mathit{ym}_{T}(V ) =\{\mathrm{ A}: \mathcal{L}\mathit{in}(V ) \rightarrow \mathcal{L}\mathit{in}(V )\vert \mathcal{A}\ \mathrm{satisfies\ total\ symmetry}\}, }$$
(19)

which is isometrically isomorphic to R 15, since N T  = 15 when n = 3. Although R 15 corresponds to the space of symmetric 5D 2nd order tensors, the isometry to symmetric 6D 2nd order tensors (Eq. (16)) can be modified to represent \(\mathcal{S}\mathit{ym}_{T}(V )\), with the added equalities:

$$\displaystyle{ \begin{array}{@{}ccc@{}} A_{\mathit{xxyy}} = A_{\mathit{xyxy}};\quad & A_{\mathit{xxzz}} = A_{\mathit{xzxz}};\quad & A_{\mathit{yyzz}} = A_{\mathit{yzyz}} \\ A_{\mathit{xxyz}} = A_{\mathit{xyxz}};\quad &A_{\mathit{yyxz}} = A_{\mathit{xyyz}};\quad &A_{\mathit{zzxy}} = A_{\mathit{xzyz}}. \end{array} }$$
(20)

Applying these equalities to A in Eq. (16), is equivalent to decomposing the 3D 4th order tensor \(\mathcal{A}_{(\overline{M}+\underline{M})}\), with major and minor symmetries, into its totally symmetric part \(\mathcal{A}_{(\overline{M}+\underline{M})}^{s}\) [15]. In other words, an isometry from \(\mathcal{S}\mathit{ym}_{T}(V )\) to the space of symmetric 6D 2nd order tensors can be established by considering the totally symmetric part of the equivalent 3D 4th order tensor with only major and minor symmetries.

The final isometry between \(\mathcal{S}\mathit{ym}_{T}(V )\) and the space of symmetric 6D 2nd order tensors is the transformation that converts a 3D 4th order diffusion tensor from the GDTI model to an isometrically equivalent symmetric 6D 2nd order tensor. This allows us to use the Riemannian metric on the space of \(\mathcal{S}\mathit{ym}_{6}^{+},\) to estimate the 4th order diffusion tensor with a positive diffusion profile.

2.3 Estimating a SPD Fourth Order Diffusion Tensor

First we re-write the diffusion function in Eq. (2), which is written in terms of the coefficients of the kth order tensor \(\mathcal{D}^{(k)}\) and of the unit gradient vector g, in the tensor terminology when k = 4:

$$\displaystyle\begin{array}{rcl} D(\mathbf{g})& =& \left < \mathcal{D}^{(4)},\mathcal{G}\right >,\quad \quad \quad \quad \quad \quad \mathrm{where}\ \mathcal{G} = \mathbf{g} \otimes \mathbf{g} \otimes \mathbf{g} \otimes \mathbf{g}{}\end{array}$$
(21)
$$\displaystyle\begin{array}{rcl} & =& \left < \mathbf{B},\mathcal{D}^{(4)}\mathbf{B}\right >,\quad \quad \quad \quad \quad \mathrm{where}\ \mathbf{B} = \mathbf{g} \otimes \mathbf{g}{}\end{array}$$
(22)
$$\displaystyle\begin{array}{rcl} & =& \left < \mathbf{b},\hat{\mathbf{D}}\mathbf{b}\right > = \mathbf{b}^{T}\hat{\mathbf{D}}\mathbf{b},{}\end{array}$$
(23)

where \(\mathcal{D}^{(4)}\) is the 4th order diffusion HOT in GDTI, \(\mathcal{G}\) is a totally symmetric 4th order tensor computed from the outer products “⊗” of the gradient vector, similarly B is a symmetric 2nd order tensor computed from the outer products of g, b is the vector form of B using the isometric map from Eq. (6), and \(\hat{\mathbf{D}}\) is the symmetric 6D matrix form of \(\mathcal{D}^{(4)}\) using the isometric map from Eq. (16). The first two equalities can be derived from the coefficients’ equation in Eq. (2), and the third equality can be derived from Eqs. (17) to (18). Therefore, the diffusion signal from the GDTI model (Eq. (2)) when k = 4, can be written in tensor form as:

$$\displaystyle{ S = S_{0}\exp \left (-b\mathbf{b}^{T}\hat{\mathbf{D}}\mathbf{b}\right ). }$$
(24)

In this form, the problem of estimating the 4th order diffusion tensor \(\mathcal{D}^{(4)}\), from the signal, with a positive diffusion profile can be solved by estimating the 2nd order tensor \(\hat{\mathbf{D}}\), from the signal, in \(\mathcal{S}\mathit{ym}_{6}^{+}\).

The objective function we minimize to estimate \(\hat{\mathbf{D}}\) from N diffusion weighted images (DWIs) is the linearized form of the modified GDTI Stejskal-Tanner equation:

$$\displaystyle{ E(\hat{\mathbf{D}}) = \frac{1} {2}\sum _{i=1}^{N}\left (\frac{1} {b}\ln \left ( \frac{S_{i}} {S_{0}}\right ) + \mathbf{b}_{i}^{T}\hat{\mathbf{D}}\mathbf{b}_{ i}\right )^{2}. }$$
(25)

To estimate \(\hat{\mathbf{D}}\) in \(\mathcal{S}\mathit{ym}_{6}^{+}\), we have to consider the Riemannian manifold of \(\mathcal{S}\mathit{ym}_{6}^{+}\), and the appropriate gradient descent in that manifold. These can be derived from the details of the Riemannian framework presented in [1013]. It requires computing the gradient of \(E(\hat{\mathbf{D}})\) in that manifold, which at every point in \(\mathcal{S}\mathit{ym}_{6}^{+}\) is defined from the directional derivatives in the corresponding tangent plane.

The Riemannian gradient of \(E(\hat{\mathbf{D}})\) at \(\hat{\mathbf{D}}\) in the manifold \(\mathcal{S}\mathit{ym}_{6}^{+}\) is [13]:

$$\displaystyle{ \nabla E(\hat{\mathbf{D}}) =\hat{ \mathbf{D}}\left [\sum _{i=1}^{N}\left (\frac{1} {b}\ln \left ( \frac{S_{i}} {S_{0}}\right ) + \mathbf{b}_{i}^{T}\hat{\mathbf{D}}\mathbf{b}_{ i}\right ) \cdot (\mathbf{b}_{i}\mathbf{b}_{i}^{T})\right ]\hat{\mathbf{D}}. }$$
(26)

This allows us to design the appropriate gradient descent algorithm, with step length ε, in the Riemannian manifold \(\mathcal{S}\mathit{ym}_{6}^{+}\):

$$\displaystyle\begin{array}{rcl} \hat{\mathbf{D}}_{t+1} =\hat{ \mathbf{D}}_{t}^{\frac{1} {2} }\exp \left (-\epsilon \cdot \hat{\mathbf{D}}_{t}^{\frac{1} {2} }\left [\sum _{i=1}^{N}\left (\frac{1} {b}\ln \left ( \frac{S_{i}} {S_{0}}\right ) + \mathbf{b}_{i}^{T}\hat{\mathbf{D}}\mathbf{b}_{ i}\right ) \cdot (\mathbf{b}_{i}\mathbf{b}_{i}^{T})\right ]\hat{\mathbf{D}}_{ t}^{\frac{1} {2} }\right )\hat{\mathbf{D}}_{t}^{\frac{1} {2} }.\quad & &{}\end{array}$$
(27)

Minimizing the objective function \(E(\hat{\mathbf{D}})\) in this way, it is possible to estimate \(\hat{\mathbf{D}}\) in \(\mathcal{S}\mathit{ym}_{6}^{+}\) from the diffusion signal. Since \(\hat{\mathbf{D}}\) is isometrically equivalent to a 4th order tensor \(\mathcal{D}^{(4)}\) with major and minor symmetries, \(\mathcal{D}^{(4)}\) is guaranteed to have a positive diffusion profile. Finally we extract the totally symmetric part of \(\mathcal{D}^{(4)}\) to compute the totally symmetric 4th order GDTI diffusion tensor \(\mathcal{D}^{(4)s}\), which is then also guaranteed to have a positive diffusion profile.

3 A Ternary Quartic Approach for Symmetric Positive Semi-definite Fourth Order Diffusion Tensors

In this section, we revisit the problem of estimating a symmetric higher order Cartesian tensor with a positive diffusion profile from the GDTI model, using a polynomial approach. In this approach we consider the polynomial interpretation of HOTs instead of considering the algebra of HOTs, and look at a polynomial solution to the positivity problem. In particular, we consider 4th order GDTI diffusion tensors, where the diffusion function of such tensors can be seen as trivariate homogeneous polynomials of degree 4 in the coefficients of the gradient vector. Such polynomials are known as ternary quartics.

Polynomials form an alternate way of expressing the multi-linear form of HOTs. This expression was indicated in the original GDTI paper [4], but was used for applying the positivity constraint in [5]. To make the relationship between the coefficients of a HOT and the coefficients of a homogeneous polynomial more evident, the diffusion function of GDTI (Eq. (2)) was rewritten in [5] as:

$$\displaystyle{ D(\mathbf{g}) =\sum _{m+n+p=k}D_{m,n,p}\ g_{1}^{m}g_{ 2}^{n}g_{ 3}^{p}, }$$
(28)

where D m, n, p are the coefficients of the kth order tensor \(\mathcal{D}^{(k)}\) by a re-arrangement of the indices.

In this form, it is clear that the diffusion function, which was considered as the projection of the of a kth order HOT on to a unit sphere, is a trivariate homogeneous polynomial of degree k in the three coefficients of the unit gradient vector \(\mathbf{g} = [g_{1},g_{2},g_{3}]^{T}\), where the coefficients of the polynomial are the coefficients of the HOT. Since D(g) is a homogeneous polynomial of even degree, the problem of a positive diffusion profile on the unit sphere, \(D(\mathbf{g}) > 0,\ \forall \mathbf{g} \in \mathbf{R}^{3}\) st. | | g | |  = 1, is equivalent to the problem of finding a polynomial \(D(\mathbf{x}) > 0,\ \forall \mathbf{x} \in \mathbf{R}^{3}/\{\mathbf{0}\}\). This is exactly the same equivalence that was used in DTI, where the problem of positive diffusion from a second order tensor, \(\mathbf{g}^{T}\mathbf{D}\mathbf{g} > 0,\ \forall \mathbf{g} \in S^{2}\), was recast as the problem of finding a positive definite second order tensor, \(\mathbf{x}^{T}\mathbf{D}\mathbf{x} > 0,\ \forall \mathbf{x} \in \mathbf{R}^{3}/\{\mathbf{0}\}\), which entailed the Riemannian framework for \(\mathcal{S}\mathit{ym}_{3}^{+}\). Therefore, in this section we consider a method of estimating the coefficients of a positive polynomial from the diffusion signal, to estimate a GDTI HOT with a positive diffusion profile.

3.1 Riemannian vs. Ternary Quartics: A Comparison

It is interesting to note at this juncture, when k = 4, how the Riemannian approach presented in the previous section compares to the polynomial formulation. When k = 4, the goal of the polynomial formulation, as we have just seen, is to find a trivariate homogeneous polynomial of degree 4, D 4(x), where the coefficients of the polynomial are the coefficients of the 4th order GDTI diffusion tensor \(\mathcal{D}^{(4)}\), such that:

$$\displaystyle{ D_{4}(\mathbf{x}) > 0,\quad \forall \mathbf{x} \in \mathbf{R}^{3}/\{\mathbf{0}\}. }$$
(29)

In comparison, the Riemannian approach, using an isometric map, tries to find a symmetric 6D 2nd order tensor \(\hat{\mathbf{D}}\) in \(\mathcal{S}\mathit{ym}_{6}^{+}\):

$$\displaystyle{ \mathbf{c}^{T}\hat{\mathbf{D}}\mathbf{c} > 0,\quad \forall \mathbf{c} \in \mathbf{R}^{6}/\{\mathbf{0}\}, }$$
(30)

where the coefficients of the totally symmetric 4th order GDTI diffusion tensor can be extracted from the coefficients of \(\hat{\mathbf{D}}\). However, although, this quadratic form resembles the diffusion profile from a totally symmetric 4th order tensor, \(\mathbf{b}^{T}\hat{\mathbf{D}}\mathbf{b}\) (Eq. (23)), estimating \(\hat{\mathbf{D}}\) in \(\mathcal{S}\mathit{ym}_{6}^{+}\) isn’t equivalent to the problem of computing a 4th order GDTI diffusion tensor \(\mathcal{D}^{(4)}\), with a positive diffusion profile. This can be seen from the isometrically equivalent inner product of the quadratic form:

$$\displaystyle{ \left < \mathbf{C},\mathcal{D}^{(4)}\mathbf{C}\right > > 0,\quad \forall \mathbf{C} \in \mathcal{S}\mathit{ym}_{ 3}/\{\mathbf{0}\}. }$$
(31)

The positive diffusion profile constraint on the other hand only implies the condition:

$$\displaystyle{ \left < \mathbf{B},\mathcal{D}^{(4)}\mathbf{B}\right > > 0,\quad \mathrm{where}\ \mathbf{B} = \mathbf{g} \otimes \mathbf{g}, }$$
(32)

which can be seen in Eq. (22). Since the 2nd order tensor B in the diffusion profile is only of rank-1, it is rank deficient, whereas in general the 2nd order tensor C, in the quadratic form would include both full rank, and rank deficient tensors. In other words, the positive quadratic form condition is much stronger than the positive diffusion profile constraint. Therefore, although the positive quadratic form constraint would entail the positive diffusion profile constraint, the solutions found from this approach – the Riemannian approach, would only belong to a subset of all the solutions possible from only the positive diffusion profile constraint.

This can also be seen through examples, shown in [5, 6], by inspecting the isometric map in Eq. (16) which transforms a 4th order tensor into a 2nd order tensor. When this 6 × 6 matrix is positive definite it cannot represent valid totally symmetric 4th order tensors whose homogeneous polynomials are of the type \(P(\mathbf{g}) = \mathit{ag}_{1}^{4} + \mathit{bg}_{2}^{4} + \mathit{cg}_{3}^{4}\), or \(P(\mathbf{g}) = (\mathit{ag}_{1}^{2} + \mathit{bg}_{2}^{2})^{2} + \mathit{cg}_{3}^{4}\), etc., because these require the matrix to be semi-definite [6]. Since, the Riemannian framework pushes such matrices away to an infinite distance from the estimation tensor \(\hat{\mathbf{D}}\), the solutions found by the Riemannian estimation only form a subset of all possible solutions.

3.2 Hilbert’s Theorem on Non-negative Ternary Quartics

We now return to the problem of estimating a non-negative trivariate homogeneous polynomial of degree k from the signal. A particular aspect of this problem has been addressed in [20], which describes a framework for estimating symmetric GDTI HOTs of any even order k and with a positive diffusion profile on a unit sphere. This paper proposes that any polynomial (the GDTI HOTs) that is non-negative on a unit sphere can be written as sums of squares of polynomials of lower order:

$$\displaystyle{ P^{(k)}(\mathbf{x}) =\sum _{ i=1}^{M}Q_{ i}^{(k/2)}(\mathbf{x}), }$$
(33)

where k is even, P (k)(x) denotes a multi-variate polynomial of degree k, \(\{Q_{i}^{(k/2)}(\mathbf{x})\}\) denote M multi-variate polynomials of degree k∕2, and only an upper bound is known for M. Therefore, in [20], the authors propose to estimate the coefficients of the polynomials \(Q_{i}^{(k/2)}(\mathbf{x})\) from the signal to estimate a polynomial P (k)(x) (or a GDTI HOT) with a non-negative diffusion profile.

Since M is not known exactly, the authors in [20] proceed by oversampling M, or rather densely sampling the space of possible polynomials of lower order \(Q_{i}^{(k/2)}(\mathbf{x})\). It is claimed that increasing the density of the sampling increases the accuracy of the decomposition of \(P^{(k)}(\mathbf{x})\). However, it also increases the number of unknown coefficients of the set \(\{Q_{i}^{(k/2)}(\mathbf{x})\}\), which need to be estimated from the signal. The authors then propose heuristically measured approximations M′ for M, for different values of k, from tests on synthetic data.

The problem of estimating non-negative trivariate polynomials when k = 4, or of estimating non-negative ternary quartics, presents a very interesting problem with a “complete” solution. In the case of ternary quartics, it can be shown that the entire space of non-negative polynomials over entire R 3 (and not only over the unit sphere), can be described by the sum of squares of quadratic polynomials. Examples in [21] of non-negative polynomials of degree k > 4 that cannot be written as sums of squares of lower order polynomials indicate that not all non-negative polynomials of arbitrary degree k can be decomposed into sums of squares of lower order polynomials. Hilbert’s theorem, which identifies all the classes of non-negative multi-variate polynomials that can be always decomposed as sums of squares of lower order polynomials is also presented in [21].

In fact, Hilbert’s theorem states that degree 4 trivariate polynomials that are non-negative and homogeneous, can always be written as a sum of squares of quadratic homogeneous polynomials, where the number of terms in the sum is also known and is exactly three (M = 3) [21]:

Theorem (Hilbert): If P(x, y, z) is homogeneous, of degree 4, with real coefficients and P(x, y, z) ≥ 0 at every \((x,y,z) \in \mathbf{R}^{3}\), then there are quadratic homogeneous polynomials f, g, h with real coefficients, such that:

$$\displaystyle{ P = f^{2} + g^{2} + h^{2}. }$$
(34)

All other classes of non-negative polynomials that can be decomposed into sums of squares of lower order polynomials are all of degree less than four [21].

In this section, we, therefore, turn to Hilbert’s theorem on non-negative, or positive semi-definite (PSD) ternary quartics, for a parameterization of the GDTI HOT when it is of order 4, to estimate diffusion HOTs with a non-negative diffusion profile. Since such tensors are symmetric and non-negative, these are known as symmetric positive semi-definite (SPSD) tensors. Based on Hilbert’s theorem, [5] and [6] have proposed two different parameterizations of the 4th order tensor. A third parameterization was proposed in [22]. In this chapter, we review all three parameterizations, but follow through mainly with the method in [22].

As a final remark, we note that by adopting the polynomial formulation for the GDTI HOT, we have gained over the Riemannian framework proposed in the previous section from the fact that we address the exact problem of estimating a diffusion HOT with a positive diffusion profile, whereas the Riemannian approach addressed a more constrained problem. However, given the results on polynomials, namely Hilbert’s theorem on ternary quartics, we concede to the Riemannian approach by the fact that we can only address the problem of a non-negative diffusion profile with the polynomial formulation, whereas the Riemannian approach addressed the positive definite diffusion profile constraint. However, we shall consider this a “negligible” loss, since in practice, due to numerical computations, we have never come across a diffusion profile that is exactly zero even along a single direction.

3.3 Estimating a SPSD Fourth Order Diffusion Tensor

The basic approach behind all three “ternary quartic” methods, [5, 6, 22], is the same. The idea is to consider the diffusion profile of a 4th order GDTI tensor as a homogeneous trivariate polynomial in the coefficients of the gradient vector g (Eq. (28)), and to apply Hilbert’s theorem on non-negative ternary quartics to rewrite it as a sum of squares of three quadratic homogeneous polynomials. Therefore, by estimating the coefficients of these quadratic homogeneous polynomials from the signal, it is possible to reconstruct the 4th order diffusion tensor by computing its coefficients from the coefficients of the quadratic forms, a process also known as the Gram-matrix approach [5, 23], such that the estimated 4th order tensor has a PSD diffusion profile. The three methods differ from each other in the way they parameterize the quadratic homogeneous polynomials to estimate their coefficients from the diffusion signal.

In [5], the diffusion profile of a 4th order GDTI tensor is written as:

$$\displaystyle\begin{array}{rcl} D(\mathbf{g})& =& (\mathbf{v}^{T}\mathbf{c}_{ 1})^{2} + (\mathbf{v}^{T}\mathbf{c}_{ 2})^{2} + (\mathbf{v}^{T}\mathbf{c}_{ 3})^{2},{}\end{array}$$
(35)
$$\displaystyle\begin{array}{rcl} & =& \mathbf{v}^{T}\mathbf{C}\mathbf{C}^{T}\mathbf{v},{}\end{array}$$
(36)
$$\displaystyle\begin{array}{rcl} & =& \mathbf{v}^{T}\mathbf{G}\mathbf{v},{}\end{array}$$
(37)

where \(\mathbf{v} = [g_{1}^{2},g_{2}^{2},g_{3}^{2},g_{1}g_{2},g_{1}g_{3},g_{2}g_{3}]^{T}\) contains the monomials formed by the coefficients of the gradient vector g, \(\mathbf{v}^{T}\mathbf{c}_{i}\) are the three quadratic forms from Hilbert’s theorem, and G is known as the Gram matrix. The column vectors c i contain the coefficients of the quadratic forms, which have to be estimated from the signal, \(\mathbf{C} = [\mathbf{c}_{1}\vert \mathbf{c}_{2}\vert \mathbf{c}_{3}]\) is a 6 × 3 matrix, which assembles these coefficients to compute the rank deficient or PSD 6 × 6 Gram matrix, which is used to compute the coefficients of the 4th order diffusion tensor from the coefficients of the quadratic forms.

The authors in [5] use the Eq. (36) to parameterize the ternary quartic decomposition by Hilbert’s theorem, and estimate C from the DWIs, and compute the 4th order tensor from G. However, this parameterization is problematic since it produces an infinite solution space, which can be seen by decomposing C into two blocks \(\mathbf{C} = [\mathbf{A},\mathbf{B}]^{T}\) where A and B are 3 × 3 matrices. Then CO, for any 3 × 3 orthogonal matrix O, also results in the same Gram matrix, since \(\mathbf{CO}(\mathbf{CO})^{T} = \mathbf{CC}^{T} = \mathbf{G}\). In other words, in this parameterization, C is unique only up to the equivalence class of orthogonal matrices O(3).

In [5], the authors overcome this degenerate subspace issue by considering the QR-decomposition (or RQ-decomposition) of the 3 × 3 submatrix A of C, where Q is an orthogonal matrix and R is an upper triangular matrix. This implies that \(\mathbf{C} = [\mathbf{RQ},\mathbf{B}]^{T} = [\mathbf{R},\mathbf{B}\mathbf{Q}^{T}]^{T}\mathbf{Q}\). Therefore, \(\mathbf{C}\mathbf{C}^{T} = [\mathbf{R},\mathbf{B}\mathbf{Q}^{T}]^{T}\mathbf{Q} \cdot \mathbf{Q}^{T}[\mathbf{R},\mathbf{B}\mathbf{Q}^{T}] = [\mathbf{R},\mathbf{B}\mathbf{Q}^{T}]^{T} \cdot [\mathbf{R},\mathbf{B}\mathbf{Q}^{T}]\), which effectively quotients out the orthogonal group from the computation of the Gram matrix G.

In [6], the authors overcome this same issue in Eq. (36) by applying certain constraints on C from the properties of the Gram matrix, to remove the ambiguity of the class of orthogonal matrices O(3). Since the rank of the Gram matrix is known a priori from Hilbert’s theorem to be three, they identify and isolate the positive definite part of the PSD Gram matrix using a modified Iwasawa decomposition [24], which is then parameterized uniquely by a Cholesky decomposition. In other words, they first collect the rank-3 positive definite part of G into a 3 × 3 matrix W, and then decompose W using a Cholesky decomposition as W = LL T. This effectively equates the 3 × 3 matrix A, from the paragraph above, where \(\mathbf{C} = [\mathbf{A},\mathbf{B}]^{T}\), to the triangular matrix with positive diagonal elements L. In short, this procedure determines a unique C in the infinite space of solutions {CO} from the previous approach, and removes the ambiguity of the class of orthogonal matrices O(3). Therefore, the authors in [6] effectively estimate \(\mathbf{C} = [\mathbf{L},\mathbf{B}]^{T}\) from the DWIs. Furthermore, the Cholesky decomposition also distinguishes C from −C, although both result in the same Gram matrix. The authors then use this uniqueness property of C to design a spatial regularization of the field of estimated 4th order diffusion tensors.

Finally, we follow up in greater detail the third parameterization [22] using the ternary quartic decomposition. Essentially, using Eq. (35) to parameterize the Hilbert decomposition, we estimate the c i directly from the DWIs and assemble these afterward to reconstruct C. From there we follow the same procedure as the two other methods, and reconstruct the Gram matrix and compute the coefficients of the 4th order diffusion tensor.

From Hilbert’s theorem on non-negative ternary quartics we write the diffusion function of a 4th order diffusion tensor as \(D(\mathbf{g}) =\psi _{ 1}^{2}(\mathbf{g}) +\psi _{ 2}^{2}(\mathbf{g}) +\psi _{ 3}^{2}(\mathbf{g})\), where:

$$\displaystyle\begin{array}{rcl} \psi _{i}(\mathbf{g})& =& a_{i}g_{1}^{2} + b_{ i}g_{2}^{2} + c_{ i}g_{3}^{2} + 2\alpha _{ i}g_{1}g_{2} + 2\beta _{i}g_{1}g_{3} + 2\gamma _{i}g_{2}g_{3},{}\end{array}$$
(38)
$$\displaystyle\begin{array}{rcl} & =& [a_{i},b_{i},c_{i},\sqrt{2}\alpha _{i},\sqrt{2}\beta _{i},\sqrt{2}\gamma _{i}]{}\end{array}$$
(39)
$$\displaystyle\begin{array}{rcl} & & \quad \quad \times [g_{1}^{2},g_{ 2}^{2},g_{ 3}^{2},\sqrt{2}g_{ 1}g_{2},\sqrt{2}g_{1}g_{3},\sqrt{2}g_{2}g_{3}]^{T},{}\end{array}$$
(40)
$$\displaystyle\begin{array}{rcl} & =& \mathbf{x}_{i}^{T}\mathbf{v}{}\end{array}$$
(41)

are the quadratic forms. Note that we have modified the form of the vector v by multiplying certain terms by \(\sqrt{ 2}\), this is a minor difference in the notation convention from [5, 6]. Each quadratic form is known if its six unknown coefficients in x i can be estimated from the DWIs. Therefore, the diffusion profile can be written as a function of the unknowns to be estimated as:

$$\displaystyle\begin{array}{rcl} D(\mathbf{x}_{1},\mathbf{x}_{2},\mathbf{x}_{3})& =& \mathbf{x}_{1}^{T}\mathbf{v}\mathbf{v}^{T}\mathbf{x}_{ 1} + \mathbf{x}_{2}^{T}\mathbf{v}\mathbf{v}^{T}\mathbf{x}_{ 2} + \mathbf{x}_{3}^{T}\mathbf{v}\mathbf{v}^{T}\mathbf{x}_{ 3},{}\end{array}$$
(42)
$$\displaystyle\begin{array}{rcl} & =& [\mathbf{x}_{1}^{T},\mathbf{x}_{ 2}^{T},\mathbf{x}_{ 3}^{T}]\left [\begin{array}{ccc} \mathbf{vv}^{T}& \mathbf{0} & \mathbf{0} \\ \mathbf{0} &\mathbf{vv}^{T}& \mathbf{0} \\ \mathbf{0} & \mathbf{0} &\mathbf{vv}^{T} \end{array} \right ]\left [\begin{array}{c} \mathbf{x}_{1} \\ \mathbf{x}_{2} \\ \mathbf{x}_{3} \end{array} \right ]{}\end{array}$$
(43)
$$\displaystyle\begin{array}{rcl} & =& \mathbf{X}^{T}\mathbf{V}\mathbf{X}.{}\end{array}$$
(44)

To estimate the unknown coefficients x i of the homogeneous quadratic forms from a set of DWIs, we minimize the objective function based on the modified and linearized Stejskal-Tanner equation:

$$\displaystyle{ E(\mathbf{X}) = \frac{1} {2}\sum _{i=1}^{N}\left (\frac{1} {b}\log \left ( \frac{S_{i}} {S_{0}}\right ) + \mathbf{X}^{T}\mathbf{V}_{ i}\mathbf{X}\right )^{2}, }$$
(45)

where N is the number of DWIs and V i corresponds to the monomials from the gradient direction g i . Although here we use the linearized form of the Stejskal-Tanner equation, it is equally possible to use the non-linear form. The gradient of the objective function with respect to the unknowns X is computed to be:

$$\displaystyle{ \nabla E(\mathbf{X}) =\sum _{ i=1}^{N}\left (\frac{1} {b}\log \left ( \frac{S_{i}} {S_{0}}\right ) + \mathbf{X}^{T}\mathbf{V}_{ i}\mathbf{X}\right )\left (\mathbf{V}_{i} + \mathbf{V}_{i}^{T}\right )\mathbf{X}. }$$
(46)

We use the well known Broyden-Fletcher-Goldfarb-Shanno (BFGS) method [25], a sophisticated quasi-Newton optimization algorithm for non-linear problems.

Finally we compute the 15 independent coefficients A ijkl of the 4th order tensor \(\mathcal{A}^{(4)}\) from the coefficients of the Gram matrix G, by using Eq. (37), which equates D(g), the multi-linear form of \(\mathcal{A}^{(4)}\), to the quadratic form of the Gram matrix. We use a mapping very similar to the one presented in [5, 23], where the inverse mapping, i.e. G in terms of A ijkl is given by:

$$\displaystyle\begin{array}{rcl} \mathbf{G} = \left (\begin{array}{cccccc} A_{\mathit{xxxx}} & a & b & \frac{1} {4}A_{\mathit{xxxy}} & \frac{1} {4}A_{\mathit{xxxz}} & d \\ a & A_{\mathit{yyyy}} & c & \frac{1} {4}A_{\mathit{yyxy}} & e & \frac{1} {4}A_{\mathit{yyyz}} \\ b & c & A_{\mathit{zzzz}} & f & \frac{1} {4}A_{\mathit{zzxz}} & \frac{1} {4}A_{\mathit{zzyz}} \\ \frac{1} {4}A_{\mathit{xxxy}} & \frac{1} {4}A_{\mathit{yyxy}} & f &\frac{1} {4}(A_{\mathit{xyxy}} - 2a)& \frac{1} {8}(A_{\mathit{xyxz}} - 4d) & \frac{1} {8}(A_{\mathit{xyyz}} - 4e) \\ \frac{1} {4}A_{\mathit{xxxz}} & e &\frac{1} {4}A_{\mathit{zzxz}} & \frac{1} {8}(A_{\mathit{xyxz}} - 4d) & \frac{1} {4}(A_{\mathit{xzxz}} - 2b) &\frac{1} {8}(A_{\mathit{xzyz}} - 4f) \\ d &\frac{1} {4}A_{\mathit{yyyz}} & \frac{1} {4}A_{\mathit{zzyz}} & \frac{1} {8}(A_{\mathit{xyyz}} - 4e) &\frac{1} {8}(A_{\mathit{xzyz}} - 4f)& \frac{1} {4}(A_{\mathit{yzyz}} - 2c) \end{array} \right ),& &{}\end{array}$$
(47)

where {a, b, c, d, e, f} are six free parameters that determine the rank of the matrix. In this case, since the rank of G is known to be three, the free parameters are determined from the construction of the Gram matrix, i.e. G = CC T. Therefore these can be used to compute the coefficients A ijkl .

In comparison to the approach in [6], since we estimate all the coefficients of the three quadratic forms without any constraints, in effect we estimate 18 unknowns from which we recover the 15 unknowns of the 4th order diffusion tensor. This actually leaves us three degrees of freedom that can be applied as suitable constraints. Also this approach doesn’t distinguish between C and −C. However, since we only deal with the estimation problem of the 4th order diffusion tensor, this isn’t important, since both C and −C give the same Gram matrix, and hence the same 4th order tensor. But if such were desired, the three degrees of freedom could be explored, to distinguish between C and −C.

4 Experiments and Results

4.1 Synthetic Dataset

We conduct experiments on three datasets. First we consider a synthetic dataset based on a multi-tensor model (to represent multi-fiber crossings). For a single fiber profile we use the diagonal tensor D = diag(1, 700, 300, 300) × 10−6 mm2/s and generate synthetic signals at a b-value of 3,000 s/mm2. We estimate 4th order HOTs using the Riemannian and the “Ternary Quartic” (TQ) approaches and plot their ADCs. Further, since the maxima of the ADCs don’t correspond to the fiber directions, we also compute the diffusion ensemble average propagators (EAPs) \(P(\mathbf{r}) =\int (S_{i}(\mathbf{q})/S_{0})\exp \left (-2\pi i\mathbf{q}^{T}\mathbf{r}\right )d\mathbf{q}\), from the estimated 4th order tensors [26, 27].

We visually compare the Riemannian approach, which guarantees a positive definite diffusion profile but solves a more constrained problem, to the Ternary Quartic approach, which guarantees only a positive semi-definite diffusion profile but solves the problem in the correct space. The diffusion profiles of the estimated 4th order GDTI tensors and the EAPs computed thereof are presented in Fig. 1. We notice that the ADCs and the EAPs of the Ternary Quartic approach are somewhat sharper than the Riemannian counterparts. We surmise that this is due to the fact that the Riemannian approach cannot estimate certain types of 4th order tensors that can have non-negative diffusion profiles, since these tensors require to have a semi-definite representation in the symmetric 6D 2nd order tensor formulation used by the Riemannian estimation. Such semi-definite 6D 2nd order tensors are, however, pushed to an infinite distance from the estimation tensor by the Riemannian metric. Nonetheless, the overall angular structure of the two methods remain comparable.

Fig. 1
figure 1

Synthetic dataset. Comparing the diffusion profiles and the EAPs from the Riemannian approach and the Ternary Quartic approach. (a) ADC Riemannian. (b) ADC Ternary Quartic. (c) EAP Riemannian. (d) EAP Ternary Quartic. The Riemannian approach guarantees positive diffusion, but solves a more constrained problem. The Ternary Quartic approach guarantees only a positive semi-definite diffusion, but solves the problem in the correct space

4.2 Biological Phantom Dataset

Next we conduct an experiment on a biological phantom data that was produced from excised rat spinal cords. Only two cords were used to create a fiber crossing configuration with known physical directions. The biological phantom [28] was created at the McConnell Brain Imaging Center (BIC), McGill University, Montréal, Canada. MR images were acquired on a 1.5T Sonata MR scanner using a knee coil. It was created from two excised Sprague-Dawley rat spinal cords embedded in 2 % agar. The acquisition was done with a single-shot spin-echo planar sequence with twice-refocused balanced gradients, designed to reduce eddy current effects. The dataset was acquired with 90 gradient directions, on a single q-shell with a b-value of 3,000 s/mm2, \(q = 0.35\ \upmu \mathrm{m}^{-1}\), TR  = 6.4 s, TE  = 110 ms, FOV 360 × 360 mm2, 128 × 128 matrix, 2.8 mm isotropic voxels and four signal averages per direction. The SNR of the S 0 image was estimated to be approximately 70 for the averaged phantom, and around 10 for the cord at b-value of 3,000 s/mm2.

In this experiment we estimate 4th order GDTI diffusion tensors from the phantom dataset using both the Riemannian approach and the TQ approach. We then compute the EAPs from the tensors using the methods in [26, 27] to validate the coherence of their geometry with the known layout of the phantom and to see if it is possible to infer the underlying fiber bundle directions. For the sake of comparison we also present the result of the orientation distribution function (ODF) computed from the analytical q-ball estimation technique in [29], which is an angular marginal distribution of the true and unknown EAP under a mono-exponential decay model that corresponds to the GDTI model. The ODFs were directly estimated from the signal.

Fig. 2
figure 2

Biological phantom dataset. (a) The layout of the phantom created using two excised rat spinal cords. (b) ODFs estimated from the signal as reference geometry. (c) EAPs computed from 4th order tensors estimated using the Riemannian approach. (d) EAPs computed from 4th order tensors estimated using the Ternary Quartic approach. The EAPs were evaluated at the constant probability radius of \(\vert \mathbf{r}\vert = 17\ \upmu\) m

The results are presented in Fig. 2. The geometry of the EAPs computed from the 4th order tensors estimated using both the methods are coherent with the underlying phantom model, and also agree with the geometry of the ODFs. It is interesting to note that since the ODFs are angular marginal distributions of the true EAPs, the radial information of the true EAPs has been marginalized out by a radial integration. Therefore, although the ODFs’ angular structures resemble the angular structures of the EAPs computed from the 4th order tensors, the ODFs do not reveal anything about the magnitude of diffusion due to the heterogeneous structure of the underlying tissue. This is visible in the EAPs computed from the tensors from the size or volume of the displacement probability at a constant displacement radius. Also, by comparing (c) and (d) in Fig. 2, again the EAPs from the TQ method look sharper than the Riemannian counterparts.

4.3 In Vivo Human Dataset

Finally we conduct experiments on an in vivo human cerebral dataset. This dataset was acquired on a 1.5T scanner using 41 gradient directions, with a b-value of 700 s/mm2 with TR  = 1.9 s, TE  = 93.2 ms, 128 × 128 image matrix, 60 slices, with voxel dimensions of 1. 875 × 1. 875 × 2 mm. This dataset is from a public HARDI database that can be found in [30].

We consider two regions of interest (ROIs) with 249,352 and 987 voxels respectively. For the 249,352 voxels we compute the diffusion profiles of the tensors and test for positive/non-negative diffusion along 81 directions distributed evenly on a hemisphere. For the 987 voxels we compare the estimation time of the methods, since the positivity constraint implies increased computational complexity.

For the positive/non-negative diffusion experiment, we test four approaches. First we consider the standard Euclidean least squares approach (LS). Then we also test a method based on spherical harmonics (SH). Since SHs of the same rank are bijective to Cartesian tensors of the same order, we first estimate real and symmetric SHs of rank 4 from the signal and then transform them to the tensor basis to obtain 4th order HOTs. And finally we consider the two proposed methods of this chapter, namely the Riemannian approach and the TQ approach. The LS approach and the SH to HOT approach don’t consider any constraints, although the SH approach includes Laplace-Beltrami regularization [31] to account for some signal noise.

Table 1 Real dataset. The estimated diffusion functions from 249,352, 4th order GDTI tensors checked for positive diffusion profile on a set of 81 pairs of directions distributed evenly on a sphere. The Ternary Quartic and the Riemannian approaches are the only methods, which guarantee positive diffusion

The results of this experiment are displayed in Table 1. The Riemannian (RM) approach and the TQ approach are the only two that estimate 4th order diffusion tensors with positive diffusion profiles. The LS approach, as known, estimates tensors with lots of negative diffusion directions. Although the SH to HOT method includes regularization, clearly that is insufficient to guarantee positive diffusivity. Positive diffusivity is only achieved when it is applied explicitly by either the Riemannian approach or the TQ approach. In this experiment, we also tested for zero diffusion and found that both the Riemannian method and the TQ method always estimated tensors with strict positive diffusion profiles. Although the TQ method only applies a non-negative constraint, clearly due to numerical computations it is highly improbable to estimate tensors with exactly zero diffusion.

Although, the positivity constraint, applied using either the Riemannian approach or the TQ approach, clearly performs well, it also implies an added computational load. To get an idea of the additional computational complexity, we compare the estimation time of the two – Riemannian & TQ – approaches with the standard and linear LS approach on an ROI of the in vivo dataset with 987 voxels. The estimation times are displayed in Table 2. The computations were conducted on a Dell D630 Latitude laptop with Intel(R) Core(TM)2 Duo CPU @ 2.20 GHz and 2 GB RAM. The linearity and efficiency of the LS method is in fact one of its main supporting factors. However, the increased estimation time due to the complexity of the positivity constraint is still tractable.

Table 2 Real dataset. Comparison of the time for estimating 987, 4th order diffusion tensors that are visualized in Fig. 3.
Fig. 3
figure 3

In-vivo human cerebral dataset. Effects of the non-negative and the positive definite constraints that are guaranteed by the Ternary Quartic approach and the Riemannian approach are evaluated on the EAPs computed from the estimated tensors. EAPs computed from tensors estimated using the Euclidean LS approach, which doesn’t consider any constraints, are shown for comparison. No spatial regularization was used. The improvement in the results is only due to the non-negativity constraints

Finally, we conclude the experiments, by computing the EAPs from tensors estimated using both the Riemannian method and the TQ method from the in vivo human dataset (using [26, 27]). For comparison we include the EAPs computed from tensors estimated using the LS method (using [26, 27]). The results are presented in Fig. 3, where a region of interest on an axial slice is shown. What stands out prominently from Fig. 3 is the increased spatial regularity in the results of the Riemannian and the TQ methods when compared to the LS method. However, no spatial regularization was used. Only the positivity constraint was employed, using the two methods, for estimating the 4th order tensors. Clearly, the positivity constraint renders the estimation of the tensors much more robust to signal noise and improves the results. This indicates the importance of the positivity constraint.

5 Discussion and Conclusion

In this chapter we considered the problem of estimating 4th order diffusion tensors with a positivity constraint from the GDTI model. In GDTI Cartesian tensors of order higher than two were used to attain greater accuracy in the modeling of complex shaped ADCs. GDTI HOTs of order k were assumed to be symmetric since only their projections along vectors were used in the ADC modeling, and were assumed to be of even order since negative diffusion is non-physical. However, in spite of this design, the standard method for estimating GDTI HOTs from the signal, namely the least squares approach doesn’t guarantee an estimated HOT with a positive diffusion profile. Least squares estimation, although linear and efficient can result in HOTs with negative diffusion profiles.

We reviewed two different approaches for estimating 4th order GDTI diffusion tensors with positive diffusion profiles and non-negative diffusion profiles respectively. In the first method, we considered the algebra of 4th order tensors to map symmetric 3D 4th order tensors to symmetric 6D 2nd order tensors. We then applied the Riemannian framework for the space of \(\mathcal{S}\mathit{ym}_{6}^{+}\), to estimate 4th order diffusion tensors with strictly positive or positive definite diffusion profiles. In the second method, we considered the polynomial interpretation of the multi-linear form of HOTs, to reformulate the problem of estimating a HOT as a problem of estimating a polynomial. In the case of 4th order diffusion tensors, we were able to use Hilbert’s theorem on non-negative ternary quartics to parameterize 4th order tensors as a sum of squares of quadratic forms. By estimating the coefficients of the quadratic forms, we were able to reconstruct 4th order diffusion tensors with non-negative diffusion profiles from the signal.

The Riemannian method we proposed, ensures a positive definite diffusion profile, but solves a problem more constrained than implied by the model. This can be understood from the fact that the 3D 4th order tensors were estimated in \(\mathcal{S}\mathit{ym}_{6}^{+}\), as 6D 2nd order tensors, which implies that the Riemannian method ensures that the multi-linear form of the 4th order tensor is positive definite for all symmetric 3D 2nd order tensor. However, the GDTI model requires that the multi-linear form of the 4th order tensor needs to be positive definite for only 3D 2nd order tensors of maximal rank one. Therefore, the Riemannian method ensures a positive diffusion profile, but the solution space is more constrained than the true solution space.

The second method we proposed – the Ternary Quartic method solves the problem in the correct space due to the appropriate polynomial parameterization. However, since the known polynomial results, i.e. Hilbert’s theorem on ternary quartics, only guarantee non-negativity, this method considers a theoretically weaker problem of a positive semi-definite diffusion profile. But this method also uses the Euclidean metric, which, as has been suggested in [7], is perhaps better suited for computing with diffusion data than the affine invariant Riemannian metric.

From the implementation and the results, we found that the shape of the ADCs and EAPs computed from tensors estimated using the Riemannian method to be similar to the shape of the ADCs and EAPs computed from tensors estimated using the Ternary Quartic method. We did, however, remark a swelling in the shapes of the tensors estimated using Riemannian method, which we suspect was the result of the over constraint. A more detailed analysis is, therefore, necessary to identify the sub-space spanned by the Riemannian approach, and also to quantify the impact of this sub-space on the estimated results. Finally, in the tests for negative diffusion profiles, we never came across zero diffusion from tensors estimated using the Ternary Quartic method, which is probably due to numerical computations. Therefore, we concluded that the concession of the weaker non-negativity constraint to be negligible in practice.

We conducted tests on a biological phantom with a known layout to evaluate whether it was possible to infer the underlying fiber directions from the geometry of the EAPs computed from the tensors estimated using the two approaches. Our experiments indicated that this could be answered in the affirmative and that the geometry of the EAPs computed from the tensors estimated using the Riemannian framework and the Ternary Quartic approach could reveal the underlying fiber directions. We also experimented on in-vivo human cerebral data using both the Riemannian framework and the Ternary Quartic approach to motivate the need for a positive or non-negative diffusion profile constraint. The experiments clearly indicated the gains of applying such constraints. Finally, we also presented the computation time to evaluate the increased complexity, and found this to be tractable.