1 Introduction

The analysis of three-dimensional images has gained more and more importance in recent years. In particular, new acquisition techniques in the medical and biological sciences produce an enormous amount of 3D data calling for automated analysis. In this article, we show how the harmonic analysis of the 3D rotation group offers a convenient framework for rotation covariant image processing and analysis.

A typical rotation covariant processing pipeline may considered as a kind of image filter. An example is illustrated in Fig. 1. First, local image features are extracted. These features are then processed in order to gather, combine and generate relevant information. Finally, one or more resulting images are created. We call such a pipeline rotation covariant, if the same filter, when applied to a rotated version of the input image, leads to a rotated version of the output. Since Cartesian tensors are rotation covariant by definition, they are widely used as the basic tool to design covariant filters.

Up to now, most algorithms rely on “low”-order features from Cartesian tensors like local intensities, intensity gradients or second-order derivatives and their products. For example, consider a lesion detection/segmentation problem in a \(T_1\)-weighted magnetic resonance image. A typical approach would be to gather certain kind of features for each voxel, an example is a Laplacian- or a Gaussian pyramid, to determine the distribution of such features in a healthy state. This distribution can be used to find probabilities for certain voxels to contain a lesion or not. Instead of using zero-order features, like the Laplacian-pyramid, higher-order features, as the smoothed intensity gradient magnitudes (1-order features) or the eigenstructure of the Hessian matrix and the structure tensor (2-order features), can improve the performance. However, features of order 3 and beyond are rarely used because it is not trivial to identify linearly independent higher-order features [105, 118]. The redundancies in the Cartesian tensor representation are cumbersome and hard to be handled operationally [105].

This article proposes a unified framework based on spherical tensors, which incorporates higher-order features in a systematic way. Spherical tensors are a common, exhaustedly studied object in the angular momentum theory. However, they are, so far, rarely used in the 3D image processing community. A reason for this might be the fact that for spherical tensors Cartesian directional information is, in contrast to Cartesian tensors, obscured by a complex-valued unitary coordinate transform between the Cartesian and the spherical representation. Moreover, unlike for tensors in Cartesian space, which can simply be extended to arbitrary dimensions, spherical tensors are strongly connected to the representation of the 3D rotation group and thus only exist in 3D space.

Fig. 1
figure 1

Typical pipeline consists of three steps: (1) feature extraction: local image features are represented in an angular-dependent manner in terms of spherical tensors. (2) computations are performed in the spherical tensor domain (here a anisotropic smoothing, we see the crossing in the center). (3) The results are transformed back into an interpretable image

The main difference between spherical and 3D Cartesian tensors is that spherical tensors have a fixed number of indices no matter which order they are. In Cartesian representation, the number of indices is determined by the order of the tensor. A Cartesian tensor of order n has n indices of fixed dimension of 3. In spherical representation, there exists only one index. With growing order, the dimension of this index is growing as well: A spherical tensor of order n has one index of dimension \(2n+1\). This property is, from an algorithmic point of view, a strong advantage. We can always deal with high-order tensors in the same way as with low-order tensors. This eases an optimized implementation.

Cartesian tensors, with their many indices, are reducible in the sense that indices can be folded together to form other indices, which still have a valid rotation behavior. An example is the trace. It let vanish two indices of a Cartesian tensor. The result is a new tensor with a reduced number of indices. For instance, the rotation corresponding to the trace of a second-order Cartesian tensor is trivial: It is the identity transformation.

For spherical tensors, since they only have one index, such operations, like the trace, do not exist. There exists no linear transformation (apart from the orthogonal/unitary ones) that can qualitatively change the rotation behavior of a spherical tensor. Spherical tensors are called irreducible. A consequence of the irreducibility is that spherical tensors are a kind of Fourier coefficients of spherical functions. They are, dependent on their order, associated with attributes like rotation symmetry, sharpness, richer in details, but also attributes like noise or less relevant (high frequency) details, the typical characteristics of image Fourier coefficients. We talk about the details later in this article.

In summary, it can be stated that in comparison with ordinary Cartesian tensor analysis, the algorithms and the handling are operationally much more clearer for spherical tensors. The combinatorial issues arising with Cartesian tensors are eliminated by the group representation theory of 3D rotations, though the involved representation theory is not easily accessible for the non-experienced. However, it allows the creation of efficient algorithms, particularly when higher-order tensors are involved.

In this work, we propose an open-source toolbox which covers all basic operations involved in spherical tensor algebra. The focus on the toolbox lies on the feature and object detection in large volumetric, biomedical images, and on the processing of tensor-valued images like diffusion-weighted MRIFootnote 1. The toolbox is written in C and has a C++ and MATLAB/Octave API. A parallel CPU and GPU implementation is available as well. A repository with the source code is publicly available [98].

The article is divided into five sections. First, in Sect. 2, we introduce the basic theoretical concepts. We show the relationship between spherical and ordinary Cartesian tensors. We introduce the notion and properties of spherical tensors and address their relation to Fourier analysis.

In Sect. 3, we introduce orientation and spherical tensor fields. We propose two fundamental operations: spherical products and spherical derivatives, both important for image feature extraction and image filtering. Further, we introduce tensor-valued basis functions for the efficient computation of rotation covariant and invariant features. In focus are a Gauss–Laguerre basis and a Gabor basis. Both are known to be important in pattern analysis [13, 16, 24, 27, 39, 40, 47, 56, 57, 63, 66, 67, 69, 74, 77, 89, 92, 104, 118, 120].

Section 3 comes with two novel contributions. (1) We transfer knowledge about symmetries of angular momentum states known in angular momentum theory to a feature description problem: We show how to avoid redundancies in spherical bi-spectra using a set of associativity rules in tensor products; this problem has, as far as we know, not been addressed so far. Since this saves both memory and computation time, it is, from an application point of view, an important aspect. (2) We also show how the diffusion equation in the position/orientation space can be efficiently solved via STA. Details about this topic have been presented in a technical report, which is available online [87].

Implementation concepts and implementation details are explained in Sect. 4. Finally, Sect. 5 reviews applications of STA, or extends, in the case of steerable deconvolution [86], existing algorithm from 2D to 3D. In this section, we give implementation examples, which can directly serve as a skeleton for biologically or medically relevant feature detection or image processing tasks.

1.1 Related Work

In two dimensions, the representation of orientation and tensor fields in terms of circular harmonics (or, the irreducible representations of SO(2)) is relatively simple and quite frequent in the literature [25, 45, 53,54,55, 64, 79, 80, 96, 119]. Complex calculus offers a well-founded background: The ordinary Cartesian partial derivatives \(\partial _x,\partial _y\) are replaced by the complex derivatives \(\partial _z = (\partial _x - {\mathbf {i}}\partial _y)/2\) and \(\partial _{ \overline{z} } = (\partial _x + {\mathbf {i}}\partial _y)/2\). In [82, 103], three-dimensional derivative operators are introduced that behave similar to complex derivatives, that is, they are compliant with the rotation behavior of spherical harmonics in 3D. In Refs. [22, 23], the Fourier transform of SE(3) is used in the context of engineering applications. For the efficient computation of the SE(3)-convolution, functions are expressed in terms of the unitary irreducible representations (UiR) of SE(3). In the context of line and contour enhancement in 2D, there are various works about orientation fields [30, 31, 113, 114]. It can be used to set up a scale space theory. More recently, extensions to 3D of these concepts appeared [29]. While the applications in 2D are typically related to feature detection and image enhancement, the 3D extension offers a new application field: the processing of diffusion-weighted magnetic resonance images (DWI). In DWI, the acquired measurements are already functions on \({\mathbb {R}}^3\times S_2\). Based on the directional dependency of water diffusivity in fibrous tissue of the human brain, it is possible to reveal underlying connectivity information. One of the main challenges in DWI is the estimation of so-called fiber/diffusion orientation distributions. There are numerous methods for estimating orientation distributions: classical Q-ball imaging [111], constrained spherical deconvolution [108], proper probability density estimation [2, 11, 18, 109] and spatially regularized density estimations for tensor-valued images [9, 17, 49, 84, 90, 110]. Most of the employed algorithms rely on tensorial or spherical harmonic representation of the orientation distributions. However, most of the algorithms for orientation distribution estimation that consider the local surrounding of a voxel, i.e., using inter-voxel information, rely on a discretization of the two-sphere [10, 28, 29, 84].

The work on classical, rotation invariant 3D features, like 3D extensions of SIFT and SURF, is manageable and focused on solving point matching and registration problems. References [20, 21] have proposed the N–D Sift descriptor, which has been used for the co-registration of volumetric medical 3D and 3D+time images. This includes the 3D-SIFT descriptor of [4]. The proposed 3D-SIFT descriptors have also been used for the registration of volumetric spectral OCTFootnote 2 images of the retina [73], or the co-registration of 3D + time CT scans of lung vessel trees [75, 76]. Further applications on object recognition include the scanning of volumetric CT scans of bags in airports for security reasons [38] and a voting-based classification of objects in volumetric images [65] based on 3D SURF. In contrast, the usage of STA for the rotation invariant feature detection has remarkably increased the last years; for instance [34,35,36,37, 59, 60, 70, 81, 91, 93, 94, 99, 102, 103]. We will introduce examples later in the application section.

2 From Cartesian to Spherical Tensors

Cartesian tensors are often used to describe spatial properties of physical systems. In image analysis, Cartesian tensors are an indispensable tool for representing image characteristics with respect to the Cartesian coordinate system. Typical 3D Cartesian tensors are image gradients, local Hessian matrices or structure tensors [3, 12, 41, 52]. 3D Cartesian tensors clearly exhibit directional information with respect to the Cartesian coordinates. This is particularly true for low-rank Cartesian tensors or tensors with certain symmetries. For instance, the eigensystem of a Hessian matrix directly represents the local image main curvatures in terms of directions and magnitudes, which is a widely used feature for the numerical computation of maxima of lower-order steerable image filters [3, 5, 42].

Every Cartesian tensor is associated with an order \(n\in {\mathbb {N}}_{0}\). A Cartesian tensor \(\mathbf {T}^n\) in 3D of order n is a mathematical object with \(3^n\) independent values \(t^{(n)}_{j_1,\ldots ,j_n}\in {\mathbb {R}}\) with \(j_k \in \{0,1,2\}\).

We say that it has \(3^n\) degrees of freedom (DOF). An order 0 tensor is a scalar. Tensors of order one and two are often written as vectors or matrices, respectively, whereas

$$\begin{aligned} \mathbf {T}^1={\begin{pmatrix} t_0\\ t_1\\ t_2 \end{pmatrix}}&\text { and }&\mathbf {T}^2={\begin{pmatrix} t_{00} &{}\quad t_{01}&{}\quad t_{02}\\ t_{10} &{}\quad t_{11}&{}\quad t_{12}\\ t_{20} &{}\quad t_{21}&{}\quad t_{22} \end{pmatrix}}. \end{aligned}$$
(1)

Let \(\mathbf {R}(g)\in {\mathbb {R}}^{3\times 3}\) be the standard representation of the rotation group. With g, we denote an element of the 3D rotation group SO(3). Just think of it as a given triple of Euler angles \((\theta ,\phi ,\psi )\). The characteristic of a Cartesian tensor is its behavior with respect to rotations. If the rotation acts in a certain way on the numbers \(t^{(n)}_{j_1,\ldots ,j_n}\), we say it is a tensor. For first- and second-order tensors, these actions can be expressed in ordinary matrix calculus by

$$\begin{aligned}&(g \mathbf {T}^1):=\mathbf {R}(g)^T\mathbf {T}^1&\text { and} \\&(g \mathbf {T}^2):=\mathbf {R}(g)^T\mathbf {T}^2 \mathbf {R}(g) . \end{aligned}$$

Here \(g\mathbf {T}\) denotes the ’action’ of the rotation group. For the general case, we need index representations:

$$\begin{aligned} (g t^{(n)})_{i_1,\ldots ,i_n} = \sum _{j_1,\ldots ,j_n} R(g)_{i_1,j_1}\ldots R(g)_{i_n,j_n} t^{(n)}_{j_1,\ldots ,j_n}, \end{aligned}$$
(2)

that is, all components \(t^{(n)}_{j_1,\ldots ,j_n}\) do “mix” under a rotation.

In Cartesian, as well as in spherical tensor calculus, there are two basic operations that combine tensors or create new tensors: derivatives and products.

Differentiation is a natural way to map 3D image information to 3D Cartesian tensors. For instance, given an image \(I:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) and successively differentiating along the Cartesian X-,Y-, and Z-axis creates an \((n+1)\)-th order derivative which is given by

$$\begin{aligned} t^{(n+1)}_{i_1,\cdots ,i_{n+1}}:=\textstyle \frac{\partial }{\partial x_{i_1}} \cdots \frac{\partial }{\partial x_{i_{n+1}}} I .&\end{aligned}$$
(3)

The results of (3) transform according to (2) and thus are 3D Cartesian tensors of order \((n+1)\).

Derivatives of different orders can be combined with tensor products to form features like the inner product of a gradient, an indicator for the presence of edges, or the trace of a Hessian matrix, a measure for blob-like structures.

In Cartesian tensor calculus, several ways exist to combine tensors. The outer product (the Kronecker-products) multiplies all elements of a tensor \(\mathbf {T}^n\) with the elements of a tensor \(\mathbf {T}^m\). This results in a new tensor of the order \(n+m\). For example, the outer product \(t^{(2)}_{ij}:=t^{(1)}_{i}t^{(1)}_{j}\) creates a matrix out of two vectors.

There are two tensors which are fix points under rotations: the symmetric delta tensor \(\delta _{ij}:=\delta (i-j)\), which corresponds to the identity matrix, and the antisymmetric epsilon tensor \(\epsilon _{ijk}\) (see Definition 6 in appendix). Due to their special rotation behavior, they can be used to build tensors out of existing ones. With the delta tensor, we can determine the sum over a pair of indexes (ij). This operation is called a tensor contraction, or the trace of a tensor. It reduces a tensor order by 2. An example is the trace of a second-order tensor \(tr(\mathbf {T}^2):=\sum _{i,j}\delta _{i,j} t^{(2)}_{i,j}\). On the other hand, the combination of the product and the delta tensor increases the tensor rank by two. With \(t^{(2)}_{ij}:=\delta _{ij} t^{0}\), we obtain the second-order tensor \(\mathbf {T}^0 \mathbf {I}_{3\times 3}\) out of the zero-order tensor \(\mathbf {T}^0\), where \(\mathbf {I}_{3\times 3}\) is the 3D identify matrix. Note that increasing the order in this way “embeds” lower-order tensor information into higher-order tensors.

In 3D space, the cross product creates a vector orthogonal to two existing vectors. The so-called epsilon tensor (or Levi-Cevita symbol) is underlying the cross product. It is a traceless, antisymmetric tensor of order three. It can be used in a similar way to the delta tensor to contract tensor indices, or to increase the tensor rank. In terms of the epsilon tensor \(\epsilon _{ijk}\), the cross product is written as \(u^{(1)}_{i}:=\sum _{i}\epsilon _{ijk} s^{(1)}_{j}t^{(1)}_{k}\). Or think of the matrix \(\mathbf {T}^2_\text {anti}=\begin{pmatrix} 0&{} t_3&{} -t_2\\ -t_3&{}0&{} t_1\\ t_2&{} -t_1&{}, 0\\ \end{pmatrix}\), which is the result of \(t^{(2)}_{jk}:=\sum _{i}\epsilon _{ijk} t^{(1)}_{i}\) and represents the matrix whose application is a cross product with the vector \(t^{(1)}\).

Similar to derivatives, tensor products can be used to successively create higher-order tensors. With an order one tensor as an example, we get

$$\begin{aligned} t^{(n+1)}_{i_1,\cdots ,i_{n},i_{n+1}}:=t^{(n)}_{i_1,\cdots ,i_{n}} t^{(1)}_{i_{n+1}}. \end{aligned}$$
(4)

We can imagine that with \(3^n\), the DOF grows drastically with an increasing order n. For \(n=5\) for example, we already have a DOF of 243. However, in most applications, tensors, like the derivatives and tensors based on outer products as in (4) and (3), have symmetries of the form \(t^{(n)}_{i,j,\cdots }=t^{(n)}_{j,i,\cdots }\), or antisymmetries of the form \(t^{(n)}_{i,j,\cdots }=-t^{(n)}_{i,j,\cdots }\). So, usually the actual DOF is often by far lower than possible.

For general higher-order Cartesian tensors however, it might be tricky to identify symmetries and redundancies, particularly when it comes to implementation and real data.

Whether a tensor has certain symmetries or not, and thus, can be represented by a fewer number of components, depends on the fact that a tensor of order n may or may not contain vanishing low-order tensor information. Let us consider a second-order tensor \(\mathbf {T}^2\) with a maximum of 9 DOF, which can be uniquely decomposed into three components:

$$\begin{aligned} \textstyle \mathbf {T}^2&= \underset{\mathbf {T}_{\text {tr}}^2}{\underbrace{({tr(\mathbf {T}^2)}/{3}) \mathbf {I}_{3\times 3}}}+ \underset{ \mathbf {T}_{\text {anti}}^2}{\underbrace{(\mathbf {T}^2-(\mathbf {T}^2)^T)/2}}\nonumber \\&+ \underset{\mathbf {T}_{\text {sym}}^2}{\underbrace{(\mathbf {T}^2+(\mathbf {T}^2)^T)/2-\mathbf {T}_{\text {tr}}^2}}~. \end{aligned}$$
(5)

The first component \(\mathbf {T}_{\text {tr}}^2\) represents the trace of \(\mathbf {T}^2\). Since it has only one DOF, it can be represented by a scalar; \(\mathbf {T}^2_{\text {anti}}\), an antisymmetric matrix with three DOF, can be written as a first-order tensor lifted by the epsilon tensor, and finally, \(\mathbf {T}_{\text {sym}}\) is a traceless symmetric matrix, a second-order tensor with only five DOF (DOF = 6 minus one for the vanishing trace: \(tr(\mathbf {T}_{\text {sym}}^2)=0\)). Under a rotation, the elements \(\mathbf {T}_{\text {tr}}^2\), \(\mathbf {T}^2_{\text {anti}}\) and \(\mathbf {T}_{\text {sym}}\) of \(\mathbf {T}^2\) do not mix and hence form invariant subspaces. An invariant subspace which cannot be decomposed further into even smaller invariant subspaces is called irreducible. This brings us now to spherical tensors, which are just representations of the tensor within these irreducible subspaces.

2.1 The Irreducible Spherical Tensors

With the decomposition into irreducible subspaces, we can separate independent components by their rotation behavior. This helps to decrease memory requirements due to eliminated redundancies and can lead to more efficiency.

As an example, consider a fully traceless, symmetric Cartesian tensor \(\mathbf {T}^2_\text {sym}\) of rank 2. Rotating a second-order tensor according to (2) requires \(3^2 \times 3 \times 2 = 54\) multiplication. However, we already know that such a tensor has just 5 DOF. The rotation is acting in fully linear manner on these five numbers. So there has to be a \(5\times 5\) matrix acting on these five numbers which accomplishes the same task with just \(5^2 = 25\) multiplications. And this vectorized representation of the symmetric, traceless matrix is exactly its spherical tensor representation. And in fact, this idea can be generalized to arbitrary order. For higher-order Cartesian tensors, the number of irreducible components becomes large quite quickly [15]: A general order five tensor can already be decomposed into up to 51 irreducible parts; see Table 1 in appendix. The family of these irreducible components is, just like in the order two case, the family of fully symmetric, traceless Cartesian tensors. The DOF of such a tensor of order n is only \(2n+1\) (making it symmetric, reduces the DOF from \(3^n\) to , and then minus (removing all traces) leads to \(2n+1\)). Figure 2 illustrates the rotation of the irreducible components in comparison with the reducible representation of an order two tensor. We refer to [6, 106] for further details.

Let us now be more explicit: A spherical tensor \(\mathbf {a}^j\) of order \(j\in {\mathbb {N}}_{0}\) is represented by a vector with \(2j+1\) complex-valued elements \((a^j_{-j},\cdots ,a^j_{j})^T\in {\mathbb {C}}^{2j+1}\). Note that due to the irreducibility a contraction of tensors does not exist, which is expressed by the fact that there is just a single subindex m. Spherical tensors are, as their Cartesian counterparts, rotation covariant. Spherical tensors of order \(j\) are rotated by the so-called Wigner-D rotation matrices \(\mathbf {D}^j(g)\in {\mathbb {C}}^{(2j+1)\times (2j+1)}\) by

$$\begin{aligned} (g \mathbf {a}^j) = \mathbf {D}^j(g) \mathbf {a}^j. \end{aligned}$$
(7)

The Wigner-D rotation matrices (or spherical group representations) are all possible group representations of the rotation group SO(3).

2.2 Clebsch–Gordan Coefficients

The explicit connection between spherical and Cartesian tensors is, despite for the order one tensor, not trivial. Let \(\mathbf {S}\in {\mathbb {C}}^{3\times 3}\) be the unitary transformation that maps the ordinary rotation matrix to the first-order Wigner-D matrix, where

$$\begin{aligned} \mathbf {D}^1(g) = \mathbf {S} \mathbf {R}(g) \mathbf {S}^\top \text { and } \mathbf {S} = \frac{1}{\sqrt{2}} {\begin{pmatrix} 1 &{}\quad -{\mathbf {i}}&{} \quad 0 \\ 0 &{}\quad 0 &{} \quad \sqrt{2} \\ -1 &{}\quad -{\mathbf {i}}&{}\quad 0 \end{pmatrix}}. \end{aligned}$$
(8)

With the matrix \(\mathbf {S}\), we have a one-to-one mapping between Cartesian and spherical tensors of order one.

For higher orders, the connection is determined by the so-called Clebsch–Gordan coefficients. They form the basis of the representation theory of 3D rotations. The Clebsch–Gordan coefficients are combinatorial coefficients, which couple two group representations \(\mathbf {D}^{j_1}\) and \(\mathbf {D}^{j_2}\) to form a new representation. The general law is

$$\begin{aligned}&D^j_{mn} \nonumber \\&\quad =\sum _{\begin{array}{c} m_1+m_2 = m \\ n_1+n_2 = n \end{array}} D^{j_1}_{m_1 n_1}D^{j_2}_{m_2 n_2} \langle j m | j_1 m_1, j_2 m_2 \rangle \langle j n | j_1 n_1, j_2 n_2 \rangle . \end{aligned}$$
(9)

With \(\langle j m \mid j_1 m_1, j_2 m_2 \rangle \in {\mathbb {R}}\), we denote the Clebsch–Gordan coefficients; see Sect. 1 in appendix for further details.

Fig. 2
figure 2

Cartesian tensors (left) can be reduced into irreducible representation which do not mix with each other under rotations (right). Irreducible representations foster the development of efficient algorithms

Equations (8) and (9) implicitly define the connection between the Cartesian and spherical representations. For instance, let \(\mathbf {T}^2\) be a second-order, Cartesian tensor. Let further, for convenience, be \(\mathbf {T}^s = \mathbf {S} \mathbf {T}^2 \mathbf {S}^\top \). Then the components of the corresponding spherical tensors \(\mathbf {b}^j\), \(j=0,1,2\), are

$$\begin{aligned} b^j_m = \sum _{m_1+m_2=m} \langle 1 m_1, 1 m_2 | j m \rangle (-1)^{m_2}t^s_{m_1(-m_2)}, \end{aligned}$$
(10)

where \(\mathbf {b}^0\) corresponds to \(\mathbf {T}^2_{\text {tr}}\), \(\mathbf {b}^1\) to \(\mathbf {T}^2_{\text {anti}}\) and \(\mathbf {b}^2\) to \(\mathbf {T}^2_{\text {sym}}\); see (5). The inverse of this “Cartesian to spherical” transformation is

$$\begin{aligned} t^s_{m_1 m_2}= \sum _{j=0,2} \sum _{m=-j}^{m=j} \langle 1 m_1, 1 (-m_2) | j m \rangle (-1)^{m_2} b^j_m. \end{aligned}$$
(11)

In “Cartesian Tensors” section in appendix, we give the explicit expressions of (10) and (11).

2.3 Relation to Fourier Analysis

In contrast to Cartesian tensors, directional information of spherical tensors is rather obscured. However, we can always interpret spherical tensors as expansion coefficients of a spherical Fourier expansion of a square-integrable orientation function \(f:S_2\rightarrow {\mathbb {C}}\), an angular-dependent function on the sphere. In contrast to the tensors themselves, such orientation functions can be indeed interpreted in an intuitive manner. They can be meaningfully visualized in 3D space in tandem with the original image. In STA, the design and interpretation of orientation functions f is, in our opinion, the important objective; the spherical tensors are just the tool to achieve the goals in a numerical manner. The Fourier expansion connecting the tensors with f is given by

$$\begin{aligned} f(\mathbf {n}):=\sum _{j=0}^L \sum _{m=-j}^{m=j} \overline{a^j_m} Y^j_m(\mathbf {n})=\sum _{j=0}^L (\mathbf {a}^j)^T \mathbf {Y}^j(\mathbf {n}), \end{aligned}$$
(12)

with \(L=\infty \). The vector \(\mathbf {n}\in S_2\) is a direction (unit) vector in 3D space. The \(\mathbf {Y}^j:S_2\rightarrow {\mathbb {C}}^{2j+1}\) are vectors of \(2j+1\) orthogonal spherical Fourier basis functions \(Y^j_m\) of order \(j\), the so-called spherical harmonics (see “Spherical Harmonics” section of Appendix for definition). Figure 3 visualizes spherical harmonics up to order 3.

An example of an orientation function is the angular-dependent distribution of gradient directions in a local Gaussian-windowed neighborhood of a voxel. Typical examples are the structure tensor [12, 41, 52], or with higher-order tensors, spherical histograms of oriented gradients (SHOG) [101]. Figure 4 depicts such local orientation functions. In this example, they represent the local gradient orientation distribution at different image locations in top of the corresponding image.

Fig. 3
figure 3

Spherical harmonic expansion for functions on the sphere \(f:S_2\rightarrow {\mathbb {C}}\). The upper row visualizes the real part of the harmonics, the bottom row the imaginary parts

Fig. 4
figure 4

Orientation functions representing the local distribution of gradient orientations in a Gaussian neighborhood around a crossing. We show local expansions at each voxel, each up to order \(L=8\); see (12). For \(L=2\), the tensors of the expansion are the spherical counterparts of the 3D structure tensor. Only with higher orders, the orientation functions become sharp and can capture the crossing in the center correctly. For better visualization, we have removed the mean from the expansions (\(j=0\))

With (12), the spherical tensors gain some nice properties known from Cartesian Fourier analysis:

Symmetry and DOF Fourier coefficients are encoding the real and imaginary part of complex-valued signals in a separable manner. As a consequence, the Fourier coefficients of real-valued functions \(f:S_2\rightarrow {\mathbb {R}}\) have the characteristic symmetry

$$\begin{aligned} a^j_{-m}=(-1)^{m}\overline{a^j_{m}}. \end{aligned}$$
(13)

We call the space of such tensors the “real” linear subspace \( V_j \). We call the orthogonal complement \({\mathbf {i}} V_j \) of \( V_j \) the “imaginary” subspace which fulfills \(a^j_m = (-1)^{m+1} \overline{a^j_{-m}} \). The space \({\mathbf {i}} V_j \) corresponds to the expansion coefficients of purely imaginary functions \(f:S_2\rightarrow {\mathbb {C}}\); see “Real and Imaginary Tensor Fields” section in Appendix for details.

As a result, in conformity with irreducible Cartesian tensors, the DOFs of the “real” and “imaginary” spherical tensors of order \(j\) shrink from \(2 (2j+1)\) to \(2j+1\): the first \(j\) complex-valued components and the solely real, or solely imaginary, valued center \(a^j_0\), respectively. Note that for any tensor \(\mathbf {a}^j \in V_j\) exists the tensor \({\mathbf {i}}\mathbf {a}^j \in {\mathbf {i}} V_j \). Tensors in \( V_j \) can be associated with irreducible symmetric Cartesian tensors and tensors in \({\mathbf {i}} V_j \) with irreducible antisymmetric Cartesian tensors. Hence, an arbitrary irreducible symmetric Cartesian tensor can be expressed by an irreducible antisymmetric Cartesian tensor and vice versa.

Finite Signal Representation Fourier coefficients represent image details in a coarse-to-fine order. We obtain a finite representation of a function f with infinite support by cutting off higher-order frequencies. For this, we set L to a finite number in our applications.

Products The product of two Fourier coefficients is again a Fourier coefficient of an orientation function. We call these products as spherical products. They will be discussed later in this manuscript.

Rotations In the Cartesian Fourier domain, cyclic translations along the Cartesian axis can be accomplished with a rotation (a phase shift) of the corresponding Fourier coefficients. The corresponding transformation for functions on the sphere is the rotation. An orientation function f can be rotated according to the coordinate transform \((gf)(\mathbf {n}):=f(\mathbf {R}(g)^T \mathbf {n})\). Similar to a cyclic translation, f can be rotated in spherical Fourier domain by rotating the Fourier coefficients according to

$$\begin{aligned} (gf)(\mathbf {n}):=\sum _{j=0}^L (\mathbf {D}^j(g)^T\mathbf {a}^j)^T \mathbf {Y}^j(\mathbf {n}). \end{aligned}$$
(14)

That is, the coefficients of (gf) are the rotated coefficients of f. This relationship is illustrated in Fig. 5.

Fig. 5
figure 5

A rotation of an orientation function in the spatial domain is accomplished with a coordinate transform. The corresponding operation in Fourier domain is a rotation of the expansion coefficients

Axial Symmetric Functions The spherical harmonic expansion of any axial symmetric orientation function of the form \(f(\theta ,\phi )=f(\phi )\) (z-axis aligned) has only scalar-valued expansion coefficients (all remaining tensor components are vanishing). The expansion simplifies to

$$\begin{aligned} f(\mathbf {n}):=\sum _{j=0}^L \overline{a^j} Y^j_0(\mathbf {n}). \end{aligned}$$
(15)

In Fig. 21, we show three examples with relevance for image processing applications.

Convolution and Correlation With spherical tensors as Fourier coefficients, spherical convolutions are just products between tensors. The spherical convolution between a spherical function f, with expansion coefficients \(\mathbf {a}^j\), and an axial symmetric orientation function \(f'\), with scalar-valued expansion coefficients \(b^j\) (see 15), is the simple product

$$\begin{aligned} (f * f')(\mathbf {n}):= \sum _{j=0}^L (b^j\mathbf {a}^j)^T \mathbf {Y}^j(\mathbf {n}). \end{aligned}$$
(16)

3 Spherical Tensor Fields

For images, a tensor typically changes with the location within the image. We call a tensor which changes with respect to the position a tensor field. We call the corresponding field of spherical functions (12) orientation fields. The creation, processing and interpretation of orientation fields in terms of spherical tensors is the base of all introduced algorithms. In this section, we first explore the rotation of orientation fields, the key property of rotation covariant algorithms. Then we introduce the theoretical background of two major operations of the toolbox: the spherical counterparts of tensor products (4) and derivatives (3). They are used to map images to spherical tensor fields and vice versa.

Orientation fields are functions \(f:{\mathbb {R}}^3\times S_2 \mapsto {\mathbb {C}}\) that assign to each point \(\mathbf {r}\in {\mathbb {R}}^3\) in 3-space an orientation function (12). We can write any square-integrable orientation field f according to (12) with respect to its second argument (the direction) as an expansion

$$\begin{aligned} f(\mathbf {r}, \mathbf {n}) = \sum _{j=0}^\infty \mathbf {a}^j(\mathbf {r})^\top \mathbf {Y}^j({\mathbf {n}}). \end{aligned}$$
(17)

The expansion coefficients \(\mathbf {a}^j(\mathbf {r})\) are tensors which vary with respect to their location in 3D space.

Any orientation field can be rotated with \((gf)(\mathbf {r},\mathbf {n}) := f(\mathbf {R}(g)^\top \mathbf {r},\mathbf {R}(g)^\top \mathbf {n})\) in a “classical” way. The first argument is a coordinate transformation, and the second argument rotates the local spherical function accordingly. With (14), the rotation can be accomplished in the Fourier domain according to

$$\begin{aligned} (gf) (\mathbf {r}, \mathbf {n}) = \sum _{j=0}^\infty \left( \mathbf {D}^j(g) \mathbf {a}^j\left( \mathbf {R}(g)^\top \mathbf {r} \right) \right) ^\top \mathbf {Y}^j({\mathbf {n}}). \end{aligned}$$
(18)

That is, if an orientation field f is rotated, the underlying expansion fields \(\mathbf {a}^j\) of expansion coefficients \(\mathbf {a}^j(\mathbf {r})\) undergo the transformation

$$\begin{aligned} \mathbf {a}^j(\mathbf {r}) \mapsto \left( \mathbf {D}^j(g) \mathbf {a}^j\left( \mathbf {R}(g)^\top \mathbf {r} \right) \right) . \end{aligned}$$
(19)

We will call any function with this kind of transformation a spherical tensor field of order j.

Definition 1

(Spherical Tensor Field) A function \(\mathbf {f}^j: {\mathbb {R}}^3 \mapsto {\mathbb {C}}^{2j+1}\) is called a spherical tensor field of order j if it transforms with respect to rotations as

$$\begin{aligned} (g \mathbf {f}^j)(\mathbf {r}) := \mathbf {D}^j(g) \mathbf {f}^j\left( \mathbf {R}(g)^\top \mathbf {r}\right) \end{aligned}$$
(20)

for all \(g\in SO(3)\). The space of all spherical tensor fields of rank j is denoted by \({\mathcal {T}}_j\).

In this context, it is important noting that an image \(I:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) is a spherical tensor field of order 0.

3.1 Spherical Tensor Coupling

With the delta and epsilon tensor, there exist various ways to combine Cartesian tensors. Since Cartesian and spherical tensors can basically express the same quantities, there must exist a counterpart of the products in the spherical tensor domain as well. Thanks to the irreducibility, there neither exists an operator for tensor contraction (there exists no trace), nor a way to represent lower-order tensors in terms of higher-order tensors. As a consequence, there exists only one single inner product-like bilinear form in the spherical tensor domain. We call this operation the spherical product.

For example, both a Cartesian and a spherical tensor of order one are irreducible. In this case, there exists a one-to-one relation between the Cartesian and spherical products. Let \(\mathbf {U}^1\) and \(\mathbf {S}^1\) be two Cartesian tensors of order one. The inner product \(t^{(0)}=\sum _{ij}\delta _{ij}u^{(1)}_i s^{(1)}_j\), the cross product \(t^{(1)}_{i}=\sum _{jk} \epsilon _{ijk}u^{(1)}_j s^{(1)}_k\) and the traceless outer product \(t^{(2)}_{ij}=u^{(1)}_i s^{(1)}_j - \delta _{ij} t^{(0)}/3\) combine the two tensors and create new irreducible, Cartesian tensors of order zero, one and two.

In the spherical tensor domain, with spherical tensors \(\mathbf {u}^1,\mathbf {s}^1\in V_1 \), the corresponding products can be performed with one spherical product \((\mathbf {u}^1\circ _{\!j}\mathbf {s}^1)\), where \(j \in {0,1,2}\). It is defined by a family of bilinear forms:

Definition 2

(Spherical Products) For every \(j \ge 0\), we define a family of bilinear forms, a tensor product

$$\begin{aligned} \circ _{\!j} : {\mathbb {C}}^{2j_1+1} \times {\mathbb {C}}^{2j_2+1} \mapsto {\mathbb {C}}^{2j+1}~, \end{aligned}$$
(21)

where \(j_1,j_2 \in {\mathbb {N}}\) has to be chosen according to the triangle inequality \(|j_1 - j_2| \le j \le j_1 + j_2\). The product is defined by

$$\begin{aligned} (\mathbf {e}^j_m)^\top (\mathbf {v}\circ _{\!j}\mathbf {w}) := \sum _{m = m_1+m_2} \langle j m \mid j_1 m_1, j_2 m_2 \rangle v_{m_1} w_{m_2}~. \end{aligned}$$

With \(\mathbf {e}^j_m\), we denote the unit vectors in \({\mathbb {C}}^{2j+1}\), where \(e^j_{mn}=\delta (m-n)\).

The spherical products are, like their Cartesian counterparts, rotation covariant.

Proposition 1

Let \(\mathbf {v} \in {\mathbb {C}}^{2j_1+1}\) and \(\mathbf {w} \in {\mathbb {C}}^{2j_2+1}\), then for any \(g\in SO(3)\)

$$\begin{aligned} (\mathbf {D}^{j_1}(g)\mathbf {v})\circ _{\!j}(\mathbf {D}^{j_2}(g)\mathbf {w}) = \mathbf {D}^{j}(g) (\mathbf {v}\circ _{\!j}\mathbf {w}) \end{aligned}$$
(22)

holds.

That is, it does not matter whether we first rotate the factors or the final product.

The spherical products are nonlinear transformations that can combine whole spherical tensor fields in a point-by-point manner. For example, given two tensor fields \(\mathbf {v} \in {\mathcal {T}}_{j_1}\) and \(\mathbf {w} \in {\mathcal {T}}_{j_2}\) and j chosen such that \(|j_1 - j_2| \le j \le j_1 + j_2\), then \(( \mathbf {v}\circ _{\!j}\mathbf {w})(\mathbf {r}) := \mathbf {v}(\mathbf {r})\circ _{\!j}\mathbf {w}(\mathbf {r}) \) is in \({\mathcal {T}}_{j}\); it is spherical tensor field of order j. Later in this article, products are used in convolutions as well. Convolutions can be used to map local image patches to spherical tensor-valued basis functions, an important step to generate local descriptors.

The spherical product naturally commutes between the “real” and “imaginary” spaces \( {V_{j}} \) and \({\mathbf {i}} {V_{j}}\). It reflects the parity by its (anti-)symmetry. The toolbox makes uses of these properties in order to save memory and computation time.

Proposition 2

Let \(\mathbf {v} \in {V_{j_1}}\) and \(\mathbf {w} \in {V_{j_2}}\), then

$$\begin{aligned} j+j_1+j_2 \text { is even }\Rightarrow & {} \mathbf {v}\circ _{\!j}\mathbf {w} = \mathbf {w}\circ _{\!j}\mathbf {v} \in {V_{j}} \\ j+j_1+j_2 \text { is odd }\Rightarrow & {} \mathbf {v}\circ _{\!j}\mathbf {w} = -(\mathbf {w}\circ _{\!j}\mathbf {v}) \in {\mathbf {i}} {V_{j}}, \end{aligned}$$

With the spherical products, we can define a rotation covariant spatial convolution between tensor fields of different orders.

Definition 3

(Tensor Convolution) For two tensor fields \(\mathbf {v} \in {\mathcal {T}}_{j_1}\) and \(\mathbf {w} \in {\mathcal {T}}_{j_2}\), the operation

$$\begin{aligned} (\mathbf {v}{\widetilde{\circ }}_{\!j}\mathbf {w})(\mathbf {x}) : = \int _{{\mathbb {R}}^3} (\mathbf {v}(\mathbf {r})\circ _{\!j}\mathbf {w}(\mathbf {x}-\mathbf {r})) \hbox {d} \mathbf {r} \end{aligned}$$
(23)

defines the tensor convolution.

Proposition 3

(Convolution Theorem) The convolution theorem

$$\begin{aligned} (\mathbf {v}{\widetilde{\circ }}_{\!j}\mathbf {w})=\text {FT}^{-1}(\text {FT}(\mathbf {v})\circ _{\!j}\text {FT}(\mathbf {w})) \end{aligned}$$
(24)

holds for the tensor convolution. With \(\text {FT}\), we denote the ordinary spatial 3D Fourier transform which maps tensor fields to their Fourier counterpart in a component-by-component manner.

3.2 Covariant Feature Extraction

In many cases, we start with a scalar-valued input image, a spherical tensor field of order zero. An initial step is the rotation covariant feature extraction. This procedure maps local image patches to tensor-valued coefficients. In the Cartesian domain, the image derivatives (see 3) often build the basis for covariant or invariant feature extraction. Image derivatives are encoding, with increasing order, local image appearance in a coarse-to-fine manner. Spatially smoothing the derivatives with a kernel, like a Gaussian \(G_\sigma \), steer the level-of-detail, or the local image patch size. Since the convolution commutes with the differentiation, such a scale-dependent analysis can be realized with only one initial smoothing; i.e.,

$$\begin{aligned} \underset{\begin{array}{c} \text {instead of }\\ \text {one convolution per derivative} \end{array}}{\underbrace{G_\sigma * \left( \textstyle \frac{\partial }{\partial x_{i_1}} \cdots \frac{\partial }{\partial x_{i_{n+1}}} I \right) }}\overset{\rightarrow }{=}\underset{\text {one convolution in total (fast)}}{\underbrace{\textstyle \frac{\partial }{\partial x_{i_1}} \cdots \frac{\partial }{\partial x_{i_{n+1}}} (G_\sigma * I)}} \end{aligned}$$
(25)

This is a big advantage, particularly for large images, because image derivatives can be implemented efficiently in terms of finite differences. In our context, (25) is the connection between scalar-valued images and tensor-valued features. In fact, (25) can be interpreted as a projection on Cartesian tensor-valued basis functions \(\mathbf {K}^{(n)}\), where

$$\begin{aligned} t^{(n)}_{i_1,\cdots ,i_n}(\mathbf {x})&:=\Bigl (\underset{\text {basis function }k^{(n)}_{i_1,\cdots ,i_n}}{\underbrace{\left( \textstyle \frac{\partial }{\partial x_{i_1}} \cdots \frac{\partial }{\partial x_{i_{n+1}}} G_\sigma \right) }} * I \Bigr )(\mathbf {x})\nonumber \\&=\Bigl (k^{(n)}_{i_1,\cdots ,i_n}* I \Bigr )(\mathbf {x}). \end{aligned}$$
(26)

Equation 26 is also called a sliding dot product. This projection maps images in a sliding window manner to tensor-valued feature fields. Each tensor at each voxel represents local image features in a rotation covariant manner. The basis functions in the example (26) are known as the Gaussian-windowed Hermite polynomials; see, e.g., [67]. A Gaussian kernel \(G_\sigma \), however, is not the only possible way to build basis functions with such rotation covariant differential relationship. In fact, any differentiable radial symmetric kernel is a possible candidate.

In this section, we introduce two examples of spherical tensor-based basis functions with differential relationship. This is, on the one hand, the spherical counterpart of the 3D Hermite polynomials. Further, we briefly introduce a spherical tensor-valued Gabor basis.

3.2.1 Spherical Derivatives

The homogeneous polynomials \(R^j_m(\mathbf {r}) = r^j Y^j_m(\mathbf {n})\) of order j are called the “solid harmonics”; see Sect. 1 for details. With \(r = |\mathbf {r}|\), we denote the distance to the center, and with \(\mathbf {n} = \mathbf {r}/r\) the direction. You may consider them as the bridge between the spherical harmonics, which only exist on the sphere (\(S_2\)), and the 3D image space (\({\mathbb {R}}^3\)). With the operator transformation \(R^j_m(\nabla )\), we map the Cartesian gradient operator \(\nabla =(\partial _x,\partial _y,\partial _z)^T\) to the spherical domain. We define the spherical differential operator \({\varvec{\partial }}^j:=(\partial ^j_{-j}, \cdots , \partial ^j_{j})^T\) by

$$\begin{aligned} \partial ^j_m :=R^j_m(\nabla ). \end{aligned}$$
(27)

The operator \({\varvec{\partial }}^1\), with \({\varvec{\partial }}^1=\mathbf {R}^1(\nabla )\), is, as its Cartesian counterpart, a 3D vector, namely \((\frac{1}{\sqrt{2}}(\partial _x-i \partial _y),\partial _z,\frac{1}{\sqrt{2}}(-\partial _x-i \partial _y))^T\).

We can treat \({\varvec{\partial }}^j\) like a spherical tensor of order \(j\). Similar to its Cartesian counterpart, higher-order derivative operators can be built out of lower-order ones with

$$\begin{aligned} {\varvec{\partial }}^j = \textstyle \sqrt{\frac{j!}{(2j-1)!!}}\ {\varvec{\partial }}^1\circ _{\!j}( \ldots \circ _{\!3}({\varvec{\partial }}^1\circ _{\!2}{\varvec{\partial }}^1)\ldots ). \end{aligned}$$
(28)

This property is of particular importance from a computational perspective, because we can compute tensor features in increasing order \(j=0,1,2,\cdots \) in a successive manner:

Fig. 6
figure 6

Finite differences are faster than convolutions based on the Fourier transform. For a kernel with differential, a projection up to order L can be accomplished in two manners: (1) upper row projection with \(((L+1)(L+2))/2\) convolutions and (2) bottom row with one convolution followed by differentiation. The latter one significantly reduces the computation time

Let \(\varvec{{\mathcal {K}}}^j\in {\mathcal {T}}_{j}\) be a spherical tensor-valued basis function with the differential relationship \(\varvec{{\mathcal {K}}}^j={\varvec{\partial }}^j \varvec{{\mathcal {K}}}^0\). Then, the sliding dot product between an image I and the basis functions \(\varvec{{\mathcal {K}}}^j, j=\{0,1,\dots \}\) can be successively computed, similar to (25) and (26), with one convolution in combination with finite differences:

$$\begin{aligned} \mathbf {a}^j=(\varvec{{\mathcal {K}}}^j* I)= {\varvec{\partial }}^j (\varvec{{\mathcal {K}}}^0 * I). \end{aligned}$$
(29)

This relationship is visualized in Fig. 6.

Fig. 7
figure 7

Images showing the center slice of the real-valued coefficient (\(m=0\)) of Gauss–Laguerre and Gabor kernels. For these coefficients, the angular patterns around the Z-axes are 2D circular harmonics of order \(j\)

Gauss–Laguerre Gaussian derivatives play an important role in the context of scale space analysis and feature extraction in many image processing tasks; e.g., [39, 40, 66, 67, 92, 104, 118]. Cartesian Gaussian derivatives are called the Hermite polynomials. The spherical counterpart of the 3D Hermite polynomials is the Laguerre polynomials [67]. Gaussian-windowed Laguerre polynomials are, as their Cartesian counterparts, optimal for local smooth processes; see section 5.1.2 in [97].

The Gaussian-windowed Laguerre basis functions \( \varvec{{\mathcal {L}}}^j_n\) of order \(j\) are spherical tensor fields of order \(j\). They are defined by

$$\begin{aligned} \varvec{{\mathcal {L}}}^j_n(\mathbf {r})= \textstyle \sqrt{\frac{j!}{(2j-1)!!}} \frac{(-\sigma ^2)^{n+j}}{n! 2^n} \mathbf {R}^j(\mathbf {r}) L^{j+\frac{1}{2}}_n(\frac{r^2}{(2\sigma ^2)})e^{-\frac{r^2}{2\sigma ^2}}, \end{aligned}$$
(30)

with the differential relationship

$$\begin{aligned} \varvec{{\mathcal {L}}}^j_n=\Delta ^n {\varvec{\partial }}^j\varvec{{\mathcal {L}}}^0_0; \end{aligned}$$
(31)

see [103] for details. Note that \(\varvec{{\mathcal {L}}}^0_0\) is the 3D Gaussian function. With \(\Delta \), we denote the Laplace operator, and with \(L^\alpha _n\) the associated Laguerre polynomial [1].

Gabor Functions The functions

$$\begin{aligned} \varvec{{\mathcal {B}}}^j(\mathbf {r},k) =(-k)^j \sqrt{\frac{j!}{(2j-1)!!}} \mathbf {Y}^j(\mathbf {n}) J_j(kr) \end{aligned}$$
(32)

are the natural radial basis functions appearing in the spherical expansion of the plane wave; see (70) in appendix. With \(J_j\), we denote the spherical Bessel functions [1]. With

$$\begin{aligned} \varvec{{\mathcal {B}}}^j(k)={\varvec{\partial }}^j J_0(kr) \end{aligned}$$
(33)

we have a differential relationship similar to the Gauss–Laguerre functions. In our applications, we use a Gaussian-windowed version \(\varvec{{\mathcal {B}}}^j(k) G_\sigma \) with local support, a Gabor wavelet. See [103] for the exact expression and differential formulation in terms of Gabor wavelets. Figure 7 shows examples for \(\varvec{{\mathcal {B}}}^j(k)\) and \(\varvec{{\mathcal {L}}}^j_n\).

3.3 Rotation Invariant Features

We call any zero-order tensor \(\mathbf {a}^0\) rotation invariant in the sense that with

$$\begin{aligned} (g\mathbf {a}^0)=\mathbf {D}^0(g) \mathbf {a}^0 = \mathbf {a}^0, \end{aligned}$$
(34)

the transformation is the identity transform. The corresponding tensor fields \(\mathbf {f} \in {\mathcal {T}}_0\) transform like ordinary 3D images; i.e., they rotate according to \((g \mathbf {f})(\mathbf {x}):=\mathbf {f}(\mathbf {U}(g) \mathbf {x})\). Hence, the quantities, the tensors themselves, are not altered. We call this property local rotation invariance. All 3D biological images with scalar-valued intensities are fields with locally rotation invariant zero-order tensors.

In a 3D biological or medical feature detection tasks, it is often desired to detect objects, or specific structures, in a rotation invariant manner. With STA, it is possible to create a set of zero-order tensor fields. Each of them contains discriminative, mutually exclusive features of local image structures in a rotation invariant manner. The idea is that even if the location or the orientation of an object varies, only the location of the set of corresponding invariant features undergoes a coordinate transform—the feature itself stays constant.

Fig. 8
figure 8

Power spectrum and the (even) bi-spectrum features of an image. The image has been projected to a Gaussian derivative basis with coefficients up to order three

A set of locally rotation invariant features is a signature of local image structures. Any state-of-the-art classifier can be used in this context in a sliding window approach to identify the objects at any location and in any orientation. In Fig. 8, we show two examples of rotation invariant features based on the spherical power spectrum and the spherical bi-spectrum of an image.

Given Fourier coefficients, the power spectrum might be the most commonly used invariant feature. The power spectrum represents the energy distribution, or power, with respect to the frequencies, in our case the tensor orders.

Definition 4

(Power Spectrum) The power of an order \(\mathbf {a}^j\) tensor can be computed with the spherical product

$$\begin{aligned} p\{\mathbf {a}^j\}=(\mathbf {a}^{j} \circ _{0} \mathbf {a}^{j}) . \end{aligned}$$
(35)

The power spectrum is often sufficient when objects clearly differ in symmetry. It reduces the characteristics of a function to a small number of coefficients. However, this might sometimes not be enough. For instance, although the functions

$$\begin{aligned}&f(\mathbf {n}) = \sum _{j=0}^\infty (\mathbf {a}^j)^\top \mathbf {Y}^j({\mathbf {n}}) \text { and }\nonumber \\&f'(\mathbf {n}) = \sum _{j=0}^\infty (\mathbf {b}^j)^\top \mathbf {Y}^j({\mathbf {n}}); \mathbf {b}^j:={{\left\{ \begin{array}{ll} \mathbf {a}^j &{} \text{ if } j\ne 2 \\ \mathbf {D}(g)^j \mathbf {a}^j &{} \text{ else } \\ \end{array}\right. }} \end{aligned}$$
(36)

differ in the orientation of the order two tensor coefficient, their power spectra are identical, see Proposition 1. This is because the power spectrum does not preserve any information about connections between the relative directional information of the coefficients. This is a disadvantage over the bi-spectrum.

Definition 5

(Bi-Spectrum) Let \(\mathbf {a}^j\in {\mathbb {C}}^{2j+1}\), \(j=\{0,1,2,\ldots \}\) be the spherical expansion coefficients of some spherical function. The bi-spectrum, we refer to [61] for details, is formed by all possible spherical products of order 3 that return a scalar (a tensor of rank 0):

$$\begin{aligned} b\{\mathbf {a}^{j_1},\mathbf {a}^{j_2},\mathbf {a}^{j_3}\} = (\mathbf {a}^{j_1} \circ _{j_3} \mathbf {a}^{j_2}) \circ _{0} \mathbf {a}^{j_3}. \end{aligned}$$
(37)
Fig. 9
figure 9

Odd bi-spectrum features can discriminate point reflections. In the feature images, positive contributions are visualized in red, negative contributions in blue

The bi-spectrum has a variety of nice properties. For example, contrarily to the power spectrum, it can discriminate point reflections: The odd products (\(j_1+j_2+j_3\) is odd) change their signs if a the underlying object undergoes a point reflection (or any other orthogonal transformation with negative determinant), see Fig. 9 for an example.

The spherical bi-spectrum is a specific case of the bi-spectrum over the rotation group SO(3) (The functions over all three Euler angles \((\theta ,\phi ,\psi )\); we only deal with spherical functions over the two angles \((\theta ,\phi )\)). For SO(3), the bi-spectrum is complete [61, 62]. This is a powerful property. It means the object can be fully recovered from the bi-spectrum (up to the rotation). However, in case of SO(3), it is a rather large object, and many possible redundancies exist. Restricting on spherical functions, the bi-spectrum becomes a rather simple object and redundancies can be avoided easily by following some specific coupling rules for triple products, which we discuss in the next subsection. Further details regarding its properties are found in [61].

3.3.1 Symmetries in Triple Products

Computing all permutations of products between three spherical tensors, as in the bi-spectrum, leads to a set of linearly dependent tensors. Fortunately, thanks to the broad usage of spherical tensors in the angular momentum theory, the symmetric relationships of angular momentum states have been studied exhaustedly in detail [15, 88]. In our case, the relationship between three states is important, see (98) in appendix.

Considering the symmetries can resolve those linear dependencies. In fact, the bi-spectrum is independent of the ordering of the arguments \(j_1,j_2,j_3\), that is, there are only \((\genfrac{}{}{0.0pt}{}{L}{3} )\) independent bi-spectral invariants, if the spherical signal is band-limited by L. The following corollary summarizes the most important cases. Theorem 1 covering the general case together with the proofs is found in appendix.

Corollary 1

(Associativity of Tensor Products) For the triple tensor products, there exist coupling rules that are one-to-one, see [97]. These rules are:

  • (Bi-spectrum) If \({ |j_1-j_2 |}\le j_3 \le {j_1+j_2}\), then

    $$\begin{aligned}&\left( \left( \mathbf {u}^{j_1} \circ _{j_3} \mathbf {v}^{j_2}\right) \circ _{0} \mathbf {w}^{j_3}\right) =\left( \left( \mathbf {w}^{j_3} \circ _{j_2} \mathbf {u}^{j_1}\right) \circ _{0} \mathbf {v}^{j_2}\right) \nonumber \\&\quad =\left( \left( \mathbf {v}^{j_2} \circ _{j_1} \mathbf {w}^{j_3}\right) \circ _{0} \mathbf {u}^{j_1}\right) . \end{aligned}$$
    (38)
  • (Upper Bound) If \(j_1,j_2,j_3\in {\mathbb {N}}\), then

    $$\begin{aligned}&\left( \left( \mathbf {u}^{j_1} \circ _{(j_1+j_2)} \mathbf {v}^{j_2}\right) \circ _{(j_1+j_2+j_3)} \mathbf {w}^{j_3}\right) \nonumber \\&\quad =\left( \left( \mathbf {u}^{j_1} \circ _{(j_1+j_3)} \mathbf {w}^{j_3}\right) \circ _{(j_1+j_2+j_3)} \mathbf {v}^{j_2}\right) \nonumber \\&\quad =\left( \left( \mathbf {v}^{j_2} \circ _{(j_2+j_3)} \mathbf {w}^{j_3}\right) \circ _{(j_1+j_2+j_3)} \mathbf {u}^{j_1}\right) . \end{aligned}$$
    (39)
  • (Lower Bound) If additionally \((j_3-j_2-j_1)\ge 0\), then

    $$\begin{aligned}&\left( \left( \mathbf {w}^{j_3} \circ _{(j_3-j_1)} \mathbf {u}^{j_1}\right) \circ _{(j_3-j_2-j_1)} \mathbf {v}^{j_2}\right) \nonumber \\&\quad =\left( \left( \mathbf {w}^{j_3} \circ _{(j_3-j_2)} \mathbf {v}^{j_2}\right) \circ _{(j_3-j_2-j_1)} \mathbf {u}^{j_1}\right) \nonumber \\&\quad =\left( \left( \mathbf {u}^{j_1} \circ _{(j_1+j_2)} \mathbf {v}^{j_2}\right) \circ _{(j_3-j_2-j_1)} \mathbf {w}^{j_3}\right) . \end{aligned}$$
    (40)

3.4 Steerable Voting

In Sect. 3.2, we mapped local image patches to spherical tensor-valued basis functions \(\varvec{{\mathcal {K}}}^j\). A result was a set of expansion coefficients \(\{\mathbf {a}^j\}\). In this section, we introduce the dual operation, which “renders” patches into a scalar-valued image. Applications can be steerable filters, where an elongated filter kernel is rendered into a target image in consistency with tubular structures in a source image; see, e.g., [50]. Or, as we will see as an example in the application section, to create object specific saliency maps, we call them voting maps, for the generic detection of 3D objects.

In Refs. [71, 81, 91, 101], this idea was used in the spirit of a generalized Hough transform [8]. Let \(\{\mathbf {a}^j\}\), with \(j=\{0,1\cdots \}\), be the set of expansion fields of a local voting function. Then, the process of “rendering” the votes into an image can be accomplished with a convolution-like operation:

$$\begin{aligned} V(\mathbf {r})=\int _{{\mathbb {R}}^3}\sum _{j=0}^\infty (\mathbf {a}^j(\mathbf {x}))^T \mathbf {\varvec{{\mathcal {K}}}}^j(\mathbf {r}- \mathbf {x}) d \mathbf {x}, \end{aligned}$$
(41)

where \(V \in {\mathcal {T}}_0\) is a scalar-valued image. This operation is rather computationally expensive. However, if \(\varvec{{\mathcal {K}}}^j\) is a kernel with differential relationship, see 3.2, then (41) can be computed in the following manner:

$$\begin{aligned} V= \sum _{j=0}^L (\mathbf {a}^j{\widetilde{\circ }}_{\!0} \mathbf {\varvec{{\mathcal {K}}}}^j) = \varvec{{\mathcal {K}}}^0 * \sum _{j=0}^L ({\varvec{\partial }}^j\circ _{\!0}\mathbf {a}^j); \end{aligned}$$
(42)

see Sect. (23) for the definition of the tensor convolution. By computing the derivatives in a top-down manner, we only need L first-order spherical derivatives in combination with an ordinary, scalar-valued convolution to compute the final result. This makes both feature computation and voting computational efficient. Figure 10 illustrates the work flow.

Fig. 10
figure 10

Finite differences are faster than convolutions based on the Fourier transform. For a kernel with differential relationship, steerable voting can be accomplished in two manners: (1) upper row many expensive projections with convolutions (2) bottom row Differentiation followed by one single convolution. The latter one significantly reduces the computation time

3.5 Diffusion on \({\mathbb {R}}^3\times S_2\) with STA

Image enhancement/restoration schemes are often based on diffusion/convection schemes. In this section, we shortly describe the generators of diffusion and convection on \({\mathbb {R}}^3 \times S^2\), and we show how they can be implemented in terms of spherical tensor algebra.

Most of the implementations solving \({\mathbb {R}}^3 \times S^2\)-diffusion equations [10, 28, 29, 84] rely on an equiareal discretization of the two-sphere \(S_2\). Indeed, there are implementations [29] that use spherical harmonics as an intermediate \(S_2\)-interpolation scheme, but due to their design they cannot benefit from the advantages of the spherical harmonic representation, like the efficient computations of \(S_2\)-convolutions, and the closedness under rotations. Below, we introduce a common differential operators acting on functions \({\mathbb {R}}^3 \times S_2 \mapsto {\mathbb {C}}\) in terms of spherical harmonics. In this way, we can benefit from the advantages of the harmonic representations. The discretization of the \(S_2\) is avoided, and we are able to implement diffusion on \({\mathbb {R}}^3 \times S_2 \) with reasonable memory consumption. In general, we are interested in solving or propagating a partial differential equation of the form

$$\begin{aligned} \partial _t f(\mathbf {r},\mathbf {n},t) = H f(\mathbf {r},\mathbf {n},t)~, \end{aligned}$$
(43)

where f is a time-dependent orientation field, and H a linear differential operator in \(\mathbf {n}\) and \(\mathbf {r}\). In [29], it is shown that if H generates a diffusion/convection, then it is a quadratic form in \(\mathbf {n}^\top \nabla \) (convection and anisotropic diffusion), and linear in \(\Delta =\partial _x^2+\partial _y^2+\partial _z^2\) and \({\mathcal {J}}^2\), which denotes the Laplace–Beltrami operator on the two spheres. Our goal is to understand the action of the generator H, if the field f is written in terms of a spherical harmonics expansion, see (17). The spherical tensor fields \(\mathbf {a}^j\), the expansion coefficients, are obtained by the projections \(\mathbf {a}^j(\mathbf {r}) = \frac{1}{2j+1}\langle \mathbf {Y}^j, f(\mathbf {r}) \rangle \). Hence, we are interested in matrix elements \({{\hat{H}}}^{jm}_{j'm'} = \langle Y^{j'}_{m'}, H Y^{j}_{m} \rangle \) of H in spherical harmonic representation such that the propagating equation can be written in the form

$$\begin{aligned} \partial _t a^j_m(\mathbf {r},t) = \sum _{j=0}^\infty \sum _{m=-j}^{m=j} {{\hat{H}}}^{jm}_{j'm'} a^{j'}_{m'}(\mathbf {r},t) \end{aligned}$$
(44)

where \({{\hat{H}}}^{jm}_{j'm'}\) is a differential operator in \(\mathbf {r}\), but purely algebraic in the orientation coordinate. The spherical Laplace operator is well known in this representation. It is defined by

$$\begin{aligned} \big \langle Y^j_m, {\mathcal {J}}^2 f \big \rangle = -j(j+1) a^j_m. \end{aligned}$$
(45)

For the spatial Laplacian, the result is \(\langle Y^j_m, \Delta f\rangle = \Delta a^j_m\). For the directed convection \(\mathbf {n} \cdot \nabla \) and the directed diffusion \((\mathbf {n} \cdot \nabla )^2\) generator, the result is not that trivial. In [87], we give the general result for SE(3) diffusion, but mention here the more simple case for \({\mathbb {R}}^3\times S_2\). For the convection generator, one finds

$$\begin{aligned}&\big \langle Y^j_m, (\mathbf {n} \cdot \nabla ) f \big \rangle \nonumber \\&\quad = \sum _{\begin{array}{c} j'=-1,1\\ m=m'+q \end{array}} \frac{2j'+1}{2 j + 1} \langle j m | j' m', 1 q \rangle \langle j 0 | j' 0 ,1 0 \rangle \ \partial ^1_{q} a^{j'}_{n'} \nonumber \\&\quad = \sum _{j'=-1,1} \frac{2j'+1}{2 j + 1} \langle j 0 | j' 0 ,1 0 \rangle {\varvec{\partial }}^1\circ _{\!j}\mathbf {a}^{j'}~. \end{aligned}$$
(46)

On the other hand, the diffusion generator takes the form

$$\begin{aligned}&\big \langle Y^j_m, (\mathbf {n} \cdot \nabla )^2 f \big \rangle = \frac{\Delta }{3} + \frac{2}{3} \nonumber \\&\quad \times \sum _{\begin{array}{c} j'=-2,0,2\\ m=m'+q \end{array}} \frac{2j'+1}{2 j + 1} \langle j m | j' m', 2 q \rangle \langle j 0 | j' 0 ,2 0 \rangle \ \partial ^2_{q} a^{j'}_{n'}\nonumber \\&\quad = \frac{\Delta }{3} + \frac{2}{3}\sum _{j'=-2,0,2} \frac{2j'+1}{2 j + 1} \langle j 0 | j' 0 ,2 0 \rangle {\varvec{\partial }}^2\circ _{\!j}\mathbf {a}^{j'}. \end{aligned}$$
(47)

The diffusion generator is an ideal candidate for regularizing inverse problems where line-like structure is of interest. In [84], this regularizer is called ’fiber continuity’ (FC).

4 The STA-Toolbox

The STA-toolbox provides a set of operations and procedures which ease the handling of spherical tensors. This includes spherical tensor products, derivatives, as well as a Fourier transformation respecting symmetries of spherical tensors. It further provides higher-level API functions for feature extraction, object detection or anisotropic filtering. For performance reasons, the toolbox is mostly written in C and C++, but provides a high-level API in MATLAB. The toolbox has been successfully tested on a 64 bit Windows system and a 64 bit Linux system. On Linux, multi-threaded tensor operations are enabled by default. Tensor operations are also available on the GPU, written in CUDA.

Fig. 11
figure 11

Representation of stafields in memory. It is an interleaved array of the real and imaginary values of the tensor coefficients

A stafield class encapsulates the data structure of orientation fields. Most of the functions of the STA-toolbox have a simple API in MATLAB, or in its open-source alternative OCTAVE. In this section, we introduce the basic functionality of the toolbox using the MATLAB interface. References to the C++ and C interface can be found online on the project webpage.

4.1 Spherical Tensor Fields

The basis of all calculations is a data container that stores the expansion coefficients of an orientation field \(f:{\mathbb {R}}^3 \times S_2 \rightarrow {\mathbb {C}}\); see (17) for the definition. The expansion coefficients are a (band-) limited number of spherical tensor fields \(\{\mathbf {a}^0,\mathbf {a}^1, \cdots ,\mathbf {a}^L\}\), where \(\mathbf {a}^j\in {\mathcal {T}}_j\).

We call this data container a stafield. It is a multi-dimensional array with attributes describing the properties of the orientation field in terms of its expansion coefficients. The data are stored in a five-dimensional, real-valued array of size \(2 \times N \times X \times Y \times Z\). Figure 11 illustrates the memory alignment of the data. The first dimension is always two. It represents the real and imaginary part of the tensor field. The second dimension represents all tensor field components \(a_m^j\); we will give details about the storage order below. The third, fourth and fifth dimensions are the image dimensions. A tensor field is associated with the four attributes: storage, L, type and element_size:

  1. 1.

    The element_size is a three-dimensional vector which defines the extents of a voxel (e.g., in micrometers). The default is \((1,1,1)^T\). The voxel size is considered by both the tensor derivatives and convolution kernel sizes. We recommend to use this attribute to account for an anisotropic image resolution.

  2. 2.

    The attribute L defines the upper limit of the expansion.

  3. 3.

    A stafield may contain only a single spherical tensor field, or only a certain subset of coefficients with odd or even order.

    The toolbox distinguishes between four different field types: “single,” “odd,” “even” and “full.” “Single” is the standard type and defines an orientation field that is defined by a single spherical tensor field of order L. With “full,” the array contains L+1 tensor fields, ranging from order 0 up to order L. “Odd” and “even” are basically “full” tensor fields; however, in order to save memory, all even, or odd tensors, respectively, are omitted in the array. Figure 12 illustrates these four types.

  4. 4.

    In most of the applications, spherical tensor fields contain only real tensors, see Sect. 2.3. In this case, the toolbox stores only the lower \(l+1\) parts \((a^l_{-l},\ldots ,a^l_0)^T\) of a tensor field of order l and sets the storage attributes to “STA_FIELD_STORAGE_R”. Otherwise the full tensor is stored (“STA_FIELD_STORAGE_C”).

A stafield in MATLAB is represented by the stafieldStruct MATLAB structure.

Fig. 12
figure 12

The toolbox provides four different types of orientation field representation. The full field contains all coefficients up to a specified order, while the single field contains only a single spherical tensor field

The stafieldStruct constructor can turn any real-valued image into an orientation field. Below an example for a real-valued image called “img.”

figure c

Basic tensor operations, which may alter the tensor order, like products and derivatives, can only be applied to stafields of the type “single,” that is, single spherical tensor fields. In order to apply such operations to all kinds of stafields, the toolbox provides an interface to access the single components, the spherical tensor fields, of a stafields container. With the stafieldStruct, we can access single stafields from full stafields. For example,

figure d

extracts the order three field component from ifield, and we overwrite a single component in a full stafield with a given single field with

figure e

Note that in C and C++, these operations work without the need for making expensive memory copies.

The stafield constructor can also initialize specific tensor fields for feature extraction. In addition to the Gauss–Laguerre and Gabor kernels \(\varvec{{\mathcal {L}}}^j_n\) and \(\varvec{{\mathcal {B}}}^j(k)\), see Sect. 3.2, it supports Gaussian kernels [81] and Gaussian smoothed spheres [101]. For example,

figure f

initializes a stafield of order 0 using a Gaussian function with a standard deviation of 5. By default, a kernel is centered with respect to \((0,0,0)^T\) so that it can directly be used in a tensor convolution. Optional parameters change the tensor field order, define the storage type, or set the element_size.

The stafieldStruct offers many further ways to construct or initialize empty tensor fields. Typing help stafieldStruct lists detailed information about further constructors.

4.1.1 Cartesian to Spherical Tensors and Vice Versa

In most applications, it is sufficient to either work with Cartesian tensors OR with spherical tensors. Mixing both worlds is, in general, from an implementation and computation point of view, in our opinion, not recommended. Remember, table 1 gives some examples about the large number of the different possible irreducible components of a higher-order Cartesian tensor. On the other hand, tensors of order one or two are playing an important role in many existing algorithms so that such a forward and backward transformation is, indeed, demanded. According to (5), a second-order tensor can always be decomposed into a unique triple of irreducible order two, one and zero tensors.

The toolbox provides functions to connect first- and second-order Cartesian tensor fields with their spherical counterparts and vice versa. The toolbox provides both the forward and backward transformation. The Cartesian-to-spherical transformation can be accomplished with the sta_c2s. For the back transformation, there exists the sta_s2c function. The two functions sta_Grad and sta_Hessian are serving as a reference for storage convention of Cartesian tensor fields of order one and two, respectively. The following example shows how to transform a Cartesian gradient and Hessian field into their spherical counterparts.

figure g

4.2 HARDI to Spherical Tensors and Vice Versa

Diffusion-weighted magnetic resonance imaging (MRI) can noninvasively visualize the fibrous structure of the human brain white matter [58]. Based on the directional dependency of water diffusivity in fibrous tissue, it is possible to directly acquire orientation information. The high angular resolution diffusion imaging (HARDI) acquisition scheme, where more than 60 diffusion directions per voxels are acquired, allows to estimate the orientation distributions of local fiber bundles. Such a measurement is essential for in vivo fiber tracking [85]. The measurement is an orientation field, which is often represented in terms of spherical tensor expansion coefficients for further processing steps like feature detection or deconvolution. In its raw form, it is an orientation field represented by a four-dimensional array whose forth dimension corresponds to the diffusion direction. The toolbox provides the functions

figure h

which map from a discrete HARDI signal to an evenFootnote 3 stafield and vice versa.

4.3 Spherical Tensor Operations

The toolbox provides the following basic tensor operations which are acting on the stafieldStruct container:

Operation

Matlab function

Interpretation

Spherical products

sta_prod

\((\mathbf {f}\circ _{\!j}\mathbf {g}),~\mathbf {f}\in {\mathcal {T}}_j,\mathbf {g}\in {\mathcal {T}}_k \)

Spherical derivatives

sta_deriv

\(({\varvec{\partial }}^n\circ _{\!n+m} \mathbf {f}),~n=\{-2,1,0,1,2\},~\mathbf {f}\in {\mathcal {T}}_m\)

Laplace

sta_lap

\( \Delta \mathbf {f},~\mathbf {f}\in {\mathcal {T}}_m\)

Tensor FFT

sta_fft, sta_ifft

\(\widetilde{\mathbf {f}}=\text {FT}(\mathbf {f}),~\mathbf {f}=\text {FT}^{-1}(\widetilde{\mathbf {f}}); \mathbf {f},\widetilde{\mathbf {f}}\in {\mathcal {T}}_m\)

Multiplication

sta_mult

\(\alpha \mathbf {f},~\alpha \in {\mathbb {C}},~\mathbf {f}\in {\mathcal {T}}_m\)

Spherical Products Given two stafields ifield1 and ifield2 of the type “single,” the spherical product can be computed according to

figure i

where L is the tensor rank of the output ofield3. Optional parameters can switch between a standard and a normalized tensor product, or can weight the product by a given scalar.

Spherical Derivatives Given a stafield ifield1 of the type “single,” the spherical derivative operator can be applied with

figure j

where n can be either -2 (two times down-derivative), -1 (one time down-derivative), 0 (a curl like operation, which is not changing the tensor order), 1 (one time up-derivative) and 2 (two dimes up-derivatives).

Laplace Operator The Laplace operator can be applied to any type of stafield. The output is a field with the same attributes like the input field.

Tensor FFT The FFT operator can be applied to any type of stafield.

Multiplication The multiplication operator is a wrapper that allows (complex valued) scalar multiplications with stafields. While for real-valued factors \(\alpha \in {\mathbb {R}}\)

figure k

is a valid solution, the C-style of complex numbers requires the usage of

figure l

for complex-valued factors.

4.3.1 An Introductory Example

A fundamental step in many applications involving STA is the covariant feature extraction, see Sect. 3.2. For steerable filters, for instance, as well as for the computation of rotation invariant features, an image is locally expanded in terms of a Gauss–Laguerre or Gabor basis.

That is, computing a stafield \({\texttt {ofield}}:=(\mathbf {a}^0,\mathbf {a}^1,\cdots ,\mathbf {a}^j)\), with \(\mathbf {a}^j:=(\varvec{{\mathcal {K}}}^j* I) \) ,where \(\varvec{{\mathcal {K}}}^j\) has the differential relationship \(\varvec{{\mathcal {K}}}^j:={\varvec{\partial }}^j\varvec{{\mathcal {K}}}^j\). The projection can be completely performed in MATLAB using the basic STA-toolbox API. For instance,

figure m

computes a full stafield containing all the expansion coefficients up to order 3.

Particularly for large images of several gigabyte, the bottleneck is rather memory allocation and de-allocation than CPU performance. The same operations can be performed using the sta_steerFilt function. The sta_steerFilt has been designed in a memory friendly way directly in C++. The example above can be shortened to

figure n

The sta_steerFilt function can be used to compute Gaussian derivatives, Gauss–Laguerre expansion coefficients as well as Gauss–Bessel coefficients. There are further optimized functions for specific applications, which will be introduced in the application section.

It is worth mentioning that for MATLAB users, there exists a convenient stafield class which mimics the C++ interface. The stafield class implements all tensor operations as member functions. For instance, \({\texttt {ofield}}={\varvec{\partial }}^1 (G_3 * I) \) can be written as

figure o

Unfortunately, this interface is currently not supported in Octave.

The toolbox can be extended with further optimized functions using the C++ or C interface of the stafield class. The C++ stafield class can be used in a similar manner than the MATLAB stafield class. The C interface is similar to the stafieldStruct interface in MATLAB and Octave. Helper functions for the exchange of data between MATLAB and C are provided as well. The C++ stafield class also provides member functions to transfer stafields from CPU to GPU and vice versa.

5 Applications

In the following, we give several application examples which make use of STA. The focus lies on applications with value for the biological and medical image analysis. As already outlined in introduction, STA bridges the gap between low-level feature detection of ridge and plane-like structures to higher-order structures. In this section, we show examples of covariant higher-order feature extraction. We further show how to compute rotation invariant descriptor maps for invariant feature detection. We also introduce an example of a trainable polynomial filter. We demonstrate how to utilize polynomial filters to detect complicated 3D image structures. Another example uses a model-based deconvolution approach regularized by the \(S_2\times {\mathbb {R}}^3\)-diffusion generator to enhance and detect tubular structures in 3D. We finally use STA to estimate the neural fiber orientation distributions from magnetic resonance images (MRI).

At the end of each subsection, you may find a simple skeleton that serves as a copy&paste example which runs even on lightweight computer systems. The skeletons together with the data can be found in the skel toolbox directory in the repository as well.

5.1 Covariant Feature Extraction

Fig. 13
figure 13

In these images, the orientation fields are representations of local image features, like edges, crossings, curvature. The orientation functions have been visualized on top of the source image. From the left to the right, we show orientation fields based on the expansion coefficients of three different types of features: a Gauss–Laguerre, b Gabor and c SHOG

The STA-toolbox provides two categories of covariant features: projection-/convolution-like features and distribution-/ histogram-like features. The first group of features is introduced in Sect. 3.2. Details for the latter features will be given in this section. All features are computed densely on the whole volume. Dense feature maps occupy a lot of memory, and their computation is expensive. However, the features we propose here can be computed efficiently with the local spherical derivative and spherical product operators.

5.1.1 Projection Features

Projections on spherical tensor-valued kernels with differential relationship can be realized efficiently via spherical derivatives, see Sect. 3.2.1. For the spherical derivatives, by default, the toolbox uses finite differences of second-order accuracy. They offer the best trade-off regarding computation time and accuracy. For higher-order harmonics, we provide higher-order approximations as well, see Fig. 15 for a comparison regarding time and accuracy.

The function \({\texttt {sta\_steerFilt}}\) is an efficient implementation for projections of the kind \(\mathbf {a}^j:=(\varvec{{\mathcal {K}}}^j* I)\), where \(\varvec{{\mathcal {K}}}^j\) is a kernel with differential relationship according to Sect. 3.2.1.

Gaussian Derivatives Simple spherical Gaussian derivatives can be computed with

figure p

The parameter L is the cutoff of the expansion, see Sect. 2.3. The parameter sigma determines the width of the Gaussian. For further details regarding spherical Gaussian derivatives, see [81].

Gauss–Laguerre The Gauss–Laguerre coefficients can be computed with

figure q

The parameter pair [sigman] determines the width of the Gaussian and the polynomial order of the radial Laguerre polynomial, respectively.

Gabor Functions For the Gabor coefficients, we just change the kernel parameter. For instance,

figure r

computes the projection of a Gabor with a radial frequency k. The parameter sigma determines the scale of the kernel. The third parameter, s, determines the Gaussian window size of the kernel with respect to the wave frequency k. For details, see [97].

5.1.2 Histogram Features

Histogram of oriented gradients features (HOG) [26]) are a widely used family of discriminative image patch descriptors. HOG features represent the gradient orientation distribution in a local image patch. In their original form, they are not rotation covariant. An extension to rotation covariant 3D representations, which we call SHOG, has been proposed in [70, 101].

SHOG (Spherical HOG) Let \(\mathbf {g} = \nabla I\) be the gradient image of an image I. Let further \(G_s:{\mathbb {R}}^3 \rightarrow {\mathbb {R}}\) be a radial symmetric window function (here a Gaussian). Then, the orientation functions of the orientation field

$$\begin{aligned} f\{I\}(\mathbf {x},\mathbf {n}) := \int _{{\mathbb {R}}^3} { ||\mathbf {g}(\mathbf {r}) ||}^\gamma \delta ^2_{\mathbf {n}}(\hat{\mathbf {g}}(\mathbf {r}))G_s(\mathbf {x}-\mathbf {r}) d \mathbf {r} \end{aligned}$$
(48)

are representing the local occurrence of gradient orientations, weighted by their magnitude. The function \(\hat{\mathbf {g}}:={\mathbf {g}}/{{ ||\mathbf {g} ||}}\) is the normalized gradient direction field of \(\mathbf {g}\). With the Dirac delta function \(\delta ^2_{\mathbf {n}}:S_2\rightarrow {\mathbb {R}}\) (see (69) in appendix), we mask out all gradients in the gradient image \(\mathbf {g}\) with a different orientation than \(\mathbf {n}\). With the parameter \(\gamma \le 1\), the influence of large outliers in the gradient magnitudes can be reduced, and edges with low signal can be enhanced.

The fields of spherical tensor expansion coefficient of \(f\{I\}\) (see (17)) can be computed with

$$\begin{aligned} \mathbf {a}^j=\left( \left( { ||\mathbf {g}(\mathbf {r}) ||}^\gamma {({\mathbf {Y}^j(\hat{\mathbf {g}})})}\right) * G_s\right) ; \end{aligned}$$
(49)

for details see (5.6.1) in [97].

The function \(f\{I\}\) maps gradient orientation information onto a spherical function in a nonlinear fashion. Hence, a fast projection in terms of derivatives, like for the kernels with differential relationship, is not possible. However, based on the fact that higher-order spherical harmonics can be computed in terms of lower order harmonics, see (64) in appendix, we have developed a recursive algorithm for an efficient mapping, see Fig. 14.

The corresponding toolbox function is the \({\texttt {sta\_shog}}\) function. A function call is

figure s

This example computes the expansion coefficients up to order L of local SHOG features in a Gaussian window of width s.

Fig. 14
figure 14

Based on the gradient field \(\mathbf {g}:{\mathbb {R}}^3\rightarrow {\mathbb {R}}^3\) of an image, the expansion coefficients of local gradient orientation distributions can be computed in a recursive manner. In this sketch, \(\hat{\mathbf {g}}:=\frac{\mathbf {g}}{{ ||\mathbf {g} ||}}\) is the gradient direction field of \(\mathbf {g}\); also note that \({ ||\mathbf {g} ||}\mathbf {Y}^1(\hat{\mathbf {g}})\) coincides with the spherical derivative \(({\varvec{\partial }}\circ _{\!1}I)\) which can be computed with the toolbox using the spherical derivative operator. (i) Higher-order representations of the orientation fields, (ii) local weighting of the higher-order direction fields with the gradient magnitudes, and (iii) aggregation of contribution from neighboring voxels using a radial symmetric window function \(w:{\mathbb {R}}^3\rightarrow {\mathbb {R}}\)

Copy&Paste Example The following example shows how to computes covariant features and how to visualize the corresponding orientation field. Figure 13 shows examples of the visualization of orientation fields of various features.

figure t

5.2 Rotation Invariant Descriptor Fields

In volumetric images, objects like cells, neurites, blood vessels, parts of tissue or bones may appear in any orientation. In contrast to 2D, where a planar rotation has only one degree of freedom, a 3D rotation has three of them. As a consequence, operations that are feasible in 2D, like sliding matched filters, are expensive in 3D so that invariant features play an important role in 3D. Rotation invariance can either be achieved by normalization or by group integration. In fact, both operations can be reduced to spherical tensor products [97]. As mentioned, the STA-toolbox provides functionality to densely compute spherical tensor products. With tensor products, the toolbox can compute angular power- and bi-spectra from local image patches, which we have introduced in Sect. 3.3.

Descriptor fields of spherical spectra have been used in applications for cell detection [93, 103], but also for fetal brain and thorax analysis in MRI [59, 60] and feature-based tissue classification in HARDI [94, 99]. An n-dimensional descriptor image is a mapping \(\mathbf {d}: {\mathbb {R}}^3 \rightarrow {\mathbb {R}}^n\), where the value transform under rotation is the identity transform, namely

$$\begin{aligned} (g \mathbf {d})(\mathbf {x}):=\mathbf {d}(\mathbf {U}(g)^T \mathbf {x}) ~. \end{aligned}$$
(50)

The n-dimensional vectors \(\mathbf {d}(\mathbf {x})\) in \(\mathbf {d}\) are called local (rotation invariant) image descriptors. Given an image descriptor \(\mathbf {d}(\mathbf {x})\), any kind of classifier is able to perform a segmentation or classification tasks in a rotation invariant manner. The toolbox provides the optimized function sta_invrts to either compute descriptor fields based on the power spectra,

$$\begin{aligned} d_j:=p\{\mathbf {a}^j\},&\end{aligned}$$
(51)

or the bi-spectra

$$\begin{aligned} d_k:=b\left\{ \mathbf {a}^{j_1},\mathbf {a}^{j_2},\mathbf {a}^{j_3}\right\} , \end{aligned}$$
(52)

from the expansion coefficients of an orientation field. This function automatically takes care of product associativities, see Sect. 3.3.1.

HARDI Power-Spectrum Descriptors Given a high angular resolution diffusion image (HARDI) [112]. Such a measurement is essential for in vivo fiber tracking [85], see Sect. 4.2. For fiber tracking, it is important to identify voxels which corresponds to white matter or gray matter tissue. A typical approach is to co-register with mask images that have been registered to a T1 image. A problem is that typical HARDI images have a low spatial resolution and are suffering from strong noise and distortion artifacts. This makes the co-registration error-prune. The HARDI signal is a symmetric orientation field with an equidistant sampling of the angular space. The symmetry implies that only the even coefficients of a spherical harmonic expansion contribute to the expansion. For HARDI signals, the expansion (17) simplifies to

$$\begin{aligned} f(\mathbf {r}, \mathbf {n}) = \sum _{j=0}^\infty \mathbf {a}^{2j}(\mathbf {r})^\top \mathbf {Y}^{2j}({\mathbf {n}}). \end{aligned}$$
(53)
Fig. 15
figure 15

Accuracy and computation time of tensor derivatives. For further details regarding the experiment, we refer to section 4.2.1 in [97]. a Qualitative comparison. (a) Second-order and (b) fourth-order approximation. GT is the explicitly computed harmonic. [(The center slice of a \(112^3\)volume is shown.) b Accuracy (Normalized cross-correlation with explicitly computed harmonics (GT).] (c) Computation speed (Intel Quad Core i5-2400 CPU @ 3.1 GHz)

Fig. 16
figure 16

A voxel classification of a HARDI signal into four classes: background (BG), cerebrospinal fluid (CSF) and brain gray/white matter tissue (GM/WM). We show results for four different test images. On the left, we show the classification results based on the features proposed by Schnell et. al. On the right, the results of an extension with local derivatives

Invariant features based on a tensor decomposition of the HARDI signal have been explored quite frequently in the literature; see for instance [46, 48, 51, 72, 95]. In [94], a learning-based approach was introduced. The power spectrum of the spherical harmonic expansion coefficients of the HARDI signal has been used to classify tissue into brain matter, white matter and the background. With the STA-toolbox, these features can be computed in only two steps: first the conversion of the HARDI array representation into an stafieldStruct, and then computing the invariants.

figure u

On the left side of Fig. 16, we show the label predictions of a random forest [14] on a test set. The forest has been trained on five images using the power spectrum of the HARDI signal.

As shown in Fig. 16, this approach leads to a noisy, discontinued segmentation of the HARDI images. HARDI features can easily be enriched with local wavelet features which improves the results. The idea is that not only the raw signal itself, but also rotation invariant representations of local neighborhood descriptors are included into the feature vector. For this example, we use the function sta_wavelet. The function sta_wavelet can project, in contrast to sta_steerFilt, not only images, but also higher-order spherical tensor fields \(\mathbf {f}^i\in {\mathcal {T}}_i\) to expansion coefficients of an orientation function. In our example here, we map the coefficients from (53) to new coefficients \(\{\mathbf {b}^{j(0)},\cdots ,\mathbf {b}^{j(i)}\}\), where

$$\begin{aligned} \mathbf {b}^{\ell (j)}:=\left( {\varvec{\partial }}^{{ |\ell -j |}} \circ _{\ell } (\mathbf {a}^j* G_\sigma )\right) . \end{aligned}$$
(54)

We then form power spectrum-based descriptor images from those coefficients. For details regarding this projection, see section 5.1.3.1 in [97].

Fig. 17
figure 17

With only two lines of code, we can compute the band-limited power spectrum \(p\{\mathbf {a}^j\}\) features of the HARDI signal (left column). This can be used for tissue classification [94]. We can improve the classification performance by including voxel neighborhood information into the descriptor image. Therefore, we compute the power-spectrum features of neighborhood descriptor fields \(\mathbf {b}^{\ell (j)}\) in three scales \(\sigma =\{0.5,3,6\}\). The green rectangle indicates the power spectrum feature images with \(j=\ell \). Note that we masked out the air (black) for a clearer visualization

In the following example, we extended the HARDI features with the orientation fields of these wavelet features in three different scales (\(\sigma =0.5, 1\) and 6).

figure v

Figure 17 shows the feature images. Thanks to this additional information, particularly the gray matter could be identified more precisely, as shown on the right-hand side of Fig. 16.

Copy&Paste Example The following example shows how to extract rotation covariant features, create rotation invariant features to form a descriptor image and finally to train and apply a classifier \(\Gamma \) to predict labels in an unseen image. The procedure is illustrated in Fig. 18.

Fig. 18
figure 18

Pipeline for a label prediction task based on rotation invariant descriptor fields. (See the copy & paste example of Sect. 5.2 for an implementation example). On the left the pair of image and labels for training. On the right-hand side: the prediction of an unseen image

figure w
Fig. 19
figure 19

A filter has been trained on two images A and B to detect four types of landmarks in T1-weighted images of human brains. Each column corresponds to one landmark. The top row shows the binary-valued label images, which mark the landmarks, on top of the input training images. The landmarks are shown in X, Y and Z-slices, centered at the landmark center. The two rows at the bottom show results on unseen test images C and D. We show the voting map of the filter in top of the input image. In blue, we show the maximum intensity projections of the voting image V

5.3 Steerable Voting

The STA-toolbox implements the voting, dual to sta_steerFilt, by sta_voteFilt (see Sect. 3.4).

There exists a high-level API, which combines the projection and the voting into a trainable filter. It implements filters presented in [81, 91, 100,101,102]. Trainable filters are a family of steerable filters. A common characteristic of these filters is that the shape of the voting kernel can be adapted to a specific 3D object detection task in an image context-dependent manner. The input is an image \(I\in {\mathcal {T}}_0\), the output a saliency map \(V\in {\mathcal {T}}_0\) for the presence of 3D objects or shapes.

For example, a training step, where we have given pairs of images and binary label images of the form \(\{\)img1,label_img1\(\}\). With the training data, we can train a harmonic filter [81] with

figure x

We have trained the filter on a landmark detection task in T1 weighted MR images; similar to [100], where they used HARDI data. For the filter, we used second-order covariant features as well as odd third-order covariant features. Due to the reflection symmetry in the brain, odd third-order features are necessary to distinguish the right and left hemisphere. We applied the filter to unseen images with

figure y

The results are shown in Fig. 19.

Copy&Paste Example The following example shows how to train and apply a voting filter. The work flow of the voting is illustrated in Fig. 20.

Fig. 20
figure 20

The trainable voting filter is some kind of trainable steerable filter. The coefficients \(\mathbf {a}^j\) are representing local image features of an image I. The coefficients \(\mathbf {b}^j\) determine the shape of local voting kernels that are rendered into a saliency map V in an image content dependent manner. The weights \(w^j_{(j_1,j_2)}\) are the free parameters. In a training step, the weights are determined so that local patches determine the shape of V. The processing step in the middle, here the products, has to be a rotation covariant, in order to get good results, also nonlinear, step. For the copy&paste example, we trained three sets of parameters for the three different kinds of objects

figure z

5.4 Steerable Deconvolution

Usually, for low-level feature detection, a bank of steerable filters is correlated with an image to obtain an orientation field [3, 5, 42]. The orientation field represents the evidence for the presence of simple structures like lines, ridges or edges in a position- and direction-dependent manner. The filter kernels are typically axial symmetric kernels. In Fig. 21, we have visualized examples of such kernels. The STA-toolbox implements the sta_steerFilt command, which can correlate an image with a variety of predefined kernels. In the STA-toolbox, a kernel is represented in terms of spherical tensor-valued basis functions \(\varvec{{\mathcal {K}}}^j\). As a consequence, the correlation coincides with projections of the form \( \mathbf {a}^j=(\varvec{{\mathcal {K}}}^j* I)\); it is an extraction of covariant features (Sects. 5.1.1 and 3.2).

Fig. 21
figure 21

The toolbox provides a large set of predefined kernels for image filtering and low-level feature detection. This includes kernels for tube detection, 3D ridge detection and surface detection. The bottom row shows the corresponding parameters for the steerable filter function sta_steerFilt

In many low-level feature detection tasks, simple structures like tubular-shaped blood vessel or neurites are corrupted by noise and artifacts. In such cases, a simple linear filter may provide a noisy and corrupted orientation field as well. Steerable deconvolution provides a way to obtain a smooth, edge-preserving orientation field from corrupted image data. For the toolbox, we have extended the existing steerable deconvolution in [86] to 3D.

Steerable filters and steerable deconvolution are using the same kind of filter bank. The difference between a steerable filter and the steerable deconvolution is the way we obtain the underlying orientation field. In case of the steerable filter, it is a projection (Sect. 3.2). In the case of steerable deconvolution, we use the dual operation, the voting (Sect. 3.4). Let \(V_f\) be a voting field \(V_f(\mathbf {r})=\int _{{\mathbb {R}}^3}\sum _{j=0}^\infty (\mathbf {a}^j(\mathbf {x}))^T \mathbf {\varvec{{\mathcal {K}}}}^j(\mathbf {r}- \mathbf {x}) d \mathbf {x},\) as defined in (41). Then, in steerable deconvolution, we search the coefficients \(\mathbf {a}^j\) of the orientation field f (see (17)) which minimize the energy function

$$\begin{aligned} J_\text {vote}(f):=\int _{{\mathbb {R}}^3} | V_f(\mathbf {r}) - I(\mathbf {r}) |^2 \mathbf {d} r. \end{aligned}$$
(55)

That is, steerable deconvolution interprets low-level feature detection as an inverse problem: A steerable filter bank maps a hidden ’orientation’ function f onto an observed intensity image I. The corresponding optimization is a deconvolution problem. As the problem is highly underdetermined, a regularization is necessary. The spherical diffusion generator, which we introduced in Sect. 3.5, is an ideal candidate to do so. That is, we want to find an orientation function f such that

$$\begin{aligned} J(f)=J_\text {vote}(f)+ \lambda \iint \limits _{{\mathbb {R}}^3 \times S_2} \left( \mathbf {n}\cdot \nabla f(\mathbf {r},\mathbf {n})\right) ^2\ \hbox {d}\mathbf {r} \hbox {d}\mathbf {n} \end{aligned}$$
(56)

is minimized; an advantage of this ’inverse’ approach is that spurious correlation can be suppressed. For example, in a line detection task, crossings can be cleanly resolved, and contributions at intermediate angles are avoided. The STA-toolbox provides a simple API to apply steerable deconvolution to lines, edges and planes.

Fig. 22
figure 22

Varying intensities, gaps, blob-like structures and crossings of presynaptic sites in an image of dopaminergic neurites in a drosophila brain make the tracing of neurites difficult. The orientation field from higher-order steerable deconvolution smoothes the neurites in an anisotropic manner and provides the information to resolve the crossings. The images show the estimation of the orientation field of the neurites using steerable deconvolution. Upper row the orientation field based on a band-limited expansion up to order two, four and six. Bottom row A maximum intensity projection of the orientation field’s maximum responds along z-direction of an \(128 \times 128 \times 40\) fraction of a 3D micrograph (left), and a projection along y of an \(128 \times 40 \times 128\) fraction (right)

In Figs. 24 and 25, we show quantitative and qualitative results for both steerable filters and the steerable deconvolution approach.

Figure 22 shows an example obtained for neurites in a drosophila fly brain. The results are obtained via

figure aa

which uses the even part of a Gauss–Laguerre expansion \(\varvec{{\mathcal {L}}}^j_n\) restricted to \(n=0\) up to order \(L=6\). One can nicely see how with increasing expansion order L the resulting orientation functions, particularly at crossings, become sharper.

Copy&Paste Example This example loads and performs the steerable deconvolution on the image from Fig. 1.

figure ab
Fig. 23
figure 23

Application of spatially regularized spherical deconvolution: a, b examples on a numerical phantom, on the one hand without regularization (a) and with regularization (b) for low-quality data (DTI). dg Other, more simple numerical phantoms for a crossing with the raw data in d, f and deconvolution results regularized by FC (e) and AFC (g). Finally, in c a in-vivo example is given. The background shows the so-called fractional anisotropy (normalized standard deviation of eigenvalues of the diffusion tensor) in gray scale

5.5 High Angular Resolution Diffusion Imaging and Spherical Deconvolution

In diffusion-weighted magnetic resonance imaging (MRI) exist high-quality schemes, like high angular resolution diffusion imaging (HARDI), where more than 60 directions per voxels are acquired, which allow to estimate so-called fiber orientation distributions (FOD). There also exist low-quality datasets, suitable for clinical routine, that allow just the estimation of a diffusion tensor (DTI). To obtain the FOD, a physical diffusion model has to be inverted. The diffusion generator used in the steerable deconvolution is perfectly suited for the regularization of the inversion. It is used quite frequently, for example in [32, 33, 43, 44, 84]. Basically, the goal is to find a FOD f such that

$$\begin{aligned} J(f) =&\iint \limits _{{\mathbb {R}}^3 \times S_2} \left| (\mathbf {H}f)(\mathbf {x}, \mathbf {n}) - S(\mathbf {x},\mathbf {n}) \right| ^2 \ \hbox {d}\mathbf {x} \hbox {d}\mathbf {n} \nonumber \\&+ \lambda \iint \limits _{{\mathbb {R}}^3 \times S_2} (\mathbf {n}\cdot \nabla f)^2\ \hbox {d}\mathbf {x} \hbox {d}\mathbf {n} \end{aligned}$$
(57)

is minimized, where S denotes the MRI measurement. \(\mathbf {H}\) denotes the physical diffusion model, which is usually a spherical convolution, i.e., a convolution on the sphere with a rotationally symmetric kernel. The second term is the regularizer termed in [33] as contour enhancement kernel, or in [84] as fiber continuity (FC). The STA-toolbox provides an easy access to the commonly available kernels and methods. It further provides a positively constrained spherical deconvolution similar to [107]. The following simple example contained in the toolbox creates test data of a 45 fiber crossing configuration and solves the inversion problem:

figure ac

The result of this example is shown in Fig. 23. Several different spatial and spherical regularization kernels are possible, like the Laplace–Beltrami operator, or the asymmetric contour enhancement (AFC) proposed in [83], which is an extension of the diffusion generator that is used in the example above. It introduces asymmetric orientation features due to intra-voxel curvature of fibers. It can be used via

Fig. 24
figure 24

We randomly created 850 crossings, equally divided into 17 categories. Each category contains two tubular lines which are crossing in a certain angle. The angles are ranging from 10\(^{\circ }\) to 90\(^{\circ }\) with a step width of 5\(^{\circ }\). We corrupted the images with a strong salt-and-pepper-like noise which removed parts from the lines. We additionally added 15% Gaussian noise to the data. We evaluated the orientation functions of three types of tube detection filters at the center. The two largest local maxima of the orientation function were extracted. The error was the mean of the displacement of both maxima in orientation space compared with the “true” line directions. The plot shows the mean error over all samples with respect to the crossing degree. We also show the standard deviation as error bars. SDL8 is a steerable deconvolution filter of order eight. SFL8 a steerable filter of order 8, and SFL2 a standard steerable filter of order two [3]. As expected, an order two filter cannot resolve crossings

Fig. 25
figure 25

We show the maximum intensity projection (MIP) of a generated 3D tubular network as it appears in images of neurites or blood vessels. We have corrupted the image with noise. We then applied steerable deconvolution and a steerable filter [3]. The bottom row shows the best results (MIP); we could obtain for both approaches

figure ad

In Fig. 23, we show qualitative results. Figure 23a, b shows results on a phantom used on the ISBI HARDI reconstruction challenge in 2013. It was created with the Numerical Fiber Generator described in [19]. We show results of a non-regularized spherical deconvolution [107] in comparison with a FC-regularized deconvolution on a low-quality DTI dataset (SNR \(= 10\), #direction \(= 32\)). Figure 23d,e shows a more simple crossing example, Fig. 23f,g a bending example using AFC. Finally, in Fig. 23c an in vivo example is shown (61 gradient directions at a b-value of 1  ms/\(\upmu \)m\(^2\) at a resolution 2 mm\(^3\) acquired at a Siemens Tim Trio 3T).

6 Conclusion

The main contribution of this article is the provision of an open-source toolbox which implements all the proposed operations involved in spherical tensor algebra. We highlighted the relationship between Cartesian and spherical tensors and introduced the fundamental spherical tensor operations which, in our opinion, are of high value for the image processing community. We complemented the theory with new, implementation relevant insights into the symmetries of spherical bi-spectra. In a survey, we have shown several applications where STA shows implementational advantages over common Cartesian tensor implementations and discretization of the two spheres.

With this toolbox, we want to foster the usage of STA in the context of 3D image processing. The user can avoid the gritty details of the mathematical intricate implementation of spherical tensors operations. The fundamental operations, differentiation and multiplication of spherical tensors are efficiently implemented in C++, OpenMP and CUDA. The API is easily accessible via C++ or MATLAB/OCTAVE. New applications and/or extensions from the community are always welcome!