Keywords

1 Introduction

Manifold-valued data arises in many medical applications, for example as image data or in the form of 2D/3D shapes, and sophisticated tools for its analysis have become increasingly important. Regression methods are central to modern statistics and research for their applicability to nonlinear spaces is fuelled by an ever-growing number of large longitudinal studies  [11]. Consequently, geodesic regression  [9, 20] was introduced as a generalization of linear regression. It allows to test whether given instances in a Riemannian manifold can be well approximated by a generalized straight line. Nevertheless, there are processes that cannot be accurately described by a geodesic, e.g., periodic motion or processes with saturation which slow down after some time. In order to handle these cases, both non-parametric  [8, 17, 23] and parametric models have been studied. In the latter category, Riemannian polynomials  [14] and splines  [22] have been considered for nonlinear regression. They are defined, for example by employing variational principles, as solutions to differential equations involving curvature terms. Therefore, evaluation and optimization is complicated and numerically expensive since there are no closed-form solutions available in general.

As an alternative, we propose to use manifold-valued Bézier curves  [18, 21]. They coincide with polynomial curves in Euclidean space, are intrinsic to the manifold (i.e., independent of a choice of coordinates) and more flexible than geodesics. In contrast to Riemannian polynomials, Bézier curves allow for explicit formulas, which enables us to evaluate them directly without time-discretization. This can improve computational speed without suffering a loss of accuracy. Furthermore, we can combine two such curves to a differentiable spline independently of the degrees of the Bézier segments. This is again an advantage over polynomial curves. While variational spline models allow for piecewise composition, there is no clear way to define them for even degrees  [14]. Therefore, the introduction of flexible, intrinsic splines is a key contribution of this work. Our model features closed-form, numerically stable and efficient expressions for the gradient of the regression objective in terms of concatenated adjoint Jacobi fields  [6]. In particular, we derive an algorithm that only requires basic Riemannian operations: the exponential and logarithmic map as well as certain Jacobi fields. Notably, closed-form expressions for these operations are available for many manifolds; in particular, they are known for Kendall’s shape space  [19] and shape models based on differential  [25] and fundamental  [3] coordinates.

While the method can be generally applied to data on any manifold, we provide two specific examples from shape analysis. First, we regress the data of 100 highly resolved femur geometries with different severeness of osteoarthritis against their grade in the Kellgren Lawrence grading system. Second, we reconstruct the full motion cycle of a mitral valve from 3D geometries that were derived from ultrasound images. To the best of our knowledge, we are the first to present intrinsic regression results of such a periodic process.

2 Spline Regression

Tools from Riemannian Geometry. Before we introduce Bézier curves on manifolds we recall some important facts from Riemannian geometry; for more information see for example [7]. As is often done, we use “smooth” synonymously with “infinitely often differentiable”.

A Riemannian manifold is a differentiable manifold M together with a Riemannian metric \(\langle \cdot ,\cdot \rangle _p\) that assigns to each tangent space \(T_p M\) a smoothly varying scalar product. As a result, a distance function d is induced on M. Every Riemannian manifold comes with a unique connection \(\nabla \) called Levi-Civita connection. Given two vector fields XY on M it yields a natural way to differentiate Y along X; we denote the resulting vector field by \(\nabla _X Y\).

A geodesic \(\gamma \) is a generalized straight line and its defining property is vanishing of acceleration, i.e., \(\nabla _{\gamma '}\gamma ' = 0\), where \(\gamma ' := \frac{{\text {d}}}{{\text {d}}t} \gamma \). An important fact is that every point in M has a so-called convex neighbourhood U. Each pair \(p,q \in U\) can be joined by a unique length-minimizing geodesic \([0,1] \ni t \mapsto \gamma (t;p,q)\) that lies completely in U. In the following, we always assume to work in a convex neighbourhood. Then, \(\gamma \) is also differentiable with respect to its starting and end point. Explicit formula of these differentials involve the Riemannian curvature tensor R (which intuitively measures local deviation from flat space; see [7, Ch. 4]). It determines Jacobi fields J along \(\gamma \) as solutions to the linear second order differential equation \(\nabla _{\gamma '} \nabla _{\gamma '}J + R(J,\gamma ')\gamma ' = 0.\) Considering the boundary value problem \(J(0) = X\), \(J(1) = 0\), we denote its solution by \(J_X\). Then, the derivative of \(\gamma \) w.r.t. its starting point p in direction \(X \in T_p M\) is given by \(J_X\), i.e., \(d_p\gamma (t;\cdot ,q)(X) = J_X(t)\) for all \(t\in [0,1]\). Furthermore, since \(\gamma (t;p,q) = \gamma (1-t;q,p)\), endpoint variations are given analogously  [6, Sect. 3.1].

Another important map is the Riemannian exponential. Let \(X \in T_pM\) such that there is a geodesic \([0,1] \ni t \mapsto \gamma (t;p,q)\) in U with \(X = \gamma '(0;p,q)\). The exponential map at p is then defined by \(\exp _p(X) := q\). Its inverse is the Riemannian logarithm \(\log _p\). In particular, we have \(\log _p(q) = \gamma '(0;p,q)\).

The adjoint \(A^*\) of a linear operator \(A:T_pM \rightarrow T_qM\) is given, as usual, by the linear operator from \(T_qM\) to \(T_pM\) that conserves the scalar product, i.e., \(\langle AX,Y \rangle _q = \langle X,A^*Y \rangle _p\) for all \(X \in T_pM,\ Y \in T_qM\).

Later, we want to calculate the gradient of a composition of functions. If \(f:M \rightarrow M\) and \(g:M \rightarrow \mathbb {R}\) are smooth, then the chain rule for gradients reads \(\text {grad}_p (g \circ f) = {d_p f}^*(\text {grad}_{f(p)} g)\), i.e., the gradient of g at f(p) is “transported” to the tangent space at p by the adjoint differential of f.

Bézier Curves. In the following we restrict the domain of definition to [0, 1] for clarity. This does not influence generality as reparametrizations are always possible. In particular, geodesics \(\gamma \) can be defined on arbitrary intervals by changing the speed of travel, i.e., the length of the velocity vector.

Fig. 1.
figure 1

Left: A cubic Bézier curve \(\beta \) on the sphere \(\mathcal {S}^2\) and the construction of \(\beta (2/3)\) by the de Casteljau algorithm. Right: A Bézier spline with 3 cubic segments on \(\mathcal {S}^2\).

A set of \(k+1\) control points \(p_0,\dots ,p_k \in U\) defines a Bézier curve \(\beta :[0,1] \rightarrow M\) of order k according to the generalized de Casteljau algorithm

$$\begin{aligned} \beta _i^0(t)&:= p_i, \nonumber \\ \beta _i^l(t)&:= \gamma (t;\beta _i^{l-1}(t),\beta _{i+1}^{l-1}(t)), \quad l=1,\dots ,k, \quad i=0,\dots ,k-l, \end{aligned}$$
(1)

by \(\beta (t) := \beta _0^k(t).\)

Note that \(\beta (0) = p_0\) and \(\beta (1) = p_k\). Furthermore, the velocities of \(\beta \) at these points are

$$\begin{aligned} \beta '(0) = k \log _{p_0}(p_1) \text { and } \beta '(1) = -k \log _{p_k}(p_{k-1}); \end{aligned}$$
(2)

see [21, Thm. 1]. The algorithm is visualized on the left of Fig. 1. Whenever of interest, we will make the dependence of \(\beta \) on its control points explicit by writing \(\beta (t;p_0,\dots ,p_k)\). Note that if there are only 2 control points \(p_0,p_1\), then \(\beta \) is just the geodesic from \(p_0\) to \(p_1\). In Euclidean space, the above algorithm is the ordinary de Casteljau algorithm (because there geodesics are straight lines) and it is a well known fact that then \(\beta \) is a curve with polynomials of order at most k as entries.

Property (2) allows us to fit Bézier curves of possibly different orders together to a differentiable spline. For \(i=0,\dots ,L-1\) let \(p^{(i)}_0,\dots ,p^{(i)}_{k_i}\) be the control points of L Bézier curves such that

$$\begin{aligned} p^{(i)}_{k_i} = p^{(i+1)}_0 \quad \text {and} \quad \gamma \left( \frac{k_i}{k_i + k_{i+1}};p^{(i)}_{k_i-1},p^{(i+1)}_1 \right) = p^{(i+1)}_0 \end{aligned}$$
(3)

for all \(i=1,\dots ,L-2\). Then, we define the Bézier spline B by

$$\begin{aligned} B(t) := {\left\{ \begin{array}{ll} \beta (t;p^{(0)}_0,\dots ,p^{(0)}_{k_0}), &{}\quad t \in [0,1],\\ \beta (t-i;p^{(i)}_0,\dots ,p^{(i)}_{k_i}), &{}\quad t \in (i,i+1], \quad i=1,\dots ,L-1. \end{array}\right. } \end{aligned}$$
(4)

From (2) it follows that B is \(C^1\), i.e., we can make B differentiable by aligning the three control points at the connections thereby removing one degree of freedom. For more details see [13, Sect. 2.3]. Note that we could add further restrictions to ensure that B is \(C^2\) [21, p. 119].

If \(L > 1\) and the first and last segment of B are at least cubic, we can consider closed Bézier splines. Then, B is \(C^1\) and closed if and only if (3) extends cyclically, that is, we also have

$$p^{(L-1)}_{k_{L-1}} = p^{(0)}_0 \quad \text {and} \quad \gamma \left( \frac{k_{L-1}}{k_{L-1} + k_{0}};p^{(L-1)}_{k_{L-1}-1},p^{(0)}_1 \right) = p^{(0)}_0.$$

In the following, we set

$$\begin{aligned} K := {\left\{ \begin{array}{ll} k_0+k_1+\cdots +k_{L-2}+k_{L-1}, &{}\quad B \text { non-closed,} \\ k_0+k_1+\cdots +k_{L-2}+k_{L-1}-1, &{}\quad B \text { closed,} \end{array}\right. } \end{aligned}$$

and denote the set of \(K+1\) distinct control points of B by \(p_0,\dots ,p_K\). In the non-closed case this means

$$(p_0,\dots ,p_K) := \left( p^{(0)}_0,\dots ,p^{(0)}_{k_0},p^{(1)}_1,\dots ,p^{(1)}_{k_1},\dots ,p^{(L-1)}_{1},\dots ,p^{(L-1)}_{k_{L-1}} \right) \in M^{K+1},$$

while \(p_0^{(0)}\) is left out for closed B. An example of a \(C^1\) spline with three cubic segments and 10 distinct control points is shown on the right of Fig. 1.

The Model. Let N data points \(q_i \in U\) with corresponding scalar parameter values \(t_i\) (for example points in time) be given. We suppose that the data points \(q_i\) are realizations of an M-valued random variable Q that depends on the deterministic variable \(t \in \mathbb {R}\) according to the model

$$Q(t) = \exp _{B(t;p_0,\dots ,p_K)}(\epsilon ).$$

Here, \(\epsilon \) is a random variable that takes values in the tangent space \(T_{B(t)}M\). The control points \(p_0,\dots ,p_K\) are the unknown parameters. In Euclidean space it reduces to polynomial spline regression since Bézier curves and polynomials coincide. Note that our model is a generalization of geodesic regression  [9], which it reduces to when B consists of a single segment with 2 control points.

Least Squares Estimation. Given N realizations \((t_j,q_j) \in \mathbb {R} \times U\), the sum-of-squared error is defined by

$$\begin{aligned} \mathcal {E}(p_0,\dots ,p_K) := \frac{1}{2} \sum ^N_{j=1} d \Big (B(t_j;p_0,\dots ,p_K),q_j \Big )^2. \end{aligned}$$
(5)

Then, we can formulate a least squares estimator of the Bézier spline model as the minimizer of this error, which under certain conditions agrees with the maximum likelihood estimation  [9]. We would like to emphasize that none of the control points agrees with the data points as is the case in spline interpolation.

In general, minimizers of (5) are not known analytically, which makes iterative schemes necessary. Therefore, we apply Riemannian gradient descent. (For optimization on manifolds see [1].) The gradient of \(\mathcal {E}\) can be computed w.r.t. each control point individually. We write \(B_t(p_i)\) for the map \(p_i \mapsto B_t(p_i) := B(t;p_0,\dots ,p_i,\dots ,p_k)\) and define the functions \(p \mapsto \tau _j(p) := d(p,q_j)^2\). It is known that \(\text {grad}_p \tau _j = -2\log _p(q_j)\) for each \(p \in U\). When we consider the j-th summand on the right-hand side of (5), the chain rule implies that its gradient w.r.t. the i-th control point is given by

$$\text {grad}_{p_i} (\tau _j \circ B_{t_j}) = d_{p_i}B_{t_j}^* \left( \text {grad}_{B_{t_j}(p_i)} \tau _j \right) = -2\ d_{p_i}B_{t_j}^* \left( \log _{B_{t_j}(p_i)}(q_j) \right) .$$

Using (5) then gives the gradient of \(\mathcal {E}\). The operator \(d_{p_i}B_t^*\) “mirrors” the construction of the segment of B to which \(p_i\) belongs by transporting the vector \(\log _{B_{t_i}(p_i)}(q_j)\) backwards along the “tree of geodesics” defined by the de Casteljau algorithm (1). More precisely, the result is a sum of vectors in \(T_{p_i}M\) that are values of concatenated adjoint differentials of geodesics w.r.t. starting and end point. In symmetric spaces, for example, they are known in closed form. For a detailed inspection of \(d_{p_i}B_t^*\) we refer to  [6, Sect. 4].

As initial guess for the gradient descent, we choose \((p_0,\dots ,p_K)\) along the geodesic polygon whose corners interpolate the data points that are closest to knot points w.r.t. time.

3 Experiments

Although physical objects themselves are embedded in Euclidean space, their shape features are best described by more general manifolds requiring Riemannian geometric tools for statistical analysis thereon; see for example  [4, 16, 25]. To test our regression method for shape analysis, we apply it to two types of 3D data: (i) distal femora and (ii) mitral valves given as triangulated surfaces. We perform the analysis in the shape space of differential coordinates  [25]. That is, for homogeneous objects given as triangular meshes in correspondence, we choose their intrinsic mean  [10] as reference template and view all objects as deformations thereof. (We assume that the meshes are rigidly aligned, e.g., by generalized Procrustes alignment [12].) On each face of a mesh, the corresponding deformation gradient is constant and, therefore, can be encoded as a pair of a rotation and a stretch, i.e., as an element of the Lie group of 3 by 3 rotation and symmetric positive definite matrices \(\text {SO}(3) \times \text {Sym}^+(3)\). Denoting the Frobenius norm by \(\Vert \cdot \Vert _F\), metrics are chosen such that the distance functions become \(d_{SO}(R_1,R_2) := \Vert \log (R_1^TR_2) \Vert _F\) and \(d_{\text {Sym}^+}(S_1,S_2) := \Vert \log (S_2)-\log (S_1) \Vert _F\), respectively. Suppose the number of triangles per object is m, then the full shape space is the product space \((\text {SO}(3) \times \text {Sym}^+(3))^m\). Using the product metric, statistical analysis thereon can be done face-wise and separately for rotations and stretches. We implemented a prototype of our method in MATLAB using the MVIRT toolbox  [5].

Distal Femora. Osteoarthritis (OA) is a degenerative disease of the joints that is, i.a., characterized by changes of the bone shape. To evaluate our model, we regress the 3D shape of distal femora against OA severity as determined by the Kellgren-Lawrence (KL) grade  [15]—an ordinal scale from 0 to 4 based on radiographic features. Our data set comprises 100 shapes (20 per grade) of randomly selected subjects from the OsteoArthritis Initiative (a longitudinal, prospective study of knee OA) for which segmentations of the respective magnetic resonance images are publicly available (https://doi.org/10.12752/4.ATEZ.1.0)  [2]. In a supervised post-process, the quality of segmentations as well as the correspondence of the extracted triangle meshes (8,988 vertices/17,829 faces) were ensured.

Table 1. The computed \(R^2_{\text {rel}}\) and \(R^2\) statistics of the regressed (w.r.t. KL grade) geodesic, quadratic and cubic Bézier curve for data of distal femora

For \(i=0,\dots ,4\), the shapes with grade i are associated with the value . We use our method to compute the best-fitting geodesic, quadratic and cubic Bézier curve. In order to compare their explanatory power, we calculate for each the corresponding manifold-valued \(R^2\) statistic that, for \(r_1,\dots ,r_N \in M\) and total variance , is defined by  [10, p. 56]

The statistic measures how much of the data’s total variance is explained by \(\beta \).

For \(j=1,\dots ,20\) and \(l=0,\dots ,4\), let \(q_j^{(l)}\) be the j-th femur shape with KL grade l. Note that, for the described setup, the unexplained variance is bounded from below by the sum of the per-grade variances, i.e., \(\sum _{l=0}^4\text {var}\{q_1^{(l)},\dots ,q_{20}^{(l)}\}.\) In particular, for our femur data this yields an upper bound for the \(R^2\) statistic of \(R^2_{\textit{opt}} \approx 0.0962\). Hence, we also provide relative values for comparison. The results are shown in Table 1.

Fig. 2.
figure 2

Cubic regression of distal femora. Healthy regressed shape (\(\text {KL}=0\)) together with subsequent grades overlaid wherever the distance is larger than 0.5 mm, colored accordingly (0.5   3.0).

The computed cubic Bézier curve is displayed in Fig. 2. The obtained shape changes consistently describe OA related malformations of the femur, viz., widening of the condyles and osteophytic growth. Furthermore, we observe only minute bone remodeling for the first half of the trajectory, while accelerated progression is clearly visible for the second half. The substantial increase in \(R_{\text {rel}}^2\) suggests that there are nontrivial higher order phenomena involved which are captured poorly by the geodesic model. Moreover, as time-warped geodesics are contained in the search space we can inspect time dependency. Indeed, for the cubic femoral curve the control points do not belong to a single geodesic, confirming higher order effects beyond reparametrization.

Mitral Valve. Diseases of the mitral valve such as mitral valve insufficiency (MI) are often characterized by a specific motion pattern and the resulting shape anomalies can be observed (at least) at some point of the cardiac cycle. In patients with MI, the valve’s leaflets do not close fully or prolapse into the left atrium during systole. Blood then flows back lowering the heart’s efficiency.

We compute regression with Bézier splines for the longitudinal data of a diseased patient’s mitral valve. Sampling the first half of the cycle (closed to fully open) at equidistant time steps, 5 meshes (1,331 vertices/2,510 faces) were extracted from a 3D+t transesophageal echocardiography (TEE) sequence as described in  [24]. Let \(q_1,\dots ,q_5\) be the corresponding shapes in the space of differential coordinates. In order to approximate the full motion cycle we use the same 5 shapes in reversed order as data for the second half of the curve. Because of the periodic behaviour, we choose a closed spline with two cubic segments as model and assume an equidistant distribution of the data points along the spline, i.e., we employ as the full set.

Fig. 3.
figure 3

Reconstructed meshes from regression of longitudinal mitral valve data covering a full cardiac cycle. The spline consists of two cubic segments.

The regressed cardiac trajectory is shown in Fig. 3. Our method successfully estimates the valve’s cyclic motion capturing the prolapsing posterior leaflet. It shows the potential for improved reconstruction of mitral valve motion in presence of image artifacts like TEE shadowing and signal dropout. This, in turn, facilitates quantification of geometric indices of valve function such as orifice area or tenting height.

4 Conclusion

We presented a parametric regression model that combines high flexibility with efficient and exact evaluation. In practice, it can be used for many types of manifold-valued data as it relies only on three basic differential geometric tools that can be computed explicitly in many important spaces. In particular, we have presented two applications to shape data where we could model higher order effects and cyclic motion. A remaining question is, which Bézier spline (number of segments and their order) to choose for the analysis of a particular data set. This problem of model selection poses an interesting avenue for future work. Moreover, we plan to extend the proposed framework to a hierarchical statistical model for the analysis of longitudinal shape data, where subject-specific trends are viewed as perturbations of a population-average trajectory represented as Bézier spline.