Keywords

1 Introduction

One major limitation of Magnetic Resonance Imaging (MRI) is the slow data acquisition process due to the hardware constraint. To accelerate data acquisition, the undersampled k-space is often acquired, which causes aliasing artifacts in the image domain. Reconstruction of high-quality images from the undersampled k-space data is crucial in the clinical application of MRI.

MRI image reconstruction is an inverse problem, in which the undersampling leads to information loss in the forward model and directly recovering the fully sampled image from undersampled data is intractable. Compressed sensing (CS) provides the theoretical foundation for solving inverse problems by assuming the reconstructed image is sparse itself or in certain transformed domains. With the ability to learn complex distributions from data, deep learning has been applied to MRI reconstruction to learn optimal sparse transformations in an adaptive way [17, 19, 27]. Methods such as SToRM [24] GANCS [21], and DAGAN [36] follow such strategy and learn the prior distribution of the image from training data. On the other hand, several studies propose to tackle the inverse problem by learning the direct mapping from undersampled data to fully sampled data in the image domain [18], k-space domain [2, 13], or cross domains [42]. Recent works extend this idea by learning such mapping in an iterative way using cascaded networks [1, 3, 8, 10, 16, 29,30,31, 35], convolutional RNN [26, 34] or invertible recurrent models [25]. Many studies design the networks based on the iterative optimization algorithms used in CS [6, 7, 11, 33, 39].

Ordinary differential equations (ODEs) are usually used to describe how a system change over time. Solving ODEs involves integration, most of which often have no analytic solutions. Therefore numerical methods are commonly utilized to solve ODEs. For example, the Euler method, which is a first-order method, is one of the basic numerical solvers for ODEs with a given initial value. Runge–Kutta (RK) methods, which is a family of higher-order ODE solvers, are more accurate and routinely used in practice.

Neural ODE [4] was introduced to model the continuous dynamics of hidden states by neural networks and optimize such models as solving ODEs. With the adjoint sensitivity method, the neural ODE models can be optimized without backpropagating through the operations of ODE solvers and thus the memory consumption does not depend on the network depth [4]. ANODE [9, 41] further improves the training stability of neural ODEs by using Discretize-Then-Optimize (DTO) differentiation methods. [37] applies ANODE to MRI reconstruction, where the residual blocks in ResNet [14] are replaced with ODE layers. Neural ODE based methods are also applied to image classification [22] and image super-resolution [15].

In this paper, we propose to formulate MRI image reconstruction as an optimization problem and model the optimization trajectory as a dynamic process using ODEs. We model the dynamics in the ODE with a neural network. The reconstructed image can be obtained by solving the ODE with off-the-shelf solvers (fixed solvers). Furthermore, borrowing the ideas from the currently available ODE solvers, we design network structures by incorporating the knowledge of ODE solvers. The network implicitly learns the coefficients and step sizes in the original solver formulation (learned solvers). We investigated several models based on three ODE solvers and compare neural ODE models with fixed solvers and learned solvers. We present a new direction for MRI reconstruction by modeling the continuous optimization dynamics with neural ODEs.

2 Method

MRI reconstruction is an inverse problem and the forward model is

$$\begin{aligned} y = Ex + \epsilon , \end{aligned}$$
(1)

where \(x \in \mathbb {C}^M\) is the fully sampled image to be reconstructed, \(y \in \mathbb {C}^N\) is the observed undersampled k-space and \(\epsilon \) is the noise. E is the measurement operator that transforms the image into k-space with Fourier transform and undersampling. Since the inverse process is ill-posed, the following regularized objective function is often used:

(2)

where \(||y-Ex||_2^2\) is data fidelity term and R(x) is the regularization term . Equation 2 can be optimized with gradient descent based algorithms,

$$\begin{aligned} x^{(n+1)} = x^n - \eta [E^T (Ex^n-y) + \nabla R(x^n)], ~~ \text {for }n=1,\ldots ,N , \end{aligned}$$
(3)

where \(\eta \) is the learning rate.

2.1 ReconODE: Neural ODE for MRI Reconstruction

The iterative optimization algorithm in Eq. 3 can be rewritten as

$$\begin{aligned} x^{(n+1)} - x^n&= f(x^n,y,\theta ), ~~\text {for }n=1,\ldots ,N . \end{aligned}$$
(4)

The left hand side of Eq. 4 is the change of the reconstructed image between two adjacent optimization iterations. This equation essentially describes how the reconstructed image changes during the N optimization iterations. The right hand side of Eq. 4 specifies this change by the function f. However, this change is described in discrete states defined by the number of iterations N. If we consider the optimization process as a continuous flow in time, it can be formulated as

$$\begin{aligned} \dfrac{\mathrm {d}x(t)}{\mathrm {d}t}&= f(x(t),t,y,\theta ). \end{aligned}$$
(5)

Eq. 5 is an ordinary differential equation, which describes the dynamic optimization trajectory (Fig. 1A). MRI reconstruction can then be regarded as an initial value problem in ODEs, where the dynamics f can be represented by a neural network. The initial condition is the undersampled image and the final condition is the fully sampled image. During model training, given the undersampled image and fully sampled image, the function f is learned from data (Fig. 1B). During inference, given the undersampled image as the initial condition at \(t_0\) and the estimated function f, the fully sampled image can be predicted by evaluating the ODE at the last time point \(t_N\),

$$\begin{aligned} x(t_N) = x(t_0) + \int _{t_0}^{t_N} f(x(t),t) dt, \end{aligned}$$
(6)

where y and \(\theta \) are omitted for brevity. An arbitrary time interval [0,1] is set as in [4]. Evaluating Eq. 6 involves solving the integral, which has no analytic solution due to the complex form of f. The integral needs to be solved by numerical ODE solvers.

Next, we will introduce two models that either use the off-the-shelf solvers (fixed solvers) or learn the solvers by neural networks (learned solvers).

Fig. 1.
figure 1

An illustration of MRI reconstruction via modeling the optimization dynamics using neural ordinary differential equations. (A) MRI reconstruction is formulated as an optimization problem. The dynamic optimization trajectory is described by an ordinary differential equation. (B) We can model the dynamics with a neural network and perform image reconstruction as solving the ODE with the off-the-shelf solver. Alternatively, we can replace both the dynamics and the ODE solver with neural networks to implicitly learn the solver and perform image reconstruction.

2.2 ReconODE with Fixed Solvers

The Euler solver is a first-order ODE solver, which employs the discretization method to approximate the integral with a step size h in an iterative way,

$$\begin{aligned} x_{n+1}&= x_{n} + hf(x_{n}, t_n), \end{aligned}$$
(7)

which is often called the step function of a ODE solver. More complicated methods such as RK solvers with higher orders can provide better accuracy. The step function for a general version of RK methods with stage s is,

$$\begin{aligned} x_{n+1}&= x_{n} + \sum _{i=1}^{s}a_i F_i \end{aligned}$$
(8)
$$\begin{aligned} F_1&= hf(x_{n}, t_n) \end{aligned}$$
(9)
(10)

where \(a_i\), \(b_{ij}\) and \(c_i\) are pre-specified coefficients. As an example, for RK2 and RK4 solvers, the coefficients are specified as \(s=2, a_1=a_2=\frac{1}{2}, b_{21}=c_2=1\) and \(s=4, a_1=a_4=\frac{1}{6}, a_2=a_3=\frac{2}{6}, b_{21}=b_{32}=c_2=c_3=\frac{1}{2},b_{43}=c_4=1 \), respectively (all other coefficients are zeros if not specified).

By using one of off-the-shelf ODE solvers to evaluate Eq. 6 numerically, the MRI image can be reconstructed as solving a neural ODE. In our experiment, we model the dynamics f as a CNN (Fig. 2B) with time-dependent convolution (Fig. 2A), in which the time information was incorporated into each convolutional layer by concatenating the scaled time with the input feature maps [4]. To train such models, we can either backpropagate through the operations of solvers (ReconODE-FT) or avoid it with the adjoint sensitivity method [4, 9] to compute the gradient (ReconODE-FA). “F" stands for fixed solvers, “T" and “A" indicate backpropagating through solvers and using the adjoint method, respectively. A data consistency layer [29] is added after the output from the ODE solver.

2.3 ReconODE with Learned Solvers

We now extend the above idea further: instead of using the known ODE solvers directly, we incorporate the knowledge of the solvers into the network and let the network learn the coefficients and step size. We can write the step function in Eqs. 810 as

$$\begin{aligned} x_{n+1}&= x_{n} + G(F_1,...,F_s; \omega )\end{aligned}$$
(11)
$$\begin{aligned} F_1&= G_1(x_{n}, t_n, y; \theta _1)\end{aligned}$$
(12)
$$\begin{aligned} F_i&= G_i(x_{n},F_1,...,F_{i-1},t_n,y; \theta _i), ~~ \text {for }i=2,\ldots ,s, \end{aligned}$$
(13)

where G and \(G_i\) are neural networks with parameters \(\omega \) and \(\theta _i\). The basic building block \(G_i\) is a CNN with five time dependent convolutional layers, which not only learns the dynamics f but also the coefficients and step size in the original solver. Furthermore, since we observe that during optimization more high-frequency details are recovered and in the original solver, \(F_{i+1}\) is evaluated ahead of \(F_i\) in time, we expect \(F_{i+1}\) to recover more details. Thus to mimic such behavior, the network \(G_{i+1}\) has a smaller dilation factor than \(G_{i}\) to enforce the network to learn more detailed information. Based on the knowledge that \(a_i \in (0,1)\) in \(\sum _{i=1}^{s}a_iF_i\), we replace the weighted sum in Eq. 8 with an attention module G.

We design three networks based on the step functions of Euler, RK2 and RK4, respectively (Fig. 2C-E), named as ReconODE-LT (“L" indicates learned solvers and “T" is backpropagation through solvers). The final network is the cascade of solver step functions and the parameters are shared across iterations.

One potential drawback of incorporating the ODE solver into the network is the increased GPU memory usage during training, especially for complicated ODE solvers. To alleviate this problem, we adopt the gradient checkpoint technique [5] to dramatically reduce the memory consumption while only adding little computation time during training, where the intermediate activations are not saved in the forward pass but re-computed during the backward pass.

Fig. 2.
figure 2

The building blocks and overall structure of proposed ReconODE networks. (A) In the time-dependent convolutional layer, the time after a linear transformation and input feature maps are concatenated and then fed into the convolution operation. (B) The dynamics in ODE is represented by a CNN with five time-dependent convolutional layers. Using off-the-shelf solvers, the ODE can be solved to obtain reconstructed images. (C-E) We extend this idea and propose three neural networks to learn step functions of Euler, RK2 and RK4 solvers based on the original solver formulations. Different dilation factors are utilized to learn multi-scale features. The ReconODE-LT network is a cascade of the corresponding solver step functions. The parameters are shared across cascade iterations.

2.4 Model Training and Evaluation

We used the single-coil fastMRI knee data [38]. There are 34,742 2D slices in the training data and 7,135 slices in the validation data. The fully sampled k-space data were undersampled with acceleration factors (AF) 4 and 8, respectively. We compared our models with UNet [28, 38] and modified it for data with real and imaginary channels and added a data consistency layer [29]. The cascade CNN model (D5C5) [29] and KIKI-Net [8] were also included for comparison. For ReconODE-FA models, we adapted the code from [9]. For a fair comparison, we applied the default settings of channel=32 from fastMRI UNet and N=5 from cascade CNN to all models if applicable. All models were trained using \(loss = L1+0.5 \times SSIM\) and RAdam [20] with Lookahead [40]. Results were evaluated on the fastMRI validation data using PSNR and SSIM [38].

3 Result

Table 1 shows the reconstruction results on the fastMRI validation dataset. UNet, which learns the direct mapping between the undersampled image to the fully sampled image, has the largest number of parameters and has about 20 to 100 times more parameters than ReconODE models. D5C5 that learns the mapping in an iterative way has slightly better SSIM in 4X but worse SSIM in 8X than UNet. ReconODE-LT-Euler with only \(0.9\%\) parameters of UNet achieves similar results as UNet at 4X. ReconODE-LT-RK2 with \(64\%\) parameters of D5C5 has similar results as D5C5 at 4X and 8X. The ReconODE-LT-RK4 achieves the best PSNR and SSIM at both 4X and 8X among all models. Figure 3 shows examples of reconstructed images. ReconODE-LT-RK4 achieves the smallest error among all models, which is consistent with the quantitative results.

Using more sophisticated but fixed ODE solvers in ReconODE-FT as well as ReconODE-FA models does not seem to significantly improve the reconstruction results. This is in line with previous results [9]. However, with more complicated but learned ODE solvers, the performance of ReconODE-LT models is improved. Also, ReconODE-LT models outperform ReconODE-FT models with the same ODE solver. These results indicate that learning the ODE solver by the network is beneficial. To demonstrate the effectiveness of the attention in our network, we also trained a ReconODE-FT-RK4 model without attention. We observe that the SSIM at 4X drops from 0.733 (with attention) to 0.726 (without attention). ReconODE-FT models have overall better performance than ReconODE-FA models, which suggests that backpropagation through solvers may be more accurate than the adjoint method. All the improvements described above are statistically significant (\(p<10^{-5}\)).

With the help of the gradient checkpoint technique, the GPU memory usage for ReconODE-LT-RK4 can be significantly reduced (\(73\%\)) with only 0.13 seconds time increase during training. We did not observe any difference in testing when applying the gradient checkpoint (Table 2).

We initially tested the original neural ODE model [4] but the performance was poor (4X SSIM 0.687) and the training was very slow, which may be due to the stability issue [9, 41].

Table 1. Quantitative results on the fastMRI validation dataset.
Table 2. Benchmark the ReconODE-LT-RK4 model with or without gradient checkpoint technique.
Fig. 3.
figure 3

Examples of reconstructed images and corresponding error maps. Results of other models are omitted due to the space limit.

4 Discussion and Conclusion

In this paper, we propose an innovative idea for MRI reconstruction. We model the continuous optimization process in MRI reconstruction via neural ODEs. Moreover, our proposed ReconODE-LT models integrate the knowledge of the ODE solvers into the network design.

Since the fastMRI leaderboardFootnote 1 was closed for submission, we only evaluated the results on the validation dataset. Our results on validation data are not directly comparable to the top leaderboard results on test data, which train on validation data with more epochs [32] and model ensembles [12]. Moreover, compared to models such as E2E-VN (30M) [32], PC-RNN (24M) [34], and Adaptive-CS-Net (33M) [23], our ReconODE models are much smaller (\(\le 0.15\)M parameters). The proposed methods achieve comparable performance with much smaller model size [25]. We expect further studies will lead to a better trade-off between performance and model size. As we intend to propose a new framework rather than a restricted model implementation, further boost of performance can be expected by using larger networks for ODE dynamics and more complicated solvers. This paper provides potential guidance for further research and extension using neural ODE for MRI reconstruction.