Introduction

In the human body, diffusion is an important process because it leads molecules from micro-vessels into the cells and vice versa. Simultaneous to diffusion, perfusion occurs in the human body as well. It is referred to as the transport of nutrients and oxygen through blood in order to maintain human physiological homeostasis (Le Bihan et al. 1988). Many kinds of illnesses (e.g., stroke, glioma, cirrhosis, and renal tumors) change drastically the characteristics of those processes, thus an appropriate technique to track changes in tissue perfusion and diffusion would help diagnosis (Koh et al. 2011; Federau 2017; Szubert-Franczak et al. 2020), mainly because T1- and T2-weighted images may present either contrast variability for radiological findings from the same illness or contrast overlap for radiological findings from different stages of the same disease (Ngo, Frank et al. 1985).

Denis Le Bihan succeeded in 1986 to model random microscopic motion within magnetic resonance (MR) voxels, when he created the intravoxel incoherent motion technique, a diffusion-weighted imaging method (DWI) which captures the signal provided by random-moving protons within a voxel under action of diffusion gradients (Le Bihan et al. 1986). Le Bihan first characterized the signal S as a mono-exponential decay (Eq. 1), whose exponent is D, the pure water diffusion coefficient measured in square millimeters per second, multiplied by b, the diffusion gradient factor measured in seconds per square millimeter, and S0 is the maximum amplitude of signal (b = 0 s/mm2).

$$S\left(b\right)={S}_{0}{e}^{-bD}$$
(1)
$$S\left(b\right)={S}_{0}{e}^{-b\cdot ADC}$$
(2)

However, biological structures like capillary network geometry, blood velocity, and cell walls constrain physiological diffusion to specific directions, so that an apparent-diffusion coefficient ADC (Eq. 2) in a mono-exponential decay would describe the phenomenon more properly (Le Bihan et al. 1986; Yamada et al. 1999; Luciani et al. 2008; Le Bihan 2018). Afterwards, Le Bihan proposed the bi-exponential alternative model (Eq. 3), where f is the perfusion fraction, D is the pure diffusion coefficient (similar to ADC in the mono-exponential model), and D* is the pseudo-diffusion coefficient, assumed to be approximately 10 times greater than D. This model separates diffusion from perfusion effects and assumes that intravoxel incoherent motion (IVIM) signal involves two compartments: the intravascular (f and D*) and the extravascular (D). Many researchers keep on investigating advantages and disadvantages of multiple-exponential models; bi- and tri-exponential decays have been considered more efficient than the mono-exponential one (Cercueil et al. 2015; Barbieri et al. 2016; van Baalen et al. 2017; Chevallier et al. 2019).

$$\frac{S\left(b\right)}{{S}_{0}}=\left({1}-f\right)\cdot {e}^{-b\cdot D}+f\cdot {e}^{-b\cdot D*}$$
(3)

Some difficulties arise by using such a technique. On one hand, long b-value sequences provide more information about tissues and enhance accuracy of estimations, but they extend the time of exam, which makes image artifacts more likely to happen. On the other hand, short sequences shorten the time of exam, but turn IVIM signals more susceptible to noise. In addition, three parameters influence a bi-exponential decay with possible short b-value sequences, which means the variability of parameters causes great changes in signal decay. As a consequence, IVIM problems are usually classified as “ill-posed” problems, whose solution is not unique and is not a linear function of the input parameters. Therefore, it becomes difficult to establish standard fitting methods to calculate IVIM data, because the effects of that variability change according to the method utilized. The evidence says that segmented least square methods supply the best results in regard to accuracy and precision (Park et al. 2017; Meeus et al. 2017; Cho et al. 2015); there are also full non-linear and non-negative methods that emerge frequently in IVIM studies though (Barbieri, Donati, Froehlich, & Thoeny, 2016; Keil et al. 2017; Paschoal et al. 2018).

One alternative to make IVIM signals less susceptible to noise is the use of ROIs, which allow the calculus of mean signals inside of a restricted area. Radiologists often use handmade square ROIs to investigate pathological tissues (Inoue et al. 2014). The advantage of this procedure is that signal average mitigates the effects of random noise on the region (Ma et al. 2016). However, depending on the position of the ROIs, they might surround more than one tissue and, consequently, partial volume effects would deteriorate the accuracy of the analysis (Bickel et al. 2017). ROI dimension is also an important factor to observe, because the mean signal of huge regions may blur the presence of small pathologies limited to few voxels, even if the ROI envelops only one kind of tissue (Arponent et al. 2015).

Because of the aforementioned problems, it is crucial to understand what exactly the roles of IVIM variables are during IVIM signal formation and processing. Many researchers have done a great job on trying to characterize IVIM signal behavior with in vivo data, but that makes parameter variations less flexible since a tiny amount of patients is usually available and such variations would depend on capabilities of MRI systems (Luciani et al. 2008; Federau 2017; Huang 2020; Lévy et al. 2020). We believe that the systematic investigation of these variations with synthetic data is important to have insights about good strategies for in vivo applications. To our best knowledge, a systematic analysis including different b sequences, perfusion/diffusion parameters, noise conditions, fitting methods, and ROI dimension and placement all together is still lacking in the literature.

The aim of this study was to characterize the influence of b-value sequences, noise, range of diffusion and perfusion parameters, placement, and dimension of region-of-interest (ROI) on the method performance for bi-exponential intravoxel incoherent motion MRI (IVIM-MRI) signal fitting. This paper is divided into the following sections: introduction; methods, where we present the hypothesis, values, and computational tools for both the voxel-wise and the image simulations; results and discussion, where we talk about the influence of each variable on IVIM parameters estimation; and conclusion, where we summarize our findings and give some insights about future research.

Methods

Voxel simulation

To follow the procedure represented in the flowchart of Fig. 1, we produced an in-house MATLAB (Mathworks, Natick, MA, R 2015b) code to simulate bi-exponential voxel signals, whose b-value sequences, SNR values, and IVIM parameters were the following:

  • b1 = (0, 5, 10, 15, 20, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 1000 s/mm2);

  • b2 = (0, 10, 20, 30, 40, 50, 75, 100, 200, 400, 600, 1000 s/mm2);

  • b3 = (0, 5, 10, 15, 30, 60, 120, 250, 500, 1000 s/mm2);

  • b4 = (0, 20, 40, 80, 200, 400, 700, 1000 s/mm2);

  • 20 ≤ SNR ≤ 50, step = 5

  • [f; D; D*]s1 = [0.10 ≤ f ≤ 0.30; 3.00 × 10−3; 4.00 × 10.−2], step = 0.05

  • [f; D; D*]s2 = [0.20; 1.00 × 10−3 ≤ D ≤ 7.00 × 10−3; 4.00 × 10−2], step = 5.00 × 10−4 mm2/s

  • [f; D; D*]s3 = [0.20; 3.00 × 10−3; 1.00 × 10−2 ≤ D* ≤ 7.00 × 10−2], step = 5.00 × 10−3 mm2/s.

Fig. 1
figure 1

Flowchart of the voxel simulation procedure. First, we create the signal according to Eq. 3; then, we add Rician noise in it and estimate the IVIM parameters with fitting methods; we repeat that 1000 times to have a number of data big enough to calculate the mean values and performance metrics

We used all the possible permutations of IVIM parameters, SNR values, and b-value sequences to produce the signals. The values of the parameters in the structures were taken from Lemke et al. (2011), Zhang et al. (2013), Cho et al. (2015), Federau (2017), Meeus et al. (2018), and Huang (2020). Rician noise corrupted the signal before fitting, so that the SNR values were between 20 and 50.

We compared estimations of the following fitting methods: Levenberg–Marquardt (LEV), trust-region-reflective (TRR), segmented nonlinear least square (NLLS2), non-negative least square (NNLS), segmented linear least square (LLS), and segmented robust linear least square (LLSR). The two latter methods need a threshold to establish signal segmentation (Fig. 2), so we decided to use 200 s/mm2 as a threshold, in accordance with some previous studies (Cohen et al. 2015; Federau et al. 2013; Sigmund et al. 2012). The procedure was repeated 1000 times, and the estimations that were beyond the following limits, named outliers, were set to zero.

  • 0 < f < 1

  • 0 < D < 5 × 10−2 mm2/s

  • 0 < D* < 5 × 10−1 mm2/s

Fig. 2
figure 2

IVIM segmented signals. When 0 < b < b-threshold, we consider the whole equation with perfusion and diffusion compartments. However, when b > b-threshold, we neglect the perfusion term

In order to evaluate the performance of methods, b-value sequences, and SNR values, we normalized the 1000 parameters (Eqs. 4 and 5) on the purpose of instantiating an origin in a referential system composed by three axes: f, D, and D*. In Eq. 4, pe and ps are the estimated and the simulated parameters, respectively; in Eq. 5, ∆pmin and ∆pmax are, respectively, the minimum and maximum differences between estimated and simulated parameters among 1000 estimations. We used the Euclidean distance Deu (Eq. 6) to the origin to address a score performance for methods, b-value sequences, and SNR values. The lower was Deu, the higher were the points, so that we managed to rank them. We performed Kruskal–Wallis and multiple-comparison tests with the Dunn-Sidák correction to differentiate results statistically with 5% of significance.

$$\Delta p=\left|{p}_{e}-{p}_{s}\right|$$
(4)
$${p}_{norm}=\frac{\Delta p-\Delta {p}_{min}}{\Delta {p}_{max}-\Delta {p}_{min}}$$
(5)
$${D}_{eu}=\sqrt{\sum_{j-{1}}^{N}{D}_{norm-j}^{2}+D{*}_{norm-j}^{2}+{f}_{norm-j}^{2}}$$
(6)

It is easy to notice that the sequence b1 will have the best performance among all the b-value sequences. However, it is unfeasible to use during real exams, because it is too long. Yet, its performance will be used like a reference to evaluate the remaining sequences.

Image simulation

We used the best combination method/b-value sequence from the previous simulation to simulate a set of voxels and calculate its parameters. That set manages to reproduce a MR image with three different tissues characterized by the following parameters:

  • Tissue 1: [f; D; D*] = [0.10; 1.00 × 10−3; 1.00 × 10.−2]

  • Tissue 2: [f; D; D*] = [0.20; 2.00 × 10−3; 2.00 × 10.−2]

  • Tissue 3: [f; D; D*] = [0.30; 3.00 × 10−3; 3.00 × 10.−2]

  • Image dimension: (Nx, Ny) = (160, 60)

  • Dimension of tissues: (∆x, ∆y) = (50, 50)

It is worth saying that we utilized a discrete morphology to create the tissues, it means there were no transition areas between them. The image simulation procedure can be seen in the flowchart of Fig. 3, and Fig. 4 shows what the image looked like. We analyzed four dimensions of square ROIs: 2 × 2, 3 × 3, 4 × 4, and 5 × 5. Besides, the same range of SNR values as in the previous simulation was used to introduce noise in the image.

Fig. 3
figure 3

Flowchart of the image simulation procedure. First, we create the signal according to Eq. 3; after that, we create the matrix that compose the image with null and non-null values; then, we add Rician noise in it and estimate the IVIM parameters with a fitting method by scanning the tissues with the ROIs

Fig. 4
figure 4

A schematic representation of the image we created. It is supposed to simulate a real MR image, so it has a noisy background, height, width, and three kinds of tissues of same dimension characterized by their respective IVIM parameters

The initial position of ROIs was the upper leftmost vertex of tissue 1 (Fig. 5); then, they scanned the tissues until the final position around the lower rightmost vertex of tissue 3; they followed the track represented in Fig. 6. Eventually, the 3 × 3 and 4 × 4 ROIs surrounded two tissues at the same time, or tissues and background image (noise), in contrast with 2 × 2 and 5 × 5 ROIs, which could scan tissues with no partial volume effects. By doing that, we could study the influence of placement in ROI estimations.

Fig. 5
figure 5

The starting position of the ROIs. All of them start from the upper-left vertex of tissue 1. We tested square ROIs with 2 × 2, 3 × 3, 4 × 4, and 5 × 5 dimensions

Fig. 6
figure 6

A schematic of the scan path of the ROIs as indicated by the black arrow

In order to assess the influences of dimension, we scanned three 42 × 46 pixels regions within tissues separately so that the ROIs surrounded neither background areas nor two tissues at the same time (Fig. 7). In both studies, the fitting method algorithms performed the calculus over the mean signal of the ROIs. The program provided mean parameters, standard deviation, and relative error to do statistical analysis.

Fig. 7
figure 7

The 42 × 46 pixel regions which we selected to scan with ROIs 3 × 3 and 4 × 4 to estimate parameters with no interference of background noise and partial volume effects

Results

Voxel simulation

We produced more than 6000 graphics of relative error vs. SNR values, normalized Deu vs. IVIM parameter values, column graphs of performance, and parametric maps. As it is not possible to show all of them herein, we shall display the most representative ones.

The influence of SNR values

Figure 8 shows the general performance of SNR values and the performance with respect to methods and b-value sequences. There was no significant difference between performances of SNR = 45 and SNR = 50 for the method NNLS (P = 1). We see that, as we expected, the estimations are more accurate and precise as the SNR value increases, and noise conditions may definitely deteriorate signal quality and worsen estimation accuracy.

Fig. 8
figure 8

a The general column graph of performance of each SNR value in the voxel simulation. b The performance of the SNR values with respect to each fitting method. c The performance of the SNR values with respect to each b-value sequence. SNR values line graphs of Deu for the NNLS method with respect to d b4 and e b3 sequences, respectively. The effects of sequences b3 and b4 on this method during estimations can be clearly seen when the parameter D varies and reaches values greater than 4 × 10 − 3 mm2/s

Surprisingly, the method NNLS had poor performance when estimating IVIM parameters for structure 2 when D values were between 4 × 10−3 mm2/s and 5 × 10−3 mm2/s, and SNR > 35, which indicates the instability of the method. Here, we did not analyze noise floor effects because b-values were not so high that IVIM signals could oscillate around zero.

It is worth saying that NNLS presented unusual behavior in terms of structure 2 for sequences b3 and b4. Whenever D values were greater than 4 × 10−3 mm2/s, not only Deu started increasing, but also higher values of SNR provided poorer estimations (Fig. 8 d and e). Besides, combinations like NNLS/b3 and NNLS/b4 reached Deu > 0.5 when estimating parameters from [f; D; D*]s2 in SNR > 30 cases.

The influence of fitting methods

The best method/b-value sequence combination was LEV/b2. We can see in Fig. 9 that LEV and TRR had the best performances, and there were no significant differences between these two methods (P > 0.14). They yielded Deu < 0.5 when SNR > 20 and estimated parameters distribution close to normal when SNR > 30. However, for lower values of SNR, we see super-estimated f and D* and sub-estimated D in distinct distributions. For the same structure, however NLLS2/b4 had enhancing Deu values from D = 4.00 × 10−3 mm2/s for all SNR values; for [f; D; D*]s1 and [f; D; D*]s3, the Deu remained constant (Fig. 9 a, c, and d), while parameters varied.

Fig. 9
figure 9

a The performance of the fitting methods with respect to each b-value sequence. b The performance of the fitting methods with respect to each SNR value. c The general column graph of performance of each fitting method value in the voxel simulation. There is a tiny difference between the methods TRR and LEV. Those methods reached the best results. Segmented methods, apart from NLLS2, and the NNLS had similar general performance

Nonetheless, LLS and LLSR estimations were often more precise and accurate than those from NNLS and NLLS2, which differed significantly (P < 0.001). Both showed very unstable results, mainly on f and D* estimation; outliers surpassed 50% of its estimations for low SNR values, and the best performance of NNLS was reached when SNR = 45, whereas all the others had no drop of performance when SNR increases.

The influence of b-value sequences

Figure 10 shows the general performance of the b-value sequences and with respect to the methods and the SNR values. The differences between b2 and b3 sequences are not significant for SNR > 35 (P > 0.06). As we expected, b1 yielded the best estimations, and b3 provided the second highest performance. It is worth noticing that b3 performed very well for segmented methods, but b2 provided more accurate and precise estimations. Also, b3 performance decreases, while the SNR value increases.

Fig. 10
figure 10

a The general column graph of performance of each b-value sequence in the voxel simulation. b The performance of the b-value sequences with respect to each SNR value. c The performance of the b-value sequences with respect to each fitting method. d, e b-value sequence line graphs of Deu for the TRR method with respect to SNR = 50 and SNR = 20, respectively. f b-value sequence line graphs of Deu for the LLS method with respect to SNR = 50. b2 behaves far worse than b3 for segmented methods like LLS

Image simulation

The main results are presented in parametric maps (Figs. 11, 12, and 13), where we see the IVIM parameter voxel-wise estimations for each ROI dimension. The estimations of D* maps resulted in many outliers, so that the scale upper limit of those parametric maps was adjusted to 0.03 mm2/s. In Fig. 11, only parametric maps for SNR = 20 are shown, because they yield better views about estimation variability.

Fig. 11
figure 11

Parametric maps of IVIM parameters for ROIs of size 2 × 2 (a), 3 × 3 (b), 4 × 4 (c), and 5 × 5 (d). All of them have SNR = 20, so one can visualize how pernicious low noise levels can be for estimations and how ROI size might mitigate it. In the parametric map of D, we had great improvement of estimations from 2 × 2 size to 4 × 4 size, for example. Yet, partial volume effects and noisy areas must be avoided

Fig. 12
figure 12

Parametric maps estimated by 3 × 3 square ROI for SNR values of 20 (a), 35 (b), and 50 (c). As the SNR value gets bigger, the number of outliers drops. We see also that noise regions in the low tissue areas cause overestimations

Fig. 13
figure 13

Parametric maps estimated by 5 × 5 square ROI for SNR value of 20 (a) and 2 × 2 square ROI for SNR value of 40 (b). The D maps reveal that big ROIs may yield good results even though noise levels are low. Nonetheless, f and D* estimations would still lack precision

Figure 12 shows a comparison between the performance of ROI 3 × 3 for parametric maps with SNR = 20, 35, and 50, respectively. These maps display the influence of noise over the calculations when using ROIs, whose dimension may mitigate such influence (Fig. 13). Figures 11b, c and 12 a, b, and c also show the partial volume effects when ROIs capture voxels with different tissue properties. These effects appear either on the edge between two tissues or on the lower edge of the tissues. Finally, we present the differences between the RE and SD of the estimations when scanning the whole image (including noise) with the ROIs and the 42 × 46 selected region (Fig. 14).

Fig. 14
figure 14

RE and SD graphs comparing the estimations of the 42 × 46 pixels Region and those of the whole region when SNR = 20. We see that f and D* estimations can be heavily affected by poor ROI positioning. A little difference between estimations was seen when comparing those circumstances for 2 × 2 to 5 × 5 ROIs, which did not involve background areas

Discussion

One hypothesis for the phenomenon described in Fig. 8 d and e is the fact that the NNLS method does not have the number of diffusion components as input. Since we provided a synthetic biexponential signal, we expected two peaks in the D spectrum with amplitudes f and 1 − f. Thus, if the method calculates more than two peaks, its estimation is more likely to be inaccurate. Yet, for biological tissue in an exploratory analysis, it can be a positive point since it might indicate unconsidered tissue information that should be taken into account.

This situation may have happened when SNR values were higher, D was large, and the b-value sequences were not appropriate, so that the method acted like there were three or more diffusion components instead of two. As we see, noise could be a real problem when performing estimations with fitting methods, mainly in regard to f and D*. The latter parameter had the highest variability, which is in accordance with previous works (Le Bihan 2018).

Unexpectedly, LEV and TRR surpassed LLS and LLSR in terms of accuracy and precision. It contradicts many previous studies (Park et al. 2017; Meeus et al. 2017; While 2018). It seems that segmentation failed to facilitate estimations for structure 2, as LLS, LLSR, and NLLS2 had increasing distances insofar as the SNR values increased. Also, it is unusual to find NLLS2 in the literature, but we used it to qualify the influences of segmentation in fitting methods and apparently it would hardly be an option to substitute either segmented methods with linearizing steps or direct methods with no segmentation.

The value distribution in the b-value sequences is an important factor for guaranteeing precise and accurate results. When considering D* in the calculations, well distributed, lower (< 50) values of b should be included, with no harm for D and f estimations, as the linearizing process does not need many values to provide satisfactory results. Perhaps that is the explanation for lower Deu values by segmented methods and b3. On the other hand, these results may be consequence of high bad distributed values in b2 sequence, which might have yielded inaccurate values of D and f, so that the error was propagated to D* calculations.

Again, we know, b1 would not be feasible in actual clinical application because it is too long. Yet, it is worth using the accurate, precise estimations related to it with comparative purposes: Those feasible sequences (as b2, b3, and b4) that perform similarly to b1 are more likely to be used in real applications. However, there is no consensus about how large the sequences should be and how they should be distributed; there are some evidences that optimized sequences with high quantity of low b-values (< 100) can be useful to have good estimations (Lemke et al. 2011; Zhu et al. 2019; Li et al. 2017).

In reality, this quantity would depend on some factors: tissue, noise conditions, hardware conditions, IVIM model, etc. — some of them were demonstrated here. Some studies recommend at least 16 well-distributed b-values (ter Voert et al. 2016); others say that 10 would be enough (Lemke et al. 2011). Still, we can find works where researchers used more than 20 values (Wurnig et al. 2018). This procedure is interesting to standardize tissue parameters (Orton et al. 2018).

In regard to the ROIs, we see that the bigger the ROI dimension is, the better is the estimation with respect to accuracy and precision as long as it does not involve either more than one kind of tissue or all-noise regions and biological tissue at the same time. Nonetheless, it is clear that the partial volume effect is a problem to be addressed, in accordance with Fig. 11 b and c, because it harms parameter estimation by providing values that do not represent any of the involved tissues. Large hepatic blood vessels, for instance, may introduce bias into the estimations of perfusion parameters (Chevallier, et al. 2019). Therefore, the homogeneity of regions within ROIs must be taken into account. Perhaps, that is the cause of discordances between our findings and those in some literature: Big ROIs may also provoke bad estimations in real applications like differentiation of benign and malignant breast lesions, for example (Gity et al. 2018), since they could involve healthy and necrotic tissues. In that sense, Gity et al. concluded that ROIs in most restricted parts of breast lesions were more accurate than whole-lesion ROIs to differentiate benign from malignant tumors. The explanation was that small ROIs in the most restricted part include only the most viable and cellular portion of the lesions and may result in better estimations.

Also, we see in the parametric maps that noise leads to overestimated parameters, as represented in Fig. 12 b and c. We see a line of high values on the lowest areas of tissues. It happens probably because of lower signal mean amplitude in the ROI, which the fitting methods interpret as faster decay and, as a consequence, yields higher parameters. Moreover, poor noise conditions make outliers more likely to happen (Fig. 12a). Evidence of that can be found if one compares the parametric maps of D, SNR = 20 and ROI 3 × 3 (Fig. 12a), SNR = 40 and ROI 2 × 2 (Fig. 13b) and SNR = 20 and ROI 5 × 5 (Fig. 13a). Depending on the noise level conditions, it is possible to reach accurate and precise estimations by using bigger ROIs (Figs. 13 and 14).

It must be said, however, that the simulation also has limitations. For example, it has been done with discrete morphology, so the boundaries between tissues are well defined, which hardly happens when one deals with real tissues. Moreover, we assume as a hypothesis that there is only one kind of tissue per voxel, and it may be not sufficiently realistic.

Conclusion

Non-segmented non-linear fitting methods may estimate IVIM parameters more precisely and accurately than either segmented two-step methods or non-negative methods regardless of SNR value and b-value sequence. In this study, they provided less outliers as well. Yet, D* was the most difficult parameter to be estimated and even LEV and TRR had bad performance on this for intermediate values of SNR. Thus, it is worth taking into account noise conditions before using IVIM technique. Conversely, we managed to estimate highly precise D values with segmented and non-segmented methods. In regard to b-value sequences, the low rather than high b-value distributions are crucial to have accurate and precise parameters. Yet, these sequences should not be too long to avoid exaggeratedly long exams and image artifacts. In fact, the optimal b-value sequence must be dependent on the biological tissue and the fitting method used; the sequence will hardly work well for all kinds of tissue and methods. ROI placement plays a vital role in MRI diagnosis as it provides noise mitigation, but it should be carefully considered in order to prevent estimations from partial volume effects.

We expect to have shown here how complex the IVIM parameter estimation is. It is an ill-posed problem whose behavior is related to signal processing and acquisition. Although our work involved conditions that could easily be reproduced during in vivo tests, it is not so trivial to say that one would certainly obtain the same results, but we managed to demonstrate important aspects to be considered while working with IVIM signals. IVIM is a promising technique but still lacks clinical applications on how fitting methods and bi-exponential models could contribute to characterizing pathological tissues and degenerative diseases.