Introduction

Vesicoureteral reflux (VUR) is abnormal retrograde flow of urine from the urinary bladder to the upper urinary tract [1, 2]. VUR is often diagnosed subsequent to the detection of a urinary tract infection (UTI) [37]. Renal scarring is more likely to develop following a UTI in children with VUR than in children without VUR. Unfortunately, renal scarring can lead to hypertension and chronic renal failure later in life [8, 9].

Voiding cystourethrography (VCUG) is the method of choice for the diagnosis of VUR. VCUG can reliably grade VUR and provide fine anatomic details of urinary systems, but it involves radiation exposure and invasive catheterization [1, 5].

Intravoxel incoherent motion (IVIM), a concept introduced and developed by Le Bihan et al. for the quantitative assessment of microscopic translational motions including molecular diffusion and perfusion that occur in each voxel in magnetic resonance (MR) imaging, is both radiation-free and contrast media-free [10, 11]. Using IVIM DWI, pure molecular diffusion parameters and perfusion-related diffusion parameters can be separated [10]. IVIM DWI can detect cortical defects with comparable sensitivity and specificity with 99mTc-dimercaptosuccinic acid (DMSA) scintigraphy that is widely used to detect renal scarring in children with UTI but involves radiation exposure. Moreover, IVIM DWI can quantitatively evaluate low signal intensity foci considered to be cortical defects [12].

Considering the invasiveness of VCUG, it would be advantageous if an alternative non-invasive imaging modality for the detection of VUR was available. We postulated that a difference of urine flow in the renal pelvis existed between the “reflux” and the “non-reflux” kidneys, and that this may affect the diffusion parameters of IVIM DWI. To our knowledge, no studies have been reported that utilize MR imaging for detection or prediction of VUR in children with UTI.

Accordingly, the purposes of our study were to compare the diffusion parameters of IVIM DWI between the “reflux” and the “non-reflux” kidneys, and to evaluate the feasibility of using IVIM DWI to predict VUR in children with UTI.

Materials and methods

Patients

This retrospective Health Insurance Portability and Accountability Act (HIPAA)-compliant study was approved by our institutional review board and the requirement for informed consent was waived. Following a review of medical records and a radiology database from May 2011 to July 2013, 397 pediatric patients who underwent VCUG due to a clinically suspected first episode of febrile UTI and hydronephrosis confirmed by abdominal ultrasonography were identified. Three hundred and forty patients who did not undergo MR urography including IVIM DWI were excluded. Additionally, 31 kidneys with a renal pelvis too narrow to be measured (area < 10 mm2) were excluded. Finally, 83 kidneys from 57 patients (mean age = 9.51 months; age range = 1-90 months) with a first episode of febrile UTI were included in the study, which comprised 37 boys and 20 girls (Fig. 1). The mean intervals between onset of UTI and VCUG, onset of UTI and MR urography, and VCUG and MR urography were 5.35 (1 to 11), 5.42 (1 to 10), and 2.23 (-8 to 6) days, respectively.

Fig. 1
figure 1

Flowchart of the study population

MR methods

MRI was performed using a 3.0 T unit whole body MR-Scanner (Magnetom Trio a Tim system; Siemens, Erlangen, Germany) with the combination of a body matrix coil and a spinal matrix coil. If attempts to sedate through breast feeding failed, chloral hydrate (50 mg/kg) was administered orally for sedation 30 min before MR imaging. The attending pediatric physician monitored the patients and their vital signs during imaging.

Coronal T2-weighted half-Fourier acquisition single shot turbo spin-echo (HASTE) sequence, axial T2 turbo spin echo sequence using a BLADE κ-space trajectory (BLADE-T2WI), and coronal BLADE-T2WI were acquired. The pre-contrast radial 3D gradient-echo (GRE) T1 axial sequence was performed in a free-breathing manner without trigger or navigation techniques. Parallel imaging was not used for the radial 3D GRE sequence tested.

IVIM DW imaging

Axial diffusion weighted single shot EPI (DWI) (work-in-progress sequence, provided by Siemens, Erlangen, Germany) and coronal DWI were obtained in a free breathing manner.

Diffusion data were calculated using a prototype IVIM post-processing tool provided by the manufacturer (Siemens, Erlangen, Germany) and by MATLAB (Mathworks, USA)-based software developed at the institution. Based on the IVIM theory, the relative signal was calculated as follows:

$$ {\mathrm{S}}_{\mathrm{b}}/{\mathrm{S}}_0\left(1\mathrm{P}\mathrm{F}\right) \exp \left(-\mathrm{b}\mathrm{D}\right)+\mathrm{P}\mathrm{F}\; \exp \left[-\mathrm{b}\left(\mathrm{D}+\mathrm{D}*\right)\right] $$
(A)

Where b represents the strength of the diffusion gradient, S0 is the signal intensity without the diffusion gradient, Sb is the signal intensity at a given b value, PF is the microvascular volume fraction, D is the pure diffusion coefficient, and D* is the perfusion-related incoherent microcirculation [13]. Both the Siemens IVIM post-processing approach and the MATLAB analysis approach used a fully bi-exponential nonlinear curve fitting for unknown parameters (D, D*, and PF). Since D* is significantly greater than D, the effects of D* on the signal decay at high b values (>200 s/mm2) can be ignored. Therefore, at high b values, Eq. (A) can be simplified into a linear fit equation to estimate D as follows:

$$ {\mathrm{S}}_{\mathrm{b}}={\mathrm{S}}_0\; \exp \left(-\mathrm{b}\mathrm{D}\right) $$
(B)

Using the value of D calculated using Eq. (B), PF and D* values could then be calculated using a partially constrained nonlinear regression algorithm based on Eq. (A) [14].

Imaging analysis

All VCUG images were reviewed by a radiologist (C.M.P., 32 years of experience in abdominal imaging). VUR was graded according to the international classification system developed by the International Reflux Study Committee [15]. All MR images were reviewed by two radiologists (C.H.L., J.W.K., 19 and 3 years of experience in abdominal imaging, respectively) blinded to the VCUG results. The radiologists manually positioned a circular region of interest (ROI) in the renal pelvis on the apparent diffusion coefficient (ADC) maps. The ROI was placed at the same position in the renal pelvis on the ADC and parametric maps by displaying all images in tandem on the PACS monitor, copying the ROI from the ADC maps and pasting it to the parametric maps for D, D* and PF. The circular ROI greater than or equal to 10 mm2 was drawn as close as possible to the center of the renal pelvis. For each ROI, the mean value of each parameter (ADC, D, D* and PF) derived from the pixel-by-pixel analysis was automatically computed. The radiologists measured ADC and IVIM parameters five times at 1-week intervals. The median value of each measurement was used as the representative value. Examples of ADC and IVIM parameter measurement in the renal pelvis are shown in Figs. 2 and 3.

Fig. 2
figure 2

A 21-day-old boy with vesicoureteral reflux in the left kidney, which was confirmed by VCUG (a). A circular region of interest (ROI) measuring ≥ 10 mm2 was placed in the renal pelvis on the axial ADC map (b) of IVIM DW MR imaging. The circular ROI was then copied and pasted to the parametric maps for D (c), D* (d), and PF (e) for placement in the same position. The ADC and IVIM parameters (D, D*, and PF) were measured five times

Fig. 3
figure 3

A 3-month-old boy with hydronephrosis in both kidneys on abdominal sonography and MR urography. ADC, D, D* and PF values of both kidneys were measured on the axial ADC map (b) and the parametric maps for D (c), D* (d), and PF (e). PF/D index values of the right and the left kidneys were 8.84 and 5.60, respectively. Because an optimal cut-off value obtained in a ROC curve analysis was 8.7, we predicted that there was VUR only in the right kidney. Actually, VCUG result was concordant with our prediction (a)

Additionally, indices were devised using a combination of the ADC and IVIM parameters. Based on our experience, the D* and PF parameters have a tendency to move in the opposite direction of the ADC and D parameters;, therefore, the following four indices were selected to compare the two groups: D*/ADC, D*/D, PF/ADC, and PF/D.

Statistical analysis

IVIM parameters and four indices were compared between the “reflux” and “non-reflux” kidney groups using general linear model with compound symmetry covariance structure. Because two kidneys were included in approximately half of the patients, the possible correlation between two kidneys within each patient was adjusted.

The degree of agreement of the median values measured by two observers was assessed using 95 % Bland-Altman limits of agreement and intraclass correlation coefficients (ICCs).

To evaluate the diagnostic performance of the ADC and IVIM parameters and the four indices for predicting VUR, receiver operating characteristic (ROC) curve analyses were performed, and the resulting areas under the ROC curve (AUCs) were compared for each parameter. The optimal cut-off values representing the maximal sum of the sensitivity and the specificity (the Youden index) were calculated and the corresponding sensitivity and specificity were obtained. Diagnostic accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated. The 95 % confidence limits for sensitivity, specificity, PPV, and NPV were also adjusted for the clustered data [16].

A p-value < 0.05 indicated a statistically significant difference. All statistical analyses were performed using statistical software (SAS version 9.2, SAS Institute, Cary, NC; and MedCalc for Windows, version 9.3.2.0, MedCalc Software, Mariakerke, Belgium).

Results

Of the 83 kidneys from 57 patients analyzed, VUR was confirmed by VCUG in 21 kidneys (“reflux” kidneys) consisting of grade I (n = 0), II (n = 5), III (n = 7), IV (n = 4), and V (n = 5). The remaining 62 kidneys did not have VUR on VCUG (“non-reflux” kidneys).

There were excellent agreements for the median values of ADC, D, D* and PF between two observers with ICCs ranging from 0.931 to 0.967 (95 % confidence interval: 0.896-0.978). The 95 % limits of agreement on Bland-Altman plots were shown as Fig. 4.

Fig. 4
figure 4

Bland Altman plots. The horizontal axis (x-axis) represents the means of (a) ADC values, (b) D values, (c) D* values, and (d) PF values measured by observer 1 and 2. The vertical axis (y-axis) represents the differences of each value measured by observer 1 and 2. The solid and dashed lines represent the mean bias and limits of agreement (1.96 × standard deviation), respectively

ADC and IVIM parameters

The median values of ADC, D, D*, and PF in the “reflux” group were 2.92 ± 0.08 (×10−3 mm2/sec), 2.79 ± 0.08 (×10−3 mm2/sec), 36.7 ± 1.5 (×10−3 mm2/sec), and 0.31 ± 0.02, respectively. The median values of ADC, D, D*, and PF in the “non-reflux” group were 3.05 ± 0.05 (×10−3 mm2/sec), 2.99 ± 0.05 (×10−3 mm2/sec), 30.3 ± 0.8 (×10−3 mm2/sec), and 0.20 ± 0.01, respectively (Table 1).

Table 1 ADC and IVIM DWI-derived parameters compared in the “reflux” and the “non-reflux” kidneys

Among the resulting values, PF and D* were significantly higher in the “reflux” kidney group (p = 0.007 and 0.033, respectively). ADC and D tended to be lower in the “reflux” kidney group although there was no statistical significance (p = 0.255 and 0.112, respectively) (Table 1).

Four indices (D*/ADC, D*/D, PF/ADC, and PF/D)

The values of D*/ADC, D*/D, PF/ADC, and PF/D in the “reflux” group were 13.06 ± 0.69, 13.76 ± 0.71, 11.06 ± 0.60 (sec/mm2), and 11.72 ± 0.66 (sec/mm2), respectively. The values of D*/ADC, D*/D, PF/ADC, and PF/D in the “non-reflux” group were 10.11 ± 0.39, 10.27 ± 0.40, 6.65 ± 0.35 (sec/mm2), and 6.80 ± 0.38 (sec/mm2), respectively (Table 2).

Table 2 Index values compared in the “reflux” and the “non-reflux” kidneys

Among the resulting values, PF/D, PF/ADC, D*/D, and D*/ADC were all significantly higher in the “reflux” kidney group (p = 0.007, 0.007, 0.024, and 0.033, respectively) (Table 2).

Diagnostic performance, PPV, NPV and diagnostic accuracy for predicting VUR

In the ROC curve analyses for each IVIM DWI-derived parameter, the PF value showed the highest AUC (Az = 0.862) with the corresponding sensitivity and specificity of 66.7 % and 90.3 %, respectively. When a PF value of 0.274 was used as the cut-off value for differentiating the “reflux” kidney from the “non-reflux” kidney, an accuracy of 84.34 % was obtained with a PPV of 70.00 % and a NPV of 88.89 % (Table 3).

Table 3 Comparison of the diagnostic capability of ADC and IVIM-derived parameters and four indices for predicting VUR

In the ROC curve analyses for each index, the PF/D index showed the highest AUC (Az = 0.864) with the corresponding sensitivity and specificity of 81.0 % and 77.4 %, respectively (Table 3). When a PF/D value of 8.7 (sec/mm2) was used as the cut-off value, an accuracy of 78.31 % was obtained with a PPV of 54.84 % and a NPV of 92.31 % (Table 3).

Comparison according to the VUR grade

Twenty-one “reflux” kidneys consisted of five kidneys with grade II, seven with grade III, four with grade IV, and five with grade V. The diffusion parameters and index values were compared between the low grade (grade II/III) and high grade VUR (grade IV/V). There was no significant parameter difference between the two groups.

Discussion

In the present study, among the IVIM DWI-derived parameters, PF and D* were significantly higher in the “reflux” kidney than in the “non-reflux” kidney. PF showed the lowest p-value and the highest AUC. Based on these results, the perfusion parameters were more significant than the diffusion parameters to predict VUR. These were in accord with our expectations, because the normal renal pelvis is lined with transitional epithelium in which only urine flow exists and cellular components do not. Although the PF value was significantly higher in the “reflux” kidney group in this study, the ROC curve analysis for PF revealed only 66.7 % sensitivity. Therefore, we attempted to develop an index by combining IVIM parameters capable of predicting VUR. We devised four indices (D*/ADC, D*/D, PF/ADC, and PF/D) for comparing the diagnostic performances. These indices significantly differed between the “reflux” and the “non-reflux” kidneys. Among them, PF/D exhibited the best diagnostic performance. When a PF/D of 8.7 (sec/mm2) was used as the cut-off value, an accuracy of 78.31 % was obtained with a sensitivity of 81.0 % and a specificity of 77.4 %.

The IVIM model was originally developed by Le Bihan et al. to explain capillary perfusion [10], but the perfusion fraction can be affected by bulk flow, such as ductal flow and glandular secretion [17]. Muller et al. questioned the use of IVIM in evaluating the true perfusion effect on the kidney and suggested that a high diffusion value in the kidney may be associated with a flow-dependent transport mechanism such as renal tubular flow [17]. Currently, there is no IVIM model distinguishing between the signals from renal vascular flow and renal tubular flow [18]. Ironically, we utilized this problem in our study. We postulated that the retrograde flow of urine in patients with VUR could affect IVIM parameters, especially the perfusion fraction. Unlike the renal parenchyma, the renal pelvis contains only urine flow, suggesting that the perfusion fraction measured in the renal pelvis represents the urine flow.

Several studies have reported significance of bulk flow in DWI. In a study conducted on patients who had undergone DWI and salivary gland scintigraphy before and after radiotherapy for head and neck cancer, ADC was significantly lower in the dysfunctional group than in the functional group [19]. Because the ADC value primarily reflected blood and saliva flows within the salivary gland, it decreased as the blood and saliva flows decreased following radiotherapy. Thus, measurement of ADC value can noninvasively assess salivary gland function [19]. In another study, ADC value in intraductal papillary mucinous neoplasms (IPMNs) was significantly higher than in mucinous cystic neoplasms (MCNs) [20]. Although both IPMNs and MCNs contain mucinous cystic contents, IPMNs tend to have free flow due to cystic dilatation of the pancreatic duct and communication with the pancreatic duct, while MCNs have restricted flow because of not communicating with the pancreatic duct [20]. In our study, ADC value tended to be lower in the “reflux” kidney although it was not significant. The ADC value is affected by the combined effect of diffusion and perfusion factors as seen in the following equation: ADC value ≒ D + (PF/b), where D is the diffusion coefficient and PF is the perfusion fraction. The increased effect of perfusion factor in the renal pelvis due to bulk urine flow might have made the ADC value insignificant. Bidirectional bulk flow and infected urine in the “reflux” kidney also might have led to this result [21]. Consequently, in our study, perfusion-related parameters, especially PF were more significant than diffusion-related parameters between the “reflux” and the “non-reflux” kidneys. If the PF values had been measured in the functional salivary gland and IPMN, they might have been higher than in the dysfunctional salivary gland and MCN, respectively.

We performed analyses according to the VUR grade; however, there was no significant difference between the low and high grade VUR. It may have been due to small number of patients in each grade. Therefore, a prospective study with larger population is needed in the future.

There were attempts to evaluate renal parenchymal alterations and renal scars in children with VUR using non-invasive imaging modalities [2224]. To our knowledge, no prior studies have utilized IVIM to predict VUR. The present study investigated IVIM parameters that could predict VUR. Often diagnosed after UTI, VUR increases the risk of pyelonephritis by facilitating the spread of bacteria from the urinary bladder to the upper urinary tract [1]. Additionally, scar formation resulting from pyelonephritis is a primary cause of hypertension and chronic renal failure [8, 9]. Therefore, early detection of VUR after diagnosis of UTI is critical to prevent subsequent renal scarring. VCUG is a long established standard diagnostic method to detect VUR, but involves radiation exposure and invasive catheterization [1, 5]. Similarly, DMSA scintigraphy, widely used to detect renal scarring in patients with UTI, entails radiation exposure. Lee et al. proposed that IVIM DWI had comparable sensitivity and specificity to DMSA scintigraphy for the detection of cortical defects [12]. Further, IVIM parameters measured in cortical defects can also be useful for quantitative evaluation of these lesions. IVIM DWI is both radiation-free and contrast media-free, and thus can be safely used as the initial imaging modality to detect and quantitate cortical defects, rather than a DMSA scan [12]. Therefore, we investigated the feasibility of using IVIM DWI to predict VUR. The results of our study indicated that PF/D was significantly higher in the “reflux” kidney than in the “non-reflux” kidney with 81.0 % sensitivity and 77.4 % specificity in detecting VUR. Although this result must be validated in a larger population, this provides a new index for detecting VUR.

The present study had several limitations. First, our study was a retrospective study and thus there may have been a selection bias. Furthermore, MR urography might not be performed on acute phase of UTI in some patients because it was originally performed for evaluating an anomaly of urinary system in pediatric patients with UTI. However, the interval between MR urography and VCUG is thought to be more important because VCUG was used as the reference standard. In this study, mean interval between VCUG and MR urography was 2.23 days. Second, only a small number of patients were included. Therefore, a prospective study using a larger number of patients is needed to validate the results of the study. Third, we could not measure ADC and IVIM parameters in a renal pelvis that had no visible hydronephrosis on MR imaging, even when hydronephrosis was apparent on abdominal ultrasonography. However, among kidneys excluded from the study for narrowed renal pelvis, there was no “reflux” kidney. Fourth, we did not use a breath-hold technique during IVIM DWI. However, a previous study reported no significant difference between breath-hold and free-breathing techniques [25]. Finally, although we evaluated VUR grade, there was no significance among groups of different VUR grades. It may be attributed to small number of kidneys in each grade. Therefore, further study with a larger population is needed.

In conclusion, the perfusion fraction (PF) was significantly higher in the renal pelvis of the “reflux” kidney than in the renal pelvis of the “non-reflux” kidney. A new VUR index, PF/D can predict VUR with a relatively high sensitivity. In the future, IVIM DWI, which is both radiation-free and contrast media-free, could serve as an initial imaging modality for the prediction of VUR in children with UTI.