Introduction

Postnatal investigation for antenatal hydronephrosis (AH) including voiding cystourethrography (VCUG) and diuretic renography (DR) is indicated if abnormal ureter or bladder is detected in fetal or neonatal ultrasonography (USG) [13]. The imaging required in infants with isolated hydronephrosis (IH), characterized by renal pelvis dilatation without ureter and bladder abnormalities, is a subject of debate due to the high incidence but limited prognostic information [47]. It is crucial to differentiate significant hydronephrosis, such as ureteropelvic junction obstruction (UPJO), which could cause renal damage if left untreated, from insignificant hydronephrosis, a self-limited condition.

Several studies showed that the outcomes of AH and severity, classified by fetal or neonatal USG findings, were correlated [811]. However, the validity of neonatal USG in IH is unknown. We conducted this study to determine the diagnostic accuracy of two classification systems introduced to standardize the neonatal USG results, anteroposterior renal pelvic diameter (APD) measurement and the society for fetal urology (SFU) grading [1214], for detecting pathology in healthy newborns with IH. Time to resolution and factors predicting resolution of insignificant hydronephrosis were evaluated.

Patients and methods

A retrospective review of healthy, full-term infants with AH, referred to a university-based, pediatric outpatient center from January 1, 2007 to December 31, 2012, and at least a 12-month follow-up, was conducted. Newborns who underwent neonatal USG at age 7–30 days and VCUG in conjunction with DR if APD > 10 mm or SFU grade 3–4 in neonatal USG were eligible. Exclusion criteria were single, cystic, dysplastic, or hyperechogenic kidney, hydroureter or dilated, thick-walled bladder in neonatal USG. Only newborns with IH were analyzed.

All kidney, ureter, and bladder USG was performed in the supine position after adequate oral hydration by two radiologists who were aware of AH but blind to VCUG and DR results. Severity of hydronephrosis was determined after voiding based on APD measurement in the mid-renal transverse plane at cortical-pelvic margins within the confines of the renal cortex, and SFU grading.

VCUG was performed at age 0.5–3 months. Vesicoureteral reflux (VUR) was graded during filling and voiding using the International Reflux Study Committee classification [15]. DR was performed after adequate hydration using 99mtechnetium-mercaptoacetyltriglycine (99mTc-MAG3) at the minimum age of 6–8 weeks. Differential renal function was assessed during the first few minutes. Intravenous furosemide was given, and renogram curves along with the half-time drainage were assessed at 20 min. Post-micturition films were taken at 120 min. In patients with residual contrast, bladder catheterization was performed.

Normal DR was characterized by an early peak at 2–5 min followed by complete emptying either spontaneously, after furosemide, or after micturition. An obstructive pattern included: (1) the renogram rose continuously over 20 min or appeared as a plateau; or (2) the half-time drainage was >30 min, or the half-time drainage was >15 min with relative renal function <40 %. Repeated DR was indicated for indeterminate results in the initial DR or worsening dilatation in the subsequent USG.

Outcomes at the last visit were divided into significant and insignificant hydronephrosis, using the combined data of sequential USG, VCUG, and DR as the reference standard. Insignificant hydronephrosis was defined as normal VCUG and stable or decreasing APD in sequential USG if the APD was ≤10 mm in the neonatal USG, or normal VCUG and DR if the APD was >10 mm in the neonatal USG. In patients with insignificant hydronephrosis, sequential USG was performed at age 3, 6, 12, 18, and 24 months, and every year thereafter until resolution, defined as APD ≤ 5 mm and SFU grade ≤1 in two consecutive USG. Patients with significant abnormalities in USG, VCUG, or DR were classified as having significant hydronephrosis.

Statistical analysis

Data analysis was conducted using Stata 12 (StataCorp, College Station, TX). p values of <0.05 were considered significant. Descriptive data are reported as mean ± standard deviation (SD) or median (interquartile range, IQR) as appropriate. Data were compared using the Chi-square test and Student’s t test for categorical and continuous variables, respectively. Receiver operating characteristic (ROC) plots were used to determine the optimal cutoffs for the APD and SFU grading. A formal comparison of the number of correctly classified cases, reflecting differences in sensitivity and specificity, was made by comparing area under the ROC curves (AUC). The Kaplan–Meier method was used to estimate time to resolution of physiologic hydronephrosis. Four potential predicting factors for resolution chosen a priori, namely gender, laterality, SFU grade, and APD were evaluated using Cox proportional hazard analysis. Assumptions about the linearity of SFU grading were checked, and the hazard ratios (HR) with 95 % confidence interval (CI) were examined. Adjacent categories were collapsed together into categorical groupings if appropriate when these assumptions were not met. Chi-square test for model fit and test of interaction were performed.

Completed checklists of standards for reporting of diagnostic accuracy (the STARD initiative) are shown in Online Resource 1.

Results

Patient characteristics

Of 128 eligible healthy newborns with AH, 26 were excluded due to abnormal neonatal USG including dysplastic kidney (12), hydroureter (10), double collecting system with ureterocele (3), and abnormal urinary bladder (1). Six patients were incompletely investigated or lost to follow-up. Hence, 96 patients with IH (82.3 % male and 17.7 % female; 129 kidneys) were analyzed. The mean gestational age was 37.9 ± 1.1 weeks, and the mean birth weight was 3.18 ± 0.50 kg. The median follow-up time was 29.5 (17.7–49.3) months.

Postnatal radiological results

Neonatal USG was performed at the median age of 8 (7–16) days. Hydronephrosis was resolved in four patients (five kidneys). Sixty patients (62.5 %) had unilateral hydronephrosis (43 left kidneys; 17 right kidneys), and 32 patients (34.7 %) had bilateral hydronephrosis. The mean APD was 12.16 ± 10.44 mm. SFU grades 0, 1, 2, 3, and 4 were observed in 5 (3.9 %), 22 (17.1 %), 35 (27.1 %), 27 (20.9 %), and 40 (31.0 %) patients, respectively.

VCUG was performed at the median age of 38 (18–67) days. Abnormal VCUG were observed in five patients (5.2 %) or seven kidneys (5.4 %), consistent with VUR grade 1, 2, 3, and 5 in 2, 1, 1, and 1 patients, respectively. Every patient received continuous antibiotic prophylaxis. None were operated. The patient with VUR grade 5 was exclusively breastfed and did not regain the birth weight when initial USG was performed at the age of 7 days. Partially distended bladder secondary to neonatal oliguria could contribute to the absence of hydroureter in the initial neonatal USG as the refluxing ureter was observed in subsequent USG.

DR was performed in 69 of 96 patients (71.9 %) at the median age of 59 (42–88) days. UPJO were identified in 31 patients (32.3 %) or 37 kidneys (28.7 %). Pyeloplasty was performed in 20 patients (24 kidneys) at the mean age of 11.7 ± 5.9 months.

Outcomes of IH

Of the 129 kidneys, 85 (65.9 %) and 44 (34.1 %) had insignificant and significant hydronephrosis, respectively. Four patients with resolved hydronephrosis in neonatal USG had insignificant hydronephrosis. Gender, gestational age, birth weight, and laterality did not differ between both groups (Table 1). The APD was larger and SFU grade 4 was more common in significant than in insignificant hydronephrosis (p < 0.001).

Table 1 Clinical characteristics of patients with isolated antenatal hydronephrosis

ROC curves were used to evaluate optimal cutoffs for determining sensitivity and specificity in detecting significant hydronephrosis (Fig. 1). The APD and SFU grading yielded AUC (95 % CI) of 0.86 (0.79–0.94) and 0.81 (0.73–0.89), respectively (p = 0.08). For the APD of ≥16 mm, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) with 95 % CI were 70.5 (60.0–83.9), 96.5 (92.5–100), 91.2 (81.6–100), and 86.3 (79.4–93.2) %, respectively, and 87.6 % of cases were correctly classified. For SFU grade 4, sensitivity, specificity, PPV, and NPV (95 % CI) were 65.9 (51.9–79.9), 87.1 (79.9–94.2), 72.5 (58.6–86.3), and 83.1 (75.4–90.9) %, respectively, and 79.8 of cases were correctly classified. False positive rates for APD ≥ 16 mm and SFU grade 4 were 3.5 and 12.9 %, respectively. False negative rates for APD ≥ 16 mm and SFU grade 4 were 29.5 and 34.1 %, respectively (Table 1).

Fig. 1
figure 1

Receiver operating characteristic (ROC) curves for detecting significant hydronephrosis based on APD measurement and SFU grading (APD anteroposterior renal pelvic diameter; AUC area under the ROC curve; SFU the society for fetal urology)

Resolution of insignificant hydronephrosis

Over the time of 1,360 months that 85 kidneys with insignificant hydronephrosis contributed to the study, 57 kidneys underwent spontaneous resolution, whereas 28 kidneys had persistent hydronephrosis. A Kaplan–Meier graph illustrating the probability of having hydronephrosis at select time points is shown in Fig. 2. Overall rates (95 % CI) of having hydronephrosis were 76.2 (65.6–83.9), 60.0 (48.5–69.6), and 31.6 (20.7–43.1) %, and resolution occurred in 23.8, 40.0, and 68.4 % at age 6, 12, and 24 months, respectively.

Fig. 2
figure 2

Kaplan–Meier graph with 95 % CI including the number at risk at select time points illustrates probabilities of persistent hydronephrosis up to 36 months (CI confidence interval)

Multivariate analysis revealed that gender (female vs. male), laterality (bilateral vs. unilateral), and SFU grading (0–2 vs. 3–4) were not associated with resolution with HR (95 % CI) of 1.30 (0.68–2.50), 0.93 (0.53–1.62), and 1.09 (0.53–2.23), respectively. By contrast, the APD was significantly associated with resolution (HR 0.83; 95 % CI 0.74–0.92, p = 0.001). Likelihood ratio test showed acceptable fit of the model (p < 0.001). We did not find any evidence of effect modification of one on the other covariates.

Discussion

Neonatal USG, an initial diagnostic modality of choice for AH, provides better anatomic resolution due to fewer interposed tissues when compared to fetal USG. In this study, neonatal USG correctly identified dysplastic or duplex kidney in 12 % of patients diagnosed with AH by fetal USG. We focused on healthy, full-term infants having AH without abnormal ureter and bladder. UPJO was the most common pathologic finding. The optimal cutoffs of neonatal USG parameters were evaluated. The APD and SFU grading were good indicators for detecting significant hydronephrosis with similar overall diagnostic value based on the AUC.

Diagnostic performance of USG is influenced by study population and cutoff value [8, 1619]. A large series of AH with a long-term follow-up from the Great Ormond Street experience reported by Dhillon [19] demonstrated that APD < 12 mm with no calyceal involvement posed no risk of surgery, whereas APD > 50 mm inevitably warranted surgery. Our study showed that APD ≥ 16 mm and SFU grade 4 yielded an excellent specificity and PPV; thus, neonatal USG could be used as a diagnostic tool to detect pathology in IH, mainly UPJO. Additional postnatal imaging should be recommended for APD ≥ 16 mm or SFU grade 4. The modest sensitivity and NPV decreased the accuracy of neonatal USG as a screening tool, and postnatal investigation should be considered on a case-by-case basis for APD < 16 mm and SFU grade <4. This recommendation could prevent a misdiagnosis of significant pathology and reduce the use of unneeded postnatal imaging with potential harmful effects from radiation exposure and urinary catheterization.

The majority of patients with IH had insignificant hydronephrosis. Until spontaneous resolution occurs, potential morbidity associated with hydronephrosis causes substantial parental concerns. Comprehensive prognostic information could reduce ongoing parental distress and prevent repeated unnecessary imaging. Resolution was inversely correlated with the severity of hydronephrosis, and details of resolution in mild cases were previously described [5, 6]. Data shown in the study by Longpre et al. [20] that the resolution rate of AH was determined by the APD and SFU grading provide an excellent prognostic overview. Nevertheless, the information is partially relevant to patients with insignificant hydronephrosis as prognosis of AH is variable by the underlying pathology. Our study is the first to report the prognostic data on insignificant hydronephrosis. However, the long-term prognosis of insignificant hydronephrosis requires further elucidation as we found that spontaneous resolution occurred within 2 years in the majority of these patients and only one-third of the patients with insignificant hydronephrosis had a follow-up of more than 18 months. The APD was the only independent factor predicting resolution. An increased chance of resolution by 17 % was observed with every 1-mm decrease of APD.

Selection bias toward the diagnosis of insignificant hydronephrosis as opposed to VUR was avoided as VCUG was performed in every patient. As patients with abnormalities of the ureter and bladder in neonatal USG were excluded and the incidence of VUR in IH was low, routine VCUG may yield limited benefit in IH. Moreover, a high spontaneous resolution and a low likelihood of urinary tract infection were noted in VUR associated with AH [21, 22]. Nevertheless, prompt recognition and management of urinary tract infection should be emphasized during parental counseling for IH.

There were a few limitations to this study. First, the diagnosis of insignificant hydronephrosis depended on the results of DR which are open to interpretation and subjective. However, the serial imaging protocol and length of observation allowed for the diagnosis of UPJO in subsequent imaging and could minimize misclassification. Second, USG interpretation is operator dependent, and inter-rater variability was not evaluated in this study. Nevertheless, APD measurement is objective and could improve the comparison of USG results between different studies [23]. Third, there was a trend toward the superior performance of APD measurement when compared to SFU grading although the difference did not reach statistical significance. A larger sample size may be required to detect the difference of the diagnostic accuracy between the APD and SFU grading.

In conclusion, APD measurement and SFU grading in the USG performed at the age of 7–30 days were a useful diagnostic tool for detecting pathology in IH. Prospective large-scale studies are required to validate our findings before the threshold for treatment decisions is recommended in clinical practice. Nevertheless, postnatal management and parental counseling for patients with IH could be individualized based upon neonatal USG parameters. Selective postnatal diagnostic procedures could reduce unnecessary investigation and utilization of limited healthcare resources. Our findings are applicable to healthy infants without ureter and bladder abnormalities as neonates with prematurity, oligohydramnios, renal failure, or extrarenal structural anomalies were excluded.