Introduction

Unilateral ureteropelvic junction obstruction (UPJO) is defined as impeded urine outflow from the renal pelvis to the ureter, which may result in progressive damage to the kidney [1]. With a varying incidence of approximately 1 per 750–2,000, unilateral UPJO represents the most common obstructive uropathy [2]. Despite its epidemiological significance, the management of children with unilateral UPJO is still controversially debated and practice varies widely [35]. The current diagnostic approach with renal imaging cannot reliably determine if patients have a significant obstruction and are at risk for permanent kidney damage [69]. Therefore, it is difficult to time intervention and to track potential progression of split renal function deterioration [1013]. The management of unilateral UPJO has undergone a paradigm shift in the recent decades [5, 14, 15]. The observation of deteriorating split renal function seemed indicative for early surgical intervention [16, 17]. Animal models supported the surgical approach, however, the total obstruction by ligature of the ureter does not reflect the situation in humans [1822]. It has become apparent that patients may be treated safely by an approach with close monitoring and serial imaging [23, 24]. Therefore, non-surgical management has become the favoured approach for most asymptomatic patients without diminished split renal function [4, 25, 26]. A recently published systematic review, however, emphasised the need to evaluate the outcome of the non-surgical management of patients with unilateral UPJO without the restriction of randomised controlled trials [27]. Therefore, the purpose of this systematic review was to determine the pooled prevalence of the effects of non-surgical management of these patients for the outcomes split renal function, secondary surgical intervention, hydronephrosis, and drainage pattern.

Materials and methods

All methods used for this systematic review were specified in advance and documented in a protocol available in the International Prospective Register of Systematic Reviews (PROSPERO; http://www.crd.york.ac.uk/PROSPERO) under registration number CRD42016034013. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (http://www.prisma-statement.org/) and items of the Cochrane Handbook for Systematic Reviews of Interventions were used for reporting.

Data sources and search strategies

We developed a search strategy to identify all studies reporting on outcomes of the non-surgical management of children younger than 18 years of age with unilateral UPJO (Table 1). We searched the electronic databases CENTRAL (Cochrane Central Register of Controlled Trials, Issue 12 2015), MEDLINE and EMBASE (from their inception to 21 December 2015) without language restriction. In addition, we searched the following clinical trials registries for ongoing or recently completed trials using keywords (“hydronephrosis,” “obstruction,” “junction,” “ureteric-pelvic,” “ureteropelvic,” “pelvi-ureteric,” “ureteral obstruction,” “pyeloplasty”): www.clinicaltrials.gov; www.controlled-trials.com; www.trialscentral.org; apps.who.int/trialsearch; www.drks.de; www.anzctr.org.au. Conference proceedings listed in Supplementary Material 1 were screened using the same keywords. The reference lists of relevant articles and reviews were also searched for trials and publications.

Table 1 Electronic search strategy used for the identification of potential studies

Study selection

We included all types of studies reporting any of the following outcomes: split renal function, secondary surgical intervention (children initially managed non-surgically, but later requiring surgery owing to worsening of the condition), hydronephrosis, and drainage pattern. Renal scintigraphy (with or without the administration of diuretics) or any other appropriate cross-sectional imaging method, such as magnetic resonance imaging, was required for the diagnosis of unilateral UPJO. For the purposes of this review we consistently refer to the term “obstruction” if any urinary outflow impairment is detected, compared with the contralateral kidney. Therefore, the detection of obstruction is not equivalent to an inevitable indication for surgical intervention. Studies were excluded if one of the following criteria was present: diminished/absent contralateral kidney function, acute clinical symptoms or distinct hydronephrosis leading to immediate surgery, nephrostomy, stenting, or crossing vessels. Case reports or case series including ≤ 3 patients were excluded.

If studies also included data on patients who did not meet the inclusion criteria, or if additional information was needed for decision-making, study authors were contacted to retrieve required data. Eligibility of study inclusion was determined independently by two review authors, and disagreements were resolved by a third author.

Data extraction

Data extraction was done independently by two authors using a standardised data extraction form. If a study reported on separate subpopulations (e.g., with different group characteristics), these were extracted separately. If more than one publication on a given study existed, reports were merged to form an exhaustive, but nonredundant data set with the most recent information available. Information required from the original authors was requested by written correspondence and subsequently included in the review. Any disagreement between two independent reviewers was resolved by a third author.

Quality assessment

The risk of bias assessment at the study level was defined according to the evidence-based medicine criteria and the Cochrane ACROBAT-NRSI (https://sites.google.com/site/riskofbiastool), with adaptations to the objective of this review [2830]. The assessment was carried out for all the studies included by two independent reviewers using a standard assessment form (Supplementary Material 2) and summarised for each study for the developed categories to ensure comparability and integrative analysis.

Data synthesis and analysis

The main outcomes were measured as prevalences, calculated from the number of individuals and the total size of the population, with accompanying 95% confidence intervals (CI). Outcomes were grouped for the population with or without secondary surgical intervention, and for the combination of both. The given numbers for patients with and without secondary surgical intervention did not have to add up to the total number of patients, as not all studies report on both subpopulations equally. The inconsistency of effects (heterogeneity of variance across studies) was addressed by inspection of forest plots and calculation of Q and I2 for each outcome per group with a random effects model [31, 32]. I2 values of more than 25, 50, and 75% correspond to low, medium, and high levels of heterogeneity. Meta-analysis was performed using the random-effects model, thereby adjusting for a variation of the effect between studies. To include studies with a prevalence of 0.0 and 100.0%, a frequency increment of 0.00001 was added to these values. If pooling was not possible, we provided descriptive results of studies. All analyses were carried out using R 3.1.2 and the additional package meta 4.3.2 [33, 34]. We performed sensitivity analyses to explore the influence of the following factors on pooled effect size:

  1. 1.

    Large/small studies (defined as ≤ 100 patients or ≤ 20 patients)

  2. 2.

    Studies not specifying a split renal function > 40% at time of enrolment

  3. 3.

    Studies defining less than 50% of the population as having a half-time drainage of > 20 min

  4. 4.

    Studies whose participants do not have a hydronephrosis of SFU grade ≥ 3

  5. 5.

    Studies that do not report on children < 1 year of age

  6. 6.

    Studies in which most or all participants were followed up for at least 2 years

A multi-variable meta-regression was performed to determine the influence of the following predictors on the prevalence of secondary surgery: age at presentation (all participants < 1 year/ other), country of study origin (Anglosphere/not Anglosphere), split renal function at presentation (all patients ≥ 40%/other), drainage pattern at presentation (at least 50% of all patients with a drainage half-time > 20 min/other) and hydronephrosis at presentation (all patients with SFU grade ≥ 3/other). Publication bias was visualised with funnel plots [35].

Results

The initial search yielded 53,266 records, none of them from clinical trials registries. After removal of duplicates, 43,080 titles or abstracts were screened and 42,818 were found to be irrelevant to the review. After full-text assessment, another 242 records were excluded with reason (Supplementary Material 3), among them 16 records with missing/unclear information on study population or outcomes for which authors did not provide details upon request, and two records whose study population was later described in more detail [36, 37]. The remaining 20 records [24, 26, 3855] were included in this review (Fig. 1) and are characterised in Supplementary Material 4.

Fig. 1
figure 1

Study flow diagram according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [77]

Study characteristics

The studies included were mostly monocentric and initiated between 1980 and 1995, predominantly in the Anglosphere. Most of them were published in the 1990s and all but two were observational, either case series or cohort studies [24, 51]. Many studies had to be excluded owing to reporting of mixed populations (uni- and bilateral UPJO) or missing functional renal imaging.

Across all studies, the selection criteria for non-surgical management were often not reported. Approximately 50% of all children in the studies included showed a renal function ≥ 40% at enrolment. Although the diagnostic methods used were comparable, the interpretation was not consistent, especially regarding the evaluation of the drainage pattern. All but three studies did not provide a precise definition of obstruction [41, 49, 50]. Follow-up times were not specified at all or were not reported in a consistent manner across studies, as mean, median or a range [38, 44, 4648]. Patients were followed up from 2.5 months to 19.6 years. Only 3 studies reported an observational period of more than 2 years for all patients [26, 45, 55]; an additional 5 studies had follow-up times of at least 1 year for each patient [39, 40, 42, 43, 49].

Quality assessment

We summarised the customised quality assessment in a risk of bias for each study and across all studies (Figs. 2, 3). For almost all studies, “measurement of interventions” and “measurement of outcomes” had to be classified as “high risk.” Approximately one third of the studies were biased because of the selection of the reported results. Missing data were apparent in less than 50% of all studies. The two randomised controlled trials included [24, 51] did not show an overall superior quality for the assessed risk of bias categories compared with the observational studies (Fig. 3).

Fig. 2
figure 2

Risk of bias summary: review authors’ judgements on each risk of bias category presented as percentages across all the studies included (green low risk, yellow unclear risk, red high risk, white not applicable)

Fig. 3
figure 3

Risk of bias: review authors’ judgements on each risk of bias category for each study included (green low risk, yellow unclear risk, red high risk, white not applicable)

Split renal function deterioration

Split renal function was the most common reported outcome of the studies included, usually only in the descriptive manner “number of patients with improvement, stabilisation or deterioration”. Only few studies reported mean values at enrolment and at the last follow-up for the entire population [39, 45, 48, 51]. The pooled prevalence of the total population with worsened split renal function (Fig. 4) showed significant heterogeneity (p < 0.0001), high variance across studies (I2 = 96.6), and amounted to 21.0% (95% CI = 4.7–37.4%) for 367 patients. The subgroups with and without secondary surgery (89 and 39 patients respectively) revealed a higher prevalence for the secondary surgical subgroup (28.4%, 95% CI = 0.0–66.7%, I2 = 91.6%) than for the solely non-surgical group (18.9%, 95% CI = 0–59.1%, I2 = 98.5%). Studies showed an increased pooled prevalence of 26.7% (95% CI = 3.4–49.9%, 295 patients) for patients with a split renal function of ≥ 40% at enrolment and a pooled prevalence of 23.1% (95% CI = 12.7–33.5%) for all patients < 1 year of age at enrolment, both with similar heterogeneity and inconsistency.

Fig. 4
figure 4

Pooled prevalence of split renal function deterioration in all children with initial non-surgical management (total), and the subgroups of children with and without secondary surgery. Each study is marked according to the information available for split renal function at enrolment. A all patients ≥ 40%, B additionally patients with < 40% included, C unknown

Secondary surgery

The prevalence of children with a secondary surgical intervention was reported in 18 studies with 1,076 patients and showed statistically significant heterogeneity (p < 0.0001) and true variance between studies (I2 = 97.7%) (Fig. 5). Prevalences ranged from 0% [47] to 73.4% [46], with a pooled prevalence of 27.9% (95% CI = 17.7–38.2%). Even when inclusion criteria for split renal function at enrolment were restricted to patients with values ≥ 40%, the prevalence of patients with secondary surgical intervention was highly variable. The pooled prevalence was 28.3% (95% CI = 18.1–38.5%, 485 patients) with significant heterogeneity (p < 0.0001) and high inconsistency (I2 = 96.3%). Studies exclusively reporting on children < 1 year yielded a lower pooled prevalence of 23.1% (95% CI = 12.7–33.5%, 905 patients); the heterogeneity was significant (p < 0.0001) and the true variance between studies remained high (I2 = 97.5%). Removal of large (≥100 patients) and small (≤20 patients) studies had no effect on the pooled prevalence (29.8%, 95% CI = 12.5–47.1%, 424 patients), with retained significant heterogeneity (p < 0.0001) and considerable inconsistency (I2 = 97.1%). If only studies with follow-up times of more than 2 years for most or all patients were considered (18.6%, 95% CI = 3.5–33.7%, 203 patients), heterogeneity remained significant (p < 0.0001) and variance among studies was high (I2 = 93.4%).

Fig. 5
figure 5

Pooled prevalence of secondary surgery. Data visualised as described in Fig. 3

We aimed to model the secondary surgery prevalence using meta-regression. Significant portions of variability were explained by the predictors age at presentation (negative correlation with age < 1 year) and study country (higher rates outside of the Anglosphere). Overall, residual variance could be reduced to 73.5% through regression analysis. A stratification of studies on children < 1 year revealed a lower pooled secondary surgery prevalence if all patients presented with a split renal function ≥ 40% (7.1%, 95% CI = 1.2–12.9%, 235 patients), compared with studies that also included patients presenting with a split renal function < 40% (27.2%, 95% CI = 0–54.5%, 463 patients).

The criteria for secondary surgical intervention were usually not precisely defined in the 18 studies available. In all but 2 studies [43, 50] the criterion “deterioration of split renal function” was reported, but only 5 studies [24, 39, 42, 48, 52] specified the deterioration as a decrease of 5 or 10%. The criteria progressive hydronephrosis, drainage pattern, or clinical associated symptoms were used less frequently, and only defined for hydronephrosis in one study [39].

Hydronephrosis deterioration

A progression of hydronephrosis was reported in 4 studies (173 patients), 2 of which used a grading system (Society for Fetal Urology, SFU) [26, 55]. The pooled prevalence of hydronephrosis deterioration (Fig. 6) was not heterogeneous (p = 0.1436) and showed moderate variance across studies (I2 = 44.6%). It amounted to 3.2% (95% CI = 0–8.1%) for a total of 173 patients. The separate analysis of patients with and without secondary surgical intervention showed the same trend, without heterogeneity (p = 0.5647 and p = 0.5104) or inconsistency (I2 = 0%), and a pooled prevalence of 0% (95% CI = 0.0–0.0%) each.

Fig. 6
figure 6

Pooled prevalence of hydronephrosis deterioration. Each study is marked according to the information available for urinary tract dilatation at enrolment. A all patients with SFU grade 3 or 4, B additionally patients with lower SFU grades, C unknown. Data visualised as described in Fig. 4

Drainage pattern improvement

For the outcome “drainage pattern” we used improvement as the criterion owing to difficulties in the evaluation of an already progressively obstructed drainage pattern. Only 2 studies defined the outcome, either as a change to a non-obstructed state or as a decrease to 15–25 min drainage half-time [45, 50]. The outcome was reported for a total of 112 patients (Fig. 7) and amounted to a pooled prevalence of 82.2% (95% CI = 69.2–95.2%), with significant heterogeneity and moderate inconsistency (I2 = 64.7%). For the 76 children with split renal function ≥ 40% at enrolment, the prevalence was almost identical (83.1%, 95% CI = 64.3–100%, p = 0.1794, I2 = 41.8%). Subgroup analysis showed no variation in this outcome for patients with secondary surgery (n = 55, pooled prevalence: 100%, 95% CI = 100.0–100.0%, p = 1, I2 = 0%), but significant heterogeneity (p = 0.0005) and inconsistency (I2 = 83%) for the solely non-surgical patients (n = 193, pooled prevalence 47.6%, 95% CI = 26.9–68.4%).

Fig. 7
figure 7

Pooled prevalence of drainage pattern improvement. Each study is marked according to the information available for the drainage pattern at enrolment. A at least 50% of study patients with drainage half-time ≥ 20 min, B additionally patients with shorter drainage half-times, C unknown. Data visualised as described in Fig. 4

Discussion

To the best of our knowledge we present the first systematic review assessing the non-surgical management of unilateral UPJO in children. Previous systematic reviews and meta-analyses have summarised the sonographic outcome of patients with prenatal hydronephrosis [56, 57]. Knowledge concerning the aetiology of hydronephrosis, however, is indispensable for treatment in the daily clinical routine. Our systematic review focuses on patients with unilateral UPJO to determine if there is sufficient evidence supporting the current trend towards non-surgical management and to highlight areas of ambiguity and inconsistency that need further clarification.

The applied literature search was extensive and broad enough to detect any current evidence of unilateral UPJO in children. During study selection, several important aspects in the evaluation of the literature attracted attention. At first, the distinction between a dilated upper urinary tract that is obstructed and a dilated but unobstructed system is of fundamental importance [26, 58, 59]. Nevertheless, a considerable number of publications report cohorts of patients as having unilateral UPJO based on the sonographic detection of hydronephrosis, without presenting any evidence of impaired urinary outflow (Supplementary Material 3). Furthermore, the therapy decision for upper urinary tract dilatation is considerably biased depending on whether the obstruction involves one or both kidneys [60, 61]. Yet, a significant number of publications report on a mixed population with uni- and bilateral UPJO (Supplementary Material 3). Renal imaging in bilateral cases of UPJO cannot reliably assess split renal function, and therefore the threshold for clinical decision-making towards surgical intervention is often lower than in patients with unilateral UPJO. Studies limited by these criteria may lead to an over- or underestimation of the effects of non-surgical management in children with unilateral UPJO and, therefore, needed to be excluded [23, 6270].

We were able to include 1,083 patients, for whom outcomes were seldom reported as quantitative measures, but rather as qualitative statements, thereby limiting the informative value of changes seen in split renal function and revealing the need for the establishment of precise and constant definitions of improvement and deterioration.

Even though deterioration of split renal function plays a crucial role in clinical decision-making, this outcome was reported for only one third of all study patients and mostly for children with good renal function at enrolment (≥40%). We showed that 21% of patients seem to have a risk for split renal function deterioration during the study period, contrasting the largest study included with only 3.6% [41]. On average, the secondary surgical group comprised a considerable number of children, who had impaired function at the last follow-up and did not regain split renal function values comparable with the measurement at enrolment. This finding challenges the hypothesis that surgical treatment might alleviate the progression of split renal function deterioration [15, 53].

For patient counselling it is necessary to assess the risk for secondary surgical intervention. The pooled prevalence for patients with ≥ 40% split renal function (28.3%) was comparable with previous findings [24, 51]. Interestingly, the pooled prevalence did not change if all children, independently of initial renal function, were included (27.9%). In addition, if only small children (<1 year of age) were taken into account, the rate remained similar at approximately 25%, revealing that older children with clinical symptoms or already progressed renal function deterioration did not lead to bias.

Meta-regression elucidated a significant influence of two parameters (study country and age at presentation) on the secondary surgery rate, yet did not greatly reduce overall variance among studies. The effect of the study country could be explained by different customs in the diagnostic and therapeutic management of UPJO. Studies including very young children (<1 year of age) with a split renal function < 40% also have an increased secondary surgery prevalence compared with those presenting with split renal function > 40% only, reflecting the hypothesis that initially decreased renal function is associated with a more severe renal drainage impairment [39, 52, 53].

Despite the fact that ultrasound is the preferred screening tool for detecting a progressive grade of hydronephrosis, this outcome was rarely reported; yet, it was the outcome least affected by heterogeneity, even though different systems of classification were used for its assessment. Only 3.2% of all patients showed an increase in hydronephrosis. Our low prevalence seems surprising, as it did not correlate with the much higher prevalence seen for split renal function deterioration and the rate of secondary surgery. This finding raises the question whether ultrasound screening is an adequate tool in the detection of UPJO progression by using a customary grading system, or if it poses the risk of underestimating the pathology of non-surgical management downward.

The drainage pattern was the least often reported outcome (112 patients) and its profound variability in prevalences reflects the challenges associated with the interpretation of this imaging technique. We saw a pooled prevalence of 82.2% for improved drainage (irrespective of split renal function at enrolment), which implies that on average only 17.8% of all patients had an unchanged or increased obstruction during follow-up. This represents a much smaller rate than would have been expected by the deterioration rates seen in split renal function. Surprisingly, the subgroup of non-operated children had an almost 50% chance of drainage improvement without intervention during follow-up.

The studies included in this systematic review show a lack of proper reporting, which may reflect pre-2001 Consolidated Standards of Reporting Trials (CONSORT) conditions (http://www.consort-statement.org/), but can have a large impact on study results. Heterogeneity of meta-analyses was found to be substantial for almost all outcomes, even though the selection of patients in the individual studies was predominantly adequate (Figs. 2, 3). Reasons for heterogeneity are likely found in the observed high risk of bias in the measurement of interventions and outcomes of the studies included. Meta-regression and sensitivity analysis did not alleviate the underlying problem.

The variable definitions of unilateral UPJO and the lack of standardised criteria in the use and assessment of renal imaging hampers not only the interpretation, but also the comparability of results. Incongruity of management protocols may challenge the risk assessment of the non-surgical management of the pre-specified outcomes. Furthermore, the criteria for surgical intervention remained unclear in most of the studies.

In the short-term follow-up, approximately 4 out of 5 children with unilateral UPJO may be treated safely with non-surgical management; yet, most studies had a follow-up period that was too short to evaluate the long-term outcome. For patients with secondary surgical intervention, neither progressive split renal function deterioration nor improvement could be observed.

Conclusion

In summary, the major concerns demonstrated in the review are the variable definitions of unilateral UPJO, the indistinct interpretation of the diagnostic methods applied and the inconsistencies and inaccuracies in the reporting of methods and outcomes [6, 25, 71, 72]. Against this background, the evidence for effective non-surgical management in children with unilateral UPJO needs to be critically assessed, especially given that non-surgical UPJO management entails long-term monitoring throughout childhood, accompanied by serial functional imaging with radiation exposure, anaesthesia and/or analgo-sedation, which could potentially lead to serious adverse effects [7376]. Currently, recommendations cannot be made in favour of or against the non-surgical treatment of UPJO in children based on the studies available to date. Nevertheless, even though we could not resolve the ongoing controversy, we were able to gain further insight by providing realistic estimates of non-surgical management of unilateral UPJO. The currently favoured non-surgical management is based on data of poor quality with incongruent types of decision trees for clinical practice within the studies included, and emphasises the need for a high-quality clinical trial that overcomes the reported limitations of previous studies, and provides not only sufficient data for patient risk stratification, but also for long-term outcome. A collaboration between pediatric nephrologists and urologists with common pre-defined therapy criteria is needed for the implementation of such a study.