Introduction

Congenital anomalies of the kidney and urinary tract (CAKUT) collectively are the most common cause of chronic kidney disease (CKD) and kidney failure in children and young adults [1]. The types of malformations are diverse, yet all share the common etiology of a disruption of normal in utero kidney development [2]. These in utero determinants include intrauterine growth restriction, placental insufficiency, prematurity, fetal sex, urinary tract obstruction, genetic mutations, and an altered maternal environment including maternal diabetes, alcohol exposure, and protein restriction [3]. All can influence normal kidney development, resulting in a decrease in final nephron number, which then determines postnatal kidney health. As proposed by others, a decrease in nephron endowment results in single nephron hyperfiltration, progressive scarring and eventual loss of kidney function over time [4, 5].

Unfortunately there is no direct measure of nephron number available in the clinic and therefore it is challenging to predict which children with CAKUT will go on to develop complications of their kidney anomaly [6]. Various surrogate measures of nephron number have been applied to different forms of kidney malformations and associated with long term kidney injury, including radiologic assessments of kidney length, kidney volume and kidney parenchymal endowment [7,8,9,10,11,12,13].

The purpose of this report is to define the prevalence of specific forms of CAKUT in a large single center cohort, to identify specific shared characteristics in these cases which predict clinically significant CKD (eGFR <60 ml/min/1.73 m2) and the composite outcome of proteinuria, hypertension, or CKD, and to demonstrate how these characteristics can be used to develop a prediction model which informs a risk stratified clinical pathway of care.

Methods

Case selection and identification

This was a quality improvement (QI) study conducted to develop a standardized clinical pathway for the management of infants with CAKUT at British Columbia Children’s Hospital. Ethics approval was not sought, as QI initiatives are exempt from Research Ethics Board review. We identified patients from 2000–2016 with CAKUT using our program clinical datasets. Relevant clinical data were extracted from patient files, and information entered into a QI REDCap data collection tool. The criteria for diagnoses of multicystic dysplastic kidneys (MCDK), unilateral kidney agenesis (UKA), kidney hypoplasia (KH), and posterior urethral valves (PUV) were as previously described [14,15,16]. KH was defined as a unilateral or bilateral kidney length or parenchymal volume <5th % for body length or weight, respectively, with or without additional kidney anomalies [17]. Absolute kidney measurements were obtained from the ultrasound reports or from the actual scan, if not reported. There were occasions when the case had more than one of the primary diagnoses. If a PUV case also had KH, PUV was designated the primary diagnosis; if a PUV case had either MCDK or UKA, PUV was designated the primary diagnosis; and if an MCDK or UKA cases also had KH in the solitary functioning kidney, MCDK or UKA were designated the primary diagnosis.

Inclusion criteria included: 1) cases with the diagnosis of MCDK, UKA, KH, and PUV, with the criteria for diagnoses confirmed as previously described [14,15,16]; and 2) ages 0–18 years. Cases were excluded if their diagnosis could not be confirmed or they were missing sufficient clinical data.

Clinical characteristics

Based on previous work and published literature [11, 18] we selected clinical characteristics which were potential risk factors for developing the kidney outcomes. These independent variables included the type of malformation (MCDK, UKA, KH, and PUV), pregnancy-associated characteristics (birth weight, gestational age, antenatal diagnosis), genetic influences (identified genetic syndrome, associated non-kidney anomalies), measures of kidney nephron endowment (first eGFR, kidney size at diagnosis), and whether there was additional CAKUT (aCAKUT).

We calculated eGFR using the appropriate Schwartz equation [19,20,21]. First eGFR represents the earliest assessment available from the case records. We excluded creatinine and eGFR data collected within the first 7 days of life to mitigate the confounding influence of maternal creatinine. For PUV cases the baseline serum creatinine and urine protein values used were obtained at 1 year of age or later to allow adequate time for the urinary tract to decompress following valve ablation surgery.

Kidney size at diagnosis was calculated by averaging kidney lengths and dividing by body length then multiplying the ratio by 100 to allow regression analyses. This was determined at the time of the kidney ultrasound and designated KL:BL, a previously validated metric of kidney size shown to predict outcomes [7, 15]. A cut point of 7.9 was determined by optimal sensitivity and specificities in receiver operating characteristic (ROC) curve analysis using CKD as the outcome.

Additional CAKUT (aCAKUT) refers to structural anomalies detected by kidney ultrasound or voiding cystourethrogram in the solitary functioning kidney for MCDK and UKA cases, and in one or both kidneys for KH and PUV cases. For cases with MCDK, UKA, and KH this included kidney cysts, increased kidney parenchymal echogenicity, dysplasia, hydroureter or hydronephrosis, vesicoureteral reflux, ureterocele, and duplex collecting system. The same aCAKUT definition was applied to cases of PUV, however, hydroureter, hydronephrosis and vesicoureteral reflux were excluded, as they were considered not independent from bladder outlet obstruction.

Clinical outcomes

The clinical outcomes studied were CKD, and the composite outcome of hypertension, proteinuria, CKD or kidney failure. A detailed description of the definitions and measurement techniques for each outcome has been previously summarized [14]. Briefly, CKD was defined as an eGFR <60 mL/min/1.73 m2 on two consecutive outpatient visits separated by at least 3 months. Kidney failure was defined as starting dialysis or receiving a pre-emptive kidney transplant. Hypertension was defined as a systolic and/or diastolic blood pressure (sBP/dBP) ≥95th percentile for age, sex, and height on two consecutive visits occurring at least 3 months apart or as being on a BP-lowering medication. Proteinuria was defined as a urinalysis with ≥0.3 g/L protein, urine dipstick >1+, urine protein to creatinine ratio (PCR) >25 mg/mmol, or urine albumin to creatinine ratio (ACR) ≥3.4 mg/mmol on two consecutive visits at least 3 months apart. The time to development of the clinical outcome was the time at which it was first recorded.

Statistical analyses

All analyses were conducted using SPSS version 28 software (IBM Corp., Armonk NY) with a threshold of p<0.05 for statistical significance. Parametric and non-parametric group data were expressed as mean ± standard deviation (SD) and median with the first (Q1) and third (Q3) quartiles in parentheses, respectively, based on visual inspection and normality testing. Categorical data were expressed as proportions (percent). Between group comparisons were made using Student’s t test, Mann–Whitney U test, and Pearson’s chi square or Fisher’s exact test, as indicated. Binary logistic regression was used to identify clinical variables that were associated with the development of chronic kidney injury outcomes, reported as odds ratios (OR) and 95% confidence intervals (CI). The choice of variables in the model was informed by the strength of their association in the univariate analysis. For binary variables, cut points were determined by optimal sensitivity and specificities in ROC curve analysis. Kaplan–Meier analysis was used to determine outcome-free survival, stratified for the type of kidney malformation. We evaluated the performance of the logistic regression models to predict CKD and the composite outcome by calculating prediction probability and concordance (c) statistic. C-statistic was calculated as the area under the ROC curve using probability scores to predict the development of CKD. The models were cross-validated and tested for overfitting by randomly selecting 80% of the cohort as the training set and 20% of the cohort as the test set and then comparing their prediction accuracy. For the regression analyses, the type of malformation was categorized according to diagnosis and compared against the MCDK cohort.

Results

Identification of cases

We identified a total of 1017 cases from our program registry with the diagnosis of a kidney malformation. For the purpose of the report, we included only those with MCDK, UKA, KH, and PUV, as these constituted the diagnoses for which we had established clinical datasets. They represented 53% of all cases initially designated as having CAKUT. After deleting duplicate data and data fields that were either incomplete or not shared among the different data files and transforming the data into common data fields, we combined the datasets for a total of 452 unique cases (Figure 1).

Fig. 1
figure 1

CAKUT case identification. Diagram outlining the cases included in this analysis. Between 2000–2016 we identified 1017 cases with a CAKUT diagnosis. From these, 540 were included according to the inclusion criteria, with 477 cases not included because of their primary diagnosis. Of the cases included, 88 were subsequently excluded because of a lack of sufficient data, duplicate cases, and being unable to confirm the diagnosis, leaving a total of 452 cases for analysis

Clinical characteristics

The combined cohort consisted of 452 cases with a median (Q1, Q3) age at diagnosis of 0.03 (–0.14, 3.28) years and 37% being female (Table 1). Types of malformation included MCDK in 160/452 (35%), UKA in 70/452 (16%), KH in 139/452 (31%), and PUV in 83/452 (18%). In those with available data, 62/354 (18%) were born preterm or less than 36 weeks’ gestation, and 124/270 (46%) had a birthweight below 3.2 kg. Genetic syndromes were identified in 69/452 (15%), and non-kidney anomalies in 105/452 (23%). The first eGFR was <90 ml/min/1.73 m2 in 159/337 (59%) of the cases and 166/391 (43%) had a KL:BL <7.9. Thirty-seven percent of the cohort or 168/451 had associated kidney anomalies (aCAKUT) in addition to their primary kidney malformation diagnosis (Table 1).

Table 1 CAKUT cohort characteristics

Clinical outcomes

The mean age at last follow-up for the combined cohort was 7.96 ± 5.41 years. CKD (eGFR <60 ml/min/1.73 m2) occurred in 22% of the cohort at a median age of 5.3 years. Fourteen percent developed proteinuria (median 12.5 years). Nineteen percent developed hypertension (median 5.3 years). Altogether 29% of the cohort reached the composite outcome (median 4.5 years) (Table 2). The outcome-free survival was significantly different among the different CAKUT categories for both CKD and the composite outcomes by Kaplan–Meier analysis (log rank p <0.001) (Figure 2). The mean years (standard error of the mean) to CKD outcome were estimated to be 16.0 (0.2), 15.5 (0.6), 13.2 (0.7), and 9.5 (1.1) for cases of MCDK, UKA, KH, and PUV respectively, and for the composite outcome 15.1 (0.4), 13.2 (0.9), 12.4 (0.6), and 7.6 (0.9) years for MCDK, UKA, KH, and PUV cases respectively.

Table 2 CAKUT cohort outcomes
Fig. 2
figure 2

Kaplan–Meier survival analysis. A. CKD-free and B. composite outcome-free survival, stratified by CAKUT category including multicystic dysplastic kidneys (MCDK), unilateral kidney agenesis (UKA), kidney hypoplasia (KH), and posterior urethral valves (PUV). Significant differences in the outcomes determined by log-rank (p <0.01). The cases remaining over time for each diagnostic category are included in the tables below

Association of predictors and outcomes

The strength of the association of the clinical attributes with the outcomes was first assessed by univariate regression analysis. The results of all 13 predictors with the selected outcomes are detailed in Supplemental Table 1. Significant associations are included in Table 3. While different clinical variables had variable associations with the different outcomes, the type of malformation, prenatal characteristics such as preterm delivery and antenatal diagnosis, the presence of non-kidney anomalies, first eGFR <90, small kidney size (KL:BL <7.9), and anomalies in addition to the primary diagnosis (aCAKUT) had the strongest associations (Table 3). In particular, a PUV malformation, first eGFR <90, KL:BL <7.9, and aCAKUT had an increased odds of being associated with CKD or the composite outcome, varying from approximately 2 to 35. Sex was excluded from this analysis as there is a strong bias when PUV is included. In a separate univariate regression analysis without PUV, female sex was not associated with either CKD (OR 0.60, 95% CI 0.32–1.10, p=0.10) or the composite outcome (OR 0.74, 95% CI 0.45–1.22, p=0.24).

Table 3 Univariate regression of independent variables against the target variables

Multivariate binary logistic regression model

To develop a risk-stratified prediction model which would identify clinical variables associated with a worse outcome, including CKD and the composite outcome, we employed binary logistic regression analysis. The type of malformation (PUV), first eGFR <90, and small kidney size (KL:BL <7.9) were identified as significant independent risk factors for CKD in a multivariate model, with adjusted odds ratios varying from 4.16 to 4.74 (Table 4). The model performed similarly for the composite outcome, with adjusted odds ratios varying from 2.04 to 13.31 (Table 5).

Table 4 Multivariate binary logistic regression analysis for CKD outcome
Table 5 Multivariate binary logistic regression analysis for composite outcome

Evaluation of the prediction model

We evaluated the performance of the model to predict CKD and the composite outcome, by calculating prediction accuracy and c-statistic. When the probability cut off was set at 0.5, the model had a 73% and 77% prediction accuracy for CKD and the composite outcome, respectively (i.e. predicted the outcome correctly). By ROC curve analysis the concordance or AUC was 0.82 and 0.80 for the CKD and the composite outcome prediction probability scores, respectively (Figure 3A). When cross validated, both models performed well and were not overfit, with the CKD prediction model having a prediction accuracy of 76% and 79% for the training and test sets, respectively, while the composite outcome prediction model had a prediction accuracy of 76% and 77% for the training and test sets, respectively.

Fig. 3
figure 3

CAKUT prediction probability scores. A. Using ROC curve analysis we determined the performance of the prediction probability score for CKD derived from our regression modeling. The area under the curve, or c-statistic was 0.82. B. Table showing the coordinates of the ROC curve analysis, with the CKD prediction probability score in the first column and the sensitivity of that score in predicting the outcome in the second column. To achieve a 100% sensitivity, or no likelihood the cases will develop CKD, the score needs to be less than 0.032, indicated by the values above the hashed line. C. Frequency plot demonstrating the range of prediction probability scores for the cohort on the x-axis and the number of cases for each score on the y-axis. With the cut point of 0.032 identified, all the cases to the left of the hashed line are predicted not to develop the outcome. These cases represent approximately 20% of the cohort studied

Implementation into a clinical pathway

We developed the prediction models and risk scores as a first step towards a risk-stratified clinical pathway. The overarching principle of the care algorithm is to identify, at the initial assessment, those CAKUT cases at risk of developing CKD and the composite outcome. Those with no identifiable risk would be discharged from specialty care into the care of their primary care provider. Those at risk would continue to be followed in the specialty clinic with follow-up and testing guided by their prediction probability risk score. As an example, we calculated the CKD prediction probability risk score for each case in our cohort, and using ROC curve analysis, identified 0.032 as the cut point score below which there was no chance of a false prediction of a good outcome (i.e. the sensitivity of the test was 100%) (Figure 3B). When we applied this cut point to our cohort, approximately 20% of the cohort was identified as having no risk of developing CKD, and therefore would not require further follow up in the specialty clinic (Figure 3C). This represented 54 cases. Of these, 29 (54%) had MCDK and 25 (46%) had UKA. None of the KH or PUV cases had prediction scores that would guarantee a 100% sensitivity (i.e. no chance of a false negative, or of incorrectly discharging them from clinic because of no risk for developing CKD). Using this model, the specialist in the clinic could make a "simple" determination of whether the case had no risk without inputting the values into the model formula to generate a risk score. As an example, if the case was an MCDK case, the first eGFR was ≥90, the KL:BL was ≥7.9, and there were no additional kidney anomalies, then their prediction score would be zero. If however they did not fulfill all these criteria (and assuming that all required testing was done) then their risk prediction score would be higher, increasing the likelihood of developing the outcome.

Discussion

CAKUT is the most common cause of CKD and kidney failure in children and young adults; however, there is significant heterogeneity in clinical phenotypes and kidney outcomes among the different malformations. This variation is linked to the range of nephron endowment within and among the specific types of CAKUT. In this report we have defined postnatal kidney outcomes in a large contemporaneous cohort of CAKUT cases and identified shared clinical characteristics that are associated with long term kidney injury such as hypertension, proteinuria, and CKD.

Our data highlight disparate outcomes within and among the different forms of CAKUT. The most common forms of CAKUT seen in our clinic included MCDK, UKA, KH, and PUV, similar to reports from other centers [11, 18] and representing over 50% of all cases. Of note, in this report we separated solitary functioning kidneys into the specific diagnoses of MCDK and UKA, as these two entities recently have been shown to have distinct outcomes and clinical risk factors [14, 22]. We also excluded a large category of hydronephrosis without a primary diagnosis or apparent kidney anomalies, recognising that most would have been transient, due to vesicoureteral reflux, partial ureteric obstruction or bladder dysfunction [23].

We examined a number of kidney injury outcomes including hypertension, proteinuria, and CKD. These were common in cases with CAKUT, varying from proteinuria in 14% of cases to CKD in 22% of cases. Importantly we also demonstrated that these outcomes were different across the different types of CAKUT and were not uniformly favorable. For example, while 22% of the cohort developed CKD, the proportion of cases with PUV was significantly higher than for the other forms of CAKUT. This was not surprising and the progression of PUV cases to CKD and kidney failure has been well described [24,25,26,27]. Notably however, all forms of CAKUT, including cases with an MCDK developed the outcomes studied. The risk of chronic kidney injury among children with solitary functioning kidneys has been reported, and recently in a large multicenter Dutch cohort, shown to be associated with kidney agenesis, anomalies in the solitary functioning kidney, and high body mass index [11, 14, 22, 28].

We identified a number of important characteristics across the CAKUT cohort that were associated with the outcomes. These included the diagnoses of KH and PUV, which in practice represent the more severe phenotypes of developmental kidney injury and therefore expectedly are associated with the worst outcomes [29]. Preterm delivery has also been shown to be associated with a decrease in nephron endowment due to an interruption in normal nephrogenesis [30], while the finding of non-kidney anomalies serves as a surrogate of underlying and often undiagnosed genetic syndromes presumably also affecting normal kidney development. Although they are coarse estimates of nephron endowment, we found baseline kidney function and kidney size to be strong indicators of postnatal outcome, as has been shown in other studies focused on specific CAKUT categories [14, 15, 25, 26]. In the multivariate model, when adjusted for the effects of the other important characteristics, the diagnosis of PUV, first eGFR <90 and small kidney size were significant independent determinants of CKD and composite outcome in cases with CAKUT.

Our binary logistic regression model successfully identified the clinical characteristics which predict the CKD and composite outcomes and the strength of their roles in that prediction. In the future, the model will be used to develop a risk-stratified clinical pathway to identify those cases likely to develop the complications and require further follow up and potential intervention. Because there is no unifying, standardized and evidenced-based clinical pathway for children with CAKUT, there is considerable practice variation within and among programs caring for these children. As a proof of principle, we applied our prediction model to our cohort and identified up to 20% of cases who could have been discharged to primary care after their initial assessment. If applied going forward, this would represent significant improvements in the standardization and efficiency of care, outcomes, and cost savings. Likewise for those cases at risk of developing the outcome, such as those being cared for in specialty clinics, the prediction probability scores generated from the model could be used to inform follow up. Future strategies to implement the prediction tool and to make it accessible include integrating it into regular clinical workflow by incorporating it into the electronic medical record (EMR). Alternatively, at the point of care, patient data from the clinic encounter can be recorded manually into a spreadsheet wherein the risk prediction tool is incorporated.

There are a number of limitations of this report. Although derived from a large cohort of cases, this was a single-center experience and therefore it is unclear whether it represents the same experience in other programs. In addition, this report has the same shortcomings of other retrospective studies, including missing data and an inherent bias of longer follow up of the sicker cases. Another important drawback is the profile of the included cases. Approximately 45% of the original cohort was ineligible for the reasons discussed, including exclusion of some cases on the basis of their diagnosis, such as ectopic kidneys and ureteropelvic junction obstruction. Future work is warranted to determine if these types of CAKUT would share the same outcomes and risk factors. This report is also limited by the relatively short follow-up, leaving the possibility that some cases would have developed the outcomes if they had been followed longer.

Another potential bias is the misclassification of cases of involuted MCDK cases as UKA. While this is an important consideration, the impact on the analyses is minimal and in practice unavoidable. In fact, as described by Aslam and colleagues in a prospective cohort of cases of MCDK diagnosed antenatally [31], only 5% involuted by the time the baby was born. In our contemporaneous cohort of 160 cases of MCDK this would represent about 8 cases. Furthermore, when we looked at our "pure" MCDK group, it is clearly distinguishable from the UKA group for a number of characteristics and outcomes. If we were to exclude misdiagnosed cases from the UKA group, these differences would likely be more pronounced.

Finally, given that our overarching aim was to develop a prediction model useful in the early diagnostic period (i.e. <6 months of age or earlier) and across various categories of CAKUT, studying the impact of other potentially important longitudinal variables such as the development of recurrent urinary tract infections and the number and type of urologic surgeries, was precluded. The optimal management of these other potentially important issues require the care of a combined multidisciplinary team, as recently discussed [32].

Nonetheless this report has a number of important strengths. By way of using a large contemporaneous cohort of different types of CAKUT diagnoses, this report has the advantage of a relatively standardized approach to defining and assessing the clinical risk factors and outcomes. Also, by identifying the predictors of the composite outcome, it marks an important early milestone in our pathway building process, which is to identify which patients to safely discharge, without concern that they might develop CKD, proteinuria, or hypertension. This report also signals the need for a protocolized care pathway and partnering with engineers of the EMR to ensure that data are gathered, catalogued, and extracted in a manner that enable future predictions and improved modeling. Future work will include externally validating the prediction model in other programs, revising as necessary, and integrating the model into regular day-to-day workflow.