Background

Acute kidney injury (AKI) is a clinical sequela that can occur in both adults and pediatric populations; worldwide incidence of neonatal AKI varies from 18 to 70% and it is an important contributor to neonatal morbidity and mortality [1,2,3,4,5,6,7,8,9]. With poorly understood and minimal data on subgroup analysis by geographic region (e.g. India), it is difficult to make any conclusions on specific regional associations causing neonates to develop AKI [9, 10]. Independent predictors of survival commonly associated with AKI include increased morbidities, a greater length of stay in the NICU, and increased mortality across neonates overall [2]. Hence, there is a compelling need to stratify these “at risk neonates”.

Neonate-specific scoring systems quantifying illness severity with mortality risk exist in the forms of the Neonatal Therapeutic Intervention Scoring System (NTISS) [11], Score for Neonatal Acute Physiology (SNAP) [12], Transport Risk Index of Physiologic Stability (TRIPS) [13], Clinical Risk Index for Babies (CRIB) [14], and the Simplified age-weight-sex score (SAWS) [15]. However, use of these scoring systems in low- and/or middle-income countries (LMICs) is limited due to the complexity in adapting widespread application through a hospital system. A recent, large multicentre study by Medvedev et al. developed and validated a score (NMR-2000) to predict the in-hospital mortality among neonates with birth weights ≤ 2000 g suitable for use in low- and middle-income countries using datasets from the United Kingdom (UK) and The Gambia [16].

Despite the significant morbidity and mortality associated with neonatal AKI, a standardized score model for stratifying the risk of neonatal AKI does not exist [17]. Scoring index systems such as the Renal Angina Index (RAI) utilise the reduction in estimated creatinine clearance, fluid balance and high-risk disease states to predict AKI in children [18,19,20]. There are a multitude of scoring systems that exist, all of which are based on valid scientific bases, but they currently lack widespread acceptance.

In order to make an AKI risk stratification tool for neonates, we performed a multicentre prospective cohort study from India, and a ‘Risk Prediction Scoring’ was created [The STARZ (Sethi, Tibrewal, Agrawal, Raina, waZir) Score] [21]. This scoring model was used to predict the risk of AKI in neonates (n = 763) upon admission to the NICU with a sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of 92.8%, 87.4%, 80.5%, 95.6%, and 89.4%, respectively. Creation of such a score for LMICs would allow over-burdened health care personnel to rapidly identify at-risk neonates. Here, we report validation of the STARZ score on 744 neonates with data collected prospectively from a multicentre database.

Methods

Study design

A multicentre, national, prospective cohort study conducted in 11 centres across India was conducted. Neonates at the level 2–3 NICU centres between September 2019 and August 2020 who fulfilled the inclusion criteria without overlapping with any of the exclusion criteria were registered.

Inclusion criteria

All neonates (≤ 28 days) admitted to the NICU with established intravenous access (IV) to receive an IV fluid for ≥ 48 h to deliver nutrition and/or hydration were designated.

Exclusion criteria

Participants were excluded if:

  • The neonate died within 48 h upon admission

  • The neonate was receiving continuous care in the nursery without needing an IV

  • The neonate was receiving an IV for medicinal purposes or hydration for < 48 h

  • The presence of any lethal chromosomal anomaly was reported, which included trisomy 13, 18 and anencephaly

  • The neonate required congenital heart surgery within the first 7 days of being delivered due to potential concurrent congenital kidney dysfunction

Data collection

Pertinent information regarding demographic details (birth, age, sex, date of NICU admission) as well as maternal-specific factors (age, parity, underlying medical conditions, peri-partum infections or complications) defined by the American College of Obstetricians and Gynaecologists (ACOG) guidelines were initially recorded. Additional neonatal characteristics noted included their mode and site of delivery (whether inborn or outborn), gestational age, birth weight, length and head circumference, data on resuscitation, temperature at admission, cause of admission as well as an in-depth history for those with certain kidney diagnoses (congenital anomalies, previously occurring AKI episodes, and the need for kidney replacement therapy).

An increase in serum creatinine of 0.3 mg/dL, ≥ 26.5 μmol/L, or 50% more from the previous lowest value as well as a urinary output of less than 1 ml/kg per hour on postnatal days 2–7 demarcated neonatal AKI per the KDIGO criteria. Until AKI was resolved, serum creatinine levels were recorded daily using the enzymatic method. Significant cardiac disease was defined by persistent pulmonary hypertension of the newborn [PPHN], a hemodynamically significant patent ductus arteriosus [PDA], cardiogenic shock, and other congenital cardiac diseases. The weight, blood pressure, heart rate, fluid intake, fluid output, and basic lab parameters (haemoglobin, blood urea nitrogen, electrolytes, and albumin) as well as the use of nephrotoxic medications, respiratory support, blood/urine and cerebrospinal fluid cultures were continuously recorded.

Data entry points

The previously noted variables were collected on a daily basis throughout hospitalization on the first week. Afterwards, the initial value for following weeks was recorded (except serum creatinine, for which all the values were recorded irrespective of the day of life) until an endpoint was reached.

End points

Data was continuously collected until the neonate was discharged for home, transferred out of the NICU, died, reached 120 days of age, or transferred to a facility that was not a part of the national collaboration.

Statistical analysis

The STARZ scoring model predicts the incidence of AKI among neonates any time within 7 days post-admission in the NICU. It was developed using best-fit multivariable logistic regression with step-wise backward elimination, which identified 10 independent variables [3 continuous and 7 categorical] significantly predicting AKI incidence [21]. Some of these variables are available at the time of admission while others are reported at 12 h post-admission. Therefore, the score can be calculated any time post 12 h of NICU admission. For scoring, each of the significant variables is assigned a score based on the previously described methodology [21] with zero indicating the reference group. The scale ranges from 0 to 100, and a cut-off score is ≥ 31.5, where a higher score is proportional to the probability of AKI occurring within 7 days post-admission (Table 1) [21].

Table 1 STARZ scoring model

An online collaborative database was used to aggregate the data points before being exported to Microsoft Excel for statistical analysis via SPSS version 20. The Kolmogorov–Smirnov test was used to test all variables for normality. Continuous variables were evaluated using median and inter-quartile ranges (IQR, 25th to 75th percentiles) while categorical variables were evaluated as percentages and frequencies. The unadjusted relationship of the variables within the two groups was analysed in a univariate manner where the Wilcoxon’s rank sum test was used for continuous variable and the chi-squared or the Fischer exact test for categorical variables. The following analyses are reported in this study: (a) between the validation and derivation cohort; (b) between the neonates with and without AKI incidence within 7 days post-admission in the NICU in the validation cohort; (c) between the neonates above and below STARZ model cut-off score in the validation and derivation cohort; and (d) between the neonates included and excluded in the validation model. The risk of AKI has been reported as relative risk (RR) along with its 95% confidence interval (CI). For the unadjusted analysis, the missing data for the continuous variables was not imputed, due to the relatively larger sample size, while none of the categorical variables had any missing values.

For the validation of the scoring model, the neonates having the data for all the required 10 variables were considered. However, to check the effect of excluded children on the predictive validity, the data for missing variables were imputed using a median imputation (replacing all occurrences of missing values within a variable with the median of that variable). Each of the neonates was assigned a STARZ score based on the data of these variables. The STARZ scoring system is validated based on statistical predictive measures including sensitivity, specificity, positive predictive value, negative predictive value, accuracy and integral of the receiver operating characteristic (ROC) curve. A two-sided p value < 0.05 was considered to be statistically significant.

Results

A total of 744 neonates were included in this validation study (out of 1267 neonates screened) and 763 in the derivation cohort (out of 1386 neonates screened) [21]. The STARZ model variables such as age at entry in NICU [median (IQR) 20 (5–84) vs. 19 (6–76) hours, p = 0.4]; neonates with < 28 weeks of gestational age [26 (3.5%) vs. 35 (4.6%), p = 0.2]; with use of nephrotoxic drugs [117 (15.7%) vs. 122 (16%); p = 0.8]; furosemide usage [39 (5.2%) vs. 54 (7.1%), p = 0.1]; and serum creatinine [median (IQR) 0.8 (0.6–1) vs. 0.7 (0.5–1.1) mg/dl, p = 0.08] were not significantly different compared to the validation and derivation cohorts. However, the variables such as neonates with < 1000 g birth weight [30 (4%) vs. 52 (6.8%), p = 0.01]; with PPV requirement in the delivery room [119 (16%) vs. 82 (10.7%), p = 0.003]; with significant cardiac disease [178 (23.9%) vs. 330 (43.3%), p < 0.001]; with inotrope(s) usage [222 (29.8%) vs. 330 (43.3%), p < 0.001]; and urine output [median (IQR): 1.2 (1–1.5) vs. 1.6 (1.2–2.2) ml/kg/h; p < 0.001] differed significantly between the validation and derivation cohorts. Also, the duration in NICU [8 (5–15) vs. 10 (6–20) days, p < 0.001]; AKI incidence within 7 days post-admission [249 (33.5%) vs. 187 (24.5%), p < 0.001] and neonates who died in NICU [24 (3.2%) vs. 51 (6.7%), p = 0.002] differed in the validation and derivation cohorts. The corresponding data for other variables is provided in Supplementary Table 1.

Univariate analysis

In the validation cohort, the STARZ model variables such as age at entry in the NICU [median (IQR) 12 (5–67) vs. 30 (5–89) hours; p = 0.01]; serum creatinine [median (IQR) 1.2 (1–1.5) vs. 0.7 (0.6–0.8) mg/dl; p < 0.001); the delivery room’s use of PPV [RR (95% CI) 1.7 (1.38–2.11)]; gestational age < 28 weeks [RR (95% CI) 1.9 (1.38–2.62)]; sepsis (during the NICU stay) [RR (95% CI) 2.64 (2.06–3.38)]; significant cardiac disease [RR (95% CI) 1.74 (1.43–2.12)]; use of nephrotoxic drugs [RR (95% CI) 1.59 (1.28–1.98)]; furosemide use [RR (95% CI) 1.58 (1.14–2.18)]; or use of inotrope(s) [RR (95% CI) 3.02 (2.49–3.68)] differed significantly among neonates with AKI vs. without AKI. This data is shown in Table 2.

Table 2 Univariate association of different variables with AKI incidence within 7 days post-NICU admission among 744 neonates

Scoring model validation

Of the 10 variables required for the STARZ model, none of the neonates had missing data for 7 categorical variables and 1 continuous variable (age at entry in NICU), but 2.7% (n = 20) and 19.9% (n = 148) neonates had missing data for urine output and serum creatinine, respectively. Therefore, 589 neonates having the data for all the required 10 variables were included in the model validation. Accordingly, each of the 589 neonates was assigned a STARZ score based on the data for these variables. Among all 589 children, the mean (22.7), median (18), and range (0–77) were noted as well as the probability of AKI based on the STARZ score. A STARZ score of < 32 indicated a probability of AKI at < 20%, score of 33–36 at 2–-40%, score of 36–43 at 40–60%, score of 44–49 at 60–80%, and score of ≥ 50 at ≥ 80%. The sensitivity of this scoring model was found to be 82.1% [147/179], its specificity 91.7% [376/410], positive predictive value 81.2% [147/181], negative predictive value 92.2% [376/408] and accuracy of 88.8% [523/589] as shown in Table 3 with the predictive ability of STARZ model based on the derivation cohort. Based on the STARZ cut-off score ≥ 31.5, an area under the ROC curve was observed to be 0.932 (95% CI, 0.910–0.954; p < 0.001), signifying that the discriminative power is high (Fig. 1). Table 4 shows the comparison of different variables based on the STARZ model cut-off score. In the validation cohort, the neonates with a score ≥ 31.5 vs. < 31.5 had a significantly higher incidence of AKI within 7 days [147 (81.2%) vs. 32 (7.8%), p < 0.001]; duration of NICU stay [10 (6–21) vs. 6 (4–11), p < 0.001]; and mortality [21 (11.6%) vs. 0 (0%), p < 0.001]. Similar findings were observed for the derivation cohort.

Table 3 Predictive ability of STARZ model for AKI incidence within 7 days post-NICU admission in validation and derivation cohort
Fig. 1
figure 1

Receiver operating characteristic (ROC) curve for the scoring system. Area under the ROC curve = 0.932 (0.910–0.954); p value < 0.001

Table 4 Stratification of validation and derivation cohort variables by STARZ model cut-off score

Snapshots of the user-friendly dashboard used for the STARZ scoring system taking two children as examples are depicted in Fig. 2 (low AKI risk) and Fig. 3 (high AKI risk).

Fig. 2
figure 2

User-friendly dashboard depicting the output of the scoring system for low-risk AKI children. ‘^’ First 12 h post-admission in NICU. In a hypothetical example of a neonate without AKI, the values of different variables were (1) age upon admission to the NICU = 3 h; (2) lack of PPV in the delivery room; (3) gestational age = 35 weeks; (4) serum creatinine = 0.4 mg/dl; (4) urine output 1 ml/kg/h; (5) no use of nephrotoxic drugs or furosemide; (6) no use of inotropes; (7) no sepsis; (8) No notable cardiac disease. The STARZ model also predicted the neonate as low probability of AKI

Fig. 3
figure 3

User-friendly dashboard depicting the output of the scoring system for high-risk AKI children. ‘^’ First 12 h post admission in NICU. In a hypothetical example of a neonate with AKI, the values of different variables were the age upon admission into the NICU was 30 h; gestational age = 24 weeks; PPV required in the delivery room; sepsis present; noteworthy cardiac disease present (patent ductus arteriosus); urine output = 0.95 ml/kg/h; serum creatinine = 1.36 mg/dl; use of nephrotoxic drugs; use of furosemide; and no use of inotropes. The STARZ model also predicted the neonate as high probability of AKI

Supplementary Table 2 shows the comparison of neonates who were included (n = 589) vs. excluded (n = 155) in the validation model. Some of the variables differed significantly in these two groups (not shown here). To check whether non-inclusion of the excluded neonates affected the predictive validity, the data for missing variables were imputed using a median imputation. For all 744 neonates, the sensitivity of the scoring model was found to be 69.9% [174/249], the specificity 87.9% [435/495], positive predictive value of 74.4% [174/234], negative predictive value of 85.3% [435/510] and accuracy of 81.9% [609/744]. Based on the STARZ cut-off score ≥ 31.5, an area under the ROC curve was observed to be 0.857 (95% CI, 0.827–0.887; p < 0.001), signifying that the discriminative power is high. The relatively lower sensitivity is due to a higher proportion of AKI and 95% (148/155) having missing data for creatinine [variable with a highest score in the STARZ model] among excluded neonates. Therefore, the STARZ model is robust and excluding neonates does not affect its predictive validity.

Length of stay and mortality

Those with AKI had a significantly greater risk of mortality compared to those without [9.6 vs. 0%; p < 0.001]. Equally, those with AKI had a significantly increased median (IQR) length of stay in the NICU [10 (6–21) vs. 7 (4–14) days; p = 0.001].

Discussion

Within the past 15 years, advances in technology and research have significantly increased knowledge and earlier recognition of some inciting factors of AKI in paediatric populations, along with the long-term impact that it may have in this group of patients. While neonates are known to have highly variable and fluctuating creatinine levels on account of inherited or environmental factors early in life, maternal creatinine reflection in the perinatal period, as well as the non-uniform levels from still maturing glomerular function and systemic metabolism in pediatric patients, makes it essential to recognize small disturbances in kidney function as potential early signs of kidney injury. Delayed recognition of kidney injury can result in significant deleterious impacts on patient outcomes. This statement of the necessity of early recognition and rapid intervention to improve patient outcomes and survival is exemplified by research observing the rate of neonatal deaths within their first 7 days of life is 73% [22]. Additionally, the AWAKEN study in 2017 assessed the incidence of AKI in neonates in the NICU and found the rates varied based on the gestational age. The stratified incidence of AKI in neonates was 37% for those ≥ 36 weeks, 18% for those between 29 and 36 weeks, and 48% for those < 29 weeks [8]. While some studies have identified biomarkers (cystatin C, neutrophil gelatinase-associated lipocalin (NGAL), etc.) that are present during the early stages of AKI, these biomarkers are not easily obtained and/or tested as LMICs, in which AKI is the most prevalent, do not have the resources readily available to evaluate for these markers [23].

The STARZ score was derived from our prior multicentre study analysing risk factors for AKI in neonates admitted to the NICU [21]. Each significant variable was assigned a score based on the model of best fit. Each STARZ score variable (age in hours upon admission into the NICU, gestational age, whether positive pressure ventilation in the delivery room was required, sepsis, cardiac disease, urine output within 12 h post-admission < 1.32 ml/kg/h, use of nephrotoxic drugs, use of furosemide or inotropes, and serum creatinine levels within 12 h post-admission ≥ 0.98 mg/dl) was designed to be simple to extract and use. This model utilises a scoring system of 0 to 100 with the risk of AKI at 7 days proportional to a greater overall score. In our derivation cohort, this scoring model was tested for its sensitivity, specificity, positive predictive value, negative predictive value, and accuracy at 92.8%, 87.4%, 80.5%, 95.6%, and 89.4%, respectively. We undertook a multicentre, national, prospective cohort study including data from 11 locations across India to validate the STARZ score for predicting risk of AKI in neonates admitted in level 2–3 NICUs. In the validation cohort, the probability of AKI was < 20% up to a score of 32, 20–40% for scores 33–36, 40–60% for scores 36–43, 60–80% for scores 44–49, and ≥ 80% for scores ≥ 50. Neonates with a score ≥ 31.5 had a significantly higher incidence of AKI within 7 days, prolonged duration of NICU stay and increased risk of mortality. These findings were similar to the derivation cohort.

The utility of RAI in critically ill children throughout four cohorts was assessed by Basu et al. in 2014. RAI was found to be very effective in predicting severe AKI on day 3 with a negative predictive value of > 92% [19]. A subsequent single centre prospective trial on a similar cohort (n = 184) further reinforced RAI’s predictive ability of severe AKI on day 3, especially with the addition of urinary NGAL (AUC/ROC 0.80 to 0.97) [24]. Successive single centre studies and the multi-centre AWARE study have reached similar conclusions [25, 26]. We performed a similar validation study to evaluate the use of the STARZ score. In our study, the area under the ROC curve was observed to be 0.932 (95% CI 0.910–0.954) [p < 0.001] in the validation cohort, indicating high discriminative power of our AKI predictive score. The STARZ scoring model had a sensitivity of 82.1% [147/179], specificity of 91.7% [376/410], positive predictive value of 81.2% [147/181], negative predictive value of 92.2% [376/408] and accuracy of 88.8% [523/589].

To our knowledge, no similar predictive model of neonatal AKI risk exists comparable to our STARZ score for assessing futility. Moreover, the greatest asset of this study is that it encompasses the most expansive data pool that has ever been used to evaluate an AKI risk prediction score in neonates. All parameters of this scoring system could be assessed during admission in the NICU in a resource-limited setting, making it an ideal tool to guide clinical decision-making. Using 10 simple and easily available parameters within the first 12 h of NICU admission, neonatologists can rapidly predict the risk of AKI. This ultimately reduces the mortality of at-risk neonates with its early recognition and rapid initiation of evidence-based interventions. Moreover, the risk of AKI in neonates admitted to the NICU can be quantitatively determined by the STARZ score. To directly compare clinical decisions made with or without the use of the STARZ scoring system, additional evaluation is required in low-resource settings using larger data pools with more diverse ethnicities. We believe the STARZ score is a clinical adjunct that will lead to the optimization of AKI biomarker performance in the neonatal population and subsequently improve the lives of neonates around the world.