Introduction

Lupus nephritis (LN) markedly influences disease outcomes of children and adolescents. Indeed, up to 30% of those diagnosed with LN will experience progression to end-stage renal disease within 15 years of diagnosis [1, 2]. Compared to adults, pediatric LN carries a higher risk of requiring renal replacement therapy and has markedly higher mortality rates [3, 4]. A lack of early clinical improvement of LN has been repeatedly proposed as the best predictor for developing chronic kidney disease and end-stage renal disease [5,6,7]. At present, the glomerular filtration rate (GFR), generally estimated by the urine creatinine clearance, and chronic proteinuria are the principal measures of failing kidney function, together with electrolyte abnormalities. However, transient GFR reduction and proteinuria are also common features of active kidney inflammation with LN [8], making a distinction between chronic irreversible and active inflammatory processes difficult in clinical settings. Thus, kidney biopsies remain the gold standard for determining the degree of renal structural destruction present with LN. Interpretation of LN histology is guided by the International Society of Nephrology/Renal Pathology Society classification (ISN/RPS) [9, 10], with the degree of active inflammation and chronic kidney structural changes often quantified using the National Institutes of Health (NIH) Activity Index (NIH-AI) and NIH Chronicity Index (NIH-CI), respectively [11].

In contrast to combinations of noninvasive urine biomarkers that are highly accurate in capturing LN activity and anticipating response of LN to therapy [12,13,14,15,16], there is a paucity of validated biomarkers to estimate the degree of irreversible, chronic kidney damage with LN. Several urine biomarkers have been, albeit inconsistently, associated with LN damage; they are adiponectin [16, 17], ceruloplasmin [18], kidney injury molecule-1 (KIM-1) [19], liver fatty acid binding protein (LFABP) [20], monocyte chemotactic protein-1 (MCP-1) [12, 21,22,23], neutrophil gelatinase associated lipocalin (NGAL) [19, 24, 25], osteopontin [26], transforming growth factor ß (TGFß) [27], transferrin [28], and vitamin-D binding protein (VDBP) [29].

The objectives of this study were to test the utility of 10 proposed urine biomarkers (adiponectin, ceruloplasmin, KIM-1, LFABP, MCP-1, NGAL, osteopontin, TGFß, transferrin, VDBP) for reflecting chronic kidney damage, as verified by renal biopsy. We further explored whether some of these biomarkers could identify the subgroup of LN patients that will experience renal functional decline by month 12 post biopsy. Considering the variability in histological features associated with LN chronicity, we hypothesized that biomarker combinations would be superior to any single biomarker in noninvasively estimating the degree of LN damage.

Materials and methods

Patients

Patients with childhood-onset systemic lupus erythematosus (cSLE) [30, 31] were required a kidney biopsy as part of standard of care participated in this longitudinal study. Clinical and laboratory information as well as random spot urine samples were collected at time of kidney biopsy and in regular intervals every 3 months (range 2–5 months) thereafter. The study complied with the Declaration of Helsinki and was approved by the institutional review boards of all of the participating institutions. This work was supported by the National Institutes of Health (U01AR059509 to HIB, P50DK096418 to PD and HIB, P30AR070549, and 5UL1TR001425).

Kidney histology

Kidney biopsy specimens were all interpreted in a blinded fashion by an expert nephropathologist (DW) as per the ISN/RPS classification [9, 10]. Histological findings were rated for the amount of active inflammatory changes using the NIH-AI score (range 0–24; 0 = inactive LN) and permanent degenerative changes of the kidney tissues using the NIH-CI, respectively. The latter index considers glomerulosclerosis, fibrous crescents, tubular atrophy, and interstitial fibrosis, each scored on a 3-point Likert scale (range 0–12; 0 = no LN damage) [11]. Based on NIH-CI scores, we classified patients as having one of the following two levels of LN Damage: “no/minimal LN damage” for NIH-CI scores of ≤ 1 or “substantial LN damage” for NIH-CI scores of ≥ 2.

Traditional noninvasive measures of SLE and LN

At each visit, the systemic lupus erythematosus (SLE) disease activity index (SLEDAI-2 K) was completed, with additional calculation of its renal domain score (renal-SLEDAI; range, 0 to 16; 0 = no LN activity) as well as the renal domain score of the Systemic Lupus International Collaborating Clinics/American College of Rheumatology damage index (renal-SDI; range, 0 to 3; 0 = no LN damage) [14, 32]. Laboratory data recorded included serum creatinine, urine sediment, urine protein to creatinine ratio (UPCR) from a random spot urine sample, and the estimated glomerular filtration rate (GFR), using the modified Schwartz formula [33]. Renal functional decline at 12 month was defined as a reduction of the GFR of ≥ 20% from baseline at the time of kidney biopsy [20].

Sample collection, handling, and batch assaying

After collection, urine samples were spun immediately and stored within the first hour of collection in the refrigerator at 4 °C prior to being frozen to – 80 °C within 24 h for laboratory testing in batches.

Unless stated otherwise, biomarkers (adiponectin, ceruloplasmin, KIM-1, LFABP, MCP-1, NGAL, osteopontin, TGFß, transferrin, VDBP) were quantified using the commercial ELISA kits as per the manufacturers’ instructions, and a four parameter logistic curve-fit was used to fit the standard curve. The inter-assay and intra-assay variability of these assays is expressed in percent of the coefficient of variation [CV intra/inter]. Adiponectin [CV intra/inter, 4.0%/9.9%] was measured using the Quantikine ELISA Human Adiponectin/Acrp30 (R&D Systems, Minneapolis, MN). We quantified ceruloplasmin [CV intra/inter, 4.1% /7.1%] by ELISA (Assaypro, St. Charles, MO); MCP-1 [CV intra/inter, 5.0%/5.9%] by ELISA (R&D Systems, Minneapolis, MN); NGAL [CV intra/inter, 1.0%/9.1%] was measured by ELISA (Human NGAL ELlSA; Bioporto, Grusbakken, Denmark); and LFABP [CV intra/inter, 6.1%/10.9%] by ELISA (CMIC Co., Tokyo, Japan). The KIM-1 assay was constructed using commercially available reagents (Duoset DY1750, R & D Systems, Minneapolis, MN) as described previously [34]. We measured VDBP [CV intra/inter, 5.1%/6.2%] by ELISA (R&D Systems, Minneapolis, MN); osteopontin [CV intra/inter, 7.8%/9.0%] with the DuoSet Human EPCR kit (R&D Systems, Minneapolis, MN); and TGFβ [CV intra/inter, 2.6%/8.3%] using ELISA (R&D Systems, Minneapolis, MN) after acid activation. Briefly, 20 μL of 1 N HCl was added to 100 μL of urine sample, mixed by inversion and incubated at room temperature for 10 min. Next, the acidified sample was neutralized by adding 20 μL of 1.2 N NaOH/0.5 M HEPES, then the assay was immediately run per manufacturer’s instructions [CV inter/intra: intra/inter]. Using immunonephelometry (Siemens, BNII, Munich, Germany), we measured transferrin [CV intra/inter, 2.5%/3.4%]. We also determined levels of urine creatinine using an enzymatic creatinine assay [CV intra/inter, 0.65%/4.48%] on a Dimension RXL Clinical Analyzer (Siemens, Munich, Germany).

Individual biomarkers were natural log transformed to address skewing in value distribution and large variations in the ranges of values prior to performing statistical analyses. Urine concentrations are reported in ng/mL for LFABP and VDBP; KIM-1, MCP-1, and TGFß were reported in pg/mL, those of NGAL, ceruloplasmin, osteopontin, and adiponectin were reported as ng/mL, and that of transferrin was reported as mg/dL. Urine creatinine and urine albumin were reported as mg/dL and mg/L, respectively. Normal ranges for these biomarkers have been established [35].

Statistical analysis

Descriptive statistics were performed, using arithmetic means with standard deviation (SD) or standard errors (SE) for numeric variables and frequencies for categorical variables, respectively. After natural log transformation, the distribution of the 10 urine biomarkers allowed the use of parametric statistics. Analyses pertaining to serum creatinine are not shown, as they were highly correlated with those performed for the GFR. Association between traditional LN measures (renal-SLEDAI, renal-SDI, NIH-CI, NIH-AI, GFR, UPCR,) and individual biomarkers (adiponectin, ceruloplasmin, KIM-1, MCP-1, NGAL, osteopontin, TGFß, transferrin, LFABP, VDBP) were assessed using Pearson correlation coefficients (r). Correlation analyses were repeated after standardization of biomarker levels for urine albumin and, respectively, urine creatinine. Values of r of | < 0.2|; |0.2–0.4|; |0.41–0.6|; and |0.61–0.8| were interpreted as unrelated, weak, moderate, and strong associations, respectively.

Only biomarkers (a) whose levels differed with the presence vs. absence of histological features considered in the NIH-CI or (b) that were significantly correlated with NIH-CI scores with r ≥ |0.2| in any of the analyses (raw biomarker levels, standardization for urine albumin, standardization for urine creatinine) were included in multivariate logistic regression modeling to predict the LN damage status (“no/minimal LN damage” for NIH-CI score ≤ 1, “substantial LN damage” for NIH-CI scores of ≥ 2). Forward and backward selection strategies were used, with/without correction for concurrent LN activity (NIH-AI) or consideration of the GFR at the time of biopsy. The utility of the individual or combined biomarkers to categorize patients by LN damage status was assessed using the area under the receiver operating characteristics (ROC) curve (AUC, range 0–1.0) and their corresponding 95% confidence intervals.

A perfect biomarker will have an AUC of 1.0, while a completely useless test has an AUC of ≤ 0.5. AUC values can be interpreted as follows: One way of interpreting the AUC of a biomarker is “high accuracy” for values ≥ 0.9, “moderate accuracy” for values of 0.7–0.9, “low accuracy” for values of 0.51–0.7, and “useless” for values ≤ 0.5 [36].

Algorithms derived by logistic regression yield a biomarker score, with higher scores reflecting higher odds of having substantial LN damage on kidney biopsy. For the biomarker score (threshold score) on the ROC curve that yielded the best discrimination between LN damage status, sensitivity, specificity, positive (PPV) and negative predictive values (NPV), positive likelihood ratio (LR+), and negative likelihood ratio (LR–) were calculated. Different from values of PPV and NPV, LR+ and LR– estimates are less influenced by the composition of the study population [37]. LR+ is defined as sensitivity/(1-specificity), and LR– is defined as (1-sensitivity)/specificity. For example, LR+ values can be interpreted as follows: > 10, large, often conclusive increase in the likelihood of “ruling in” the presence of substantial LN damage; 5–9.9, moderate increase; and 2–4.9, small increase, respectively [38]. LR– can be interpreted accordingly for “ruling out” LN-damage. p values < 0.05 from 2-sided testing were considered statistically significant. To predict renal functional decline up to 12-months post baseline, urinary biomarker levels were tested in mixed model analyses as described previously [14]. Excel 2013 (Microsoft, Redmond, WA) and SAS 9.4 (SAS Institute, Cary, NC) programs were used for analysis.

Results

Patient characteristics

Data and samples of 89 cSLE patients were included in this study. Most patients had notable proteinuria at baseline, and a majority had an active urinary sediment and ISN/RPS class IV LN (Table 1). There were no patients with class I or class VI LN. Twelve patients had an NIH-CI score of 0, and three of them also had an NIH-AI score of 0. Abnormally low GFR values at < 90 mL/min/m2 were present in 4 of 54 (7.4%) patients with no/minimal LN damage as compared to 12 of 35 (34.2%) with substantial LN damage (Fisher exact; p = 0.015).

Table 1 Lupus nephritis: baseline patient demographics [N = 89]

Relationship between traditional and histopathologic measures of renal damage

Table 2 summarizes the differences in traditional laboratory, clinical, and histological measures of LN in relation to specific histological changes considered in the NIH-CI and by LN damage status. Levels of KIM-1, MCP-1, NGAL, transferrin, TGFß, or UPCR did not differ with the presence vs. absence of histological features that are scored in the NIH-CI, nor with LN damage status. None of the biomarkers considered differed with tubular interstitial fibrosis presence vs. absence. Adiponectin and LFABP differed significantly with the presence vs. absence of fibrous crescents (both p < 0.031). The only biomarker whose levels differed significantly with LN damage status was osteopontin (p = 0.01). The GFR was significantly lower in patients with substantial damage as compared to that in patients with no/minimal LN damage (p = 0.0003), although means were still in the normal range for both groups. NIH-AI scores were significantly higher in those with substantial LN damage (p = 0.0071).

Table 2 Features of kidney damage in the cohort (N = 89) as scored in the NIH chronicity index†

When biomarker levels were corrected for urine albumin, only osteopontin remained significantly lower (p = 0.01) in the substantial LN damage group as compared to no/minimal LN damage group. After standardization for urine creatinine, only KIM-1 (p = 0.02) differed significantly with LN damage status (also see Supplemental Table 1). Further, after correction for kidney histology, biomarker levels did not differ with race and use of antihypertensive use (both p > 0.5).

Association between biomarkers, histological, and other LN measures at baseline

The GFR was moderately associated (r = − 0.5; p < 0.005) with both histological activity (NIH-AI) and chronicity (NIH-CI), whereas the UPCR was not importantly correlated with any (r ≤ |0.20|; p > 0.05). NIH-AI scores or NIH-CI scores were weakly associated with each other (r = 0.33; p < 0.01).

As shown in Table 3, irrespective of standardization for urine creatinine or urine albumin, the biomarker most closely and consistently associated with NIH-CI scores was osteopontin (|0.27| ≤ r ≤ |0.36|; 0.05 > p > 0.005). After standardization for urine creatinine, NGAL and KIM-1 remained weakly associated with NIH-CI scores (|0.31| ≤ r ≤ |0.40|; 0.05 > p > 0.005). As expected, many of the biomarkers assayed for this study were moderately associated with each other, regardless of standardization for urine albumin or urine creatinine. Additional details are provided in Supplemental Tables 2a–c.

Table 3 Baseline laboratory measures by lupus nephritis damage status in the cohort (N = 89)

Precision of laboratory measures to predict LN damage status

Based on differential excretion in the urine and association with LN damage (NIH-CI), only adiponectin, ceruloplasmin, KIM-1, LFABP, NGAL, and osteopontin were considered in multivariate analyses. Individually, all of the biomarkers were poor predictors of LN damage status (all AUC ≤ 0.60). As is summarized in Table 4, irrespective of consideration of concurrent LN activity (NIH-AI), only the combined consideration of adiponectin and osteopontin yielded moderate accuracy of predicting LN damage status (0.74 ≤ AUC ≤ 0.77). Considering unadjusted urine levels, a biomarker score of at least − 0.31 was > 80% sensitive but only 67% specific in predicting LN damage status (LR+ = 2.42, LR– = 0.29) (Fig. 1, Panel A). Based on these results, a patient with a biomarker score of ≥ − 0.31 will be known to have a 20% (71.4% vs. 50.8%) higher risk of having substantial LN damage than without the knowledge of the biomarker score. Conversely, patients with biomarker scores < − 0.31 will be expected to have a 71% chance of having no/minimal LN damage. Uncorrected adiponectin and osteopontin levels were similar to levels standardized for albuminuria or urine creatinine in predicting LN damage status. Considering concurrently also the GFR at the time of kidney biopsy yielded only marginally improved accuracy for estimating the LN damage status (AUC = 0.79; see last column of Table 4).

Table 4 Prediction of LN damage status considering urine biomarkers and traditional measures of renal function
Fig. 1
figure 1

Accuracy of biomarkers. Receiver operating characteristic (ROC) curves are shown. a Combination of uncorrected urinary concentrations of osteopontin and adiponectin has good accuracy in predicting lupus nephritis (LN) damage status based on an area under the ROC curve (AUC) of 0.74. The algorithm to calculate the biomarker score is shown as well as the statistically optimal threshold score of − 0.31. b Shows the ROC curve of the glomerular filtration rate (GFR) for predicting LN damage status together with the ROC shown in Panel A. Both curves are almost identical. AUC, area under the curve; ROC, receiving operating characteristic; GFR, glomerular filtration rate

The UPCR had low accuracy (AUC ± 95% confidence interval (CI), 0.67 ± 12; p > 0.05) and the GFR moderate accuracy (AUC ± 95% CI, 0.72 ± 12; p < 0.01) for predicting LN damage status (Supplemental Fig. 1).

Prediction in renal functional decline at 12-months post kidney biopsy

Among the 89 patients included in the study, 14 patients (16%) experienced a renal functional decline between the time of kidney biopsy and month 12 post kidney biopsy. Generally, patients with renal functional decline had continuously higher biomarker levels compared to those with preserved renal function (Fig. 2). None of the differences in biomarker levels between baseline and 9 months reached statistical significance (p > 0.05) for groups of patients with vs. without renal functional decline. Indeed, only levels of TGFβ (p < 0.05), LFABP, and transferrin (both p < 0.01) differed significantly at month 12 between patients with vs. without renal function decline.

Fig. 2
figure 2

Individual urinary biomarker based prediction for renal function decline. Mean (standard error bars) value levels of unadjusted urine biomarkers starting at the time of biopsy in patients (n = 14) with renal functional decline in red and others in blue. (y axis), at 0 to 3, 3 to 6, 6 to 9, and 9 to 12 month time points (x axis), respectively. Please see Fig. 2 for further details. The symbols “*” and “**” reflect statistically significant differences between the groups, at p < 0.05, and < 0.01, respectively. Biomarker levels starting from visit 1 (baseline, time of biopsy) and then every 3 months up to month 12 (or visit 4) are shown for each of the 10 biomarkers considered in this study. Biomarker levels remain higher for patients with renal functional decline (red) as compared to those with preserved renal function (blue). Although similar at visit 1, level of TGFβ, transferrin, and LFABP is significantly lower at month 12 (visit 4) in patient with preserved renal function. VDBP, vitamin-D binding protein; LFABP, fatty acid-binding protein

Discussion

This is, to the best of our knowledge, the first study that comprehensively investigated the chronic kidney disease in children in the context of urine biomarkers. When prospectively testing the value of previously proposed urine biomarkers of LN damage, we found that only the combination of osteopontin and adiponectin yielded good accuracy for the predicting LN damage status. Further, higher levels of urine biomarkers that have been previously shown to reflect active renal inflammation were present in patients who experienced renal functional decline within 1 year of kidney biopsy.

Adiponectin and osteopontin are both regulated by the renin-angiotensin-aldosterone system (RAAS), which plays an important role in the initiation and perpetuation of kidney inflammation [39]. Adiponectin is a small protein which is primarily expressed in adipocytes in noninflammatory states. With active LN, this protein becomes strongly expressed throughout the kidney tissue, with clear expression in glomeruli and renal tubules [13,14,15,16]. Murine studies suggest that adiponectin regulates proteinuria and podocyte function [40]. Osteopontin is mainly present in the loop of Henle and distal nephrons in normal kidneys. With renal damage, osteopontin expression is significantly upregulated in all tubule segments and the glomeruli [41]. We confirm that high urinary osteopontin levels are predictors of renal structural damage and proteinuria without [42] and with LN [43]. Notably, osteopontin levels were often lower in patients with substantial damage as compared to that in no/minimal LN damage, even after adjustment for concurrent LN activity (NIH-AI). This may be explained by the fact that LN chronicity is associated with the presence of fewer functioning glomeruli and renal tubules.

When considering that both adiponectin and osteopontin are regulated by the RAAS, one might expect that their urine levels are strongly correlated with each other. However, in line with other investigators [39], we found urine levels of these proteins at most weakly associated with each other.

Urine concentrations of other biomarkers included in this study were higher in LN patients who experienced renal functional decline than those in patients with preserved kidney function. With the exception of osteopontin, these are all validated biomarkers of LN activity and its response to therapy [12, 13, 15, 16]. Accordingly, none of these biomarkers was closely associated with NIH-CI scores in this study. Together, our results support the concept that cumulative LN activity over time results in LN damage. Thus, rapid control of LN activity seems necessary to minimize kidney damage with LN. This hypothesis is in line with observations in extra-renal inflammation with cSLE [44] and clinical studies of adults with LN [5,6,7].

Traditional measures of LN damage include proteinuria, elevated serum creatinine, and diminished GFR [45]. Of these traditional measures and despite its short comings [44], the GFR is a better predictor of renal function among children [33]. However, in this study the GFR of only 1 in 5 patients with substantial LN damage was abnormal, stressing that a normal GFR does not indicate the absence of structural kidney damage in children. Further, we confirm that proteinuria is a rather poor marker of LN chronicity. Our results do not support that adiponectin and osteopontin are superior to the GFR alone in detecting LN damage, nor that these biomarkers are synergistic with the GFR in predicting LN damage. Together, this supports the notion that repeat kidney biopsies are still required to determine the degree of renal damage with LN. Unfortunately, our findings are in line with those of many other investigators that aimed at validating urine biomarkers of kidney damage [46]. Arguably, blood-based biomarkers [47], microRNAs reflecting renal fibrosis [48], and/or advanced kidney imaging strategies [49] might be better suited to capture abnormalities in renal mechanisms due to structural changes with LN chronicity.

Further, diurnal changes in proteinuria with LN have been described [50] and, possibly, we missed association of biomarkers with LN chronicity due to basing our experiments on random urine samples. However, in our prior research in LN activity, use of random urine samples sufficed to estimate LN activity with over 90% accuracy [12].

Our study has few limitations, including that none of the patient had NIH-CI scores > 6. Nonetheless, our cohort is representative of other pediatric LN cohorts [51, 52], and all biomarker relationships and changes observed in our cohort were very likely due to LN, rather than comorbid conditions affecting the kidney. This is different from studies in adults with LN, who often suffer from other chronic system diseases that promote renal scarring, such as hypertension, diabetes, and atherosclerosis. Another limitation of our study may be that follow-up kidney biopsies were unavailable, because these would be expected to provide especially relevant information regarding renal functional decline with LN [53]. Hence, we were unable to verify whether renal functional decline truly represents a change in NIH-CI scores. However, follow-up kidney biopsies for LN are currently not standard practice in pediatrics [54]. Further, follow-up was limited to 12 months, and a longer time horizon might have been more clinically relevant.

In summary, osteopontin and adiponectin in combination are good predictors of histological damage with pediatric LN. Continued renal inflammation as reflected by higher levels of urine biomarkers known to reflect LN activity is a risk factor for renal functional decline in cSLE within 1 year of kidney biopsy.