Introduction

Deterioration of long-term kidney function after transplantation remains a problem [1].

Several studies have shown that the graft may show histological abnormalities at the time of transplantation [2,3,4]. Cohort studies of baseline biopsies with very long-term follow-up are sparse as is the application of stereology-based, design-unbiased methods for morphometric assessment of kidney grafts [5, 6].

The hypothesis of this study is that structural deviations in implantation kidney biopsies have prognostic impact on long-term graft function. As these intrinsic graft factors seem difficult to categorize as immunological or non-immunological factors, we use the term “structural parameters.”

Formerly, the terms “chronic rejection” and “chronic allograft nephropathy” (CAN) have been used as both a clinical and histological diagnosis [7, 8]. The term CAN was a part of the early Banff classification systems [9, 10], but was later discarded and replaced by the term IF/TA (interstitial fibrosis/tubular atrophy) [11]. Regardless of the terminology used, the histological changes have been reported in a large proportion of transplanted kidneys [12,13,14,15]. Structural factors have been shown to be associated with the kidney function in native kidney diseases [16, 17] and in transplanted kidneys [18,19,20,21,22,23,24,25,26], and are still a focus point [27,28,29].

In the present study, baseline biopsies were scored according to the Banff 97 classification [10]. Furthermore, by light microscopy and an unbiased technique, cortical interstitium was estimated by test points. The frequency of sclerosed glomerular profiles was counted, and glomerular volume was obtained by the Cavalieri estimator; stereological methods and electron microscopy were used for estimation of glomerular capillary basement membrane thickness and mesangial volume [30]. Finally, the associations between baseline structural parameters and graft failure were evaluated by Cox regression and illustrated by Kaplan–Meier plots. We focus on graft failure as an end point because of its significance and also evaluate the association between baseline factors and kidney function.

Materials and methods

The study is a prospective long-term continuation of a short-term study [31].

Clinical data

Baseline kidney biopsies were obtained prospectively from 54 consecutive patients receiving a kidney transplant from September 1997 to November 1998 in the Kidney Transplantation Center in Aarhus, Denmark. Baseline donor and recipient data were registered at the time of transplantation.

Recipient follow-up

The recipients were prospectively followed up with clinical control every 3 months during the first year, then once a year or on clinical indication. Standard clinical assessments were performed. During the first 3 years, clinical data were noted in a special chart in the patient file. Then, annual clinical data were obtained from the files and from the Scandinavian transplantation cooperative Scandiatransplant. The date of the cessation of renal function was registered together with the clinical cause. Latest follow-up was December 31, 2012 according to the Health Research Ethics permission. Four clinical endpoint groups were defined: recipient alive with kidney function, recipient alive without kidney function due to re-transplantation or return to dialysis, recipient died with kidney function, and recipient died without kidney function [32].

Kidney biopsies

Baseline biopsies

Before wound closure, two baseline biopsies were obtained with an 18-G needle. The biopsy core length was 10 mm. The biopsies were performed at an angle of approximately 30° to the kidney surface to obtain primarily cortical tissue. One of the biopsies was fixed in 4% buffered formaldehyde, paraffin-embedded, serially sectioned at 4 μm, and stained by standard procedures [10]. The other biopsy provided tissue for plastic embedding; it was fixed in 2% glutaraldehyde, divided in blocks of approximately 1 mm3 and Epon-embedded, and a specialized sectioning procedure was applied, see Online Resource page 2. This provided material for measurement of glomerular volume and for electron microscopical evaluation.

Banff scores

The paraffin-embedded biopsy was scored by an experienced renal pathologist (NM) according to the Banff 97 classification [10]. The scoring was anonymized. The scores for arteriolar hyaline thickening “ah,” interstitial fibrosis “ci,” tubular atrophy “ct,” allograft glomerulopathy “cg,” and arterial fibrous intimal thickening “cv” were the study parameters. Figure 1 presents an overview of applied methods

Morphometry

Quantification was performed at baseline blinded to clinical information, and by a stereology-based method when applicable. The techniques used have been presented [31] and are partly accessible in Online Resources. Biopsies were anonymized by numbering. The objective quantification was performed to lower bias based on subjectivity, and the counting was performed by one person to minimize observer variability. The study parameters were defined to fulfill method-dependent inclusion criteria.

Measurements by light microscopy

The measurements of the sections were performed with computer-assisted light microscopy (Grid®, Zeiss A/S, Denmark). Live video images of the field of vision in the microscope were transmitted to a computer screen. For details, see Online Resource page 2.

Interstitial volume fraction

A stereology-based point counting technique [33] was applied to estimate the volume fraction of the interstitium per glomerular cortex, VvC(interstitium/cortex). All biopsies with cortical tissue were included. The section with the largest biopsy area stained with periodic acid-Schiff (PAS) was used for the quantification of interstitial tissue. In five patients, the measurements were repeated on two consecutive days.

Fraction of occluded glomeruli

The frequency of glomerular occlusion, also called glomerulosclerosis, was expressed as the proportion of corpuscle profiles that were totally occluded. The inclusion criterion was a minimum of 14 glomerular profiles present.

Glomerular volume

The volume of individual glomeruli was obtained by the Cavalieri estimator [30, 34, 35] on systematic sampled 1-μm plastic sections obtained by exhaustive sectioning; Online Resource page 3. Biopsies containing at least seven glomeruli were included.

Measurements by electron microscopy

Digital images were obtained with a Philips CM10 electron microscope equipped with the computer software SIS (AnalySIS 3.0, Soft Imaging System). For details, see Online Resource page 4.

Mesangium

The volume fraction of mesangium per glomerular tuft, VvG(mesangium/glomerular tuft), was estimated by a stereology-based point counting technique. Images were obtained at a low magnification of × 1450 with three glomeruli assessed at two levels.

Basement membrane thickness

The glomerular capillary basement membrane thickness (BMT) was estimated using the orthogonal intercept method [35] by images obtained at a magnification of × 5800 with three glomeruli assessed at two levels each. For calculation of the stereology-based estimates, a computer program was used (Dimac, Digital Image Company, CHAMP).

Follow-up biopsies

Follow-up biopsies were performed on clinical indication. The histological diagnosis was established by an experienced renal pathologist (NM) according to the Banff 97 classification; chronic changes should be with no evidence of specific etiology, in accordance with the IF/TA definition [11].

Kidney function

The renal allograft function was evaluated by creatinine values, and glomerular filtration rate was estimated by the abbreviated MDRD equation (eGFR). When patients were alive and on dialysis or re-transplanted, eGFR was assigned the value zero. eGFR groups were established according to chronic kidney disease stages, CKD stages (www.renal.org).

Statistical methods

We used Kaplan–Meier plots and Cox proportional hazards regression to analyze renal allograft survival. Recipients who died with functioning graft were censored at the time of death [32]. Recipients with immediate complications were excluded. The association of kidney function to baseline structural values was analyzed with linear regression. P values less than 0.05 were considered significant. Analyzes were performed with Stata 13 (StataCorp LCC, Texas, USA).

Results

Clinical data

Baseline donor (n = 54) and recipient (n = 54) data are presented in Table 1. One year after transplantation, eight patients were alive without kidney function; among these, two patients had lost the kidney transplant very early: one due to arterial torsion with infarction, one due to an immediate renal vein thrombosis. One graft was histologically diagnosed with continuous acute rejection grade 3, and later vein thrombosis. Two patients were on dialysis with their grafts in situ; they were later explanted and diagnosed with IF/TA. Four grafts were explanted due to clinically and histologically verified acute rejection in a combination with IF/TA; one of these patients had died. At the latest follow-up December 31, 2012, 14 patients were alive without kidney function: Among these was the patient with early complication caused by immediate renal vein thrombosis, and the patient with early continued acute rejection. Two transplants were explanted due to acute rejection and IF/TA; six had been explanted solely due to IF/TA. One patient had been re-transplanted based on biopsy-proven IF/TA. One was on dialysis without a diagnosis. Two were on dialysis/explanted with histological diagnoses as a mixture of chronic changes and diabetic complications. Six patients had died without kidney function. Among these was the one with early complication due to arterial torsion. One patient was on dialysis and had a biopsy verified IF/TA. One was on dialysis with the graft in situ, clinically diagnosed as chronic rejection, but not confirmed by histology. Furthermore, one patient was on dialysis without biopsy. Two grafts had been explanted due to acute rejection and IF/TA. Twelve patients died with graft function; two patients died during the first year, one with cerebral infarction, one due to acute pancreatitis after a cholecystectomy. Until 5 year’s follow-up, another three patients died with transplant function; one due to an acute myocardial infarction, two with unspecified cause. After 5 years until the last date of follow-up, further seven recipients died with transplant function: three with cancer diagnosed within 5 to 7 years after transplantation (pulmonal, thyroid and esophageal), and one caused by infection (Pneumocystis carinii). Three with unspecified cause. The follow-up period was up to 15.3 years. Clinical status is illustrated in Table 2. The mean overall kidney survival time was 8.6 years (median = 8.6, SD = 6.05, range 0.03–15.3, n = 54). Figure 2 illustrates survival as a combination of recipient and graft survival. The mean number of histologically verified acute rejections during the first year was 0.46 (range 0–3), borderline rejections was 0.66 (0–4).

Table 1 Data for recipient and donor
Fig. 1
figure 1

Flowchart with an overview of the studied parameters and the applied techniques

Table 2 Clinical status of the recipients having a kidney transplant from September 1997 to November 1998 after prospectively short-term and long-term follow-up (14.2 to 15.3 years)
Fig. 2
figure 2

Recipient and graft survival, illustrated by Kaplan–Meier curves. Text in the graph area according to the current outcome. At the end of the study, 66.7% recipients were alive, 40.7% were alive with kidney transplant function, and 33.3% died during follow-up; of these, two-thirds (22.2%) had a functioning kidney transplant. Numbers at risk are shown in Table 2

Baseline biopsies

Results are presented in Table 3, and includes Banff scores, Measurements by light microscopy, and Measurements by electron microscopy.

Table 3 Data from baseline biopsies of donor kidneys

Kidney function

Plasma creatinine values for kidneys functioning after 1 year were on average 163 μmol/l (median = 149, SD = 55, range 86–306, n = 43). Mean eGFR value for these patients was 42 ml/min/1.73 m2 (median = 41, SD = 15, range 16–93, n = 43). Median CKD group value was 3 (range 1–5, n = 51). Mean plasma creatinine values for kidneys functioning at the end of 2012 was 144 μmol/l (median = 129, SD = 50, range 62–258, n = 22). Mean eGFR value for kidneys functioning was 48 ml/min/1.73 m2 (median = 47, SD = 23, range 23–127, n = 22). Median CKD group value was 4 (range 1–5, n = 36).

Baseline allograft factors as predictors of allograft failure

Table 4 shows the results of Cox regression analysis of baseline allograft factors as predictors of allograft failure during the follow-up period. Results are expressed by hazard ratios, and includes analysis at 5 years and at the end of the study.

Table 4 Cox regression of structural allograft factors at baseline as predictors of allograft failure during 14.2 to 15.3 years of follow-up

The Banff “ah” score at implantation was associated with loss of function of the transplanted kidney. For each unit increase in “ah” score at baseline, the incidence of allograft failure increased by a factor of 3.28, P < 0.001. An “ah” score above 1 had a worse outcome, Fig. 3. The “cv” score was marginally significant after 5 years. For each unit increase in “ci” score at baseline, the incidence of allograft failure increased by a factor of 5.98, P = 0.001. The proportion of interstitium in the glomerular cortex at baseline was also associated with loss of function of the transplanted kidney. For each percent increase in interstitial tissue, the incidence of lost kidney transplant function increased by 14%, P = 0.01. Per 10% increase in VvC(interstitium/glomerular cortex), the incidence of lost kidney transplant function increased by a factor of 3.58 (95% CI 1.38, 9.26; P = 0.01). Per 10 years increase in donor age, the hazard ratio was 1.60 (95% CI 1.06, 2.41; P = 0.03).

Fig. 3
figure 3

Illustration of survival by Kaplan-Meier plots for two groups. Cutpoint value was guided by the mean value of Banff “ah” score in the baseline biopsies. Death censored graft survival as a function of arteriolar hyaline thickening score in the baseline biopsy. Cox regression for Banff “ah” score 2-3 compared to Banff “ah” score 0–1. P < 0.001; Hazard Ratio = 8.18 (CI 2.83, 23.6) at the end of the study

Figures 4 and 5 and Online Resource Figures 14 illustrate grouped graft survival for the baseline parameters. Hazard ratio for the number of acute rejections in the first year was 2.09, P = 0.01 (95% CI 1.18, 3.66). The number of borderline rejections in the first year, AB and DR–mismatches and the cold ischemia time did not show significance in the Cox regressions. When applying a multifactorial model, the effect of “ah” score was robust. The Banff “ah” score was also the single most important factor when donor age was included in the assessment.

Fig. 4
figure 4

Illustration of survival by Kaplan-Meier plots for two groups. Death censored graft survival as a function of glomerular volume in the baseline biopsy. Cox regression for glomerular volume above 3 × 106 μm3 compared to glomerular volume below 3 × 106 μm3. P = 0.06; hazard ratio = 7.44 (CI 0.03, 58.95) at the end of study

Fig. 5
figure 5

Illustration of survival by Kaplan–Meier plots for two groups. Death censored graft survival as a function of Banff score of cortical interstitial tissue in the baseline biopsy. Cox regression for Banff “ci” score 1–3 compared to Banff “ci” score 0. P = 0.001; hazard ratio = 5.98 (CI 2.06, 17.34) at the end of study

Baseline allograft factors correlated to kidney function during follow-up

The eGFR after 1 year of follow-up was statistically significantly correlated with arteriolar hyalinosis at baseline, as well as with the number of sclerosed glomerular profiles and glomerular volume at baseline, Table 5. Further analysis of kidney function at year 1 is presented in Online Resource page 5. Banff arteriolar hyalinosis score and glomerular volume still correlated with eGFR at 5- and 10-year follow-up, and at the latest follow-up (results shown for the latter). At the end of the study, eGFR was decreased by 12 ml/min/1.73 m2 per unit increase in the score for arteriolar hyalinosis at implantation (P = 0.02); per 106 μm3 increase in glomerular volume at baseline, the eGFR at the latest follow-up was decreased by 19 ml/min/1.73 m2 (P = 0.03).

Table 5 Kidney function (eGFR) after 1 year and at the end of study, by structural allograft parameters at baseline. Linear regression analysis

Discussion

We hypothesized that long-term kidney graft survival is related to structural parameters in baseline biopsies, and also studied the association between baseline structural parameters and long-term graft function. Baseline factors have been reviewed in detail [36, 37]; few studies have been very long-term prospective cohort studies [6]. We analyzed factors in three structural compartments (vessels, interstitial tissue, glomeruli) and evaluated together with donor age. The main finding of our study is that the Banff score for arteriolar hyaline thickening (“ah” score) is the single factor with the greatest impact on long-term graft survival.

The “ah” score also associated with long-term renal function in the surviving grafts, as also shown with a shorter period of follow-up [26]. This was also the case in a long-term study on retrospectively reviewed early indication biopsies [38], but are in contrast to another study by the same group [39]; the discrepancy might rely on the type of biopsy and different inclusion criteria with a group of biopsies “on request” from surgeon and non-consecutively enrolment for the latter. We did not perform quantitative estimates for arteries.

The quantitative estimate of baseline cortical interstitial tissue, VvC(interstitium/cortex), and the baseline Banff “ci” score for interstitial fibrosis both correlated significantly with graft survival. The stereology-based measurement was evaluated in an attempt to objectify the amount of interstitial tissue. The two principally different methods for estimation, the quantitative measurement contra the semi-quantitative score, led to comparable results regarding graft survival. Optimization of assessment of fibrosis is ongoing [40], and proposed in implantation biopsies as well [29]; automatization might also be implemented in future diagnostic practice.

The range of the cortical interstitial tissue measurements at baseline was 0.13 to 0.38, with a mean value of 0.24. A comparable variation has been reported [20, 41]. The result present a broad range for kidneys clinically regarded as normal. Different definitions of interstitial tissue and interstitial fibrosis might explain some of the variation [42]. We did not evaluate inflammation as it was very sparse at baseline. A former study found that total inflammation in 6-week transplant biopsies did not predict progression of fibrosis at 1 year [43].

Glomerular volume and the fraction of occluded glomerular profiles may reflect two stages or pathways of glomerular affection [44]. Both factors correlated with graft survival in our study; but only the unbiased estimate of glomerular volume also correlated with long-term renal function.

Baseline glomerular area evaluated by a maximal planar area method has been reported as a predictor of serum creatinine and creatinine clearance, with a follow-up of 7.5 years [22].

Nankivell et al. evaluated sequential graft biopsies up to 10 years after transplantation and reported that severe arteriolar hyalinosis resulted in greater glomerulosclerosis on sequential biopsies [45]. A prospective study by Wavamunno et al. used quantitative methods for ultrastructural parameters; the study was performed with surveillance biopsies in 15 patients in a 5-year ultrastructural follow-up [26]. They did not report on baseline findings, but found that ultrastructural changes were detectable early, and light microscopy changes regarding transplant glomerulopathy could be detected 2.3 years later.

Podocyte depletion has been shown to contribute to allograft failure [46]. The Ann Arbor group further reported increasing glomerular volume in biopsies with late transplant glomerulopathy; the glomerular volume was estimated in biopsies with at least 8 tuft profiles, and was based on the average radius of all tuft profiles in one section. Glomerular volume estimated from two-dimensional measurements has also been reported by the Mayo group [44]; based on one PAS section, biopsy section adequacy was defined as at least 2 mm2 of cortex and 4 glomeruli per section. They report that larger cortical nephron size, subclinical nephrosclerosis, and arteriolar hyalinosis modestly predict death censored graft failure; the mean follow-up was for 6.3 ± 3.8 years. The two-dimensional Weibel–based techniques for glomerular volume estimates are less work-demanding, compared to the unbiased “gold standard” method applied in our study; estimates from average profile area usually require correction factors (for shape, size distribution, and shrinkage) and a certain amount of glomerular profiles are needed [35]. There may indeed be a future role for these techniques in automated morphometry as suggested by Issa et al. [44], and perhaps in diagnostic routine. A prospective long-term follow-up cohort study with regard to baseline glomerular volume based on the Cavalieri estimator has not been performed before.

We found the Banff 97 score for mesangium hard to apply to the baseline biopsies, and we estimated the volume fraction of mesangium by electron microscopy. The basement membrane thickness of the glomerular capillaries was also estimated by electron microscopy. These factors did not correlate with graft survival.

The strength of the study is the prospective and consecutive inclusion of all baseline biopsies from a cohort of kidney transplant patients from a single center. Donor kidneys were not refused due to pre-implantation biopsies, which might have prevented a selection bias; standards for pre-implantation biopsies are evolving [47]. We used needle biopsies [47, 48], which were paraffin- or Epon-embedded, and we applied predefined inclusion criteria for evaluation of the biopsies and the measurements of structural parameters. All histological diagnoses and Banff scores were established by one experienced renal pathologist and quantitative measurements were conducted by one person. The scoring and measurements and the statistical evaluations were performed anonymized. The long observation period is a strength but also a weakness of the study; the patients dying with functioning grafts contributed to a reduction in the number of patients for long-term follow-up; the fraction is comparable to Issa et al. [44]. The number of biopsies fulfilling the predefined strict inclusion criteria also affected the groups for final evaluation. Despite limitations, the results of our cohort study show that existing changes within the donor kidney have extraordinary long-term implications.

Prospective studies per se are historic. The clinical management might be different from current approaches. However, we present baseline structural factors in a cohort where no patient was lost to follow-up. In Denmark, we have access to all patient data due to a nationwide system with a specific “Personal identity number.” The clinical treatments are based on national guidelines, and the Health Service is without individual economical costs. We find that a strength, and it may make an extrapolation of the results realistic, also in a historic perspective.

This study should not cause a negative selection of donors [49]. We find that the early morphological signs, which point to later development of reduced graft function, should encourage the investigation of therapeutic targets [50] and introduction of further preventive therapies. Implementation of fibrosis-inhibiting drugs and renal protective treatments for risk groups of kidney graft recipients could be proposals to minimize the influence of the structural factors present at baseline, in an effort to delay the process of vascular damage and the glomerular and interstitial changes.