Introduction

Biliary atresia (BA) is a destructive, inflammatory, obliterative cholangiopathy of neonates and infants that affects varying lengths of both the intrahepatic and extrahepatic bile ducts [1]. The Kasai procedure, liver transplantation (LT), and a combination of the two have been established as standard life-saving treatments for BA [2, 3]. Kasahara et al. showed that among patients with BA who underwent living donor liver transplantation (LDLT) in Japan, the 5-, 10-, and 20-year graft survival rates were 90.4%, 84.6%, and 79.9%, respectively [4]. Because of the excellent long-term outcomes of LT, improvements in health-related quality of life of the recipients are now the main focus of clinical practice for patients with BA [5,6,7]. Quality of life, cognitive skills, and intellectual properties are known to be poorer in school-aged children after LT than in healthy peers [8,9,10]. The transplantation process may expose a child to many factors that affect developmental status. Infancy is a period in which the developing neurological system is most vulnerable to injury. Thus, children with infantile-onset liver disease are at increased risk for insult to the developing brain [11, 12]. Malnutrition that results from chronic liver disease, metabolic derangements, and exposure to neurotoxic medications used in the transplantation process has become a dangerously common problem in children with chronic liver disease and have negatively affected neurocognitive development and growth [13].

To date, the intellectual outcomes of patients with BA who underwent LT before the age of 1 year have not been investigated in a consistent manner. The caregivers of these patients are often quite anxious whether their children would be able to enjoy school life and maintain fruitful living when they grow up. It is our obligation to obtain baseline data and provide accurate information. In this study, intellectual development was evaluated using the Wechsler Intelligence Scale for Children—fourth edition (WISC-IV) in a cohort of school-aged children who had been monitored for more than 5 years after LDLT for BA in infancy. In addition, we investigated whether preoperative, perioperative, and postoperative variables affected intellectual outcomes.

Patients and methods

Study population and inclusion criteria

Between January 1998 and December 2017, 520 patients underwent LDLT at Kumamoto University Hospital, Kumamoto, Japan. Of the patients younger than 18 years who received pediatric LDLT, 102 had BA, 27 had metabolic liver disease, 23 had acute liver failure, 9 had neoplasms, and 24 had other disease entities. Of the 102 pediatric patients with BA, 49 underwent LDLT before the age of 1 year. For the purpose of this study, we excluded 14 patients who were younger than 6 years and 4 patients who were older than 16 years old at the time of testing. The remaining 31 patients who were followed up for more than 5 years after LDLT and were eligible to be tested with the WISC-IV (mean = 100, standard deviation [SD] = 15), which covered the ages between 6 and 16 years. Of these 31 patients, 9 did not have signed informed consent documents to join this study, 1 died, and 1 patient could not be located. Finally, data from the remaining 20 patients were subjected to analyses (Fig. 1). All living donors were family members (fathers, mothers, and one grandmother). All patients were of Japanese descent. The pediatric LDLT procedures at our institution have been described previously [14], and the immunosuppressive regimen consisted of tacrolimus combined with low-dose steroids [15].

Fig. 1
figure 1

Flow-chart of patient selection. LDLT living donor liver transplantation, WISC-IV Wechsler Intelligence Scale for Children—fourth edition

This study was conducted at Kumamoto University Hospital and approved by the hospital’s institutional review board (Kumamoto University Hospital #2305).

Data collection

Preoperative, perioperative, and postoperative data of each patient were obtained from medical charts. To evaluate intellectual property, we administered the Japanese version of WISC-IV [16,17,18] (the fifth edition was not available in Japan) to the patients after the parents or caregivers gave consent and the patients themselves agreed to participate in our study. Their results were assessed by three trained test psychiatrists. For each child, the clinical data before, at the time of, and after LDLT were collected from digital medical charts at our institution.

Because socioeconomic conditions have been reported as a risk factor for intelligence [8], we also performed a parental questionnaire about social status to assess correlations with intellectual developmental outcomes. The questions included were the number of family members; main caregiver’s education, occupation, and income; whether the parents had divorced after LDLT; and whether the patients were the eldest among siblings.

Definition of intellectual borderline by WISC-IV

The WISC-IV yields scores for the following indexes: full scale intelligence quotient (FSIQ), verbal comprehension index (VCI), perceptual reasoning index (PRI), working memory index (WMI), and processing speed index (PSI). We performed a one-sample t test to determine whether the patients’ mean scores on indexes were significantly different from normative mean values. The borderline score of WISC-IV has been traditionally defined as 70–85 (− 2SD to  − 1SD) [19]. We classified our patients into two groups: those with intellectual borderline scores (FSIQ ≤ 85) and those with normal scores (FSIQ > 85).

Statistical methods

The distribution of the data for the measured parameters was investigated with standard descriptive analyses. Clinical data were not normally distributed. Thus, the data were calculated as medians and the corresponding ranges of minimum to maximum. WISC-IV index scores were compared with the norms of the general population by means of the one-sample t test for normally distributed data.

To investigate the relation between the FSIQ score and clinical variables, we compared the two groups by a two-tailed Mann–Whitney U test for continuous data and by a chi-squared test for categorical data. For patients who underwent retransplantation, we analyzed preoperative and perioperative variables of the first transplantation and postoperative variables of the second transplantation. The variables for which p < 0.20 in the univariable regression analyses were included in a multivariable logistic regression analysis to identify predictive factors of intellectual borderline performance (FSIQ ≤ 85). SPSS version 26.0 (IBM Corporation, Armonk, NY, USA) was used to calculate the results of the one-sample t test, two-tailed Mann–Whitney U test, chi-squared test, and univariable and multivariable logistic regression analyses. A p value of < 0.05 was considered significant.

Results

Patient demographics

Of the 20 participants, 8 were boys, and 12 were girls. They were diagnosed with BA who were born at a median of 39 (31–41) weeks of gestation. Median serum total bilirubin before LDLT was 13 (0.60–43) mg/dL. Median serum albumin before LDLT was 3.3 (2.3–4.1) g/dL. Before LDLT, 16 (80%) were hospitalized and 4 (20%) were followed as an outpatient. The median donor age was 31 (23–62) years old. Twelve (60%) donors were female and 8 (40%) were male. Data on the social status were obtained from 16 (80%) of subjects and are summarized in Supplementary Table 1. No patient had a history of intracranial hemorrhage, vision and hearing impairments, obvious neurodevelopmental disorders, or other congenital structural anomalies associated with BA. Eighteen patients had undergone the Kasai procedure, but sufficient reduction of the bilirubin levels was achieved in only 3 (17%).

At the time of LDLT, the participants’ median age was 6.2 months (3.5–12 months). The median interval from the Kasai procedure to LDLT was 4.5 months (2.2–9.9 months). Two patients had undergone primary LDLT without the Kasai procedure after adequate weight gain because advanced liver cirrhosis was identified at the time of diagnosis. The median Pediatric End-stage Liver Disease score at the time of LDLT was 15 (\(-\) 4 to 27). A total of 20 patients underwent LDLT with a median weight Z score of − 0.8 (\(-\) 3.8–1.6) and a median height Z score of − 1.5 (− 5.8–1.2). There were 6 (30%) ABO incompatible LDLTs. The median operation time was 630 (543–900) min. The median graft-to-recipient weight ratio (GRWR) was 3.2 (2.4–3.9) %. The median warm and cold ischemic time were 120 (59–190) and 39 (32–52) min, respectively.

After LDLT, 2 patients developed hepatic venous stenosis. Balloon dilation was successful in one patient; the other patient ultimately required stent placement 31 months after LDLT. Biliary stricture occurred in two patients, and both underwent percutaneous transhepatic biliary drainage. The stricture resolved in one patient; the other underwent surgical revision (hepaticojejunostomy) 2 months after LDLT. Two additional patients underwent re-LDLT. One of those patients suffered septic shock because of intestinal perforation during the first week after LDLT, which eventually led to graft failure, and re-LDLT was performed 4 months after the first transplantation. In the other patient, who had undergone primary LDLT without the Kasai procedure, portal vein stenosis and cholangitis resulted in graft failure, and the patient underwent re-LDLT 10 years after the first transplantation.

The median age at the time of WISC-IV testing was 8.6 years (6–16 years), with a median follow-up period of 7.6 years (5–15 years). The median weight and height Z scores were − 0.42 and − 0.35, respectively. All patients attended regular classes, and none of them required special needs support at the time of assessment. They visited our outpatient clinic with one or both parents at monthly intervals. The median serum total bilirubin level was 0.5 mg/dL (0.3–2.0 mg/dL), and the median tacrolimus trough level was 1.0 ng/dL (0.0–12.3 ng/dL) at WISC-IV test. Four patients were on triplet immunosuppression that consisted of tacrolimus, mycophenolate mofetil, and steroid, and one patient was on tacrolimus and mycophenolate mofetil. After LDLT, the median number of steroid pulse therapy after LDLT were 0 (0–2). The median total steroid dosage was 474 (211–2846) mg. The median number of hospitalizations were 7 (2–21). The median maximum serum bilirubin level was 4.8 (1.9–50) mg/dL. The median minimum serum albumin level was 2.1 (1.6–2.6) g/dL. The median maximum ammonia was 100 (73–279) µg/dL. The median maximum tacrolimus trough level was 17 (13–21) ng/mL.

WISC-IV scores

Figure 2 demonstrates the box and whisker plots of the results of the one-sample t tests of the WISC-IV index scores (n = 20). No patients had an FSIQ score of less than 70 (the clinical definition of intellectual disability of the general population), which corresponded to − 2 SD. The mean (SD) FSIQ score of the study population was 91.8 (15.3) and was significantly lower than that of the general population [100 (15), p = 0.026]. Likewise, the mean (SD) PRI score [89.3 (16.8)] was significantly lower than that of the general population (p = 0.010). The scores for the VCI, WMI, and PSI of the study patients did not differ significantly from those of the general population.

Fig. 2
figure 2

Box and whisker plots of the results of the WISC-IV scores of the study population (n = 20) and one-sample t test comparison with the general population. The mean (standard deviation) FSIQ and PRI scores of the study population (n = 20) were significantly lower (p = 0.026 and p = 0.010, respectively) than those of the general population (mean 100, standard deviation 15). WISC-IV Wechsler Intelligence Scale for Children—fourth edition, FSIQ full scale intelligence quotient, VCI verbal comprehension index, PRI perceptual reasoning index, WMI working memory index, PSI processing speed index

Subgroup analysis revealed that the mean (SD) PSI of 10 patients who underwent the Kasai procedure at other centers was significantly lower than that of the general population (p = 0.020). The mean FSIQ of these 10 patients was also lower than that of the general population, but the difference was not statistically significant (p = 0.052; Supplementary Fig. 1). Meanwhile, the WISC-IV index scores of the eight patients who underwent the Kasai procedure at our center were similar to those of the general population (Supplementary Fig. 2). Of note, the median age at LDLT was significantly older for patients who underwent the Kasai procedure at other centers than that at our center (9.1 months vs. 6.4 months, respectively, p = 0.004). However, the difference of median interval from the Kasai procedure to LDLT was not statistically significant between the 2 groups.

Risk factors for borderline intellectual outcome

For FSIQ, 7 patients (35%) and 13 patients (65%) scored ≤ 85 and > 85, respectively, in comparison with the general population, of whom 15.8% scored ≤ 85 and 84.2% scored > 85.

To evaluate possible risk factors for borderline intellectual performance, the Mann–Whitney U test was used to compare the preoperative, perioperative, and postoperative data of children who had FSIQ scores of ≤ 85 (borderline group) with those of children who had FSIQ scores of > 85 (normal group; Tables 1 and 2). Univariable regression analyses revealed that patient status, donor gender, operation duration, and a treatment regimen that included tacrolimus with other drugs were potentially significant variables (p < 0.20; Table 3 and Supplementary Table 2). To identify independent factors for the borderline FSIQ score, multivariable logistic regression models were created by including those variables, but none of them was predictive of borderline FSIQ.

Table 1 Comparison of preoperative and perioperative characteristics between FSIQ ≤ 85 and FSIQ > 85
Table 2 Comparison of postoperative characteristics between FSIQ ≤ 85 and FSIQ > 85
Table 3 Variables associated with FSIQ ≤ 85 by univariate logistic regression analysis (p < 0.20)

Discussion

In this study, we assessed the intellectual performance of school-aged children who had undergone LDLT for BA before the age of 1 year and had been monitored for more than 5 years after LDLT. The one-sample t test revealed that the FSIQ scores of these patients were significantly lower than those of the general population. Rodijk et al. reported that more than two-thirds of previous studies showed deficits in all neurodevelopmental areas, including intellectual impairment, in children with liver diseases [20]. No study to date, however, has focused on the effect of LDLT on intellectual development in a homogeneous cohort of patients with a uniform diagnosis, similar ages at LDLT, and a sufficient follow-up period. For the first time, we demonstrated that LDLT for BA in infancy affects intellectual development in approximately one-third of school-aged transplant recipients in the long run.

Regardless of the level of intellectual development, it is important that the recipients and their families have a good quality of life. Rodijk et al. emphasized the importance of neurodevelopmental interventions for school-aged children with BA [21]. None of our patients required special educational and social supports, and all were leading normal daily lives. Nonetheless, physicians should be prepared for future risks. The children with intellectual borderline scores accounted for 35% of our patient population, and this subset of patients may benefit from early consultation with neurodevelopmental–behavioral pediatricians and pediatric psychologists or psychiatrists. At our center, only the surgeons oversee follow-up for pediatric recipients with no apparent intellectual or developmental problems. However, our findings shed light on the critical importance of close post-transplant monitoring of presumed intellectually competent pediatric recipients by both surgeons and pediatricians. We recommend that patients with BA who undergo liver transplantation in infancy take the WISC-IV test when they reach school age so that those with borderline intellectual performance (FSIQ ≤ 85) are identified without delay. The need for and timing of interventions, including academic, behavioral, mental health, and social supports, should be thoroughly discussed and determined according to the WISC-IV test results of each individual patient.

We also found that the mean PRI score of our patients was significantly lower than that of the general population. The PRI test is an examination of nonverbal fluid reasoning skills and enables a direct assessment of cognitive processes, including visual perception, visual–motor integration, visuospatial processing, and coordination. Our results were consistent with the observation by Haavisto et al. that some pediatric patients who underwent LT later showed visuospatial impairment [22].

In general, repeated exposure to anesthesia and surgery at an early age is an important independent risk factor for the subsequent development of learning disabilities and adversely affects neurodevelopment [23]. Whether liver transplantation and anesthesia in infancy produces an additive effect on intellectual development was out of the scope of this study and warrants further investigation.

Previous studies have demonstrated that several preoperative factors such as hyperammonemia before LT, long-term illness, and percentile height at the time of LT were associated with cognitive decline in pediatric patients after LT [8, 10, 22, 24]. The brain is more susceptible to the deleterious effects of ammonium in childhood than in adulthood. Hyperammonemia causes irreversible damage to the developing central nervous system: cortical atrophy, ventricular enlargement and demyelination lead to cognitive impairment, seizures, and cerebral palsy [25]. The ammonia concentration and its duration appear to be key determinants of the long-term outcome [26]. However, we did not find a significant correlation between hyperammonemia and intellectual borderline functioning (FSIQ scores of ≤ 85) in this study.

Likewise, in contrast to previous publications [10, 22], we found that the duration of the disease was not predictive of intellectual impairment. The relatively short duration of disease or the small sample size may explain these results.

It is noteworthy that the patients who underwent both the Kasai procedure and LDLT at our center did not show intellectual borderline functioning, whereas those who underwent the Kasai procedure at other centers demonstrated lower mean FSIQ and processing speed index than the general population. The median age at LDLT was significantly older in the latter group, suggesting that earlier referral to a pediatric liver transplant center after a failed Kasai procedure and potentially earlier LT may mitigate concerns about intellectual disability and contribute to normal intellectual development; however, the retrospective nature of the study renders this possibility only speculative.

In addition, manganese deposits and their neurotoxicity in patients with BA have been reported [27,28,29,30], but those were not evaluated in this study, and future investigation is warranted. Preterm birth and low birth weight (less than 2500 g) have also been reported to be risk factors for intellectual disability [31, 32]. In this study, only one patient was born preterm at less than 37 weeks, and only one had low birth weight. Moreover, noncardiac major surgical procedures in neonates increase the risk of neurodevelopmental decline [33]. However, performing LDLT for BA during the neonatal period (first 4 weeks after birth) is extremely rare.

Regarding potential contributing factors of intellectual development impairment, as mentioned earlier, Flick et al. reported that repeated exposure to anesthesia and surgery before the age of 2 was an independent risk factor for the later development of learning disabilities [23]. On the other hand, Håkanson et al. demonstrated that exposure to anesthesia and abdominal surgery during infancy was not associated with cognitive dysfunction in adolescent and adult individuals [34]. Whether the detrimental effect of LDLT on intellectual development differs from that of other abdominal operations should warrant further study.

In clinical practice, we encounter many pediatric recipients with overt intellectual or developmental impairment after LDLT for BA. Such patients were excluded from this study. Our findings should increase the awareness of clinicians and caregivers that a certain number of recipients who underwent LDLT for BA in infancy, have good liver graft function, and are presumably intellectually competent may actually have intellectual borderline functioning. We expect our study to be a cornerstone of further large-scale investigations with multicenter or registry data. It would also be important to assess the intellectual function of patients who underwent LDLT for BA after 1 year of age.

The limitations of this study included its retrospective design; small sample size conducted at a single center; the lack of pretransplant intellectual information; the different ages at WISC-IV testing; no comparison with BA patients who survived with their native liver, those who underwent LDLT at different ages, those who underwent deceased donor LT; and the lack of a healthy matched control population. Apparently, a large-scale study is urgently needed to see if our observation stands true for the entire cohort. Nevertheless, previous researchers have studied the intellectual outcomes of patients without a control group and used the mean score of 100 (SD 15) as reference, and we adopted the same approach [10, 21, 24]. There was an almost 20-year time span for the study which may have affected outcomes given improved techniques and medications, and changes in schools systems and educational supports offered. Ideally speaking, pretransplant assessment of baseline neurologic/intellectual function followed by serial measurements at specified timepoints would have provided better understanding of the cause-effect relationship between LDLT and intellectual development. It would also be interesting to compare the WISC-IV results of LT recipients with their siblings as this could potentially be a better estimate of anticipated IQ levels instead of comparing them to a large “normal” population dataset. In addition, the clinical implications of an approximately 10-point difference in FSIQ with regards to social function and quality of life of an individual warrant further investigation.

In conclusion, the mean FSIQ of the school-aged children who underwent LDLT for BA in infancy was significantly lower than that of the general population and approximately one-third of patients was classified as intellectual borderline functioning. Timely evaluation of intellectual status of such patients is therefore recommended to discuss the need for educational and social support with the caregivers.