Introduction

Immunoglobulin light-chain (AL) amyloidosis is a progressive disease characterized by the deposition of amyloid fibrillar in tissues, which is derived from the monoclonal immunoglobulin light chain produced by clonal plasma cells [1]. It is estimated that around 74,000 AL amyloidosis patients were diagnosed globally in the past 20 years prior to 2018, and the estimated incidence rate was 10 cases per million population with an increase in disease prevalence of 12% per year [2, 3]. According to previous research, the outcome of AL amyloidosis patients is improved with the median overall survival (OS) from 1.4 to 4.6 years [4]. However, the largest real-world study of AL amyloidosis patients in Europe indicated that the early mortality (within 3 months after therapy) was 13.4% and did not improve pre- and post-2010 despite the therapeutic advances [5]. More than one-third of patients died after diagnosis of AL amyloidosis within one year [6]. Accurately predicting the patient’s outcome and providing appropriate treatment are crucial.

The current OS prediction for AL amyloidosis is based on the indicators of cardiac involvement [7]. The Mayo 2004 staging system, which only relies on cardiac biomarkers, is utilized for assessing the severity of AL amyloidosis patients with advanced cardiac involvement and continues to be employed in clinical investigations and practice guidelines up to now [8,9,10]. The Mayo 2012 staging system, which incorporates serum free light chain measurements (FLC) into Mayo 2004 staging system, has been hypothesized to have a more precise prediction of the long-term prognosis for patients with AL amyloidosis [11]. However, as a multi-system disease, the prognosis of AL amyloidosis patients is not only affected by a single organ, which is supported by the increasing number of proven risk factors for AL amyloidosis [12]. It is considered significant to develop comprehensive, multi-disciplinary approaches to clinical diagnosis, treatment and prognosis for AL amyloidosis patients. Random survival forest (RSF), a machine learning algorithm that combines random forest with survival analysis and can integrate high-dimensional data, has shown promising performance in patients with other diseases [13, 14]. To accurately predict OS for AL amyloidosis patients, the goal of this study was to identify key indicators of multi-organ involvement and then to develop a prognostic model based on the RSF and validate its performance.

Methods

Data collection

The patients diagnosed at Xijing hospital from January 1, 2012 to February 28, 2023 were included in this study. The confirmation of AL amyloidosis is based on 2023 National Comprehensive Cancer Network guidelines [1]. Exclusion criteria were as follows: (1) patients’ pathology reports, such as renal or fat biopsy, were missing; (2) patients who were secondary to multiple myeloma according to the criteria by the International Myeloma Working Group [15]; (3) patients whose clinical data of the difference between involved and uninvolved free light chain (dFLC), N-Terminal pro-Brain Natriuretic Peptide (NT-proBNP), cardiac troponin T (cTnT) and cardiac troponin I (cTnI) were unavailable when the confirmation of AL amyloidosis; (4) patients with ejection fraction (EF) < 45% at the confirmation of AL amyloidosis; (5) patients whose follow-up time less than 3 months unless they died due to all causes. Involvement of organs was assessed according to the 10th International Symposium on Amyloid and Amyloidosis [16]. The eligible patients’ clinical data at the time of diagnosis by biopsy, including demographic characteristics, clinical indicators, pathological characteristics and treatment strategies, were gathered. The OS was defined as the time between the confirmation of AL amyloidosis and death or the last follow-up until May 31, 2023. This study was authorized by Xijing Hospital’s ethics committees and review board, and informed consent was not required because of the nature of the retrospective study. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis reporting guidelines was followed in this study (Supplemental Table 1) [17].

Table 1 Clinical characteristics of patients

Statistical analysis

The continuous variables were presented by mean (standard deviation) or median (interquartile range), and the categorical variables were expressed as number (frequencies). The t test or the Mann–Whitney U test was performed to compare continuous variables, and the χ2 test or the Fisher exact test was applied for categorical variables. All statistical analyses were finished by SPSS software package version 26.0 (IBM, Armonk, New York) and 2-sided P-values less than 0.05 were considered statistically significant. The model development and assessment were performed by R version 4.3.0 (R Project for Statistical Computing).

Model development

Variables’ missing value was imputed by “missForest 1.5” package, but the variable that missed exceed 10% should be deleted. To reduce confounding, continuous variables with Spearman’s correlation coefficients greater than 0.4 would also be deleted. After that, all data were randomly divided into a training set and a test set at a ratio of 7:3. The training set was used to assess the variable importance (VIMP), and 95% confidence interval (CI) of VIMP was calculated by applying the bootstrapping resampling method with 1000 repetitions. The variables with mean VIMP more than 0.03 were selected for developing the final RSF model by “randomForestSRC 3.2.1” package. The ideal hyperparameter was identified by the grid search method.

Model validation

The Concordance index (C-index) and the area under the receiver operating characteristic curve (AUC) were calculated to evaluate the model’s discrimination. Integrated brier score (IBS) and Calibration Curve were used to evaluate the model’s calibration. To ensure accuracy, the bootstrap method with 1000 repetitions was performed to calculate the 95% CI of C-index and IBS of prediction models.

Cox models, which were based on the variables with mean VIMP more than 0.03, Mayo 2004 staging system and Mayo 2012 staging system, were developed to validate the Superiority–Inferiority of the RSF model. Decision Curve Analysis was also performed to confirm the clinical benefits of different models. Besides, Net reclassification improvement (NRI) and integrated discrimination improvement (IDI) between models were calculated.

3. Results

Characteristics of patients

A total of 173 patients were included (Fig. 1). The mean age was 58 ± 9.4 years old. 106 (61.3%) patients were male. 144 (83.2%) patients had lambda amyloidogenic light chains. According to Mayo 2004 staging system, 43 (24.9%), 24 (13.9%) and 106 (61.2%) patients were in the I, II and III stage. 79 (45.7%), 32 (18.5%), 47 (27.2%) and 15 (8.6%) patients were in the I, II, III and IV stage based on Mayo 2012 staging system. Patients with renal and cardiac involvement were 166 (95.9%) and 103 (59.5%), respectively. 104 (60.1%) patients had more than two organs involved. After a median follow-up of 25.9 (9.2, 50.3) months, 48 (27.7%) AL amyloidosis patients died (Table 1).

Fig. 1
figure 1

Flowchart of this study. AL amyloidosis, immunoglobulin light-chain amyloidosis, EF ejection fraction

Model development

49 continuous variables and 23 categorical variables were included in our study (Supplemental Table 2). Previously reported thresholds were used to dichotomize categorical variables. 13 variables had missing values but no one of them missed exceed 10% and no statistically significant were found before and after imputation (Supplemental Table 3). After deleting confounding variables, 23 continuous variables were left (Supplemental Fig. 1), and were evaluated by RSF together with 23 categorical variables. At last, 34 variables could be used to develop the RSF model due to their mean VIMP > 0 (Fig. 2). Creatine kinase-MB (CK-MB), estimated glomerular filtration rate (eGFR) ≤ 50 mL/min/1.73 m2, interventricular septum (IVS) ≥ 15 mm, EF, alanine aminotransferase (ALT) and Live involved were the key indicators with a mean VIMP more than 0.03. The ntree for the final RSF model was set to 800. According to grid search, the optimize mtry and nodesize were 1 and 8, respectively (Supplemental Fig. 2).

Table 2 The concordance index and Integrated brier score of all prediction models
Fig. 2
figure 2

Variable importance of indicators favourable for developing model. VIMP variable importance, CK-MB creatine kinase-MB, eGFR estimated glomerular filtration rate, IVS interventricular septum, EF ejection fraction, ALT alanine aminotransferase, HDL high-density lipoprotein, UA uric acid, IBIL indirect bilirubin, Fib Fibrinogen, BMI body mass index, ALP alkaline phosphatase, Hb Hemoglobin, NOI number of organs involved, ERY erythrocyte, DDi D-Dimer, WBC white blood cell, TBD time before diagnosis, DBP diastolic blood pressure, iFLC involved free light chain, TT thrombin time, TD thyroid dysfunction. To reduce confounding, 26 continuous variables were removed. Then, the VIMP of the last 23 continuous variables and 23 categorical variables were evaluated by the RSF algorithm. 12 variables were also removed due to their VIMP ≤ 0, which means they were unfavorable for developing model. Eventually, 34 variables were selected. As we can see, the CK-MB, eGFR ≤ 50 mL/min/1.73 m2, IVS ≥ 15 mm, EF, ALT and Live involvement were identified as the top six factors

Model performance

Discrimination

The C-index of all prediction models were presented in Table 2. In the training set, the RSF model had the highest C-index with 0.843 (95% CI 0.761–0.898) and the Cox model which was based on the same variables was the lowest with 0.769 (95% CI 0.661–0.844). The C-index of of Cox models which were based on Mayo staging system were satisfactory. In the test set, the RSF model’s C-index dropped slightly to 0.834 (95% CI 0.725–0.915), but the C-index of Cox model which was based on the same factors rose to 0.856 (95% CI 0.727–0.913). Besides, it was notable that the C-index of Cox model which was based on NT-proBNP and cTnI was the lowest with 0.810 (95% CI 0.682–0.895).

The receiver operating characteristic curves of all models were also plotted (Fig. 3). In the training set (Fig. 3A), AUCs of the RSF model at 1-, 3- and 5-year were 0.875, 0.882 and 0.852 and the AUCs of Cox model which was based on the same factors were 0.800 0.794 and 0.723. RSF model’s AUCs were the highest among all prediction models at 3- and 5-year. In the test set (Fig. 3B), the AUCs of RSF model at 1-, 3- and 5-year were 0.790, 0.870 and 0.895, and the Cox model which was based on the same factors had the AUCs of 0.834, 0.890 and 0.905. The Cox model which was based on NT-proBNP and cTnI had the lowest AUC with 0.868 and 0.852 at 3- and 5-year.

Fig. 3
figure 3

The receiver operating characteristic curve of all prediction models. (A) are the receiver operating characteristic curves (ROC) in the training set. (B) are the ROC in the test set. Creatine kinase-MB, estimated glomerular filtration rate ≤50 mL/min/1.73 m2, interventricular septum ≥15 mm, ejection fraction, alanine aminotransferase and Live involved are adopted to develop random survival forest (RSF) model and Cox model A. Cox model B was based on N-Terminal pro-Brain Natriuretic Peptide (NT-proBNP) and cardiac troponin T (cTnT). Cox model C was based on NT-proBNP and cardiac troponin I. Cox model D was based on NT-proBNP, cTnT and the difference between involved and uninvolved free light chains

Calibration

All models with low IBS showed great calibration ability (Table 2). The RSF model’s IBS was 0.106 (95% CI 0.104–0.109) in the training set and 0.151 (95% CI 0.140–0.181) in the test set. The IBS of Cox model which was based on the same factors were 0.136 (95% CI 0.135–0.137) and 0.132 (95% CI 0.119–0.164) in the training set and test set, respectively. However, it was notable that the Cox model which was based on NT-proBNP and cTnI had the highest IBS in both the training set and test set. The calibration curves of RSF model were displayed (Supplemental Fig. 3), which confirmed the great performance of RSF model.

Decision Curve Analysis

In the training set, the RSF model was more beneficial than other Cox models at 1-, 3- and 5-year (Supplemental Fig. 4A). In the test set (Supplemental Fig. 4B), the benefits of Cox model which was based on NT-proBNP and cTnI were consistently the least compared with other prediction models.   

NRI and IDI

All data was integrated to calculate NRI and IDI between the RSF model and other Cox models (Supplemental Table 4 and Supplemental Table 5). The Cox model which was based on NT-proBNP, cTnT and dFLC was used as the reference model. At 1-year, the RSF model’s NRI and IDI were 0.229 (95% CI –0.013–0.474, P = 0.064) and 0.064 (95% CI –0.075–0.212, P = 0.406), which indicated the RSF model performed just as well as the Cox model which was based on NT-proBNP, cTnT and dFLC. However, the NRI and IDI of the Cox model which was based on NT-proBNP and cTnI were –0.457 (95% CI –0.660––0.302, P < 0.01) and –0.210 (95% CI –0.320––0.108, P < 0.01), which meant the relatively poor performance. At 3-year, the NRI and IDI of the RSF model was 0.272 (95% CI 0.059-0.506, P = 0.014) and 0.126 (95% CI 0.008–0.251, P = 0.032). At 5-year, the NRI and IDI of the RSF model were 0.301 (95% CI 0.048–0.546, P = 0.024) and 0.157 (95% CI 0.041–0.269, P < 0.01). These results indicated the RSF model had superior performance than the Cox model which was based on NT-proBNP, cTnT and dFLC. Meanwhile, the poor performance of Cox model which was based on NT-proBNP and cTnI was still evident at 3- and 5-year (P < 0.01).

The effect of treatment options on RSF model

In addition, we verified the effect of treatment options on RSF model. We grouped patients’ treatment options into five categories (Supplemental Table 6). The RSF model was developed by combining patients’ treatment options with the other six variables identified in our study (RSF-therapy). The C-index was 0.848 (95% CI 0.717–0.923) and the IBS of RSF-therapy was 0.138 (95% CI 0.126–0.172) in the test set (Supplemental Table 7). However, the AUCs of RSF-therapy were lower than those of our original RSF model at 5-year (Supplemental Fig. 5). To validate the Superiority–Inferiority of the RSF therapy compared with our original RSF model, the NRI and IDI of RSF-therapy were calculated (Supplemental Table 8). As we can see, NRI and IDI at 1-, 3- and 5-year (P < 0.05) indicated RSF-therapy had superior performance than our original RSF model.

Discussion

The main findings of our research are as follows. First, six routinely used indicators were identified with a significant impact on all-cause death for AL amyloidosis patients, which were CK-MB, eGFR ≤ 50 mL/min/1.73 m2, IVS ≥ 15 mm, EF, ALT, and Live involved. Second, the RSF model that incorporates above indicators of multi-organ involvement had great performance. Third, the Cox model which was based on NT-proBNP and cTnI was inferior to other models.

The outcome of AL amyloidosis patients mainly depends on whether cardiac is involved. However, traditional indicators such as NT-proBNP, cTnT, and cTnI were removed due to the high correlations among cardiac markers. At last, the CK-MB, EF, and IVS ≥ 15 mm were identified as crucial cardiac indicators. Previous studies have confirmed that an increased mortality rate was related to the elevation of CK-MB after percutaneous coronary intervention [18], and elevated CK-MB may predict the adverse outcome of patients with coronavirus disease 19 [19]. However, the prognostic value of CK-MB in AL amyloidosis is not clear yet. According to previous studies, FLCs can induce increased reactive oxygen species production in cardiomyocytes, followed by apoptosis and then lead to impaired heart function [20], which may cause the release of CK-MB from cardiomyocytes in AL amyloidosis patients.

AL amyloidosis patients often had heart failure with preserved EF, and experienced decreased EF in the advanced stage of disease. Meanwhile, IVS gradually thickens due to the deposition of amyloid and then leads to a poor outcome once the IVS exceeds the normal range significantly. Besides, the thickness of IVS was found to be significantly associated with high NT-proBNP levels [21], which is the main marker in the Mayo staging system. To early and accurately recognize pathological changes that may lead to decreased EF with thickened LVS in AL amyloidosis, Gallium-68-labeled fibroblast activation protein inhibitor 04 (68 Ga-FAPI-04) had demonstrated its significant correlations with EF and IVS [22]. Furthermore, cardiac 68 Ga-FAPI-04 PET/CT also proved its great potential in the diagnosis and prognosis of AL amyloidosis patients [22]. Identifying the changes in EF and IVS early may prevent long-term death in AL amyloidosis.

Hepatic involvement is found in approximately 20% of AL amyloidosis patients. The 5- and 10-year survival rates of those patients are less than 16.9 and 6.6% in the past [23, 24]. Even with the advent of the new therapy, the prognosis of patients with hepatic involvement is still poor. For patients with hepatic involvement, hepatocytes may be compressed by excessive amyloid deposits, which will cause them to atrophy and then to hepatic dysfunction [25]. Chronic heart failure due to cardiac amyloidosis also influenced hepatic dysfunction [26]. Therefore, alkaline phosphatase (ALP) and total bilirubin were found to be independent prognostic factors for those patients. In our study, total bilirubin was also removed, and indirect bilirubin with ALP was identified as prognostic factors but were not selected to develop models due to their relatively lower VIMP. Eventually, ALT that be found as a predictor of mortality in a previous study [27], was selected to develop the prediction model.

In patients with renal involvement, the decreased eGFR appeared in the stage of irreversible damage of FLCs to the kidney. The decreased eGFR seems to be related to prerenal kidney injury brought on by cardiac involvement [28]. According to previous studies, 50 mL/min/1.73 m2 of eGFR and 5 g of urinary total protein (UTP) could be considered the key threshold for adverse outcomes [29]. In our study, only eGFR ≤ 50 mL/min/1.73 m2 at the time of biopsy was identified as a key indicator (VIMP more than 0.03), which confirmed the significance of eGFR to all-cause death. That was also supported by other studies that impaired renal function negatively affects overall survival, but proteinuria does not [28]. Besides, amyloid deposition in the kidney not only can lead to renal dysfunction but also accelerate the pre-existent renal insufficiency. However, the exact time when renal dysfunction began to develop was not known because a decline in eGFR was hard to perceive by patients. In this study, we included the time interval between the onset of disease-related symptoms and a definitive diagnosis of AL amyloidosis (TBD), and the TBD was also identified as a key variable that was beneficial for RSF model development (VIMP more than 0) (Fig. 2). This suggested that early diagnosis is crucial for patients, which also implied the importance of early intervention, such as maintaining normal renal function by reducing the damage of amyloid or other treatments.

The comprehensive and interdisciplinary approaches for AL amyloidosis are developing. Jimenez-Zepeda et al. presented a multifaceted, interdisciplinary algorithm aimed at diagnosing amyloidosis, with the potential to streamline diagnostic processes, enable prompt and accurate interventions, and ultimately improve patient outcomes [30]. Yan et al. developed a novel clinical prognostic staging system for individuals with AL amyloidosis by integrating plasma cell-related characteristics with cardiac-renal-hepatic parameters, thereby enhancing risk stratification [31]. However, as mentioned in the literature above, there are regional variations in the availability of resources for prognostic testing. The existing staging systems do not provide precise survival estimates and variations in threshold values for the same indicator may also cause some interference in clinical practice. Furthermore, Cox proportional hazards analysis is deficient in making full use of voluminous, messy and complex clinical data. Besides, the linear relationship between the hazard and variables, and proportional hazards assumption may not be applicable for some time-to-event data, which may limit the performance of Cox models. However, the RSF model can avoid the restrictions mentioned above [32].

Most importantly, the RSF model shows great performance in discrimination, calibration and clinical utility. In the test set, the C-index of the RSF model was 0.834 (95% CI 0.725–0.915). Meanwhile, the AUCs of the RSF model were also satisfactory. The IBS of the RSF model was also relatively lower. By comparing the RSF model to the Cox model which is based on the Mayo 2012 staging system, we found both the NRI and IDI of the RSF model were statistically significant improvements at 3- and 5-year, which implied a better performance. Besides, the DCA of the RSF model also presented its benefits. In contrast, the Cox model which was based on NT-proBNP and cTnI showed relatively poor performance with the lowest C-index, the highest IBS, and the most limited DCA, which was consistent with previous research that the survival status of AL amyloidosis patients could not be comparatively accurately divided by NT-proBNP combined with cTnI [33, 34].

By comparing RSF therapy with original RSF model, we speculated that incorporating treatment options in RSF model can effectively predict the prognosis of AL amyloidosis patients. However, the cohort of patients participating in our study who opted for daratumumab-based therapy was limited in sample size. Considering the robust effect of daratumumab-based therapy in AL amyloidosis patients and its potential adoption by a large number of patients in the future [8, 9, 35, 36], the performance of RSF-therapy is likely to be further improved.

Our research also has several limitations. First, due to the rarity of AL amyloidosis, the number of patients included in our study was relatively modest and lack of external validation. To confirm our findings, a prospective study with a large cohort of AL amyloidosis patients from multi-center is warranted. Second, some potential key factors may be missed in the medical record or the process of developing the RSF model. Detailed clinical data from patients with AL amyloidosis are also needed. Third, advancements in treatment always result in a better prognosis for patients. In our study, only the treatment regimen adopted at the time of confirmation of AL amyloidosis was collected, which may ignore the potential influence of alternative therapies on the survival of patients. Although treatment options were not included in the Final RSF model, the performance of RSF-therapy also indicated the improvement of RSF model with the entry of therapy, which also needs to be tested.

Conclusions

In summary, our research provides a timely and effective approach to accurately predict the survival time of AL amyloidosis patients based on multi-organ indicators by RSF. The performance of the developed RSF model is great and could be a valuable addition to Mayo staging system to help physicians’ decision making.