Introduction

Although the average length of hospital stay following TKA has decreased over recent years due to enhanced perioperative and intraoperative management, length of stay (LOS) continues to be a substantial driver of costs [2]. A recent study investigating hospital costs for patients with different LOS highlighted that hospital costs increase by 5–8% for every additional night spent in the hospital [27]. The projected rise in the number of primary TKA procedures will be accompanied by a concomitant increase in revision TKA surgeries, with modeling studies forecasting around half a million revision TKAs to be performed over the next decade [20, 21]. Prolonged length of stay, defined as the LOS greater than the 75th percentage of the length of stay for all revision TKA patients [22], provides a particular challenge in terms of cost containment as it increases the average total hospital costs by almost 40% [27]. Therefore, understanding modifiable risk factors for prolonged LOS will be essential to make bundled payment models cost-effective.

Prior retrospective studies have identified numerous modifiable and non-modifiable risk factors for prolonged length of stay following primary and revision TKA [3, 24]. However, these studies do not address the weight of each of these risk factors for prolonged LOS following knee arthroplasty surgery [3, 24]. Therefore, statistical models that can predict patients who will require prolonged length of stay have the potential to help optimizing patients preoperatively.

Artificial intelligence (AI) algorithms, such as artificial neural networks (ANN), represent valuable tools for analyzing and interpreting large and complex datasets, thus these were applied in many medical fields [4, 23]. Although AI algorithms were used in prior literature to predict clinical and functional outcomes for patients following arthroplasty surgery [15,16,17], AI algorithms have yet to be used for the prediction of prolonged length of stay. Therefore, the aim of this study was to develop and validate artificial intelligence algorithms for identifying patients at higher risk of prolonged length of stay following revision total knee arthroplasty. The authors hypothesize that artificial intelligence algorithms can accurately predict prolonged length of stay following revision total knee arthroplasty.

Materials and methods

This study received Institutional review Board approval for the retrospective review of medical health records. A consecutive series of 2577 patients that underwent revision total knee arthroplasty at a single tertiary institution between 2010 and 2017 was identified. Exclusion criteria included (1) patients with prior revision surgeries, (2) bilateral revision TKA procedures and (3) incomplete data. A total of 2512 revision TKA patients remained for evaluation and inclusion for the development of artificial intelligence algorithms to predict prolonged length of hospital stay following revision TKA.

Primary outcome and candidate variables

The primary outcome was the prediction of prolonged length of stay for patients following revision total knee arthroplasty. Length of stay was defined as time between hospital admission and discharge [3]. Prolonged length of hospital stay was defined in concordance with previous literature as length of hospital stays that exceed the 75th percentile of all length of stays following revision TKA [22, 25]. The secondary outcome of interest was the comparison of clinical outcomes between patients with prolonged LOS and those patients without a prolonged LOS following revision TKA.

Candidate variables were collected and included patient, surgical and implant factors which were associated with prolonged LOS in prior studies [24, 25]. Patient variables included for analysis involved: age, gender, body mass index (BMI), insurance status, marital status, ethnicity, medical comorbidities, American Society of Anesthesiologist Physical Status score (ASA score), Charlson comorbidity index (CCI) and preoperative opioid use. Surgical variables included for analysis involved: laterality, indication for revision surgery, type of revision TKA (single component vs all components revision), anesthesia type, tranexamic acid usage, component fixation method (cemented vs non-cemented), tourniquet use, operation time and blood loss [3]. To compare clinical outcomes between patients with prolonged LOS and those without a prolonged LOS following revision TKA, patient charts were also reviewed with regards to readmission rates and re-revision rates. All patients had a minimum follow-up time of 36 months.

Artificial intelligence algorithm development and data analysis

A 80:20 stratified split ratio was applied to the study cohort to create a training dataset (n = 2070 patients) and an independent testing dataset (n = 518 patients). Random forest recursive feature elimination was used to extract variables with the greatest predictive value [7, 11]. Three artificial intelligence algorithms were developed and applied to the training set: (1) neural network (NN), (2) support vector machine (SVM), and (3) elastic-net penalized logistic regression (EPLR). These three artificial intelligence algorithms were chosen based on prior literature demonstrating the high accuracy of these algorithms for the prediction of clinical outcomes [12, 13]. The training dataset underwent a fivefold cross-validation five times and each model was subsequently assessed using standardized metrics of model performance to identify the artificial intelligence algorithm with the best predictive analytics. We applied a coarse-grained grid-search algorithm with repeated random sub-sampling to tune each algorithm’s hyper-parameters during the training phase of each cross-validation round (ANN: number of hidden layer nodes; SVM: number of trees and boosting parameter; EPLR: mixing parameter α (Ridge regularization α = 0; Lasso regularization α = 0) and regularization penalty λ). The grid-search algorithm was constrained to pre-defined lower bounds, upper bounds, and step sizes for each hyper-parameter.

Four methods for model assessment were applied: (1) discrimination (area under the receiver operating curve [AUC]), (2) calibration (calibration plot—intercept and slope), (3) Brier score, and (4) decision curve analysis. Relative variable importance plots were utilized to determine the most important predictors for the algorithm with the best overall performance.

Discrimination of artificial intelligence candidate algorithms utilized the AUC, with AUCs greater than 0.80 representing excellent algorithm performances. Artificial intelligence algorithm calibration was ascertained through a calibration plot, with perfect candidate algorithms having a calibration slope of 1 and a calibration intercept of 0 [12]. Overall algorithm performance was assessed through the Brier Score [8], which is defined as mean squared difference between predicted probabilities and observed frequencies. Perfect artificial intelligence candidate algorithms have a Brier score of 0.

The interpretability of all artificial intelligence algorithms was performed at both local and global levels [9]. Global explanations were provided through the use of variable importance plots, which show the relative importance of variables used for prediction indexed against the most important variable (normalized to 100 points). In contrast, local explanations were provided for individual patients to demonstrate which variables for specific patients in question contributed to the prediction of the artificial intelligence algorithms [26]. All analyses were performed using Matlab (MathWorks Inc., Natick, MA, USA), Anaconda (Anaconda Inc., Austin, TX, USA) and Python (Python Software Foundation, Wilmington, DE, USA) (Fig. 1).

Fig. 1
figure 1

Schematic of five-fold cross-validation for machine learning algorithm development

Ethical approval

The retrospective review of electronic health records for this present study was approved by our Institutional Review Board (IRB; P2020P003315). Additionally, recommendations of the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis were followed for all data analysis [5].

Results

A total of 2512 consecutive patients (1347 males (53.6%), 1165 females (46.4%) underwent revision total knee arthroplasty (Table 1). Patient demographics and surgical variables for the revision TKA cohort are summarized in Table 1. Patients with prolonged LOS following revision TKA demonstrated a significantly higher re-revision rate (141 TKA patients (11.4%) vs 72 TKA patients (7.5%), p < 0.01), 30-day readmission rate (137 TKA patients (9.0%) vs 57 TKA patients (7.2%), p = 0.01), 60-day readmission rate (149 TKA patients (11.3%) vs 71 TKA patients (9.5%), p = 0.04) and 90-day readmission rate (203 TKA patients (14.0%) vs 88 TKA patients (10.7%), p = 0.03), when compared to patients without prolonged length of stay following revision TKA (Table 2).

Table 1 Baseline characteristics of study population
Table 2 Comparison of clinical outcomes between patients with prolonged length of stay and those without prolonged length of stay following revision TKA

Model performance

The optimal ANN had two hidden layers with 18 neurons each. The optimal SVM consisted of 100 trees, with the number of predictors for each node set to default. The optimal SVM learning rate was 0.35 with a subsampling coefficient of 0.75. The optimal EPLR used a mixing parameter α = 0.4 and a regularization penalty term of λ = 0.6.

The artificial intelligence algorithms identified numerous patients and surgical factors to be associated with prolonged length of stay following revision total knee arthroplasty (Fig. 2). These include age (> 75 years; p < 0.001), Charlson Comorbidity Index > 6 (p < 0.001), body mass index (> 35 kg/m2; p < 0.001), operative time (> 154 min; p < 0.001), type of revision TKA (revision of all components; p < 0.01), American Society of Anesthesiology score 3/4 (p < 0.01), revision surgery for peri-prosthetic joint infection or peri-prosthetic fracture (p < 0.01), renal disease, preoperative anemia (< 12 g/dL; p < 0.01), diabetes (p < 0.01), female gender (p = 0.02) and smoking status (p = 0.03). The greatest impact on the risk of a prolonged length of stay following revision TKA was observed for age (> 75 years), Charlson Comorbidity Index and body mass index (> 35 kg/m2; Fig. 2).

Fig. 2
figure 2

Global variable importance plot to assess overall importance of patient and surgical factors for the prediction of prolonged length of stay for patients following revision total knee arthroplasty

The performance for all three artificial intelligence candidate algorithms in both training and testing set is summarized in Tables 3 and 4. In the training phase, the AUCs for the three artificial intelligence candidate algorithms ranged from 0.86 for support vector machines to 0.88 for neural networks (Table 3; Fig. 3). In the testing phase, all three candidate algorithms achieved an excellent AUC. The greatest AUC was achieved by neural networks (AUC 0.87) as shown in Table 4. Decision curve analysis demonstrated that the three artificial intelligence candidate algorithms all achieved higher net benefits for the prediction of prolonged length of hospital stay for patients following revision TKA, when compared to the default strategies of changing management for all patients or no patients (Fig. 4).

Table 3 Discrimination and calibration of artificial intelligence algorithms on training set for revision TKA patients
Table 4 Discrimination and calibration of artificial intelligence algorithms on testing set for revision TKA patients
Fig. 3
figure 3

Calibration plot for the neural network algorithm for the prediction of prolonged length of stay for patients following revision total knee arthroplasty

Fig. 4
figure 4

Decision curve analysis for prolonged length of stay following revision total knee arthroplasty using the neural network algorithm to assess the usefulness of the machine learning algorithms. The decision curve analysis shows the net benefit of the neural network model (green) in comparison to the default strategies of changing management for all patients (purple) or for no patients (red)

Clinical application

Utilizing the artificial neural network algorithm, a local patient-level explanation for the model predictions is shown in Fig. 5. For a 76-year-old female patient (Charlson comorbidity index 6, ASA score 2, history of diabetes, preoperative anemia < 12 g/dL, BMI 31 kg/m2) who underwent all component revision TKA due to aseptic loosening (operation time 118 min), the predicted probability of prolonged length of hospital stay is 33.6% (Fig. 5). Age, Charlson comorbidity index, operation time, type of revision TKA, preoperative anemia, history of diabetes and female gender all increased the probability of prolonged length of stay following revision TKA, whereas body mass index, revision surgery indication, American Society of Anesthesiology score as well as no prior history of renal disease and smoking decreased the probability of prolonged length of stay following revision TKA surgery.

Fig. 5
figure 5

Example of individual patient-specific explanation generated by the neural network algorithm for a patient with prolonged length of stay following revision total knee arthroplasty. Green bars demonstrate an increase in the probability of prolonged length of stay, whereas red bars represent a decrease in the probability of prolonged length of stay following revision total knee arthroplasty

Discussion

The main findings of the current study were that (1) the three developed artificial intelligence algorithms to predict prolonged LSO following revision TKA demonstrated excellent model performance on discrimination, calibration and decision curve analysis, and (2) through recursive feature elimination it was found that age (> 75 years), Charlson Comorbidity Index and body mass index (> 35 kg/m2) were the strongest clinical parameters for prolonged LOS following revision TKA. As the number of revision TKA continues to increase due to an increase of primary TKAs [19], identification of patients at increased risk of prolonged LOS is increasingly important to identify modifiable risk factors and their relative significance. In a retrospective review using the National Inpatient Sample (NIS) database, Sloan et al. reported that from 2000 to 2014, length of stay among revision TKA patients decreased from 4.3 to 2.8 days [28]. In an attempt to identify patients at risk of prolonged length of hospital stay following revision TKA, prior retrospective studies aimed to identify numerous modifiable and non-modifiable patient and surgical risk factors [3, 24, 25]. However, these prior works did not address the weight of each of these risk factors on the probability of prolonged length of stay [3, 24, 25]. In contrast, artificial intelligence algorithms possess the ability to analyze large datasets with high accuracy through an efficient and automated analysis of complex and non-linear relationships between numerous patients and surgical variables [10], thus AI algorithms have the potential to assist in clinical practice through preoperative patient-specific quantification of increased risk of prolonged length of stay following revision TKA.

The present study identified patient factors including age (> 75 years), Charlson Comorbidity Index and body mass index (> 35 kg/m2) as the strongest predictors for a prolonged length of stay following revision TKA. Similar observations were made in previous retrospective analyses [25]. In a retrospective study including 1,112 revision TKA patients aged over 75 years, Raut et al. reported numerous patient risk factors to be associated with a prolonged length of stay [25]. Similarly, Keswani et al. also identified older age, female gender, high body mass index, high Charlson Comorbidity Index, high ASA score, preoperative anemia and the preoperative use of walking aids as risk factors for prolonged length of stay following both primary and revision TKA [6, 25]. The significance of patient’s comorbid status and body mass index was highlighted by Raut et al., elaborating that patients with a high Charlson Comorbidity Index, high ASA score or high body mass index may struggle with rapid postoperative mobilization, which may hinder the recovery process, thereby increasing their length of hospital stay following revision TKA.

Although there is a strong agreement between risk factors for prolonged length of hospital stay between prior retrospective studies and the present artificial intelligence study analysis [24, 25], the present study illustrates that the type of revision TKA plays a significant role for the risk of patients to have a prolonged length of stay. Previous retrospective analyses by did not identify all component revision TKA as a risk factor for prolonged LOS following revision TKA [24, 25]. This may be due to the use of conventional logistic regression analysis in their studies, with artificial intelligence demonstrating higher accuracies for the analysis of large and complex datasets through the identification of non-linear relationships between numerous clinical variables, an aspect disregarded in conventional statistical analysis methods [10]. Additionally, artificial intelligence algorithms were shown to provide highly accurate analyses for datasets with incomplete data as well as noisy data, when compared to conventional statistical methods, making artificial intelligence an attractive option for data analysis, when compared to conventional statistical methods [10]. As artificial intelligence algorithms also provide estimates in real-time, these computational tools have strong potential to assist in clinical decision-making for patients with total knee arthroplasty.

The present study also reported that patients with prolonged length of stay following revision TKA demonstrated higher postoperative complication rates in terms of re-revision rates and readmission rates, when compared to patients without prolonged LOS following revision TKA. This further demonstrates the clinical utility of the artificial intelligence algorithms as it provides useful information for patient counseling prior to revision surgery. The association between prolonged length of stay and increased postoperative complication rates has also been reported in prior literature. Collins et al. reported that cases with prolonged LOS from 11 elective operations using the National VA Surgical Quality Improvement Program demonstrated increased postoperative complication rates, when compared to patients without prolonged LOS [6]. For patients following hip and knee arthroplasty surgery, Collins et al. showed increased odds of return to the hospital within 90 days as well as operating room for patients with prolonged LOS [6]. Similarly, Krell et al. reported higher inpatient complication rates as well as postoperative complications for patients with prolonged LOS following colorectal resection, utilizing a study population of 22,664 patients from the American College of Surgeons National Surgical Quality Improvement Program registry [18].

The findings of this present study need to be interpreted in light of several limitations. First, this present study utilizes a retrospective study design which is associated with inherent limitations [1]. Additionally, the study population includes patients from only a single large tertiary referral center which may limit the generalizability of the artificial intelligence algorithms in clinical practice. Second, the inclusion of revision TKA procedures from multiple surgeons and uncertain adherence to clinical pathways of care do introduce additional variability. However, this represents a common limitation of prior retrospective studies investigating risk factors for prolonged length of stay following revision TKA [14, 24]. Third, this present study investigated a large number of potential patient and surgical risk factors; however, functional measures such as patient-reported outcome measures were not included. Additionally, most of the potential risk factors were binary; thus, the effect of disease severity was not evaluated in this study. Furthermore, due to the retrospective nature of the study, specific comorbidities, such as the presence of chronic neuropathic pain and mental health, were not analyzed.

Conclusion

This study developed and validated artificial intelligence algorithms for the prediction of patient-specific prolonged length of hospital stay following revision total knee arthroplasty, demonstrating excellent model performance on discrimination, calibration and decision curve analysis. This indicates the potential of these artificial intelligence algorithms to aid in strategical discharge planning and resource allocations.