Emergency Department Readmission Risk Prediction: A Case Study in Chile

Artetxe, Arkaitz; Graña, Manuel; Beristain, Andoni; Ríos, Sebastián

doi:10.1007/978-3-319-59773-7_2

Arkaitz Artetxe^18,19,
Manuel Graña¹⁹,
Andoni Beristain¹⁸ &
…
Sebastián Ríos²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10338))

Included in the following conference series:

International Work-Conference on the Interplay Between Natural and Artificial Computation

1995 Accesses
1 Citations

Abstract

Short time readmission prediction in Emergency Departments (ED) is a valuable tool to improve both the ED management and the healthcare quality. It helps identifying patients requiring further post-discharge attention as well as reducing healthcare costs. As in many other medical domains, patient readmission data is heavily imbalanced, i.e. the minority class is very infrequent, which is a challenge for the construction of accurate predictors using machine learning tools. We have carried computational experiments on a dataset composed of ED admission records spanning more than 100000 patients in 3 years, with a highly imbalanced distribution. We employed various approaches for dealing with this highly imbalanced dataset in combination with different classification algorithms and compared their predictive power for the estimation of the ED readmission probability within 72 h after discharge. Results show that random undersampling and Bagging (RUSBagging) in combination with Random Forest achieves the best results in terms of Area Under ROC Curve (AUC).

Access provided by CONRICYT-eBooks. Download conference paper PDF

Predicting 30-Day Emergency Readmission Risk

A machine learning model for predicting ICU readmissions and key risk factors: analysis from a longitudinal health records

Article 26 April 2019

A machine learning model for predicting risk of hospital readmission within 30 days of discharge: validated with LACE index and patient at risk of hospital readmission (PARR) model

Article 23 April 2020

Keywords

1 Introduction

In hospitals inside public and private healthcare systems, there is a growing concern on the quality and sustainability of the service. The readmission events, defined as the recurrent visits of a patient in a time span smaller that a given threshold, has become one of the quality measures, both regarding patient attention and economical factors. In some countries, insurance companies have set a time threshold below which they decline to answer for the cost of the patient care, and the hospital must assume it. Therefore, the prediction and prevention of these events is becoming economically critical for some institutions. In other countries, healthcare quality is the primary concern, so that preventing readmissions is a measure of improved patient attention. Readmission predictors are built by machine learning techniques, as specific two-class classifiers. A specific issue building these predictors from data is that the readmission events are much less frequent than normal admissions, i.e. the datasets are class imbalanced.

In supervised classification, data imbalance occurs when the a priori probabilities of the classes are significantly different, i.e. there exists a minority (positive) class that is underrepresented in the dataset in contrast to the majority (negative) class. In healthcare, as well as in other fields (e.g. fraud detection or fault diagnosis), instances of the minority class are outnumbered by the negative instances. Also, the minority class is the target class to be predicted because it is related to the highest cost/reward events. Most classification algorithms assume equal a priori probability for all the classes, so when this premise is violated the resulting classifier is biased towards the majority class. The resulting classifier has a higher predictive accuracy over the majority class, but poorer predictive accuracy over the minority class.

The degree of class imbalance is given by the imbalance ratio (IR), defined as the ratio of the number of instances in the majority class and the number of those in the minority class. Some studies have shown that classifier performance deteriorates even with modest class imbalance in the training data [11].

Although imbalanced data classes have been recognized as one of the key problems in the field of data mining [14], it is not usually taken into account in the literature of readmission risk prediction, despite some authors [2] have encountered class imbalance problems when building their predictive models. Some works such as [1, 12, 15] point out the existence of the class imbalance problem and propose methods to circumvent it. Nevertheless, only simple preprocessing approaches such as oversampling and under sampling are considered. Recent works [8, 10] in the field of disease risk prediction have attacked the problem of class imbalance using different preprocesing and ensemble techniques such as SMOTE or RUSBoost among others.

The main contributions of this paper are:

A methodology proposal for overcoming the class imbalance problem based on RUSBagging
An experimental study using real-world data where we compare the performance of different methods

The paper is organized as follows. In Sect. 2 we present our dataset as well as the methodological approach followed in order to build our models. Next, we describe the evaluation methodology and the experimental results. In Sect. 4 we discuss the conclusions and future work.

2 Materials and Methods

2.1 Experimental Dataset

We used a pseudonymised dataset composed of 99858 admission records recorded between January 2013 and April 2016 in the Hospital José Joaquín Aguirre of the Universidad de Chile, which is part of the public health system of Chile. The variables recorded in the dataset are divided into three main groups: (i) Sociodemographic and administrative data, (ii) Health status (iii) Reasons for consultation or diagnoses made at admission. Records with missing values are discarded for this study. Table 1 shows the characteristics of the dataset and the distribution of 72-hour readmissions among different variables^{Footnote 1}.

Table 1. Characteristics of the dataset

Full size table

2.2 Data Pre-processing

Data was provided in a large ASCII text file containing 156120 admission records corresponding to 102534 different patient identities. After parsing the data, we built a dataset combining admission and patient-related data. Next, we cleaned the data by removing inconsistent and missing samples. Missing values where imputed using the arithmetic mean for continuous variables and the mode for categorical variables.

For each admission of a patient to the ED we calculated the number of days elapsed since his last visit. In order to build our model following a binary classification approach, the target variable meaning was set to readmitted/not readmitted. Those patients returning to the ED within 72 h after being discharged where considered readmitted, otherwise they were considered not readmitted.

Notice that a patient returning the very first day after discharge and another one returning the third day are both considered as readmitted. On the other hand, a patient returning the 73rd hour from discharge is considered as not readmitted.

2.3 Evaluation Metrics

The evaluation metrics that we have used are: sensitivity, specificity, accuracy and Area Under ROC Curve (AUC), defined as follows:

Accuracy. In binary classification, accuracy is defined as the proportion of true results among the total population:
$$\begin{aligned} Accuracy= \frac{\varSigma TN+\varSigma TP}{\varSigma TN+\varSigma TP+\varSigma FN+\varSigma FP}, \end{aligned}$$
(1)
where TN is a true negative, TP a true positive, FN is a false negative and FP a false positive. In heavily umbalanced datasets it is not very meaningful because a simple strategy such as assigining each test sample to the majority class provides high accuracy.
Sensitivity. Sensitivity is a classification performance measure defined as the proportion of correctly classified positives:
$$\begin{aligned} Sensitivity= \frac{ TP}{ TP+ FN}, \end{aligned}$$
(2)
Sensitivity provides more informative about the success on the target class.
Specificity. Specificity is defined as the proportion of negatives that are correctly identified as such:
$$\begin{aligned} Specificity= \frac{ TN}{ TN+ FP}, \end{aligned}$$
(3)
AUC. The Area Under ROC Curve (AUC) shows the trade-off between the sensitivity or $TP_{rate}$ and $FP_{rate}$ (1 - specificity):
$$\begin{aligned} AUC= \frac{1 + TP_{rate} - FP_{rate}}{2} \end{aligned}$$
(4)
where the True Positive rate is equal to the Sensitivity and the False Positive rate is defined as $FP_{rate}=\frac{\varSigma FP}{\varSigma FP+\varSigma TN}$.

Table 2. Confusion matrix for a binary classifier

Full size table

2.4 Learning from Imbalanced Data

The main issue of learning from imbalanced datasets is that classification learning algorithms are often biased towards the majority class and hence, there is a higher misclassification rate of the minority class instances (which is usually the most interesting ones from the practical point of view). Figure 1 depicts a taxonomy of methods developed to deal with class imbalance [9] where three main techniques are identified, namely preprocessing, cost-sensitive learning and ensemble techniques. We give a quick overview of the different strategies.

Preprocessing. Methods following this strategy carry out resampling of the original dataset in order to change the class distribution. Resampling techniques can be divided into three groups: (i) Undersampling techniques, consisting on deleting instances of the majority class, (ii) Oversampling techniques, that replicate or create new instances of the minority class, such as the Synthetic Minority Over-sampling Technique (SMOTE) [4], and (iii) Hybrid techniques, those that combine both resampling techniques.

Cost-Sensitive Learning. The strategy followed by cost-sensitive learning methods is to assign different cost values to each class misclassifications, so that the bias towards the majority class is balanced by the lower cost of misclassifications. A cost matrix is build assigning cost values to the entries of the confussion matrix giving (see Table 2). The usual approach is to heavily penalize misclassifications of the minority class. They are categorized into the following groups:

Direct methods, that introduce the misclassification cost within the classification algorithm.
Meta-learning, where the algorithm itself is not modified. Instead, a preprocessing (or postprocessing) mechanism is introduced to handle the costs. Meta-learning methodologies can be divided into two categories, namely thresholding and sampling.

Ensemble Classifiers. Ensemble methods rely on the idea that the combination of many “weak” classifiers can improve the performance of a single classifier [6]. They are divided in two groups, namely cost-sensitive ensembles and data and algorithmic approaches.

Cost-sensitive ensemble techniques, are analogous to cost-sensitive methods mentioned earlier, although in this case, the cost minimization is undertaken by the boosting algorithm.
Data and algorithmic approaches, which embed a data preprocessing technique in an ensemble algorithm. Depending on the ensemble algorithm they use, three groups are identified: (i) Boosting, (ii) Bagging and (iii) Hybrid.

Bagging [3] consists in creating bootstrapped replicas of the original dataset with replacement (i.e. different copies of the same instance can be found in the same bag), so that different classifiers are trained on each replica. Originally each new data-set or bag mantained the size of the original data-set. Nevertheless, UnderBagging and OverBagging strategies embed a resampling process, so that bags are balanced by means of undersampling or oversampling techniques. To classify an unseen instance, the output predictions of the weak classifiers are collected performing a majority vote in order to produce the joint ensemble prediction. In this group we find, among others, algorithms like SMOTEBoost [5] or UnderBagging [13] which embed undersampling within the ensemble algorithm. We propose RUSBagging which carries out a random undersampling for each bag generated in the ensemble creation. An individual weak classifier is trained from the data in each bag.

3 Experimental Results

In this section we present the results obtained when trying to predict the readmission risk before 72 h over the dataset presented in the previous section. We have tested two data balancing methods: random undersampling (RUS) and random undersampling embedded in a bagging approach. We used the following well-known classification algorithms, implemented in the open source machine learning library scikit-learn^{Footnote 2}:

1.
Decision Tree (DT), setting Gini impurity as splitting criterion
2.
Random Forest (RF), setting Gini impurity as splitting criterion and number of estimators = 10

The models were evaluated using 10-fold cross-validation, performing 10 independent executions. Accuracy, specificity, sensitivity and AUC were calculated for each execution, so average and standard deviation were computed. In order to statistically compare results we employed an Analysis of Variance (ANOVA) approach.

The following data balancing approaches were compared: (i) Original dataset with its imbalanced class distribution, (ii) Undersampling with random undersampling and (iii) RUSBagging. Table 3 shows the average accuracy, sensitivity, specificity and AUC along with its respective standard deviation, for each method and classifier.

3.1 Comparison of Classifiers

According to the results shown in Table 3, both classification algorithms, Random Forest achieve significantly better results (p < 0.001) than Decision Trees looking at the AUC. Though DT performs better in the original dataset (anyhow both classifiers perform poorly), when preprocessing and ensemble approaches are utilized RF performs much better. As shown in Fig. 3, the AUC is significatively greater for RF when RUSBagging is used, however, sensitivity is sacrificed if compared with DT. Overall, results are poor, however they compare well with the state of the art in readmission prediction. In a recent review [7], most studies reported performances measured by AUC near 0.5, with some outliers achieving a maximum of 0.7 (Fig. 2).

Table 3. Mean (±standard deviation) of performance metrics for each data balance method and classifier model configuration

Full size table

3.2 The Effect of Preprocessing and Ensemble Methods

Several conclusions can be extracted from the results shown in Table 3.

The models trained without modifying the original class distribution were clearly biased towards the majority class. Although accuracy scores were high (>90%), specificity was close 100% while sensitivity tended to zero. Thus, according to the AUC scores, models performed similar or just slightly better than a random classifier.
Using random undersampling for class balancing had a direct effect in the performance of the resulting model. Results show that both DT and RF get better AUC scores, 0.56 and 0.58 respectively, and sensitivity increases considerably. However, as could be expected, both accuracy and specificity tend to decrease.
RUSBagging, which embeds random undersampling within a bootstrap aggregating algorithm, outperforms both previous methodologies. According to the AUC scores, the combination of RUSBagging and Random Forest shows the best performance with a mean of 0.60.
The performance of the models considering the AUC metric, suggests poor discrimination ability. Nevertheless, a systematic review on risk prediction models for hospital readmission documented similar AUC scores (ranging from 0.50 to 0.70) in most of the studies [7].

4 Conclusions and Future Work

In this paper we have presented the results of readmission prediction based on a real dataset from a hospital in Santiago, Chile. To overcome the class imbalance problem we propose an approach called RUSBagging, that carries out random undersampling for each bag in a bagging ensemble training.

Results show that RUSBagging in combination with Random Forest significantly improves predictive performance in the context of a highly imbalanced dataset. Nevertheless, our model has shown limited predictive ability for clinical purposes, what seems to be related with the inherent dfficulties and limitations of the readmission risk prediction problem. We have attacked one major issue (data imbalance) but others such as the appriate selection and measurement of varaibles remain untouched in this paper. In order to validate the usefulness of our presented approach, we plan to gather and include additional baseline status and administrative data, to perform a prospective study. Future work will also include an extension of our comparative study including new methodologies and classifiers.

Notes

1.
Most common categorical values are only shown.
2.
http://scikit-learn.org/.

References

Artetxe, A., Beristain, A., Graña, M., Besga, A.: Predicting 30-day emergency readmission risk. In: Graña, M., López-Guede, J.M., Etxaniz, O., Herrero, Á., Quintián, H., Corchado, E. (eds.) ICEUTE/SOCO/CISIS -2016. AISC, vol. 527, pp. 3–12. Springer, Cham (2017). doi:10.1007/978-3-319-47364-2_1
Chapter Google Scholar
Billings, J., Blunt, I., Steventon, A., Georghiou, T., Lewis, G., Bardsley, M.: Development of a predictive model to identify inpatients at risk of re-admission within 30 days of discharge (parr-30). BMJ Open 2(4), e001667 (2012)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
MATH Google Scholar
Chawla, N.V., Lazarevic, A., Hall, L.O., Bowyer, K.W.: SMOTEBoost: improving prediction of the minority class in boosting. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS, vol. 2838, pp. 107–119. Springer, Heidelberg (2003). doi:10.1007/978-3-540-39804-2_12
Chapter Google Scholar
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)
Article Google Scholar
Kansagara, D., Englander, H., Salanitro, A., Kagen, D., Theobald, C., Freeman, M., Kripalani, S.: Risk prediction models for hospital readmission: a systematic review. JAMA 306(15), 1688–1698 (2011)
Article Google Scholar
Khalilia, M., Chakraborty, S., Popescu, M.: Predicting disease risks from highly imbalanced data using random forest. BMC Med. Inform. Decis. Mak. 11(1), 1 (2011)
Article Google Scholar
López, V., Fernández, A., García, S., Palade, V., Herrera, F.: An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf. Sci. 250, 113–141 (2013)
Article Google Scholar
Mateo, F., Soria-Olivas, E., Martınez-Sober, M., Téllez-Plaza, M., Gómez-Sanchis, J., Redón, J.: Multi-step strategy for mortality assessment in cardiovascular risk patients with imbalanced data. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2016)
Google Scholar
Mazurowski, M.A., Habas, P.A., Zurada, J.M., Lo, J.Y., Baker, J.A., Tourassi, G.D.: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 21(2), 427–436 (2008)
Article Google Scholar
Meadem, N., Verbiest, N., Zolfaghar, K., Agarwal, J., Chin, S.C., Roy, S.B.: Exploring preprocessing techniques for prediction of risk of readmission for congestive heart failure patients. In: International Conference on Knowledge Discovery and Data Mining (KDD), Data Mining and Healthcare (DMH) (2013)
Google Scholar
Wang, S., Yao, X.: Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009, pp. 324–331. IEEE (2009)
Google Scholar
Yang, Q., Wu, X.: 10 challenging problems in data mining research. Int. J. Inf. Technol. Decis. Mak. 5(04), 597–604 (2006)
Article Google Scholar
Zheng, B., Zhang, J., Yoon, S.W., Lam, S.S., Khasawneh, M., Poranki, S.: Predictive modeling of hospital readmissions using metaheuristics and data mining. Expert Syst. Appl. 42(20), 7110–7120 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Vicomtech-IK4 Research Centre, Mikeletegi Pasealekua 57, 20009, San Sebastian, Spain
Arkaitz Artetxe & Andoni Beristain
Computation Intelligence Group, Basque University (UPV/EHU), P. Manuel Lardizabal 1, 20018, San Sebastian, Spain
Arkaitz Artetxe & Manuel Graña
Business Intelligence Research Center (CEINE), Industrial Engineering Department, University of Chile, Beauche 851, 8370456, Santiago, Chile
Sebastián Ríos

Authors

Arkaitz Artetxe
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Graña
View author publications
You can also search for this author in PubMed Google Scholar
Andoni Beristain
View author publications
You can also search for this author in PubMed Google Scholar
Sebastián Ríos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Graña .

Editor information

Editors and Affiliations

Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Cartagena, Spain
José Manuel Ferrández Vicente
Departamento de Inteligencia Articial, Universidad Nacional de Educación a Distancia, Madrid, Spain
José Ramón Álvarez-Sánchez
Departamento de Inteligencia Articial, Universidad Nacional de Educación a Distancia, Madrid, Spain
Félix de la Paz López
Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Cartagena, Spain
Javier Toledo Moreo
The Ohio State University, Columbus, Ohio, USA
Hojjat Adeli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Artetxe, A., Graña, M., Beristain, A., Ríos, S. (2017). Emergency Department Readmission Risk Prediction: A Case Study in Chile. In: Ferrández Vicente, J., Álvarez-Sánchez, J., de la Paz López, F., Toledo Moreo, J., Adeli, H. (eds) Biomedical Applications Based on Natural and Artificial Computing. IWINAC 2017. Lecture Notes in Computer Science(), vol 10338. Springer, Cham. https://doi.org/10.1007/978-3-319-59773-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-59773-7_2
Published: 27 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59772-0
Online ISBN: 978-3-319-59773-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Emergency Department Readmission Risk Prediction: A Case Study in Chile

Abstract

Similar content being viewed by others

Predicting 30-Day Emergency Readmission Risk

A machine learning model for predicting ICU readmissions and key risk factors: analysis from a longitudinal health records

A machine learning model for predicting risk of hospital readmission within 30 days of discharge: validated with LACE index and patient at risk of hospital readmission (PARR) model

Keywords

1 Introduction