Abstract
Pediatric Index of Mortality (PIM) 2 score is used in pediatric intensive care unit (PICU) to predict the patients’ risk of death. The performance of this model has never been assessed in Switzerland. The aim of this study was to evaluate the performance of the PIM2 score in the whole cohort and in pre-specified diagnostic subgroups of patients admitted to PICUs in Switzerland. All children younger than 16 years admitted to any PICU in Switzerland between January 1, 2012 and December 31, 2017 were included in the study. A total of 22,382 patients were analyzed. Observed mortality was 2%, whereas mortality predicted by PIM2 was 4.2% (SMR = 0.47, 95% CI, 0.42–0.52). Calibration was also poor across the deciles of mortality risks (p < 0.001). The AUC-ROC for the entire cohort was 0.88 (95% CI, 0.87–0.90). Calibration varied significantly according to primary diagnosis.
Conclusion: The performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. The PIM 2 score tends to under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest.
What is Known: •PIM2 score is a widely used mortality prediction model in PICU. •PIM2 performance among uncommon but clinically relevant diagnostic subgroups of patients is unknown. •The performance of PIM2 score has never been assessed in Switzerland. | |
What is New: •The performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. •Calibration varies significantly according to primary diagnosis. The PIM 2 score under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest. |
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Mortality prediction models are an important component of the evaluation of the overall quality of care provided by pediatric intensive care units (PICUs) [1]. These models assess the risk of death for each individual patient according to physiologic data collected at admission. These models predict individual patient outcomes to determine aggregate mortality rates of PICUs or group of patients according to physiologic data available at admission. Better or worse than average quality of care is indicated by discordance between observed and expected mortality rates which are predicted by the models [2].
However, a disagreement between observed mortality rates and those predicted by a model can be caused by other reasons than the quality of care. Indeed, when a predictive model is developed, the prediction rule is optimized for the prediction of the outcome in the study population and at the time of the study. The model may under or overestimate mortality when applied to other populations or many years after its development [3].
From January 2012, the Swiss Society of Intensive Care Medicine requires the use of the Pediatric Index of Mortality (PIM) 2 score as a model for predicting the risk of death for all pediatric admissions to all PICUs in Switzerland. This score has been developed more than 20 years ago in Australia, New Zealand and the UK [4]. Therefore, by using the PIM 2 score, the present quality of care is assessed in regard of the quality of care 20 years ago. In addition, the performance of this model has never been assessed in Switzerland.
In the last few years, a drift in calibration of the PIM 2 score, which is the degree of agreement between predicted and observed deaths, has been detected in two large national registries in Australia, New Zealand and the UK [5, 6]. The use of PIM2 might therefore result in an overestimation of the quality of care in a country.
The aim of this study was to evaluate the predictive performance of the PIM2 score by analyzing its discrimination and calibration in the whole cohort as well as in the pre-specified subgroups of patients admitted to PICUs in Switzerland.
Materials and methods
This retrospective study analyzed the data from the Minimal Intensive Care Unit Dataset (MDSi) of the Swiss Society of Intensive Care Medicine that systematically collects information on all pediatric admissions to all PICU in Switzerland.
Patients
The study population is represented by all children younger than 16 years admitted to any PICU in Switzerland between January 1, 2012 and December 31, 2017. Premature infants less than 36 weeks of gestational age and patients with a diagnosis at admission of “organ donor” were excluded from the analysis. A new health record number is systematically reassigned to patients who are admitted to another unit. As a consequence, it was not possible to assess the calibration and the discriminative power of the score for patients transferred to/coming from other PICUs. Those patients were therefore excluded from the study.
Children who could not be identified with one primary admission diagnosis were also excluded. For patients transferred to/coming from other PICUs (2078 cases, 8%), mortality could not be established, therefore they were also excluded from the study.
Data
All 8 tertiary PICUs in Switzerland collect data to the MDSi using a standard coding for admissions. Since 2012, the pediatric MDSi uses the ANZPIC Registry diagnostic codes to classify children into the following primary admission categories: cardiac (including postoperative), cardiac or respiratory arrest, trauma, neurology, oncology, respiratory, sepsis with or without septic shock, miscellaneous, general medical, and general surgical [7].
Statistical analysis
Data are summarized as frequencies and percentages for categorical variables and as mean (standard deviation) and range for continuous variables. Characteristics were compared between survivors and non survivors using t tests and Chi-2 tests. Patients were divided into age groups (newborn, infant, pre-school, school, and adolescent). The performance of PIM2 score was assessed by analyzing its discrimination and calibration in the general population and different subgroups (age, year of admission, diagnosis at admission, and operative status). Calibration was assessed by standardized mortality ratio (SMR, the ratio between observed and predicted deaths) with 95% CIs and calibration plots. In addition, Hosmer-Lemeshow (HL) goodness-of-fit test was applied to compare the mortality observed and predicted by PIM2 score when patients were categorized according to the deciles of risk predicted by PIM2 score [8]. Discrimination, which is the extent to which a model predicts higher probability of having an event among patients who will versus those who will not have an event, was assessed through the area under the receiver operating characteristics curve (AUC-ROC) [9]. All statistical tests were two sided, and the significance level was 0.05. All analyses were performed using R statistical software version 3.4.3 (The R Foundation for Statistical Computing).
Results
During the study period, 25,041 admissions were registered; 2659 (10.6%) were excluded from the analysis because of exclusion criteria (Fig. 1). A total of 22,382 patients were analyzed. Demographic characteristics for these patients are summarized in Table 1. Main diagnostic categories were represented by “miscellaneous” (28.5%) followed by patients with respiratory indications (27.4%). Most frequent medical diagnosis found among “miscellaneous” patients were represented by gastrointestinal/bowel obstruction (8%), infection (5%), patients needing PICU surveillance after invasive procedures (4%), and patients with decompensated diabetes (3%).
PIM2 performance in entire cohort
The overall observed mortality was 2% (445/22,382), whereas mortality predicted by PIM2 (4.2%, 949 deaths) was twice as high (SMR = 0.47, 95% CI, 0.42–0.52). To better describe this issue in calibration, patients were categorized according to the risk predicted by PIM2 score. Except in patients with a predicted risk lower than 0.2%, the observed mortality was systematically lower than the mortality predicted by PIM2 score (Table 2). The difference between the number of deaths observed and predicted by PIM2 score was statistically significant (p < 0.001). The calibration plot (Fig. 2) represents graphically the overestimation of the mortality by PIM2 score. The AUC-ROC for the entire cohort was 0.88 (95% CI, 0.87–0.90).
Analysis by age group
The performance of the PIM 2 score across age groups was similar to that of the entire cohort with poor calibration and acceptable discrimination (Table 3). Neither the SMRs (p = 0.208) nor the AUC-ROCs (p = 0.235) were significantly different across age groups.
Analysis by diagnostic group
The SMR varied importantly across diagnostic groups and the difference was statistically significant (p = 0.004, Table 4). The over-estimation of mortality by the PIM 2 score was especially high for respiratory disease and miscellaneous: the predicted mortality was approximately 4 times higher than the observed mortality. In contrast, for sepsis and cardiorespiratory arrest, the predicted mortality was slightly lower than the observed mortality. The discrimination ability of the PIM 2 score also varied across diagnostic groups (p < 0.001). The poorest discrimination ability was observed in patients admitted for oncologic diseases (AUC-ROC = 0.74, 95% CI 0.26 to 1.00) and the highest in patients admitted for injury (AUC-ROC = 0.98, 95% CI 0.97 to 0.99).
Discussion
Detailed prognostic information is extremely important for patients and physicians for quality of care assessment and benchmarking. Accurate estimation of the severity of illness might also contribute to the appropriate allocation of PICU resources and admission triage. Predicted risks can be used to evaluate the outcome of one institution compared with others either directly, by comparing risk-adjusted outcomes between institutions, or indirectly, by comparing outcomes for the single institution against those predicted by the model [10]. A well-performing model also contribute to compare a PICU current performance with its past.
PIM2 recalibration was last updated more than 10 years ago [1]. Therefore a reevaluation of the PIM2 score was needed in Switzerland. We assessed the performance of the PIM2 score in the whole population of patients admitted to PICU in Switzerland. Our results indicate that the score has an adequate capacity for discriminating between survivors and non survivors in the general population as well as in age subgroups. Discrimination was excellent in trauma and neurological subgroups, while PIM2 score showed a poorer but still adequate discriminative power among septic and oncologic patients. Nonetheless predicted mortality exceeds observed mortality by more than 50% in the whole study population. A SMR less than 1 provided by a non-up-to-date model such as PIM 2 might be misleadingly interpreted as evidence that the quality of care is better than it actually is [11].
Calibration across diagnostic groups is also important. If a model over predicts or under predicts death in a specific group of patients, the performance of PICU assessed by the model will be biased by the number of patients admitted in that diagnostic group [12]. In our study poor calibration was observed across mortality risks and age groups. It is important that PICU prediction models calibrate across diagnostic groups as diagnostic mix varies among units. Mortality was also overestimated in almost all diagnostic subgroups. This observed drift in calibration might be explained by improved patients’ care, changes in the mix of patients, timing of intensive care admission. However in septic patients and patients admitted after a cardiorespiratory arrest observed mortality exceeded expected mortality by 13% and 18%, respectively. Possible explanations might be represented by variations in thresholds for commencing and discontinuing life support in patients with particularly severe clinical conditions and the possibly inappropriate application of the model in the setting of a rapidly evolving critical illness. The fact that PIM2 score over or under predicts mortality according to primary diagnosis might help interpret the overall performance of PICU still using the PIM2 score on the ground of the proportion of patients admitted in different diagnostic categories [4, 13].
The PIM2 score under predicts mortality among low-risk patients. Noninvasive procedures (i.e., arterial blood sampling) are not systematically performed among less severe patients. Therefore variables such as base excess or PaO2 might have a high percentage of missing data which need to be imputed. Missing observations are usually given values considered normal, thus potentially undermining prediction in this category of patients.
Prognostic models are periodically calibrated by adding new variables to compensate for the deterioration of calibration that occurs as a result of change in medical practice and case mix over time [14]. A well-calibrated model allows PICU to compare their performance with the local as well as international standards of care. Given the less than satisfactory performance of PIM 2 score in Switzerland, the use of an updated prognostic score should probably be recommended.
A new version of the PIM score, the third edition of the PIM score (PIM 3), is available. PIM3 has been updated in 2013 in an effort to counterbalance a drift in calibration observed in two large national datasets [15]. This model, based on more recent data, provides better estimates of mortality risks among children admitted to PICUs in Australia, New Zealand, and the UK. Besides the introduction of a quadratic transformation of systolic blood pressure (SBP), the reorganization of diagnostic categories with the introduction of the “very-high-risk diagnosis” and a different classification of the term “recovery post procedure”, the variables included in the PIM 3 model are substantially similar to those used in the PIM2. Nonetheless the PIM 3 seems to perform better than the PIM 2 and might represent a valid alternative to PIM2 [16]. The assessment of the performance of PIM 3 score in Switzerland by means of multicenter prospective studies is warranted before its systematic implementation.
Our study has strengths. The analysis included data from a large, national database containing vast clinical information from all the patients admitted to Swiss PICU, ongoing audits of data quality and validation procedures. The fact that the totality of the population of children admitted to Swiss PICU during the study period was analyzed increases the robustness of our results. Moreover this is one of the first studies to evaluate the performance of PIM2 in uncommon but clinically relevant diagnostic subgroups such as septic patients and patients admitted after a cardiorespiratory arrest. Although a more recent version of the PIM2 score is available, we believe our results might be of interest for the several PICU that still use PIM 2 for mortality prediction [17, 18].
Our study has also limitations. One important limitation is its retrospective nature. Moreover data were collected over 5 years, and mortality rates might have changed during the study period. Additionally, the “miscellaneous” group resulted the most numerous group of patients. Given its heterogeneity, the generalizability of our results for this category of patients may be problematic.
Conclusion
Our results indicate that the performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. Calibration varies significantly according to primary diagnosis. The PIM 2 score tends to under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest. The use of PIM 3 score might represent a potential alternative to PIM2.
Abbreviations
- AUC ROC :
-
receiver operating characteristics curve
- HL :
-
Hosmer-Lemeshow
- MDSi :
-
minimal intensive care unit dataset
- PICU :
-
pediatric intensive care unit
- PIM:
-
pediatric index of mortality
- SMR :
-
standardized mortality ratio
References
Marcin JP, Wetzel RC (2013) The parsimonious pediatric index of mortality*. Pediatr Crit Care Med 14:718–719. https://doi.org/10.1097/PCC.0b013e3182a1251b
Keegan MT, Gajic O, Afessa B (2011) Severity of illness scoring systems in the intensive care unit. Crit Care Med 39:163–169. https://doi.org/10.1097/CCM.0b013e3181f96f81
Breslow MJ, Badawi O (2012) Severity scoring in the critically ill: part 1--interpretation and accuracy of outcome prediction scoring systems. Chest 141:245–252. https://doi.org/10.1378/chest.11-0330
Slater A, Shann F, Pearson G, Paediatric Index of Mortality (PIM) Study Group (2003) PIM2: a revised version of the Paediatric index of mortality. Intensive Care Med 29:278–285. https://doi.org/10.1007/s00134-002-1601-2
Report of the Australian and New Zealand Paediatric Intensive Care Registry (2011) • Health Improvement and Innovation Resource Centre. https://www.hiirc.org.nz/page/45050/report-of-the-australian-and-new-zealand/;jsessionid=695645D8A9A3265C6074D7948D838523?p=936&tag=intensivecare&tab=851&contentType=419§ion=8965. Accessed 19 Oct 2019
Batchelor J, Draper E, Hobson R, et al (2011) Paediatric Intensive Care Audit Network National Report for Scotland 74
Slater A, Shann F, McEniery J, ANICS Study Group (2003) The ANZPIC registry diagnostic codes: a system for coding reasons for admitting children to intensive care. Intensive Care Med 29:271–277. https://doi.org/10.1007/s00134-002-1600-3
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley-Interscience Publication, New York
Morris DE, Pepe MS, Barlow WE (2010) Contrasting two frameworks for ROC analysis of ordinal ratings. Med Decis Mak 30:484–498. https://doi.org/10.1177/0272989X09357477
Power GS, Harrison DA (2014) Why try to predict ICU outcomes? Curr Opin Crit Care 20:544–549. https://doi.org/10.1097/MCC.0000000000000136
Arias López MDP, Boada N, Fernández A, Fernández AL, Ratto ME, Siaba Serrate A, Schnitzler E, Members of VALIDARPIM3 Argentine Group (2018) Performance of the pediatric index of mortality 3 score in PICUs in Argentina: a prospective, national multicenter study. Pediatr Crit Care Med 19:e653–e661. https://doi.org/10.1097/PCC.0000000000001741
Slater A, Shann F, ANZICS Paediatric Study Group (2004) The suitability of the pediatric index of mortality (PIM), PIM2, the pediatric risk of mortality (PRISM), and PRISM III for monitoring the quality of pediatric intensive care in Australia and New Zealand. Pediatr Crit Care Med 5:447–454. https://doi.org/10.1097/01.PCC.0000138557.31831.65
Murphy-Filkins R, Teres D, Lemeshow S, Hosmer DW (1996) Effect of changing patient mix on the performance of an intensive care unit severity-of-illness model: how to distinguish a general from a specialty intensive care unit. Crit Care Med 24:1968–1973. https://doi.org/10.1097/00003246-199612000-00007
Le Gall JR, Neumann A, Hemery F et al (2005) Mortality prediction using SAPS II: an update for French intensive care units. Crit Care 9:R645–R652. https://doi.org/10.1186/cc3821
Straney L, Clements A, Parslow RC, Pearson G, Shann F, Alexander J, Slater A, ANZICS Paediatric Study Group and the Paediatric Intensive Care Audit Network (2013) Paediatric index of mortality 3: an updated model for predicting mortality in pediatric intensive care*. Pediatr Crit Care Med 14:673–681. https://doi.org/10.1097/PCC.0b013e31829760cf
Wolfler A, Osello R, Gualino J, Calderini E, Vigna G, Santuz P, Amigoni A, Savron F, Caramelli F, Rossetti E, Cecchetti C, Corbari M, Piastra M, Testa R, Coffaro G, Stancanelli G, Gitto E, Amato R, Prinelli F, Salvo I, Pediatric Intensive Therapy Network (TIPNet) Study Group (2016) The importance of mortality risk assessment: validation of the pediatric index of mortality 3 score. Pediatr Crit Care Med 17:251–256. https://doi.org/10.1097/PCC.0000000000000657
Teheux L, Verlaat CW, Lemson J, Draaisma JMT, Fuijkschot J (2019) Risk stratification to improve pediatric early warning systems: it is all about the context. Eur J Pediatr 178:1589–1596. https://doi.org/10.1007/s00431-019-03446-0
Ishihara T, Tanaka H (2019) Causes of death in critically ill paediatric patients in Japan: a retrospective multicentre cohort study. BMJ Paediatr Open 3:e000499. https://doi.org/10.1136/bmjpo-2019-000499
Author information
Authors and Affiliations
Consortia
Contributions
AP: Conceptualisation, methodology, validation, writing, rewieving and editing. CG: Methodology, validation, writing, rewieving and editing. CC: Methodology, validation, writing, rewieving and editing. YLJ: Data curation, project and software administration, validation. PR: Conceptualisation, methodology, resources and supervision, validation, writing, rewieving and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interests.
Ethical approval
The study was approved by the Ethical Commission of Northwestern Switzerland (EKNZ UBE-15/47) and the Scientific Committee of the Swiss Society for Intensive Care Medicine with a waiver for informed consent.
Additional information
Communicated by Piet Leroy
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Polito, A., Giacobino, C., Combescure, C. et al. Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study. Eur J Pediatr 179, 1515–1521 (2020). https://doi.org/10.1007/s00431-020-03639-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00431-020-03639-y