Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study

Polito, Angelo; Giacobino, Caroline; Combescure, Christophe; Levy-Jamet, Yann; Rimensberger, Peter

doi:10.1007/s00431-020-03639-y

Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study

Original Article
Published: 01 April 2020

Volume 179, pages 1515–1521, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

European Journal of Pediatrics Aims and scope Submit manuscript

Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study

Download PDF

Angelo Polito¹,
Caroline Giacobino²,
Christophe Combescure²,
Yann Levy-Jamet¹,
Peter Rimensberger¹ &
Swiss Society of Intensive Care Medicine

244 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Pediatric Index of Mortality (PIM) 2 score is used in pediatric intensive care unit (PICU) to predict the patients’ risk of death. The performance of this model has never been assessed in Switzerland. The aim of this study was to evaluate the performance of the PIM2 score in the whole cohort and in pre-specified diagnostic subgroups of patients admitted to PICUs in Switzerland. All children younger than 16 years admitted to any PICU in Switzerland between January 1, 2012 and December 31, 2017 were included in the study. A total of 22,382 patients were analyzed. Observed mortality was 2%, whereas mortality predicted by PIM2 was 4.2% (SMR = 0.47, 95% CI, 0.42–0.52). Calibration was also poor across the deciles of mortality risks (p < 0.001). The AUC-ROC for the entire cohort was 0.88 (95% CI, 0.87–0.90). Calibration varied significantly according to primary diagnosis.

Conclusion: The performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. The PIM 2 score tends to under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest.

What is Known: •PIM2 score is a widely used mortality prediction model in PICU. •PIM2 performance among uncommon but clinically relevant diagnostic subgroups of patients is unknown. •The performance of PIM2 score has never been assessed in Switzerland.
What is New: •The performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. •Calibration varies significantly according to primary diagnosis. The PIM 2 score under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest.

Internal validation and evaluation of the predictive performance of models based on the PRISM-3 (Pediatric Risk of Mortality) and PIM-3 (Pediatric Index of Mortality) scoring systems for predicting mortality in Pediatric Intensive Care Units (PICUs)

Article Open access 12 April 2022

Comparison of Outcomes using Pediatric Index of Mortality (PIM)-3 and PIM-2 Models in a Pediatric Intensive Care Unit

Article 01 November 2018

Performance of the PRISM I, PIM2, PELOD-2 and PRISM IV scoring systems in western China: a multicenter prospective study

Article 13 September 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Medical Imaging

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Mortality prediction models are an important component of the evaluation of the overall quality of care provided by pediatric intensive care units (PICUs) [1]. These models assess the risk of death for each individual patient according to physiologic data collected at admission. These models predict individual patient outcomes to determine aggregate mortality rates of PICUs or group of patients according to physiologic data available at admission. Better or worse than average quality of care is indicated by discordance between observed and expected mortality rates which are predicted by the models [2].

However, a disagreement between observed mortality rates and those predicted by a model can be caused by other reasons than the quality of care. Indeed, when a predictive model is developed, the prediction rule is optimized for the prediction of the outcome in the study population and at the time of the study. The model may under or overestimate mortality when applied to other populations or many years after its development [3].

From January 2012, the Swiss Society of Intensive Care Medicine requires the use of the Pediatric Index of Mortality (PIM) 2 score as a model for predicting the risk of death for all pediatric admissions to all PICUs in Switzerland. This score has been developed more than 20 years ago in Australia, New Zealand and the UK [4]. Therefore, by using the PIM 2 score, the present quality of care is assessed in regard of the quality of care 20 years ago. In addition, the performance of this model has never been assessed in Switzerland.

In the last few years, a drift in calibration of the PIM 2 score, which is the degree of agreement between predicted and observed deaths, has been detected in two large national registries in Australia, New Zealand and the UK [5, 6]. The use of PIM2 might therefore result in an overestimation of the quality of care in a country.

The aim of this study was to evaluate the predictive performance of the PIM2 score by analyzing its discrimination and calibration in the whole cohort as well as in the pre-specified subgroups of patients admitted to PICUs in Switzerland.

Materials and methods

This retrospective study analyzed the data from the Minimal Intensive Care Unit Dataset (MDSi) of the Swiss Society of Intensive Care Medicine that systematically collects information on all pediatric admissions to all PICU in Switzerland.

Patients

The study population is represented by all children younger than 16 years admitted to any PICU in Switzerland between January 1, 2012 and December 31, 2017. Premature infants less than 36 weeks of gestational age and patients with a diagnosis at admission of “organ donor” were excluded from the analysis. A new health record number is systematically reassigned to patients who are admitted to another unit. As a consequence, it was not possible to assess the calibration and the discriminative power of the score for patients transferred to/coming from other PICUs. Those patients were therefore excluded from the study.

Children who could not be identified with one primary admission diagnosis were also excluded. For patients transferred to/coming from other PICUs (2078 cases, 8%), mortality could not be established, therefore they were also excluded from the study.

Data

All 8 tertiary PICUs in Switzerland collect data to the MDSi using a standard coding for admissions. Since 2012, the pediatric MDSi uses the ANZPIC Registry diagnostic codes to classify children into the following primary admission categories: cardiac (including postoperative), cardiac or respiratory arrest, trauma, neurology, oncology, respiratory, sepsis with or without septic shock, miscellaneous, general medical, and general surgical [7].

Statistical analysis

Data are summarized as frequencies and percentages for categorical variables and as mean (standard deviation) and range for continuous variables. Characteristics were compared between survivors and non survivors using t tests and Chi-2 tests. Patients were divided into age groups (newborn, infant, pre-school, school, and adolescent). The performance of PIM2 score was assessed by analyzing its discrimination and calibration in the general population and different subgroups (age, year of admission, diagnosis at admission, and operative status). Calibration was assessed by standardized mortality ratio (SMR, the ratio between observed and predicted deaths) with 95% CIs and calibration plots. In addition, Hosmer-Lemeshow (HL) goodness-of-fit test was applied to compare the mortality observed and predicted by PIM2 score when patients were categorized according to the deciles of risk predicted by PIM2 score [8]. Discrimination, which is the extent to which a model predicts higher probability of having an event among patients who will versus those who will not have an event, was assessed through the area under the receiver operating characteristics curve (AUC-ROC) [9]. All statistical tests were two sided, and the significance level was 0.05. All analyses were performed using R statistical software version 3.4.3 (The R Foundation for Statistical Computing).

Results

During the study period, 25,041 admissions were registered; 2659 (10.6%) were excluded from the analysis because of exclusion criteria (Fig. 1). A total of 22,382 patients were analyzed. Demographic characteristics for these patients are summarized in Table 1. Main diagnostic categories were represented by “miscellaneous” (28.5%) followed by patients with respiratory indications (27.4%). Most frequent medical diagnosis found among “miscellaneous” patients were represented by gastrointestinal/bowel obstruction (8%), infection (5%), patients needing PICU surveillance after invasive procedures (4%), and patients with decompensated diabetes (3%).

Table 1 Characteristics of the study population according to the vital status

Full size table

PIM2 performance in entire cohort

The overall observed mortality was 2% (445/22,382), whereas mortality predicted by PIM2 (4.2%, 949 deaths) was twice as high (SMR = 0.47, 95% CI, 0.42–0.52). To better describe this issue in calibration, patients were categorized according to the risk predicted by PIM2 score. Except in patients with a predicted risk lower than 0.2%, the observed mortality was systematically lower than the mortality predicted by PIM2 score (Table 2). The difference between the number of deaths observed and predicted by PIM2 score was statistically significant (p < 0.001). The calibration plot (Fig. 2) represents graphically the overestimation of the mortality by PIM2 score. The AUC-ROC for the entire cohort was 0.88 (95% CI, 0.87–0.90).

Table 2 Mortality observed and predicted by PIM2 score per category of predicted risk. For a perfectly calibrated predictive score, the predicted numbers of deaths are equal to the observed numbers of death

Full size table

Analysis by age group

The performance of the PIM 2 score across age groups was similar to that of the entire cohort with poor calibration and acceptable discrimination (Table 3). Neither the SMRs (p = 0.208) nor the AUC-ROCs (p = 0.235) were significantly different across age groups.

Table 3 Calibration and Discrimination of the PIM 2 score by age group

Full size table

Analysis by diagnostic group

The SMR varied importantly across diagnostic groups and the difference was statistically significant (p = 0.004, Table 4). The over-estimation of mortality by the PIM 2 score was especially high for respiratory disease and miscellaneous: the predicted mortality was approximately 4 times higher than the observed mortality. In contrast, for sepsis and cardiorespiratory arrest, the predicted mortality was slightly lower than the observed mortality. The discrimination ability of the PIM 2 score also varied across diagnostic groups (p < 0.001). The poorest discrimination ability was observed in patients admitted for oncologic diseases (AUC-ROC = 0.74, 95% CI 0.26 to 1.00) and the highest in patients admitted for injury (AUC-ROC = 0.98, 95% CI 0.97 to 0.99).

Table 4 Calibration and discrimination of the PIM 2 score by admission diagnostic groups and operative status

Full size table

Discussion

Detailed prognostic information is extremely important for patients and physicians for quality of care assessment and benchmarking. Accurate estimation of the severity of illness might also contribute to the appropriate allocation of PICU resources and admission triage. Predicted risks can be used to evaluate the outcome of one institution compared with others either directly, by comparing risk-adjusted outcomes between institutions, or indirectly, by comparing outcomes for the single institution against those predicted by the model [10]. A well-performing model also contribute to compare a PICU current performance with its past.

PIM2 recalibration was last updated more than 10 years ago [1]. Therefore a reevaluation of the PIM2 score was needed in Switzerland. We assessed the performance of the PIM2 score in the whole population of patients admitted to PICU in Switzerland. Our results indicate that the score has an adequate capacity for discriminating between survivors and non survivors in the general population as well as in age subgroups. Discrimination was excellent in trauma and neurological subgroups, while PIM2 score showed a poorer but still adequate discriminative power among septic and oncologic patients. Nonetheless predicted mortality exceeds observed mortality by more than 50% in the whole study population. A SMR less than 1 provided by a non-up-to-date model such as PIM 2 might be misleadingly interpreted as evidence that the quality of care is better than it actually is [11].

Calibration across diagnostic groups is also important. If a model over predicts or under predicts death in a specific group of patients, the performance of PICU assessed by the model will be biased by the number of patients admitted in that diagnostic group [12]. In our study poor calibration was observed across mortality risks and age groups. It is important that PICU prediction models calibrate across diagnostic groups as diagnostic mix varies among units. Mortality was also overestimated in almost all diagnostic subgroups. This observed drift in calibration might be explained by improved patients’ care, changes in the mix of patients, timing of intensive care admission. However in septic patients and patients admitted after a cardiorespiratory arrest observed mortality exceeded expected mortality by 13% and 18%, respectively. Possible explanations might be represented by variations in thresholds for commencing and discontinuing life support in patients with particularly severe clinical conditions and the possibly inappropriate application of the model in the setting of a rapidly evolving critical illness. The fact that PIM2 score over or under predicts mortality according to primary diagnosis might help interpret the overall performance of PICU still using the PIM2 score on the ground of the proportion of patients admitted in different diagnostic categories [4, 13].

The PIM2 score under predicts mortality among low-risk patients. Noninvasive procedures (i.e., arterial blood sampling) are not systematically performed among less severe patients. Therefore variables such as base excess or PaO₂ might have a high percentage of missing data which need to be imputed. Missing observations are usually given values considered normal, thus potentially undermining prediction in this category of patients.

Prognostic models are periodically calibrated by adding new variables to compensate for the deterioration of calibration that occurs as a result of change in medical practice and case mix over time [14]. A well-calibrated model allows PICU to compare their performance with the local as well as international standards of care. Given the less than satisfactory performance of PIM 2 score in Switzerland, the use of an updated prognostic score should probably be recommended.

A new version of the PIM score, the third edition of the PIM score (PIM 3), is available. PIM3 has been updated in 2013 in an effort to counterbalance a drift in calibration observed in two large national datasets [15]. This model, based on more recent data, provides better estimates of mortality risks among children admitted to PICUs in Australia, New Zealand, and the UK. Besides the introduction of a quadratic transformation of systolic blood pressure (SBP), the reorganization of diagnostic categories with the introduction of the “very-high-risk diagnosis” and a different classification of the term “recovery post procedure”, the variables included in the PIM 3 model are substantially similar to those used in the PIM2. Nonetheless the PIM 3 seems to perform better than the PIM 2 and might represent a valid alternative to PIM2 [16]. The assessment of the performance of PIM 3 score in Switzerland by means of multicenter prospective studies is warranted before its systematic implementation.

Our study has strengths. The analysis included data from a large, national database containing vast clinical information from all the patients admitted to Swiss PICU, ongoing audits of data quality and validation procedures. The fact that the totality of the population of children admitted to Swiss PICU during the study period was analyzed increases the robustness of our results. Moreover this is one of the first studies to evaluate the performance of PIM2 in uncommon but clinically relevant diagnostic subgroups such as septic patients and patients admitted after a cardiorespiratory arrest. Although a more recent version of the PIM2 score is available, we believe our results might be of interest for the several PICU that still use PIM 2 for mortality prediction [17, 18].

Our study has also limitations. One important limitation is its retrospective nature. Moreover data were collected over 5 years, and mortality rates might have changed during the study period. Additionally, the “miscellaneous” group resulted the most numerous group of patients. Given its heterogeneity, the generalizability of our results for this category of patients may be problematic.

Conclusion

Our results indicate that the performance of the PIM 2 score in a cohort of Swiss patients is poor with adequate discrimination and poor calibration. Calibration varies significantly according to primary diagnosis. The PIM 2 score tends to under predict the number of deaths among septic patients and in patients admitted after a cardiorespiratory arrest. The use of PIM 3 score might represent a potential alternative to PIM2.

Abbreviations

AUC ROC :: receiver operating characteristics curve
HL :: Hosmer-Lemeshow
MDSi :: minimal intensive care unit dataset
PICU :: pediatric intensive care unit
PIM:: pediatric index of mortality
SMR :: standardized mortality ratio

References

Marcin JP, Wetzel RC (2013) The parsimonious pediatric index of mortality*. Pediatr Crit Care Med 14:718–719. https://doi.org/10.1097/PCC.0b013e3182a1251b
Article PubMed Google Scholar
Keegan MT, Gajic O, Afessa B (2011) Severity of illness scoring systems in the intensive care unit. Crit Care Med 39:163–169. https://doi.org/10.1097/CCM.0b013e3181f96f81
Article PubMed Google Scholar
Breslow MJ, Badawi O (2012) Severity scoring in the critically ill: part 1--interpretation and accuracy of outcome prediction scoring systems. Chest 141:245–252. https://doi.org/10.1378/chest.11-0330
Article PubMed Google Scholar
Slater A, Shann F, Pearson G, Paediatric Index of Mortality (PIM) Study Group (2003) PIM2: a revised version of the Paediatric index of mortality. Intensive Care Med 29:278–285. https://doi.org/10.1007/s00134-002-1601-2
Article PubMed Google Scholar
Report of the Australian and New Zealand Paediatric Intensive Care Registry (2011) • Health Improvement and Innovation Resource Centre. https://www.hiirc.org.nz/page/45050/report-of-the-australian-and-new-zealand/;jsessionid=695645D8A9A3265C6074D7948D838523?p=936&tag=intensivecare&tab=851&contentType=419&section=8965. Accessed 19 Oct 2019
Batchelor J, Draper E, Hobson R, et al (2011) Paediatric Intensive Care Audit Network National Report for Scotland 74
Slater A, Shann F, McEniery J, ANICS Study Group (2003) The ANZPIC registry diagnostic codes: a system for coding reasons for admitting children to intensive care. Intensive Care Med 29:271–277. https://doi.org/10.1007/s00134-002-1600-3
Article PubMed Google Scholar
Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley-Interscience Publication, New York
Book Google Scholar
Morris DE, Pepe MS, Barlow WE (2010) Contrasting two frameworks for ROC analysis of ordinal ratings. Med Decis Mak 30:484–498. https://doi.org/10.1177/0272989X09357477
Article Google Scholar
Power GS, Harrison DA (2014) Why try to predict ICU outcomes? Curr Opin Crit Care 20:544–549. https://doi.org/10.1097/MCC.0000000000000136
Article PubMed Google Scholar
Arias López MDP, Boada N, Fernández A, Fernández AL, Ratto ME, Siaba Serrate A, Schnitzler E, Members of VALIDARPIM3 Argentine Group (2018) Performance of the pediatric index of mortality 3 score in PICUs in Argentina: a prospective, national multicenter study. Pediatr Crit Care Med 19:e653–e661. https://doi.org/10.1097/PCC.0000000000001741
Article PubMed PubMed Central Google Scholar
Slater A, Shann F, ANZICS Paediatric Study Group (2004) The suitability of the pediatric index of mortality (PIM), PIM2, the pediatric risk of mortality (PRISM), and PRISM III for monitoring the quality of pediatric intensive care in Australia and New Zealand. Pediatr Crit Care Med 5:447–454. https://doi.org/10.1097/01.PCC.0000138557.31831.65
Article PubMed Google Scholar
Murphy-Filkins R, Teres D, Lemeshow S, Hosmer DW (1996) Effect of changing patient mix on the performance of an intensive care unit severity-of-illness model: how to distinguish a general from a specialty intensive care unit. Crit Care Med 24:1968–1973. https://doi.org/10.1097/00003246-199612000-00007
Article CAS PubMed Google Scholar
Le Gall JR, Neumann A, Hemery F et al (2005) Mortality prediction using SAPS II: an update for French intensive care units. Crit Care 9:R645–R652. https://doi.org/10.1186/cc3821
Article PubMed PubMed Central Google Scholar
Straney L, Clements A, Parslow RC, Pearson G, Shann F, Alexander J, Slater A, ANZICS Paediatric Study Group and the Paediatric Intensive Care Audit Network (2013) Paediatric index of mortality 3: an updated model for predicting mortality in pediatric intensive care*. Pediatr Crit Care Med 14:673–681. https://doi.org/10.1097/PCC.0b013e31829760cf
Article PubMed Google Scholar
Wolfler A, Osello R, Gualino J, Calderini E, Vigna G, Santuz P, Amigoni A, Savron F, Caramelli F, Rossetti E, Cecchetti C, Corbari M, Piastra M, Testa R, Coffaro G, Stancanelli G, Gitto E, Amato R, Prinelli F, Salvo I, Pediatric Intensive Therapy Network (TIPNet) Study Group (2016) The importance of mortality risk assessment: validation of the pediatric index of mortality 3 score. Pediatr Crit Care Med 17:251–256. https://doi.org/10.1097/PCC.0000000000000657
Article PubMed Google Scholar
Teheux L, Verlaat CW, Lemson J, Draaisma JMT, Fuijkschot J (2019) Risk stratification to improve pediatric early warning systems: it is all about the context. Eur J Pediatr 178:1589–1596. https://doi.org/10.1007/s00431-019-03446-0
Article PubMed PubMed Central Google Scholar
Ishihara T, Tanaka H (2019) Causes of death in critically ill paediatric patients in Japan: a retrospective multicentre cohort study. BMJ Paediatr Open 3:e000499. https://doi.org/10.1136/bmjpo-2019-000499
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Pediatric and Neonatal Intensive Care Unit, Department of Pediatrics, University Hospital of Geneva, 6 rue Willy Donzé, CH-1211, Geneva, Switzerland
Angelo Polito, Yann Levy-Jamet & Peter Rimensberger
Division of Clinical Epidemiology, Faculty of Medicine, University of Geneva, and Geneva University Hospitals, 6 rue Gabrielle-Perret-Gentil, CH-1211, Geneva, Switzerland
Caroline Giacobino & Christophe Combescure

Authors

Angelo Polito
View author publications
You can also search for this author in PubMed Google Scholar
Caroline Giacobino
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Combescure
View author publications
You can also search for this author in PubMed Google Scholar
Yann Levy-Jamet
View author publications
You can also search for this author in PubMed Google Scholar
Peter Rimensberger
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

Swiss Society of Intensive Care Medicine

Contributions

AP: Conceptualisation, methodology, validation, writing, rewieving and editing. CG: Methodology, validation, writing, rewieving and editing. CC: Methodology, validation, writing, rewieving and editing. YLJ: Data curation, project and software administration, validation. PR: Conceptualisation, methodology, resources and supervision, validation, writing, rewieving and editing.

Corresponding author

Correspondence to Angelo Polito.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests.

Ethical approval

The study was approved by the Ethical Commission of Northwestern Switzerland (EKNZ UBE-15/47) and the Scientific Committee of the Swiss Society for Intensive Care Medicine with a waiver for informed consent.

Additional information

Communicated by Piet Leroy

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Polito, A., Giacobino, C., Combescure, C. et al. Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study. Eur J Pediatr 179, 1515–1521 (2020). https://doi.org/10.1007/s00431-020-03639-y

Download citation

Received: 21 October 2019
Revised: 19 February 2020
Accepted: 18 March 2020
Published: 01 April 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00431-020-03639-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Overall and subgroup specific performance of the pediatric index of mortality 2 score in Switzerland: a national multicenter study

Abstract

Similar content being viewed by others

Internal validation and evaluation of the predictive performance of models based on the PRISM-3 (Pediatric Risk of Mortality) and PIM-3 (Pediatric Index of Mortality) scoring systems for predicting mortality in Pediatric Intensive Care Units (PICUs)

Comparison of Outcomes using Pediatric Index of Mortality (PIM)-3 and PIM-2 Models in a Pediatric Intensive Care Unit

Performance of the PRISM I, PIM2, PELOD-2 and PRISM IV scoring systems in western China: a multicenter prospective study

Introduction