COVID-19 caused by SARS-CoV2, continues to cause significant worldwide morbidity and mortality. Since the beginning of the pandemic, over 133 million cases have been documented worldwide, with more than 1 million cases in Canada.1 In Canada alone, over 23,000 individuals have died of COVID-19.2 Data from the province of Ontario indicate that 13% of identified cases have been hospitalized and 3% have been admitted to the intensive care unit (ICU),3 although identified cases likely represent a small fraction of total infections. At the time of writing 1,397 patients with COVID-19 have been hospitalized in Ontario, of which 504 (35%) were admitted to an ICU and 320 (23%) received ventilation.4 Historically, Ontario’s ICU capacity has been limited to 1,122 beds capable of accommodating patients requiring mechanical ventilation.5 Currently, emerging variants of concern are placing increasing strains on health resources,1,3 underscoring the need for evidence-informed tools to guide decision-making. Such tools can be used to understand differences in population risk and help with local resource planning, guide data-driven discussions with patients and families regarding prognosis, allow comparisons of patient illness severity and outcomes across different critical care units, and also be used to compare baseline risk between treatment groups in clinical studies.

Existing COVID-19 risk prediction algorithms, including novel deep learning models, have proven useful for forecasting progression to critical illness at the time of hospital admission or during earlier phases of illness.6,7,8,9,10 Nevertheless, models predicting mortality have largely focused on all hospitalized patients rather than the subset of patients who are critically ill.10,11,12,13,14,15,16,17 While prognostic scoring systems such as the Sequential Organ Failure Assessment (SOFA) score have been widely applied in ICU patients, recent work has suggested lower than anticipated accuracy for the prediction of mortality in COVID-19 patients requiring ventilation.18 Additionally, the utility of this score in patients transferred from other ICUs is unclear. Importantly, to our knowledge there are no tools that allow for estimation of mortality in ventilated patients with COVID-19 dynamically (i.e., using clinical data from after the time of admission), which is essential to account for changes in a patient’s clinical condition. The objective of this study was to create a dynamic risk prediction model for mortality in ventilated COVID-19 patients, applied every three days over the initial 15 days of ICU admission. The advantage of this approach would be the ability to continually reassess risk as new patient information is obtained during the course of treatment.

Methods

Data sources and collection

This study was approved by the University Health Network and Sinai Health System research ethics boards (20-5378.1 and 20-0115-C) and follows the Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) guidelines.19 Written informed consent was waived given the retrospective design and the use of de-identified data. The records of patients admitted to intensive care units (ICUs) at Sinai Health System, Toronto Western Hospital and Toronto General Hospital were reviewed from 1 March 2020 to 15 December 2020. Adult patients (≥ 18 yr) with polymerase chain reaction-confirmed SARS-CoV2 infection (Seegene Allplex™ 2019 n-CoV assay; Seegene Inc., Seoul, South Korea) receiving invasive mechanical ventilation at any point during their ICU stay were eligible for inclusion. Patients prescribed high-flow nasal cannula or non-invasive ventilation (e.g., continuous positive airway pressure or bilevel positive airway pressure) were not eligible. A team of experienced clinical data abstractors assessed electronic and paper charts. Data were collected and stored on a centralized electronic database.

Selection of potential predictive variables

Candidate predictors (i.e., laboratory, demographic, and clinical parameters) were selected from existing models and published literature 6,7,9,11,12,20,21 and included age, sex, Acute Physiology and Chronic Health Evaluation II (APACHE II) score at the time of ICU admission,22 respiratory rate, oxygen requirements and saturation, temperature at admission, white blood cell count and lymphopenia, end-organ dysfunction (including acute kidney injury and evidence of hepatic injury through elevations of hepatic enzymes), and preadmission patient comorbidities (e.g., hypertension, diabetes, coronary artery disease, current smoker, pulmonary disease). Data on biomarkers such as albumin, procalcitonin, ferritin, d-dimer, C-reactive protein (CRP), lactate dehydrogenase, and erythrocyte sedimentation rate were collected where available.23All data were collected at the time of ICU admission (day 0) and in three-day intervals thereafter (day 3, 6, 9, 12, 15) until the patient was discharged from ICU.

Outcomes

The primary outcome was the risk of mortality from the day 0 until day 15 and beyond, assessed dynamically every three days. Secondary outcomes included the median duration of survival, ICU stay, and mechanical ventilation from day 0 and the median duration of hospital stay from the time of intubation. All outcomes were collected up to 30 days, except for overall mortality, which was collected up until study termination on 15 December 2020.

Statistical analysis

Sixty-one variables were used in the initial univariable screening process (Electronic Supplementary Material [ESM] eTable 1), specifying a preset alpha of 0.25.24,25 Two variables, smoking status and alcohol use, were not reliably documented and were excluded from further analysis. Biomarkers not routinely collected as part of patient care (lactate dehydrogenase, ferritin, CRP, and d-dimer) were excluded from the final model because of high rates of missing data. Otherwise, missing data were not common (≤ 10% for any variable) and were addressed using multiple imputation approaches and sensitivity analyses to test the robustness of the study results.

Generalized estimating equations (GEE), which adjust for clustering on the patient, were used to determine the final set of risk factors for inclusion into the model analyzed as repeated measures over time.26,27 The likelihood ratio test was used in a backwards elimination process (P < 0.05 to retain) to select risk factors for retention into the model. As this was an event-driven model, we utilized the recommended ratio of 10:1 for the number of events per included variable.28 The goodness of fit of the final model was assessed with the Hosmer-Lemeshow test. We evaluated model calibration by estimating a smooth calibration line between the observed and predicted outcomes.29 Nonparametric bootstrapping was applied to test the internal validity of the final prediction model.29,30 Resampled data (1,000 iterations) were used to generate bootstrap estimates of the regression coefficients. The confidence intervals of the regression coefficient estimates from the bootstrap sampling were compared with the values calculated by the GEE regression analysis for each time point.

Our intent was to develop a mortality risk algorithm that could be applied to ventilated patients every three days. From the GEE regression outputs, the contribution of the individual risk factor to the risk of mortality was weighted with the final model coefficients. The coefficients were transformed by multiplying each by a constant and then rounding to the nearest unit value. A summary mortality risk score was assigned to each patient by adding the transformed coefficient values (points) for each risk factor they possessed.

The predictive accuracy of the final risk scoring algorithm was assessed by the specificity, sensitivity, and area under the receiver operating characteristic (ROC) curve.31,32 A cut point maximizing sensitivity and specificity was identified. Patients with risk scores above this threshold were classified as “high risk” for mortality over the next three-day period.

Time-to-event outcomes consisting of duration of overall survival, mechanical ventilation, ICU stay, and hospitalization were evaluated by the method of Kaplan-Meier. Time-to-event outcomes were reported as medians, with their interquartile range [IQR]. All statistical analyses were performed using Stata, V16.0 (Stata Corp., College Station, TX, USA).

Results

Demographic and clinical characteristics of the derivation cohort

Four hundred twenty-seven ICU patients were screened for inclusion between 1 March 2020 and 15 December 2020, with 127 meeting eligibility criteria and comprising the final analysis set. The mean (standard deviation [SD]) patient age was 58 (14) yr, and 90 (71%) were male. Most were transferred from other institutions (65%), while the remainder (35%) were internal transfers from wards or the emergency department. At day 0, 83 (65%) patients were intubated, with 28 (22%) intubated on day 1 and 16 (13%) intubated on day 2. The median [IQR] APACHE II score at day 0 was 22 [10–34]. Patients were generally ventilated in accordance with ARDSNet,33 with tidal volumes of 6–8 mL·kg-1 of predicted body weight and permissive hypercapnia (ESM eTables 2 and 3). Until discharge from ICU or death, 39 (31%) patients received extracorporeal membrane oxygenation (ECMO). Demographic and clinical characteristics are shown in ESM eTable 1.

Clinical outcomes in ventilated COVID-19 patients

Overall mortality was 42% (53 patients), with 29% (37 patients) dying within the first 30 days of ICU admission (ESM eTable 2). In the 39 patients who received ECMO, overall mortality was 59% compared with 34% in the 88 patients who did not receive ECMO (P = 0.01). The median [IQR] duration of hospital stay from the day of intubation was 36.9 [19.1–58.5] days, with a median [IQR] duration of intubation of 26.6 [5.6–not reached] days and a median [IQR] duration of ICU stay of 26.9 [15.4–52] days. Median [IQR] overall survival from ICU admission was 43 [22–not reached] days (Table 1).

Table 1 Characteristics of intubated patients with COVID-19

Predictor selection and development of a repeated-measures model

Sixty-one predictors were identified for potential inclusion in the model (ESM eTable 1). After the initial univariable selection procedures, 12 predictors remained for inclusion in the multivariable model. Inclusion of these 12 variables in the multivariable GEE regression model resulted in five variables that remained significantly associated with mortality during the first 15 days of the ICU stay and beyond. The final five variables retained as significant predictors of total mortality on day 0 and at subsequent three-day intervals to day 15 were: age, 24-hr peak temperature, 24-hr peak lactate level, tidal volume at the time of lowest arterial partial pressure of oxygen/fraction of inspired oxygen (PaO2/FiO2) or highest peak pressure (whichever was lowest), and need for vasopressors (Table 2). Corticosteroid administration or elevated CRP was not associated with mortality (ESM eTable 1). A U-shaped mortality curve was observed between the mortality risk and days spent in the ICU. The reference time interval was days 0–2 of ICU admission. The risk of death increased on days 3–6 after ICU admission, declined again between days 9–12, and dramatically increased after day 15 (Table 2). Overall, the lowest risk of death was in the initial 0–2 days of admission. The confidence intervals of the regression coefficients from the bootstrap samples were similar to the values of the coefficients obtained from the GEE multivariable regression analysis, supporting the internal validity of the model.

Table 2 Clinical outcomes of intubated patients with COVID-19

Development of a scoring tool for the prediction all-cause mortality in ventilated COVID-19 patients

A clinical risk scoring system was derived from the regression coefficients and intercepts generated from the multivariable GEE model. The model was developed to be scored at day 0 and at subsequent three-day intervals following admission (ICU day 3, 6, 9, 12, and 15). The final product was a scoring system between 0 and 100 where higher scores were associated with an increased risk of mortality over each three-day interval from the day 0 (Table 3, Figure). The clinical risk scores were then used to create six risk stratification categories ranging from scores of ≤ 30 to > 70, with higher scores corresponding to a higher risk of mortality. Observed mortality rates are shown by category in Table 4. A risk score > 60 was identified as a threshold for identifying patients at high risk of mortality over the next three-day period. Detailed results are presented in Tables 4 and 5 as the selection of a risk threshold is not a static process and can change based on a clinician’s risk tolerance. The model was further characterized utilizing ROC curve analysis, including an area under the ROC curve (AUC). The AUC was high, at 0.9 (95% confidence interval [CI], 0.8 to 0.9). For each ten-point increase in the risk score, the relative increase in the risk of death was greater than four times, with an odds ratio of 4.1 (95% CI, 2.9 to 5.9).

Table 3 Predictive factors for mortality in intubated patients with COVID-19
Figure
figure 1

Relationship between patient risk score and probability of death in intubated COVID-19 patients. Area under the ROC curve = 0.9 (95% CI, 0.8 to 0.9). For every ten-point increase in the risk score, the relative increase in the risk of death is approximately four times (OR = 4.1; 95% CI, 2.9 to 5.9). CI = confidence interval; OR = odds ratio; ROC = receiver operating characteristic

Table 4 Risk score algorithm for mortality in intubated COVID-19 patients
Table 5 Detailed analysis of risk scoring system for 30-day mortality in mechanically ventilated COVID-19 patients

Discussion

The preservation of health system capacity has been a foremost priority during the COVID-19 pandemic, particularly the availability of scarce critical care beds with capacity for invasive mechanical ventilation.34 This paper describes the derivation and internal validation of an accurate, pragmatic, and dynamic clinical risk score for the prediction of all-cause mortality in ventilated COVID-19 patients over three-day intervals during the first 15 days of ICU stay, allowing for risk reassessment related to changes in disease course. The U-shaped curve we observed between mortality risk and days spent in the ICU confirms the importance of reassessing risk after admission.

Our risk score was derived from patients who either presented or were transferred to academic ICUs within one of the largest networks of teaching hospitals in Canada and thus represent a subset of the most critically ill patients. The mortality rate in our cohort (42%) is consistent with the high case fatality rate for patients with COVID-19 receiving invasive mechanical ventilation.35 We identified older age, lower body temperature, higher lactate level, lower tidal volume, and vasopressor use as significant predictors of mortality in mechanically ventilated patients. In addition to high discriminative ability, these variables have high face validity for the prediction of mortality, our outcome of interest, increasing the utility of the model for clinicians. Older age has been well documented to be associated with adverse outcomes and higher mortality in COVID-19.7,10,11,15,16,35,36 Similarly, temperature, lactate level, and vasopressor use are known to be associated with poor outcomes in critical illness (including severe COVID-19 infection).10,16,17,37,38,39,40,41,42 Recent evidence suggests that impaired ability to regulate body temperature, particularly an inappropriately low body temperature, is a strong predictor of COVID-19 mortality.43

Approximately 90% of patients dying of COVID-19 have acute respiratory distress syndrome (ARDS), which is also the primary indication for intubation in the vast majority.2 Previously published predictive models for mortality in ARDS have found age, PaO2/FIO2, and plateau pressures to be predictive of mortality, with an AUC of 0.76 (95% CI, 0.70 to 0.81) in the derivation cohort compared with 0.63 (95% CI, 0.57 to 0.70) for the Acute Physiology and Chronic Health Evaluation (APACHE) II Score.(44) Our final selected variables generated a higher AUC in our derivation cohort, suggesting higher discriminative accuracy for the prediction of mortality in mechanically ventilated COVID-19 patients, the majority of which had a diagnosis of ARDS. For the outcome of hospital mortality, the AUC for the APACHE-II score for all patients in our cohort (those newly admitted to ICU as well as transfers) was 0.56 (95% CI, 0.46 to 0.66). As the APACHE-II score was designed to be used in patients newly admitted to ICU rather than institutional transfers, in the subset of patients within our cohort who were new ICU admissions, the predictive utility of the APACHE-II score remained low, with an AUC of 0.57 (95% CI, 0.40 to 0.74). Although the SOFA score has been advanced as a predictor of mortality in critically ill patients with COVID-19 and as a triage tool, recently its usefulness and accuracy have been questioned in this population, with age alone being a better predictor of outcome than the SOFA score.18 Other prognostic models developed specifically in COVID-19 patients have largely focused on static risk prediction in community or hospital in-patients, rather than dynamic risk prediction in the critically ill.45

The association of higher tidal volumes with decreased risk of mortality may be related to lower compliance with increasing severity of ARDS or differences in outcomes related to underlying heterogeneity in ARDS physiology.33,46,47 In our dataset, patients with ARDS receiving ECMO had a median [IQR] tidal volume of 305 [220–373] mL at admission to ICU compared with a larger median [IQR] tidal volume of 360 [308–400] mL in patients not receiving ECMO (P = 0.004). The median dynamic driving pressure of the group receiving ECMO was 18 cm H2O14,21 at the time of admission to ICU, which was nearly identical to the group not progressing to ECMO (17 cm H2O [12, 20]; P = 0.50). Given the known association between increases in driving pressure and adverse outcomes in ARDS,48,49 it appears that clinicians controlled driving pressure in our cohort, and thus decreases in tidal volume reflect deteriorating compliance and worsening severity of ARDS (ESM eTables 3 and 5). This is consistent with a recent study demonstrating that higher tidal volumes are associated with increased mortality when adjusted for respiratory compliance.36 Additionally, identification of high-compliance “L-type” ARDS in COVID-19 patients and low-compliance “H-Type” patterns may have led to different ventilation strategies, including larger tidal volumes in patients presenting with “L-type” physiology.47

We are currently developing an app for the model, which will allow mortality risk and other outcomes to be calculated rapidly and repeatedly in real time via a smartphone or tablet computer. Future work will seek to validate this model in a larger sample size.

Limitations of our study include the absence of external validation in other datasets, which is our future area of research. Our cohort of patients comprises the development cohort, and the accuracy of our model as assessed by the area under the ROC curve is high. Internal validation of accuracy using bootstrap resamples confirmed our findings. Nevertheless, to confirm widespread generalizability of our findings to other populations, data from patients and hospitals not included in the development cohort should be used as part of the external validation process, which will also confirm the accuracy of the model in other populations.7 While our cohort was inclusive of the majority of COVID-19 patients admitted to ICU at our institutions, on a global scale it remains a small sample with few mortality events. As such, further validation of this prognostic tool in larger datasets is required. While our multivariable model has a high area under the ROC curve, the odds ratios associated with individual risk factors should be interpreted with caution because of the possibility of residual confounding.50 Of note, since development of this risk scoring system, there has been adoption of various therapies, including tocilizumab and sarilumab, which may impact the outcomes of critically ill COVID-19 patients and were not accounted for in our model.51,52,53 For ease of calculation, we have included tidal volume not corrected for predicted body weight (PBW) as a model variable. The vast majority of patients in our cohort were ventilated using the ARDSNet strategy of 6–8 mL·kg-1 of predicted PBW, considered standard of care in patients with COVID-19.33,54 Correlation between the absolute tidal volume and the tidal volume expressed per kg of PBW was high in our sample for all days analyzed (Pearson rho ≥ 0.90; P < 0.001); nevertheless, our model may be less accurate for patients not ventilated according to ARDSNet strategies (ESM eTable 6). Lastly, while mortality is a highly important outcome for patients, other patient-centred outcomes such as long-term sequelae and quality of life should also be considered.55

Conclusion

In this analysis, we derived and internally validated a pragmatic, dynamic, clinical score that facilitates accurate prediction of mortality in ventilated patients with COVID-19. This tool may provide additional prognostic information allowing improved local resource planning and more accurate comparisons across patients treated with different therapies or in different ICUs. Future work is required to validate our results in other cohorts.