Introduction

The Multiple organ dysfunction syndrome (MODS) is an evolving clinical syndrome triggered by various stimuli. It is the main cause of morbidity and mortality in patients admitted to intensive care units (ICU) [1]. Many severity indices have been used to assess illness severity, including the Acute Physiology and Chronic Health Evaluation (APACHE) [2], the Simplified Acute Physiology Score (SAPS) [3], and the Mortality Prediction Model (MPM) [4]. These indices enable calculation of the standardized hospital mortality, i.e., the ratio between observed mortality and expected mortality. However, these prognostic prediction models do not take the patient’s course during ICU stay into account, thus limiting their usefulness in later decision making in the course of the disease.

Several instruments have been developed in recent years to quantify the severity of MODS, in particular the Multiple Organ Dysfunction Score [5], the Logistic Organ Dysfunction Score (LODS) [6], and the Sequential Organ Failure Assessment (SOFA) [7]. All evaluate the Multiple Organ Dysfunction Score as a continuous and dynamic phenomenon and quantify the dysfunction of individual organs. SOFA and the other indices are not calculated on admission but rather at the onset of MODS. SOFA scores above 15 are associated with a mortality rate of over 90% [7, 8]. When patients have MODS for several consecutive days and more than three organs are affected, mortality is very high [5, 9]. While prognosis prediction models contribute significantly to defining patient cohorts, they are not particularly useful for decision making in individual cases. The practice of withholding, withdrawing, or limiting life support in critically ill patients has increased in the past few years [10]. Recent studies have reported some type of life support limitation in 4–13% of patients admitted to ICUs and in 70–90% of those who died in the ICU [11, 12, 13, 14].

The Bioethics Working Group of the Spanish Society of Intensive Medicine and Coronary Units (SEMICYUC) set the following goals in this study (presented in abstract form [14]): (a) to ascertain the incidence and mortality rate of MODS in Spanish ICUs, (b) to evaluate the use of limitation of life support measured in MODS patients, and (c) to determine whether daily SOFA measurement can assist decision making in such cases.

Material and methods

A prospective, observational, multicenter study was performed in 75 ICUs in Spain, (approx. one-third of all Spanish ICUs), and 4 ICUs in Latin America. These ICUs have a total of 800 beds. This study was approved by the Ethics Committee of the Hospital de Barcelona, Spain, and informed consent was waived in each participating ICU due to the epidemiological and noninterventional nature of the study. A total of 7,615 patients were admitted to the ICUs during February and March 2001, and 1,340 of these (17.6%) had MODS; 16 patients were excluded because they did not meet the study’s inclusion criteria. The mean age of the study population was 62.2 years (60% men, 40% women). Reasons for admission of MODS patients are presented in Table 1. The mean length of all patients’ stay was 5.4±2.2 days and that of the patients with MODS 13.2±13.65 days).

Table 1 Reason for the admission of MODS patients (COPD chronic obstructive pulmonary disease)

The clinical diagnosis of MODS was established on the basis of the SOFA score, with a minimum score of 3 and more than two organs affected. Patients admitted to the ICU for less than 24 h and those scheduled for surgery requiring less than 2 days of hospitalization were excluded. Daily total SOFA scores were recorded as described by Vincent and coworkers [7]. The SOFA score is composed of scores from six organ systems, graded from 0 to 4 according to the degree of dysfunction/failure. Organ systems considered in the SOFA score are: respiratory (PO22/FIO2), cardiovascular (blood pressure, vasoactive drugs), renal (creatinine and diuresis), hematological (platelet count), neurological (Glasgow Come Score) and liver (bilirubin) [7]. Physicians were unblinded regarding to SOFA score when decisions on limiting life support were made. These data were used to calculate the trend, minimum, maximum and average values during the period in which patients presented MODS. The mean SOFA was calculated as the mean of the SOFA scores recorded each day during multiple organ dysfunction. The SOFA trend was recorded every day during multiple organ dysfunction as follows: a value of 0 was given when the SOFA score was unchanged from the previous day, +1 when the SOFA score was higher than on the previous day, and −1 when it was lower. The overall trend was obtained by adding the scores and categorizing this score as either a positive trend (positive sum), unchanged trend (zero sum), or negative trend (negative sum). This means that the trend in SOFA was calculated independently of the magnitude of change.

All incidents of life support limitation were also recorded. Decisions on limiting life support were implemented at the discretion of the attending physicians. These were orders to withdraw or withhold therapy (renal dialysis, vasoactive drugs, high FIO2, mechanical ventilation, artificial nutrition), and/or not to resuscitate. A do-not-resuscitate order means an explicit written treatment on the clinical record of the patient. No performance means a do-not-resuscitate action in a patient without a written statement on his/her clinical record. High FIO2 means breathing through a reservoir mask at high oxygen concentrations and mechanical ventilation refers only to intubation plus mechanical ventilation.

A specific form was designed for collecting the data. These forms were processed automatically to recover all the items using Teleform 7.0 Elite (available at: http://www.cardiff.com). The first 100 forms were also double-checked manually by independent investigators for quality control. Logistic regression was used to analyze the predictive power of the variables with respect to mortality. We used the forward method to enter variables into the model. Variables were removed from the model based on the significance of the change in the likelihood ratio. In the preliminary analysis variables were considered in their original continuous form. In the subsequent analysis such variables were considered to be categorical (e.g., maximum SOFA score, or a maximum SOFA score <10 vs. ≥ 10). The following formula must be used to calculate the likelihood of in-hospital death with continuous variables model:

$$ <>prob = \frac{1} <>{{1 + e^{{ - {\left( { - 3.189 + {\left( {0.034AGE} \right)} + {\left( { - 1.491TREND} \right)} + {\left( {0.299SOFA\min } \right)}} \right)}}} }} $$

Variables with p value of 0.05 or less were included in the model. From the model of logistic regression we constructed receiver operating characteristic curves, and their discriminatant power together with confidence intervals were calculated from the area under the curve. We extracted from this area under the curve the sensitivity and specificity, and different models were assessed using the categorical variables with the aim of identifying those that showed a specificity close to 100% while maintaining a sensitivity as high as possible. In the overall test the two-tailed level of significance was fixed at 5% (α=0.05). Analyses used the SPSS (version 10.0) statistical software program.

Results

ICU mortality among MODS patients was 37.3% and hospital mortality 44.6%. Some type of life support limitation was applied in 70.6% of the MODS patients who died. The percentage of life support limitation in survivors was 18%. As shown in Table 2, the most frequent life support limitation instructions were not to resuscitate (54.2%) and to withhold renal dialysis (36.1%). When life support limitation and mortality were used as a dependent and independent variable, respectively, the model yielded an area under the receiver operating characteristic curve of 0.804 (p<0.001, 95% CI 0.779–0.829). The model that gave the highest area under the receiver operating characteristic curve (0.807, p<0.001, 95% CI 0.784–0.830) considered the noncategorical variables minimum SOFA and age and the categorical variable trend during the first 5 days (Fig. 1). Mortality was 100% in patients with a maximum SOFA above 13, a minimum SOFA above 10, age over 60 years, and a positive trend during the first 5 days with MODS (n=34). The models that treat the four variables maximum SOFA, minimum SOFA, trend, and age as categorical variables simplify calculation and facilitate their clinical application. Table 3 presents the specificity, sensitivity, positive predictive value, and negative predictive value of these models, and Table 4 the area under the corresponding receiver operating characteristic curve.

Table 2 Percentage of limitation of life support measures in MODS patients who died
Fig. 1
figure 1

SOFA maximum, minimum, and trend and age (0.807; p<0.001, 95% CI 0.784–0.830) in predicting hospital mortality

Table 3 Categorical variables and their clinical applications (percentages) (model 1 maximum SOFA >13, minimum SOFA ≥10, positive SOFA trend, age >60 years; model 2 maximum SOFA >10, minimum SOFA ≥10, positive SOFA trend, age >60 years; PV predictive value, parentheses 95% confidence interval)
Table 4 The area under the corresponding receiver operating characteristic curve (ROC AUC) for each of these models (model 1 continuous variables: minimum SOFA, positive SOFA trend over 5 days, age, maximum SOFA; model 2 categorical variables: maximum SOFA >13, minimum SOFA >10, positive SOFA trend over 5 days, age >60 years; model 3 categorical variables: maximum SOFA >10, minimum SOFA >10, positive SOFA trend over 5 days, age >60 years; parentheses 95% confidence interval)

Discussion

Caring for patients with MODS patients is a significant challenge in every ICU. MODS patients admitted to an ICU require closer monitoring and consume a greater number of human and material resources than other patients. Limitation of life support is high in MODS patients, especially in those who die. Predictive indices such as the SOFA score, together with other variables, may be useful for making bedside clinical decisions regarding life support limitation in MODS patients.

In our study the incidence of MODS was 17.6%, with an in-hospital mortality rate of approx. 45%. MODS is the main cause of morbidity and mortality in patients admitted to ICUs, and it has been calculated to account for 80% of ICU deaths [1, 15]. In the United States MODS costs exceed $100,000 per patient [16, 17]. In Spain patients who develop MODS consume a high proportion of resources and also account for a higher proportion of those who die and of those who survive with a worse quality of life [18].

Life support limitation in MODS patients is a complex issue. One survey of ICU health personnel in Spain reported that 72% of respondents found differences between withholding and withdrawing life support [19]. Nevertheless, from an ethical viewpoint the decision to withdraw life support is not essentially different from the decision to withhold it, and this has been reported in multiple studies [20, 21, 22, 23, 24]. In one recent investigation performed in six Spanish teaching hospitals life support limitation was applied in 7% of all patients admitted to ICUs and in 34% of those patients who died. In this study life support limitation was applied in 36% of patients with sepsis or MODS [12]. In our study some type of life support limitation was applied in 70.6% of MODS patients. The difference between the two studies is, firstly, that our investigation was considerably larger. Secondly, life support limitation in our study included orders not to resuscitate as well as the nonperformance of cardiopulmonary resuscitation maneuvers. The decision to limit or withdraw treatment is based on issues such as quality of life, age, and futility of treatment, and although the severity scoring systems cannot be used as the basis for individual decision making, they provide another element that can be taken into account [25]. The issue of futility in MODS is confounded by studies performed more than 10 years ago when it was considered futile to continue life support if four or more organs were affected for more than 3 days [5, 26, 27]. Older definitions, however, grade organ failure as a dichotomous variable (yes/no), while modern definitions, such as the SOFA, provide a scaling system for each organ system failure.

The knowledge generated from predictive indices has probably influenced clinical decision making related to life-support limitation. Since Knaus et al. [9] published their index for quantifying multiple organ failure in 1985, defined as a binary phenomenon (present/absent), various scores have been proposed that enable organ dysfunction to be rated as a continuous dynamic function over time. SOFA, originally referred to as the “sepsis-related organ failure assessment score.” was developed in 1994 by a panel of experts from the European Society of Intensive Care Medicine, based on a review of the literature. This SOFA index has proven a useful tool for describing and quantifying organ dysfunction/failure in nonselected critically ill patients [7] and in those suffering trauma [27], renal failure [28], or cardiovascular disorders [29] and has proven superior in prognostic value to SAPS II [30]. Moreno et al. [30] showed that initial SOFA, ΔSOFA (trend), and maximum total SOFA are related to outcome. A recent study by Ferreira et al. [8] in 352 patients showed that SOFA, particularly mean SOFA and the highest score during ICU stay, is a good indicator of outcome. They reported that an increase in SOFA score during the first 48 h after admission to the ICU was associated with a mortality rate above 50%, irrespective of the initial score, and above 91% when the baseline SOFA was greater than 11 Janssens et al. [29] reported the maximum total SOFA to be higher in nonsurvivors, with a value of 7.7 in patients who died vs. 2.5 in survivors (p<0.01). These studies show that SOFA, particularly certain measures obtained from it, provides a good estimate of outcome and response to treatment in critically ill patients with organ dysfunction.

Although the initial goal of most current scores used in MODS patients was to assess the severity of the organ dysfunction, such scales have also been shown to be useful as prognostic indices. Pettila et al. [31] reported that the Multiple Organ Dysfunction Score, LODS, and SOFA all have excellent discriminating power (ability to distinguish between patients who die and those who survive). However, organ dysfunction scores such as SOFA has not been prognostic, and they are not calibrated on the basis of their prognostic capacity. That has been the domain of dedicated prognostic scores such as APACHE, SAPS, and MPM.

Our study sought to ascertain which factors are associated with a mortality rate of 100% in MODS patients. If objective criteria such as SOFA score and their course were aggregated, they could determine when life support would be futile and thus provide a basis for deciding whether to withhold or withdraw life support. According to the model derived from our results, mortality is 100% in all patients with age over 60 years, total maximum SOFA greater than 13 on any of the first 5 days, minimum SOFA lower than 10 at all times, and a positive or unchanged SOFA trend. For daily clinical practice it is easy to remember that patients with a SOFA score above 10, age over 60 years and a positive or unchanged 5-day trend have a mortality rate of 100%. It has been suggested that with an expected mortality rate of 98% or above, treatment can be considered futile [32]. However, our estimate of 100% mortality was developed from 34 patients, giving a relatively wide 95% confidence interval (89.7–100%). In terms of decision making in the ICU our findings are limited because the performance of the model was good only for a small group of very severe patients.

Although applicable to only a very selected population, our findings may lead to improving end-of-life care in such a way that can help caregivers and perhaps patients’ surrogates to take responsible decisions based on more objective parameters [33, 34]. Instead of using an isolated marker at admission, our study combined the initial SOFA together with the course of SOFA measurements over time in patients who remain unresponsive despite receiving appropriate treatment for several days.

In our opinion, maintaining life support in cases where treatment is futile, simply prolongs suffering and can be considered maleficent. However, defining futility in a reproducible and objective manner is very challenging. Lastly, when faced with limited resources (number of intensive care beds), to have beds occupied by patients who unfortunately have no chance of recovery is ethically questionable.

The strengths of this study include: (a) the large number of centers enrolling patients and the 2-month cohort design; (b) the use of organ dysfunction which is thought to contribute to decisions to withdraw life support; (c) use of the SOFA score, which is probably the most widely used organ dysfunction score and whose properties have been well established in several studies; and (d) the large database of 7,615 patients and the generalizability that this affords. On the other hand, limitations of the study are: (a) The life support limitation applied to each patient was carried out in accordance with each center’s and physicians criteria. Decision making around treatment limitations is very complex and attended to by many social, cultural, religious, and other factors, not merely medical considerations; these additional issues were not considered in this study. (b) Another limitation is the validation of this model. In addition to using the daily SOFA score, our model also takes into account patient age and SOFA trend over a period which, in this case, was 5 days. (c) Since practice varies so widely in end-of-life decision making between countries, the generalizability of these findings needs to be addressed [35, 36]. (d) The area under the receiver operating characteristic curve of 0.8 it is not particularly strong, and the variables incorporated in our model are not independent. (e) The impact of withdrawing or withholding support based on SOFA score is difficult to ascertain since some decisions can increase SOFA (withdraw dialysis) while others can decrease SOFA (withdraw norepinephrine). In this study we kept unchanged common clinical practice.

In conclusion, our study suggests that an objective clinical and physiological score such as SOFA can be useful when deciding whether to limit life support. Further studies are required in this field to help the clinician in making decisions that are still difficult due to the lack of clear standards on the futility of certain treatments.