Introduction

Postoperative outcome is determined by patient, pathology, and procedure related factors. There are well-established scoring systems for predicting outcome in the intensive care setting, derived from selected physiological variables with or without evaluation of pre-existing disease [14]. The Physiologic and Operative Severity Score for Enumeration of Morbidity and Mortality (POSSUM) system, which includes both preoperative and intraoperative variables, predicts mortality and morbidity and has been adapted for speciality-specific and even procedure-specific use [5, 6]. The employment of such scoring systems outside clinical trials is hampered by the sizeable data set required for their proper use. In the absence of objective risk stratification, management decisions may be largely based on subjective assessment, which has been demonstrated to be suboptimal [7, 8]. The development of a simple, reproducible, and accurate perioperative scoring system may guide postoperative patient management while circumventing some of these shortcomings.

The Surgical Apgar Score (SAS) was derived from analysis of 77 perioperative variables in a cohort of 303 patients and thereafter was validated in cohorts of 869 and 4,119 patients undergoing general and vascular surgery in U.S. academic hospitals [9, 10]. A 10-point scoring system based on intraoperative estimated blood loss, lowest heart rate, and lowest mean arterial pressure was devised (Table 1). Validation studies demonstrated that the risk of death or major complication at 30 days increases monotonically with the SAS from 5 to 56.3% [1]. The SAS predicts outcome after adjustment for preoperative risk factors and may function as a marker of the quality of intraoperative care [11]. It has also been demonstrated to predict post-discharge complications after colorectal surgery and has been validated for use in a large cohort of patients undergoing major orthopedic elective joint replacement [12, 13].

Table 1 The Ten-point surgical Apgar score (SAS) is calculated from the weighted scores of three variables

The original aim of the SAS was to provide a simple means of providing immediate objective feedback that clinicians might use to improve the postoperative management of high-risk patients [9]. Studies describing the development and validation of the SAS have been conducted with large cohorts in tertiary centers, and it has been established to reproducibly stratify postoperative risk of mortality and morbidity in several settings on the large scale. The utility and application of the SAS will depend in part upon whether its performance can be replicated in the practice of individual surgeons in the general population. The primary aim of the present study was to assess the utility of the SAS in predicting 30 day mortality and major complication within a U.K. district general hospital population for both general/vascular and orthopedic patients. Secondary aims were to analyze the performance of the score in elective and emergency subgroups and to identify whether the SAS might be used to improve outcome.

Methods

A prospective, consecutive series of 236 patients was identified between April and July 2009. The inclusion criteria were major and intermediate general surgical and vascular procedures, lower limb joint replacement, and emergency fractured neck of the femur. Patients were over 16 years old and surgery was performed in a non-ambulatory setting. Thirteen cases were excluded due to insufficient data (n = 2) or inadequate follow-up (n = 11), resulting in a final cohort of 223 patients (94.5%). All data were collected prospectively using a standardized pro forma that included the mode of surgery. Emergency surgery was defined as an unscheduled procedure occurring during an unplanned surgical admission. The SAS was calculated as described in previous publications [9, 10]. The relevant variables—estimated blood loss, lowest heart rate, and lowest mean arterial pressure—were extracted from handwritten anesthesia charts. The primary endpoint was 30 day major complication (which included mortality). Major complication was defined as previously [9]: acute renal failure, bleeding requiring ≥4 units of red cell transfusion within 72 h after surgery, cardiac arrest requiring cardiopulmonary resuscitation, coma for ≥24 h, deep venous thrombosis, myocardial infarction, unplanned intubation, ventilator use for ≥48 h, pneumonia, pulmonary embolism, stroke, wound disruption, deep or organ-space surgical site infection, sepsis, septic shock, systemic inflammatory response syndrome, vascular graft failure and death. Urinary tract infection and superficial surgical wound infection were not included. Complications were recorded prospectively by the authors (N.S. and M.C). and cross referenced with the electronic patient record in which surgical complication was an obligatory field and represented the sole repository for discharge summary data. Outcome data were collected prospectively from the electronic patient record, and each case was reviewed at 30 days in order to identify readmission relating to delayed presentation of a complication. The operating room discharge destination was recorded in order to assess any potential for increasing the level of postoperative care in higher risk patients.

The overall discriminatory power of the score was analyzed using receiver operating characteristic (ROC) curves and the area under the curve (AUC) with respect to major complication or death. The SAS was rationalized into two groups representing high and low risk for analysis of subgroups. The threshold was determined with reference to the optimal accuracy and likelihood ratio (LR). Fisher’s exact test was used to analyze the performance of the score in these subgroups. Results of statistical analysis were considered significant at a level of p < 0.05. Statistical analysis was performed with GraphPad Prism 5 software (GraphPad, La Jolla, CA).

Results

The final cohort of 223 patients comprised 132 general and vascular cases and 91 orthopedic cases. A summary of the range of operations undertaken is presented in Table 2. With reference to the general and vascular surgical cohort, 30/127 (24%) patients experienced at least one major complication, including 4 deaths (3%). Forty-four percent of procedures were classified as emergencies. Of patients with scores of 9–10, 5/41 (12%) developed major complications within 30 days with no deaths. For patients with scores of 7–8, 11/60 (18%) had major complications with no deaths. For those with scores of 5–6, 11/20 (55%) patients had complications (three deaths), and for those with scores of 4 or less, 3/5 (60%) had complications (one death). Major complication or death increased monotonically relative to the SAS (Fig. 1).

Table 2 Operations included in the final cohort of 223 consecutive cases
Fig. 1
figure 1

Distribution of major complication (including death) between the surgical Apgar score (SAS) categories in the general/vascular and orthopedic cohorts

In the orthopedic cohort, 17/87 (20%) patients experienced at least one major complication within 30 days of surgery of which 7 (8%) died. Forty-six percent of procedures were classified as emergencies. Of patients with scores of 9–10, 4/25 (16%) developed major complications (one death). For patients with scores of 7–8, 6/40 (15%) had complications (one death). For those with scores of 5–6, 5/19 (26%) had complications (four deaths), and for those with scores of 4 or less, 2/3 (66%) had complications (one death). Major complication increased relative to the SAS (Fig. 1).

The relative proportion of general and vascular cases within each of the SAS categories was similar to a previous validation study; however, the relative risks attributable to each category were considerably less than reported in this study (Table 3). The discriminatory power of the SAS was compared using ROC curve analysis for all cases and also with respect to the mode of surgery, elective or emergent. Considering all cases, for general and vascular cases, the AUC was 0.73 (p = 0.0002) for death and major complication, compared with 0.62 (p = 0.15) in the orthopedic cohort. Within the general and vascular cohort, the SAS achieved significant discrimination of outcome in the emergency subgroup (AUC 0.72, p = 0.011) but not the elective subgroup (AUC 0.66, p = 0.08) (Table 4).

Table 3 The proportion of cases and relative risk of death or major complication in the general and vascular study cohort, compared with a previous validation study [10]
Table 4 The performance of the SAS on ROC curve analysis with respect to 30 day mortality or major complication in the general/vascular and orthopedic cohorts

The SAS was interrogated in order to identify the optimal threshold to dichotomize the SAS for further analysis (i.e., into two groups of good and poor prognosis). The accuracy (0.79), relative risk (3.5), and likelihood ratio (4.1) were maximal at a threshold of ≥7, which was therefore used to define these categories. Fisher’s exact test was used to determine the performance of the dichotomized score in high (SAS < 7) and low risk (SAS ≥ 7) categories. Considering the general and vascular cohort, the dichotomized SAS demonstrated significant differences in 30 day death or major complication rates when all cases were included (RR 3.5 [95% CI 2.0–6.2] Fisher’s p < 0.001), and this finding was replicated in the emergency subgroup (RR 2.9 [95% CI 1.5–5.8] Fisher’s p = 0.004). There was no significant difference in outcome when considering the elective subgroup alone (Fisher’s p = 0.12). Analysis of the orthopedic cohort demonstrated that SAS did not predict 30 day mortality or major complication (p = 0.12) in any of these settings (Table 5).

Table 5 The performance of the SAS using a threshold of 7 to define high-risk and low-risk groups in the prediction of 30 day mortality or major complication after general/vascular and orthopedic surgery

Twelve patients (5.4%) were admitted to level 2 or 3 facilities directly from the operating room, all of whom had undergone general or vascular surgery. A greater proportion of high-risk patients (15%, 7/47) were admitted when compared to low-risk patients (3%, 5/176). Five of these 12 admissions were scheduled in the preoperative period of which four were classed as low risk at the time of surgery (i.e., SAS ≥ 7). Of the seven unscheduled admissions, all were emergency cases and six were classified as high-risk. Overall 30 day mortality within the cohort was 4.9% (11 of 223), and summaries of the individual cases are presented (Table 6). The SAS placed 9 of these 11 cases into the high-risk category (SAS < 7) of whom only two were managed in level 2 or 3 care settings.

Table 6 Summary of the relevant attributes of the eleven 30 day mortalities identified in the cohort

Discussion

A prognostic score is required to be practicable and objective, and it should provide information that supplements clinical judgment for the purpose of counseling patients as well as informing the judicious management of resources. Clinical judgment encompasses many more factors than can contribute to a scoring system, and its value relates to the breadth of qualitative factors that can be considered. Nevertheless, the operating surgeon has previously been demonstrated to be less effective at identifying high-risk gastrointestinal surgical patients than either an independent clinician carrying out a structured examination or, indeed, certain preoperative plasma protein levels [14]. In contrast, the operating surgeon’s clinical judgement has also been demonstrated to be comparable to POSSUM scoring in the prediction of postoperative outcome following major gastrointestinal surgery [15]. In reality, any standardized scoring system can only fulfill an adjunctive role in perioperative decision making.

Predictive scoring with APACHE II (Acute Physiology and Chronic Health Evaluation II), SAPS II (Simplified Acute Physiology Score II), or POSSUM is rarely used outside a research and audit environment, in part because of the complexity of these measures [1, 3, 16]. POSSUM has been validated as an accurate predictor of postoperative mortality across surgical specialties in a variety of adapted forms [6, 1719]. The calculation of a POSSUM score demands 6–12 physiological and 3–6 intraoperative variables depending on the model employed. APACHE II and SAPS II were derived from intensive care unit (ICU) populations and require a minimum of 12 variables. In comparative studies, an important measure of their relative utility is ease and simplicity of use [20]. The SAS has the advantage of being calculated from three universally available intraoperative data points, and it has been demonstrated to predict 30 day surgical outcome independent of preoperative physiological status [11].

This is the first time that the utility of the SAS has been assessed in a heterogeneous U.K. surgical population including general surgical, vascular, and orthopedic cases. It has been validated in tertiary U.S. and international populations across a range of surgical procedures [10, 21, 22]. Consistent with previous publications, there was a monotonic relationship between SAS and the risk of death or major complication in the general surgical and vascular cohort. The relative risk of death or major complication in the high-risk (SAS < 4) group was reported to be 16.1 compared with the reference group (SAS 9–10) in the index publication [9]. For patients with SAS 3–4, the relative risk of death or major complication was 5.6 (95% CI 1.8–18) compared with 10.7 (95% CI 8.1–14.7) in a previous validation study [10]. Interrogation of the SAS using ROC curve analysis and Fisher’s exact test after dichotomization into high-risk (SAS ≥ 7) and low-risk (SAS < 7) groups demonstrated statistical significance with respect to the primary outcome, major complication, or death. This was reproduced in the emergency subgroup but not in the elective subgroup. The dichotomization of the score at this threshold has been selected in a previous validation study including the original authors [13].

The SAS was not derived from a population that included orthopedic surgery but since has been validated as a predictor of major complication and mortality in elective lower limb arthroplasty and also cytoreductive ovarian surgery and radical cystectomy [13, 23, 24]. The present analysis, which included elective major joint replacement and emergency surgery for femoral fractures, did not demonstrate statistical significance for the prediction of major complication or death.

An advantage of this relatively small cohort study is that individual cases may be reviewed to assess the potential utility of the SAS in modifying postoperative management, albeit in a post hoc setting. Eleven patients in the cohort died within 30 days, and these cases were reviewed. Four deaths followed general or vascular procedures. Two of these patients were discharged to the ICU from the operating room after undergoing emergency laparotomy and small bowel resection for ischemic complications (SAS 5 and 2). The remaining two 30 day deaths were a 94-year-old (SAS 5) patient who underwent elective axillofemoral bypass and developed a postoperative chest infection after being discharged to the ward, and 35-year-old patient who had undergone a palliative bypass for colorectal cancer recurrence in whom a ceiling of care had been defined and who was discharged to the ward before developing sepsis and renal failure (SAS 6). The remaining seven patients had all undergone surgery for femoral fracture and had a mean age of 89 years (range: 83–97 years). All were discharged directly to the ward from the operating room, and five had a predefined perioperative ceiling of care excluding ICU admission. Two patients (aged 85 and 97) were stratified as low risk (SAS 8 and 9) and therefore would not have been identified as candidates for intervention by means of the score alone, although it is highly unlikely that any subjective assessment would have considered them to be at low risk of complication.

The stated aim of the SAS was to provide a simple means of providing immediate objective feedback, which clinicians might use to improve the postoperative management of high-risk patients [9]. The subsequent finding that the score can predict outcome after adjustment for preoperative risk led to the suggestion that it might also be used as an indicator of the quality of intraoperative care [11]. At the level of individual surgical practice, the primary benefit of the SAS would be in the former capacity. The SAS predicted 30 day death or major complication after general and vascular surgery, and this appears to reflect its good performance in the emergency subgroup rather than the elective subgroup. It might be suggested that clinicians would be alerted to a significant problem were the SAS parameters to be deranged during the course of an elective surgery (substantial blood loss, hypotension, or persistent tachycardia) and that intuitively the score might be more useful in the emergency setting. Four patients died following general or vascular surgery, all of whom had SAS < 7. Two of these patients were discharged from the operating room to ICU, and one patient was discharged to the ward (end-stage palliation of cancer recurrence). Therefore, only one patient might have benefitted from an increased level of care on the basis of the score—the 94-year-old woman who had lost 700 ml of blood during elective axillobifemoral bypass with associated intraoperative hypotension (85/30 mmHg), who was discharged to a level 1 (ward) bed.

An intraoperative scoring system can only influence outcome if relevant clinical factors can be modified in the postoperative period. The three variables that constitute the SAS influence tissue perfusion, and the score is likely to be a surrogate assessment of tissue oxygenation. A series of landmark studies demonstrated that maintaining oxygen delivery at a predetermined level (goal-directed therapy) in the preoperative period can significantly reduce mortality in high-risk surgical patients by up to a factor of 5 [25, 26]. Subsequent trials of goal-directed therapy (GDT) in the postoperative setting did not support these results and may have reflected a lack of intervention early in the natural history of organ failure—i.e., before and during surgery [27, 28]. In a nonrandomized but widely cited article, the greatest improvement in outcome for critically ill surgical patients (as measured against POSSUM prediction) was seen in those patients admitted to ICU for GDT before surgery, rather than after surgery or not at all [29]. The effectiveness of the enhanced recovery program relies in part on the principle of preoperative preparation and intraoperative GDT to maintain cardiac output at optimal levels. A small nonrandomized study has demonstrated that postoperative high dependency unit (HDU) care resulted in significantly fewer cardiorespiratory complications after major abdominal surgery, although there was no difference in mortality [30]. Planned 48 h ICU admission of high-risk elective surgical patients from the operating room resulted in a significantly lower 30 day mortality than predicted by p-POSSUM in a series of 1,045 patients [31]. Although there is a wealth of evidence that early high-level care with GDT can improve outcome, it is by no means clear that outcomes can be improved if this treatment is instituted postoperatively in a patient who has surrogate evidence of poor tissue oxygenation.

Had all general and vascular patients stratified as high-risk (SAS < 7) received routine postoperative HDU or ICU management over the study period, a total of 18 additional admissions would have been required, equating to a 2.6-fold increase in demand. There remains constant pressure on ICU and HDU resources in the U.K., with little capacity to provide additional care at present. It may be possible to improve outcome outside a critical care setting by optimizing the management of high-risk patients on the ward. However, this has yet to be objectively demonstrated.

Limitations to this study include the retrospective extraction of data from manually completed anesthesia charts compared with the digital, automated perioperative data capture employed in the original study. It has been suggested that readings taken at a frequency of less than 5 min intervals may invalidate the score [22]. Blood loss may be difficult to quantify accurately, and discrimination between volumes above and below 100 ml may be inconsistent. Information regarding complications were collected from the electronic patient record system and was therefore dependent on the quality of data entry. Patients who presented with late surgical complications to their general practitioner or who were readmitted to another hospital will not have been captured in the analysis. The comparatively small study cohort may have resulted in a type II error and consequent underestimation of the ability of the SAS to predict primary outcomes, particularly on subgroup analysis of elective cases where very few events were recorded over the study period. Nevertheless this does not detract from the analysis of the score’s practical utility on case review.

The proportion of cases in high- and low-risk groups was similar to previous studies (20% cases in the high-risk group [SAS < 7]), but there was a much higher proportion of emergency cases in the current cohort (44 vs. 6%). In comparison with centralized UK Hospital Episode Statistic (HES) data, the proportion of emergency cases in the study cohort is only slightly higher in general surgery subgroup (37.3 vs. 40.2%) but considerably higher in the orthopedic subgroup (27.1 vs. 46.1%) [32]. The current study excluded minor surgery and ambulatory cases, which are included in the HES data and contribute to these discrepancies. The proportion of emergent operations is likely therefore to be representative of U.K. general surgical practice. In a practical sense, surgeons might find more valuable the support of a prognostic score in the setting of intermediate and major emergency surgery, which accounts for a high proportion of surgical morbidity and mortality. Validation of the SAS has been performed in cohorts of several thousand, but for the score to fulfill its stated role in guiding surgical decision making at the level of the individual doctor and patient, it needs to function effectively on a smaller scale. Because of the small size and selected population for this study, it focussed on the utility of the SAS at this scale.

The SAS has been demonstrated to be a simple and objective tool for providing accurate and reproducible postoperative risk stratification in the setting of emergency general and vascular cases. Its efficacy appears to be limited in the setting of elective surgery, where deviations from the anticipated intraoperative course are relatively straightforward to identify in the absence of a scoring system. The SAS does not appear to be a useful adjunct to decision making in the context of orthopedic surgery in the context of this specific study.

There is robust evidence that the SAS is effective in a large population studies; however, its role in improving postoperative outcome may be limited, particularly when there is limited evidence that postoperative intervention benefits outcome after an adverse preoperative or intraoperative course. Moreover, despite the SAS identifying patients at high-risk of complication, including death, case review suggested limited potential to improve individual postoperative management. Identification of a high-risk cohort may be valuable in selecting candidates for future trials. Continued research into the capacity of the SAS to modify rather than merely predict postoperative outcome is required to endorse its benefit.