Accurate assessment of the risk of surgery has significant implications for the patient, surgeon, and the health care provider. This may influence decisions about surgery, including deciding between surgical and nonsurgical treatment and the type of operation that will be undertaken. Identifying a group of high-risk patients may also influence perioperative management, with particular care being taken to prevent postoperative complications in high-risk patients. On an institutional level, accurate clinical identification of perioperative risk may help to quantify the complexity of work being undertaken. This may therefore provide a method for documenting a risk-adjusted outcome for different health care providers [1].

As many factors contribute to perioperative complications, the prediction of complications in an individual patient is problematic. Important contributing factors include medical comorbidity, sepsis, malnutrition, the acuteness of surgery, the extent and complexity of surgery, the type of anesthesia, postoperative care, the magnitude of the underlying pathology, institutional and surgeon volumes, system problems that may vary from institution to institution, and random events. It is therefore not surprising that individual preoperative predictors of risk identify only a subgroup of high-risk patients [2] and that a careful clinical assessment, noting cardiorespiratory disease, preexisting sepsis, and nutritional status, has been demonstrated to be as effective as any other single preoperative indicator [2].

A number of scoring systems, which combine multiple predictors of risk, have been developed. These represent a significant advance when compared with individual preoperative predictors. Examples include POSSUM (Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity) [3, 4], APACHE (Acute Physiology and Chronic Health Evaluation) [5, 6], the Surgical Risk Scale [7], the Complexity and Risk-adjusted model [8], the Veterans Affairs Surgical Risk Model [9, 10], the Association of Coloproctologists of Great Britain and Ireland (ACPGBI) Colorectal Cancer Model [11], the Risk Scoring System [12], and the Multifactorial Preoperative Risk Index [13]. Scoring systems have also been developed specifically for cardiac surgery [14]. These scoring systems have limitations when it comes to influencing preoperative decision-making and patient management. Reasons for this include the scoring system being designed for comparative surgical audit rather than for helping with the prospective management of an individual patient [4, 7, 8, 10], the use of intraoperative factors and postoperative pathology results in generating the score [4, 11], the model being specifically designed for a single disease entity [11] or only for intensive care unit use [6], the model assessing mortality but not morbidity [6, 7, 11], and some scoring systems having a reduced accuracy beyond the environment in which they were generated [15, 16].

None of these scoring systems included a clinical prediction of risk by the surgeon. The surgeon’s prediction of risk is an “educated” assessment based on clinical experience and a precise knowledge of the clinical scenario. Although this is “subjective,” it can be quantified by the use of a visual analog scale (VAS), which transforms the global assessment of risk into a quantitatively useful number [1]. Potential advantages of the surgeons’ risk assessment include it being tailored to the individual surgeon, being able to synthesize multiple risk factors, and having the potential to incorporate factors that would otherwise be difficult to quantify.

It has been suggested that a comparison of the VAS score immediately before and after surgery can identify important technical intraoperative events that contribute to perioperative complications [17]. The importance of technical expertise in preventing adverse events is well recognized [1820] and is a fundamental principle underpinning surgical training programs, surgical subspecialization [21], and studies examining the relationship between surgical volume and outcomes [22, 23]. Changes in the VAS score may be one way for important intraoperative events related to surgical technique to be identified.

This prospective study attempts to assess how good the surgeon is at predicting complications. It looks at the impact of a change in prediction after the surgery has been completed and then goes on to evaluate the usefulness of adding the surgeon’s VAS scores to a multivariate model for predicting perioperative complications.

Patients and methods

This prospective observational study involved 1013 consecutive patients undergoing abdominal surgery, who were also eligible for entry into a study on prophylactic antibiotic use [24]. Patient consent was obtained for both studies. The surgeons prospectively estimated the risk of a major complication on a 100-mm VAS in the operating room immediately before and after surgery (Fig. 1). If the score was changed postoperatively, the surgeon was asked to document the reason for the change. For purposes of analysis a change of greater than 5 mm was categorized as a change in the prediction of a major complication. The VAS prediction of major complications was available only in the operating room and was not used to directly influence patient management in any way.

Fig. 1
figure 1

Visual analog scale

Data on other patient risk factors, including the 12 variables used in the Complexity and Risk Adjusted model [8], were prospectively collected in all patients. Clinical evaluation for complications was performed daily by the surgical team and clinical research nurse until hospital discharge. All patients were reviewed at a minimum of 30 days postoperatively, either as an inpatient, as an outpatient at a surgical clinic, or by telephone by the research nurse using a standardized questionnaire. The surgical audit was checked to ensure that all complications were included in the final study results.

Major complications were defined as complications resulting in death or major morbidity. Major morbidity [2, 17] included myocardial infarction, major arrthymias, cerebrovascular accidents with permanent neurologic damage, respiratory failure, renal failure, gastrointestinal bleeding, deep venous thrombosis, pulmonary embolus, bowel obstruction, anastomotic leak, intra-abdominal abscess formation, septicemia, wound dehiscence, and any cause for reoperation. All complications also included minor morbidity. Minor morbidity [2, 17] included atrial fibrillation, transient ischemic attack, minor bleeding, superficial thrombophlebitis, confusion, ileus, wound infection, wound hematoma, urinary retention, urinary infection, chest infection, atelectasis, diarrhea, and febrile morbidity. A case with more than one complication was assigned to its most major complication. The study was approved by the institution’s ethics committee.

The Mann-Whitney U test was used to compare the VAS score of patients with a complication against those without a complication. The accuracies of the different methods for predicting complications were compared using receiver operating characteristic (ROC) curves. Point sensitivity and specificity were calculated at each 10% interval for the study population. The area under the curve (AUC) was then calculated.

The surgeons’ VAS score was combined with the Complexity and risk adjusted model [8] for predicting the outcome of surgery. As this model is based on variables prospectively collected on the Otago Surgical Audit for General Surgery [25], we subsequently refer to this as the Otago Audit Model. To enable the VAS score to be combined with the Otago Audit Model, the grades of complications were redefined according to the defined severity of complications used in the model [8, 25]. Logistic regression analysis was performed using SAS version 6.12 (SAS Institute, Cary, NC, USA) with the Otago Audit Model variables. The surgeons’ VAS score was then added as a prognostic variable with four categories. The model was then optimized for the patient population included in this study, and the VAS score was again added as a prognostic variable with four categories. A correlation matrix was used to check for possible interactions between the variables included in the logistic regression analysis and the presence of multicollinearity was checked using variance inflation factors.

Results

The preoperative VAS score for a major complication was completed in 1013 patients and the postoperative VAS score in 1011 patients. One of the two patients without a postoperative VAS score developed a wound infection. Patient demographics, the types of procedures performed, and the causes of death and major morbidity are presented in Table 1.

Table 1 Patient demographics, type of surgery, causes of death, and major morbidity

There was a significant difference in the surgeons’ preoperative VAS score between the patients who did and did not develop a complication (Table 2). This was also observed for the surgeons’ postoperative VAS scores. The more severe the complication, the greater the difference in the risk prediction. For example, the difference in the median postoperative VAS score between those with and without a complication was 20 mm when the patient died, 15 mm for a major complication, and 7 mm for any complication. There was no significant difference in the frequency of major complications for either of the antibiotics that patients were randomized to receive.

Table 2 Surgeons’ VAS scores for patients with and without complications

The surgeons’ VAS score for complications was increased postoperatively in 174 patients, not changed in 709, and decreased in 128. The predictive value of changing the VAS score is shown in Table 3. Patients with an increase in VAS score had a higher mortality rate (6.3% vs. 2.4%, p = 0.006), “major complication” rate (20.1% vs. 11.0%, p = 0.001), and “all complication” rate (48.3% vs. 34.3%, p = 0.001) than those with no change or a decreased VAS score. The increase in complications predicted after surgery was significant for the group of patients with a low preoperative VAS score (below the median of 20 mm), but it did not reach significance in the patients for whom the preoperative score was already “high” (above the median of 20 mm; Fig. 2). The increase in the VAS score identified an additional 6 (19%) deaths, 13 (10%) “major complications,” and 20 (5%) “all complications.” The major complication rate when the postoperative VAS score was increased between 5 and 9 mm was 15.6% (7 of 45); between 10 and 29 mm it was 20.9% (18 of 86); and by 30 mm or more it was 25% (11 of 44). Although both the extent of change and the final predictive VAS score helped to predict complications, the major complication rate was more closely predicted by the surgeons’ postoperative VAS score (a score of < 20 or > 65 having a major complication rate of 9% and 40%, respectively). The morbidity after decreasing the surgical VAS score was the same as when this was not changed.

Table 3 Correlation between change in the VAS score at the completion of surgery and the incidence of perioperative complications
Fig. 2
figure 2

Flow diagram demonstrating the influence of changes in the VAS score for patients considered to have a higher and a lower preoperative risk. Low risk = up to and including the median score of 20 mm. High risk = above the median score of 20 mm. p value for an increased postoperative assessment is for major complications using the χ2 test

The discrimination of the surgeons’ prediction of complications was measured using the area under the ROC curve. For preoperative prediction this was 0.74 for mortality, 0.67 for major complications, and 0.64 for all complications. For postoperative prediction this was 0.75 for mortality, 0.69 for major complications, and 0.65 for all complications. The ROC curves for the preoperative prediction of major complications are presented in Figure 3. The accuracy of the prediction of major complications was not influenced by the experience of the operating surgeon (being the same for consultants and registrars). The likelihood ratios for both the preoperative and the postoperative prediction were similar up to a VAS score of 60 mm, being 1.6, 2.3, and 3.1 for a score of ≥20 mm, 50 mm, and 60 mm, respectively. For a score of ≥70 mm, the likelihood ratio was 3.7 preoperatively and 4.5 postoperatively. With a pretest probability for major complications of 12.5%, the probabilities of a major complication were 19% (20 mm), 29% (50 mm), 40% (60 mm), 46% (preoperative prediction ≥70 mm), and 56% (postoperative prediction ≥70 mm) The discrimination of the surgeons’ VAS score in predicting major complications was similar to other single significant predictors of complications such as the patient’s age and the duration of surgery (Fig. 3).

Fig. 3
figure 3

Receiver operating characteristic curves for the surgeon’s preoperative prediction of complications using a visual analog scale. A Preoperative prediction and the ROC curves for different severity of postoperative complications. () Mortality, AUC = 0.74. (▲) Major complications, AUC = 0.67. (●) All complications, AUC = 0.64. B ROC curve for preoperative prediction of major complications compared with the ROC curves for other predictors of complications. () Age, AUC = 0.69. (●) VAS score, AUC = 0.67. (▵) Duration of surgery, AUC = 0.64

All patients had a complete collection of the Otago clinical audit data set (Table 4). When applied to the Otago Audit Model [8] for all general and vascular surgery, there was a marginal fit (Table 4, Hosmer-Lemeshow statistic, p = 0.056). The patient’s age, the duration of surgery, and the acuteness of the surgery were the strongest predictors of complications. The addition of the surgeons’ preoperative VAS score resulted in both improved discrimination and goodness of fit (Hosmer-Lemeshow statistic, p = 0.374). The odds ratios for a VAS score of 0–19, 20–39, 40–59, and ≥60 mm were 1, 1.6, 2.0, and 3.3, respectively. The Otago Audit Model was then refined for inpatient abdominal surgery. Steps in developing the model included redefining the organ systems and removing variables that did not significantly contribute on logistic regression analysis. The prognostic variables retained include age, acuteness of surgery, duration of surgery, the number of operations, and the organ system being operated on (hepatobiliary, gastrointestinal, vascular, and a low-risk group including appendix, hernia, and gynecology). The predictive ability, goodness of fit, and discrimination of the model optimized for abdominal surgery was again further improved with the inclusion of the surgeons’ preoperative VAS score. When the postoperative VAS score was used, there was an additional small improvement in the discrimination of the model (AUC = 0.78). No collinearity was present between the surgeons’ VAS score and any of the other prognostic variables used in the Otago Audit Model (variance inflation factors <2).

Table 4 Otago Audit Model for general surgery and adjusted for abdominal surgery, with and without incorporation of the surgeons’ VAS score

Of the reasons given for changing the VAS score postoperatively, 74% were categorized as technical, 24% as disease related, and 2% as anesthetic or medical (Table 5). As more than one reason was sometimes stated, technical and disease-related reasons were present in 77% and 32% of patients, respectively, when a reason was given. The six main technical reasons included the procedure being more technically demanding than anticipated, the procedure being more extensive than expected, gastrointestinal or biliary contamination, anastomotic issues, duration of surgery, and injury to an intra-abdominal organ (intestine 8, spleen 4, bile duct 1). It was not documented whether the reason given for increasing the postoperative score was inevitable, was contributed to by “a lack of technical expertise,” or due to a technical error. In a number of cases the stated technical reason was unavoidable. The main disease-related reasons for changing the VAS score were the extent of sepsis, the extent to which the underlying malignancy had spread, and a new diagnosis of intestinal ischemia. No reason was given for a change in the VAS score in 38.7% of cases. In these cases the magnitude of the change was less than when a reason was stated (p < 0.001) and was more likely to be in a downward direction (p < 0.001).

Table 5 Reasons given by the surgeon for changing the VAS score for major complications

Discussion

This study demonstrates that when asked to predict the likelihood of a major complication using a visual analog scale, surgeons are able to make a meaningful prediction of perioperative risk. The more severe the complication, the more likely it will be predicted.

So how useful is the surgeons’ global VAS prediction of complications? This was demonstrated to be as useful as other individual indicators of risk, such as age and the duration of surgery (Fig. 3). It was not, however, sensitive or specific enough when used on its own to reliably describe surgical risk or to be used to influence perioperative decision-making. Not surprisingly, the VAS score, as for other single predictors of risk, did not perform as well as models that use multiple risk factors. The VAS score’s AUC for mortality and major complications of 0.74 and 0.67, respectively, compares to values in multifactorial models of 0.75–0.93 [4, 5, 7, 11] for 30-day mortality and of 0.82–0.86 [4, 8] for major complications.

When the surgeons’ risk assessment was incorporated into a multifactorial model for predicting the outcome of surgery, it made an important contribution to the model. The objective of studying the contribution of the surgeons’ VAS score to a multifactorial model was to assess if this was a statistically significant independent predictor of risk, or if it was dependent on other recognized risk factors. The Otago Audit Model was chosen because it had been developed by our group, peer reviewed, and had been shown to as good a predictor as other published and more complex models. The model is based on 12 easily collected clinical and operative variables (rather than physiologic or laboratory variables) that are low cost and almost universally recorded. The model was designed on a general surgical data set, which included inpatient and day-surgery cases and a wide case mix with breast, vascular, trauma, and hernia surgery as well as abdominal surgery. The model encourages adjusting the weights of the prognostic variables for each data set to which it is applied [8]; it was adjusted in this study for inpatient abdominal surgery. Although the Otago Audit Model’s discrimination was modest, the addition of the VAS score resulted in an improvement in prediction, an improved goodness of fit, and a further modest improvement in the discrimination of the model. One reason the VAS score improves the model is because of the absence of collinearity between the surgeons’ VAS score and the other prognostic variables. This was an important and unexpected finding. It was unexpected because we had anticipated that the surgeons’ VAS score would draw on other known prognostic factors. However, a surgeon’s risk assessment also allows for the incorporation of additional information such as the surgeon’s experience, a global assessment of the patient, the quality of hospital care and support services, and specific unique or infrequent patient problems that would not impact on a generalized model. Multifactorial models are based on variables that occur at a sufficient frequency to have an impact on the probability prediction of the model, e.g., age, ASA, and the urgency of surgery. Although such factors may not be critical to an individual patient, over a population their importance is reinforced. However, important but low-frequency clinical events such as liver cirrhosis, severe morbid obesity, or a recent myocardial infarction, which are important risk factors for the individual patient, are usually not included in established models, as they would make the model impossibly complex. This results in important risk factors that occur at a low frequency not being captured. We would suggest that the surgeon’s risk assessment in the VAS score identify and appropriately weight these clinical events. The surgeon’s VAS score, by capturing a variety of infrequent but critical factors, has the potential to add greater flexibility and accuracy to multifactorial models designed to predict perioperative complications.

How important is it when the surgeon changes the VAS score at the completion of surgery? This study demonstrated that when the surgeon increased the VAS score there was a statistically significant doubling of the rate of major complications. This confirms that what the surgeon sees and does intraoperatively is important in determining the outcome of surgery. In a similar study [24] this observation was interpreted as indicating shortcomings in operative skills and the importance of dealing with this to reduce surgical complications. This may be an oversimplification. In our study 75% of the reasons for changing the score were technical in nature, but whether these were unavoidable or related to lack of technical proficiency or due to technical error was not evaluated. More work is therefore required to understand the component of surgical competency within the VAS score. Whatever the cause for an increase in the VAS score, the impact was important in the individual cases where this occurred. The increase in complications was most evident in the preoperative low-risk group, where an unexpected event is of greater relative consequence than it is for those in the high-risk group. These preoperative low-risk patients ended up having a similar major complication rate and mortality rate to those in the preoperative higher-risk group (Fig. 2). Despite this, changing the score predicted only 10% of all the major complications, resulting in a small improvement in the overall predictive ability of the postoperative VAS score.

We identified three studies that looked at preoperative [2, 17, 26] risk assessment with a clinical VAS score. In the two small studies [2, 17] the global assessment or “gut feeling” of the surgeon immediately before surgery was not significantly associated with complications; however, a more structured clinical assessment that looked at cardiorespiratory disease, preexisting sepsis, and nutritional status was. In the third study [26] of over 1000 patients, the VAS score, ASA grade, the patient’s age, and a procedural magnitude score were all significantly correlated with adverse events, with the VAS score being the most successful discriminator for severe adverse events. In our study the VAS score, age, and duration of surgery were similar in their predictive ability. These studies all suggest that a careful structured assessment is likely to result in a more accurate preoperative prediction of risk than an intuitive assessment alone. Three studies that looked at a postoperative assessment by the surgeon were also identified [17, 27, 28]. In all three studies, as well as in our own study, the postoperative assessment of risk by the surgeon was highly predictive of complications. In two studies the clinical prediction was noted to be comparable to POSSUM [27, 28]. Although no comparison was made with POSSUM in our study, we have shown that the surgeons’ score can enhance the predictive power of these structured assessment tools.

What is surprising is that the surgeon’s viewpoint has not been given more consideration in the many risk-scoring systems for surgery [4, 5, 7, 11]. Our study, by demonstrating the lack of collinearity between the surgeon’s assessment and the more commonly used predictors, should encourage further attempts to integrate the surgeon’s VAS score into other validated risk models in an attempt to improve them. Similarly, the impact of unexpected events at the time of surgery appears to be rather neglected in risk assessment and further work to evaluate these events from the perspectives of risk management and quality improvement is warranted. The VAS score may also be a useful way to identify a high-risk group of patients for an additional medical or therapeutic intervention. Finally, identifying the comorbidities and surgical risk of the case mix of different health care providers is difficult and may be helped by using a quantifiable clinical measurement such as the VAS score.

Although risk assessment by the surgeon has been shown to be a useful predictor of outcome, these results do need to be further validated because there is no guarantee that another group of surgeons will obtain the same result. More observations are required in other contexts and in different surgical disciplines to determine the repeatability and general applicability of the surgeons’ VAS score. The prospective performance of multifactorial models that incorporate the surgeons’ score should also be further tested.