Introduction

Leaving an abdominal wound partially open after an operation on patients with septic peritonitis was described as far back as 1897 [1]. In 1940, Ogilvie [2] suggested that, rather than try to close an abdominal wound under tension, it should be left completely open and the bowels covered with Vaseline-impregnated gauzes. It was not until much later that open abdomen (OA) therapy became an established form of surgical therapy. In the late 1970s, surgeons started to use OA for treating severe intra-abdominal infections [35] and in the 1980s for relieving intra-abdominal hypertension or abdominal compartment syndrome [6, 7]. When damage control surgery was introduced in the 1990s, with abbreviated laparotomy and OA as a central piece, OA therapy became routine surgical practice in trauma situations [8]. Improved methods of temporary abdominal closure have increased the chances of a delayed primary fascial closure at the end of OA therapy and diminished the need to resort to planned ventral hernias [911]. In recent years, OA therapy has become the treatment of choice in various difficult surgical situations in many hospitals [10, 12].

Managing patients with an OA is a challenge, in both the operating room and the intensive care unit, and the treatment is associated with high mortality and morbidity. In order to improve OA management, it is essential to be able to describe clinical scenarios in a standardized fashion and compare different treatments and outcomes. A classification system for the OA is an important step in this direction. A classification system for abdominal wounds was proposed by Banwell and Téot in 2003 [13] and revised by Swan and Banwell in 2005 [14]. In 2009, an international consensus group, represented by Björck et al. [15], proposed a classification system aimed at OA therapy, describing the status of the abdomen in relation to the complexity of management and the chances of achieving delayed abdominal closure when discontinuing OA therapy. The Björck classification has been applied in several clinical studies on OA therapy [1012, 16, 17]. An amended version was published in 2013 by the World Society of the Abdominal Compartment Syndrome (WSACS), represented by Kirkpatrick et al. [18] (Table 1). No study has been published evaluating any of the aforementioned classification systems.

Table 1 Classification of the open abdomen by WSACSa

The aim of this study was to evaluate the amended OA classification by WSACS with regards to validity and reliability, using a large cohort of patients treated with an OA [10]. In addition, detailed instructions for use of the classification system are proposed.

Materials and methods

Early and late results of OA therapy using vacuum-assisted wound closure and mesh-mediated fascial traction (VAWCM) in 111 consecutive patients have been described in a prospective study [10, 19]. The study was registered in Clinical Trials (http://www.clinicaltrials.gov; registration number: NCT00494793) and approved by the Ethical Committee of Lund University. Primary endpoints were to evaluate fascial closure at the end of OA treatment, complications, and development of incisional and parastomal hernias 1 year after abdominal closure. Median age was 68 years (range 20–91). Indication for OA therapy was mainly visceral (51 %) or vascular (41 %) disease, while 8 % were trauma patients. Most were non-referral patients and were treated with VAWC or VAWCM from the start of OA therapy. Median duration of OA therapy was 14 days (range 4–87), the number of dressing changes was five (range 2–22), and the number of mesh-tightening procedures was four (range 0–10). Outcomes with regards to fascial closure and mortality is shown in Table 2. Delayed primary fascial closure (n = 85) was performed by removing the temporary mesh used for fascial traction, and suturing the fascia with a running PDS suture. A mesh reconstruction was used in eight patients; four due to frozen abdomen with a remaining fascial diastasis of 3–10 cm and four due to previous wound dehiscence. The mesh was placed in a sublay (retromuscular), onlay or intraperitoneal (IPOM) position. Permanent abdominal wall closure was defined as either primary delayed fascial closure or same-hospital-stay abdominal closure with a permanent mesh. Two patients had partial fascial closure with complete skin closure, both due to ossification in the wound, leaving a small remaining fascial diastasis. A total of 16 patients died before it was appropriate to terminate OA therapy and attempt abdominal closure. A further 17 patients died in hospital after fascial closure. Death occurred after 21 days (range 4–37) from start of OA therapy in the former group and in 30 days (range 7–189) in the latter.

Table 2 Clinical outcome of the VAWCM study [10]

The same patient cohort and prospective data, including all operative reports, were used in the current study. In the original publications of the OA classification, clinical scenarios are described in general terms [15, 18]. During the evaluation process, it became apparent to the authors that more detailed definitions of the terms used in the classification were needed in order to facilitate a standardized grading procedure. After discussions between the authors, definitions of terms and instructions on the application of the classification system in diverse clinical scenarios were constructed (Appendix 1).

In the 2013 publication by WSACS [18], the main document defines grade 4 as “established enteroatmospheric fistula, frozen abdomen” whereas supplement 6, outlining the rationale for the amendments, defines it as “established enteroatmospheric fistula.” We adhered to the latter definition and registered all enteroatmospheric fistulas (EAFS), with or without frozen abdomen, as grade 4.

Validity

The validity of the OA classification system by WSACS from 2013 [18], was evaluated by assessing the degree to which the results of the classification (OA grades) corresponded with clinical outcomes (abdominal closure and mortality).

All operative reports from the OA period for every patient (n = 753) were graded by one of the authors (rater 1). The OA classification had not been published when the VAWCM study [10] was initiated in 2006 and therefore was not included in the study protocol. Consequently, OA grades had to be analyzed retrospectively from operative reports. When information necessary for the OA classification, e.g. on the extent of contamination or fixation, was missing from an operative report, it was registered as not present.

OA grades were compared with the following clinical outcome variables: primary fascial closure rate, permanent abdominal wall closure rate (i.e. with or without a mesh), mortality during OA therapy and in-hospital mortality. In order to separate factors associated with failure of fascial closure from factors causing death, fascial closure rate was calculated per protocol, i.e. patients who died with OA were excluded from the calculations.

Any floor or ceiling effect was assessed by calculating the percentage of patients receiving the lowest or highest possible score (least or most complicated OA grade).

The 2013 updates of the OA classification were evaluated by comparison with the 2009 version: all operative reports (n = 753) were graded by rater 1 according to both the current (2013) and the former (2009) version of the classification system and the results compared.

Inter-rater reliability

A sample of operative reports was selected for inter-rater analysis in the following manner. All 753 operative reports were divided into five groups based on their OA grade according to rater 1: 1A (n = 460), 1B (n = 106), 2A (n = 133), 2B (n = 21) and the rarest occurring grades (1C, 2C, 3A, 3B and 4) combined in one group (n = 33). The number of reports selected from each group corresponded to this group’s proportion of the total number of reports. To assure that the smallest group should not be represented fewer than 3 times, a sample size of 108 operative reports was calculated as appropriate. The reports were selected and arranged in a random order using a random number generator function in SPSS. Each of these 108 anonymized operative reports was assessed separately by three independent surgeons (Raters 2–4), registering the OA grade for each operation. The instructions for use were presented to the raters beforehand and used in the grading process. The results of Rater 1 were not used in this comparison, since potential knowledge of the complete clinical course for a patient could interfere with the evaluation of an individual operative report. All raters were surgeons with experience in OA management.

Test–retest reliability (repeatability)

The same 108 operative reports were reassessed by the same raters (Raters 2–4) after a delay of 4–6 weeks and the results compared to the first assessment.

Statistics

Data management, statistical analysis and randomization were performed using SPSS software, version 21 (IBM Corp., Armonk, NY). Continuous variables were expressed as median and range.

For the validity analysis, OA grades were converted into ordinal numbers (1–9) according to internal order (1A–1B–1C–2A–2B–2C–3A–3B–4). Correlation was assessed using Spearman’s correlation coefficient. Differences in proportions between groups were evaluated by χ 2 (Chi square) test or Fisher’s exact test as appropriate. A p value < 0.05 was considered as statistically significant. A floor or ceiling effect was considered to be present when >15 % of patients received the highest or lowest score, respectively [20].

Inter-rater reliability was assessed by calculating the extent to which raters made exactly the same judgment about an operative report. Test–retest reliability was assessed by the consistency in the rating of operative reports by the same rater. Reliability was expressed in proportional agreement and intra-class correlation coefficient (ICC), both with 95 % confidence intervals (CI). Strength of agreement, based on ICC, was interpreted as “poor” (< 0.20), “fair” (0.21–0.40), “moderate” (0.41–0.60), “good” (0.61–0.80) or “very good” (0.81–1.0) [21]. Comparison between the 2013 and 2009 versions of the OA classification system was evaluated using the results of Rater 1, expressed as ICC with 95 % CI [22].

Results

Validity

Initial OA grade

Association between “grade at initial OA laparotomy” and clinical outcome is shown in Table 3. No significant association was found. Eight patients were registered as grade 2 from start, due to adherences from previous operations which were not released completely during the initial laparotomy.

Table 3 Association between initial open abdomen grade and clinical outcome

Most complex OA grade

Correlation between the most complex OA grade registered during the OA period for each patient, and clinical outcome is shown in Table 4. Correlation was found between “most complex grade” and “failure of delayed primary fascial closure” as well as “mortality during OA therapy”. Ninety-one percent of patients had OA grades 1 or 2 throughout the entire OA period and 20 % of patients did not receive a grade above 1A.

Table 4 Association between most complex OA grade and clinical outcome

Deteriorating OA grade

Comparison of patients who developed a more complex OA grade without a later improvement (n = 38) and patients who did not change at all or improved after temporary deteriorating to a higher grade (n = 73), are shown in Table 4. Association was seen between “deteriorating OA grade” and “mortality during OA therapy”.

Grade 1A only

Comparison of patients, who had grade 1A during the entire OA period (n = 22), and patients who at some point received a more complex OA grade (n = 89) is shown in Table 4. In-hospital mortality was 41 % for the patients with grade 1A only, compared to 28 % for the other patients (p = 0.24). No significant difference was found.

OA Grade at abdominal closure or death

Abdominal closure and mortality for all 111 patients is shown in Table 2. Clinical course of the OA in all 111 patients is shown in Appendix 2, with OA grade at each dressing change operation, fascial closure or death presented chronologically. Among patients who received delayed primary fascial closure (n = 85), 67 % had grade 1A and 32 % grade 2A at fascial closure, while one patient had grade 1B, due to contamination from a urinary bladder perforation, repaired simultaneously. In the mesh group (n = 8), four patients had grades 1A or 2A at closure and received a mesh not for technical reasons but due to poor fascial quality from previous wound dehiscence, while the other four had a clean, frozen abdomen (grade 3A). The two patients who received partial fascial closure (due to ossification) had grades 1A and 2A at closure. Patients who died during OA therapy (n = 16) did not have more complex final OA grade than those who survived until abdominal closure (p = 0.10). Patients who died in hospital after fascial closure (n = 17) had previously been closed at grade 1A (n = 14), 2A (n = 2) or 4 (n = 1), with delayed primary fascial closure in all but two (both closed with mesh at grade 1A).

Contamination

Patients who had contamination (grades 1B or 2B; n = 47) at the index operation had similar delayed primary fascial closure rate (p = 1.0), permanent abdominal wall closure rate (p = 1.0), mortality during OA therapy (p = 0.75) and in-hospital mortality (p = 0.74), as did patients who had a clean abdomen from start (grades 1A or 2A; n = 40). Patients with contaminated abdomen as the most complex grade (grades 1B or 2B; n = 41), did not differ from patients with corresponding clean grades (1A and 2A; n = 46) with regards to closure rates or mortality (p = 0.12, 1.0, 0.73 and 0.74 for delayed primary fascial closure, permanent abdominal wall closure, mortality during OA therapy and in-hospital mortality, respectively).

Fixation

Patients with developing fixation as the most complex registered grade (grades 2A and 2B; n = 33), had similar delayed primary fascial closure rate (p = 0.63) and permanent abdominal closure rate (p = 0.37) as the corresponding non-fixated grades (1A and 1B; n = 54). Mortality during OA therapy was similar (p = 0.73) but in-hospital mortality was lower in the group with fixation (15 vs. 35 %, respectively; p = 0.042).

Frozen abdomen

Delayed primary fascial closure rate was, per definition, zero in patients with frozen abdomen (n = 5). However, permanent abdominal wall closure using a mesh was achieved in all four surviving patients. Permanent abdominal wall closure rate was similar to the other patients (p = 1.0), as was mortality during OA therapy (p = 0.55) and in-hospital mortality (p = 1.0).

Enteric leak

Twenty-four patients had an enteric leak at the initial OA laparotomy. Source control was achieved in all patients and grade C was changed to B or A at the next operation. Ten patients (two of the 24 plus further eight patients) developed an enteric leak during OA therapy (Appendix 3). Four were successfully treated and two of them survived to be discharged from hospital, while the other six developed an EAF. Patients with enteric leaks (n = 32) had similar delayed primary fascial closure rate (p = 0.23) and permanent abdominal wall closure rate (p = 0.052) compared to the other patients. Mortality during OA therapy was higher (p = 0.001) but in-hospital mortality was similar (p = 0.15). For the patients who developed a new enteric leak (n = 10), delayed primary fascial closure rate (p = 1.0) and permanent abdominal wall closure rate (p = 1.0) was similar but both mortality during OA therapy (p = 0.001) and in-hospital mortality was higher (p = 0.001).

Enteroatmospheric fistula

Out of the ten patients who developed an enteric leak during OA therapy, six became established EAFs (Appendix 3): three after unsuccessful surgical treatment and three after conservative treatment from start. Four established EAFs were treated with passive drainage and eventually became frozen abdomens. Two were actively treated and remained non-frozen, one of which was ultimately turned into a controlled enterocutaneous fistula (ECF). None of the six EAF patients survived to be discharged from hospital. Both mortality with OA (p = 0.004) and in-hospital mortality (p = 0.001) was higher for patients with EAFs but there was no difference in abdominal closure rate per protocol (p = 1.0 for both delayed primary fascial closure and permanent abdominal wall closure).

Floor and ceiling effect

A floor effect was observed, with 22 of 111 patients (20 %) having grade 1A as the most complex grade received throughout the OA period. Out of all 753 operative reports, 459 (61 %) were grade 1A. Six patients (5 %) received the most complex grade, indicating absence of a ceiling effect.

The 2013 modification of the OA classification system

The comparison of the results according to the current 2013 version and the former 2009 version of the OA classification, showed that 14 out of 111 patients (13 %) received one or more different grades (Appendix 4), while 97 had identical grades in both systems.

In the 2009 version, enteric leaks, EAFs and ECFs were mixed together. Former grade 3 was defined as “OA complicated by fistula formation”, i.e. enterocutaneous or EAF, while a fistula in combination with a frozen abdomen was defined as grade 4. Using the 2013 classification, these categories are separated (Appendices 3 and 4). The difference in clinical outcome for patients with enteric leaks and EAFs is presented above.

In the 2009 version, frozen abdomen, EAFs and ECFs were mixed together. Former grade 4 was defined as “frozen OA with adherent/fixed bowel, unable to close surgically, with or without fistula”. Using the 2013 classification, these categories are separated (Appendices 3 and 4). Comparison of clinical outcome for patients with a clean, frozen abdomen (n = 5) and patients with EAFs (n = 6) in our patient cohort showed that delayed primary fascial closure rate was lower for patients with frozen abdomen (zero, per definition), whereas definitive abdominal wall closure rate was similar (p = 1.0), mortality during OA therapy was 66 % in the EAF group and 20 % in the frozen abdomen group (p = 0.24) but in-hospital mortality was significantly higher for EAF patients (100 vs. 20 %; p = 0.015).

When all 753 operative reports were compared, a difference was seen in 33 (4 %). The intra-class correlation coefficient between the current 2013 and former 2009 version was 0.93 (95 % CI 0.92–0.94).

Reliability

Inter-rater reliability

Inter-rater reliability is shown in Table 5. Agreement was calculated between each pair of raters (rater 2 vs. 3; 2 vs. 4; and 3 and 4, respectively) and was found to be ‘good’ to ‘very good’. There was no difference in agreement when the least complex operative reports (grade 1A) were excluded. Agreement between all three raters was seen in 61 (56 %) of the operative reports and total disagreement in 1 of 108 (1 %).

Table 5 Inter-rater and test–retest analysis of the open abdomen classification by WSACSa

Test–retest reliability

The test–retest reliability (repeatability) is shown in Table 5. Agreement was found to be ‘very good’ for all raters. Agreement between all three raters simultaneously was the same in the retest as it was in the first test (56 %).

Discussion

This is the first methodological evaluation of the OA classification system since the original publication by Björck et al. [15] in 2009. In 2013, an updated version was proposed by WSACS [18], adjusting the definition and hierarchy of enteric fistulas. We now present a validity and reliability analysis based on a large group of patients treated with an OA, consisting mostly of elderly, non-trauma patients with visceral or vascular surgical disease. With an early evaluation of the updated version of the OA classification system and the development of detailed instructions for use, we hope that the results may be of benefit for future application of the system—in upcoming studies or in clinical practice.

The OA classification had not been published when the VAWCM study was initiated in 2006 [10] and was not included in the study protocol. Consequently, these data had to be extracted retrospectively, probably causing over-representation of lower OA grades, since operative reports where, for example, fixation or contamination was not mentioned were registered as if it was not present. Prospective registration is recommended in future studies, with the OA grade registered at the end of each surgical procedure.

When initially grading the operative reports, we noted that many clinical scenarios were not straightforward with regards to the classification, requiring the rater’s own interpretations of the terms used in the classification. For example, should adherences around stomas, from earlier operations or those released during the same operation be registered as developing fixation? Should bowel necrosis, wound infections, or urinary tract perforations be registered as contamination? Should a leakage from a gastrostomy entry or from an excluded rectal stump, a perforated hepaticojejunostomy or a perforated Bricker urostomy be registered as contamination, enteric leak, or fistula? After discussions between the authors, detailed definitions of terms and instructions for use of the classification system were constructed. These were carefully read by the three independent raters before the reliability analysis.

The OA classification system was designed to describe the clinical course of the OA itself and not the prognosis in general. Consequently, it is perhaps not entirely fair to use mortality as a parameter in the validity evaluation. However, due to the high morbidity and mortality in this group of patients, we felt it reasonable to evaluate the OA classification in the broader clinical perspective, i.e. to include mortality as well as fascial closure.

The results of the validity analysis were somewhat conflicting. On one hand, more complex OA grades did indeed correlate with worse clinical outcome, indicating high validity of the OA classification. On the other hand, no correlation was found between initial grade, 1A only, contamination, fixation and frozen abdomen, and clinical outcome, indicating the low validity. Moreover, the floor effect, with a large number of patients belonging to the lowest grade and being indistinguishable from each other, reduced the validity of the OA classification system further. However, this lack of strong association between OA grades and clinical outcome, as well as the floor effect, does not necessarily indicate poor validity of the OA classification in general. When the OA classification was first published in 2009 [15], OA therapy resulted in frozen abdomens and/or fistulas in the majority of patients. Under such circumstances, the scores would have been more evenly distributed on the scale. With modern methods of OA management such as VAWCM and the possibility of early mesh reconstruction, fascial closure can be achieved in most patients, regardless of initial contamination, developing fixation or frozen abdomen—many of the main parameters of the OA classification.

The inter-rater analysis might also be affected by this homogeneity of patients (with many patients in the lowest category and very few in the highest grades), i.e. causing higher agreement. Due to the time-consuming nature of the rating process, only a part of the operative reports could be analyzed by the external raters. After consulting a statistician, it was decided that the sample should contain OA grades in the same proportion as the whole group. While statistically correct, the sample contained very few operative reports belonging to the most severe grades. Rating of such reports would probably have resulted in more diversity between raters. On the other hand, high inter-rater and test–retest agreement could also demonstrate the advantage of precise definitions of terms and detailed instructions for use, presented to the external raters before the analysis.

The definition of an enteric fistula, according to the original 2009 version of the OA classification, comprised both ECFs and EAFs, whereas these are separated in the updated 2013 version. This is supported with our clinical observations in that these conditions represent very different clinical scenarios: an EAF in the middle of exposed bowel loops is extremely difficult to manage, while an ECF can be controlled with a simple stoma bag. In fact, a few EAFs in this study were treated by externalization into a controlled ECF outside of the open abdomen. Our interpretation that the definition of grade 4 should include EAFs not only in a frozen abdomen but regardless of fixation status, is supported by these findings. Separating frozen abdomens from EAFs in the updated 2013 version, with EAF now being the most serious OA grade, is also supported by our own findings. Patients who develop EAFs have the poorest outcome, in contrast to those with a frozen abdomen, who have quite a favorable outcome. The concept of enteric leak was introduced in the amended 2013 version of the OA classification. In contrast to an EAF, an enteric leak has not become permanent and has the possibility of immediate surgical repair. Enteric leaks in our cohort were previously (according to the 2009 version) registered as either contamination or enteric fistula. Moreover, several non-permanent enteric perforations were registered as fistulas, as well as a few non-enteric fistulas. This potential variation in the interpretation of fistulas might result in large differences in the reporting of fistula incidences in similar patient cohorts. This issue of uncertainty has now been resolved with the possibility of classifying intermediate states as an enteric leak.

Conclusion

In this validity and reliability analysis of the OA classification system by WSACS from 2013, some variables of the classification (most complex grade, deteriorating grade, grade C, grade 4), were found to be associated with worse clinical outcome, while others (initial grade, 1A only, contamination, fixation, frozen abdomen) were not. The VAWCM technique generated favorable clinical results and pooling of patients in the lower OA grades, resulting in a floor effect in the validity analysis. Written instructions for use, together with a prospective registration are essential to achieve high reproducibility between users of the OA classification. Every effort should be made to prevent patients from ascending to a more complex OA grade, to try to repair enteric leaks and to avoid EAFs.