Introduction

The abdominal wall is a morphological and structural entity, which stabilizes the intra-abdominal organs and increases intra-abdominal pressure by muscle contraction.

An abdominal press is essential for crucial body functions such as breathing, coughing, and defecation. An incisional hernia interferes with the function of the abdominal wall and has an incidence of 7–15% after laparotomies [1, 2]. Post-operative complications and long-term outcomes of incisional hernias vary depending on risk factors, morphology, and size of the hernia as well as the surgical technique used [3]. Retromuscular mesh implantation (sublay-technique according to Rives–Stoppa) is one of the standard procedures to treat incisional hernias [4,5,6]. In open-repair of median-subxiphoidal incisional hernias, insufficient preparation of the fatty triangle followed by incorrect mesh placement without cranial retroxiphoidal extension causes recurrent herniation and in consequence affect patients’ quality of life [7,8,9].

Experience, knowledge of anatomy, and dexterity are considered the most important predictors for good outcomes. In this context, the axiom primum non nocere requires young surgeons to “train before doing” rather than exercise on narcotized patients [10]. Despite the economic and ethical demands surgical education is confronted with, simulation models have begun to augment surgical training possibilities and correlate positively with the surgical outcome [11,12,13]. In a recent publication, the authors have developed and validated a silicone-based model for umbilical-hernia repair with mesh in preperitoneal position [14]. Nevertheless, high-fidelity models for incisional hernia repair have not been described yet.

In this study, we developed a new silicone-based single-use full procedural model for incisional hernia repair with retromuscular mesh reinforcement. Data for construct-validity (the ability of a model to adequately reflect performance of beginners when compared to experts) and criterion validity (autopsy data) [15] are presented. To address the question whether protocols used for performance assessment adequately measure and discriminate different grades of performance, reliability was calculated.

Material and methods

Description of the high-fidelity model and the surgical procedure

The model was developed and produced by the Division of Hernia Repair and Abdominal Wall Reconstruction of the Department of General, Visceral, Vascular and Pediatric Surgery of the University Hospital of Wuerzburg and the Institute of Medical Teaching and Medical Education Research of the University of Wuerzburg. Elements used to construct the model were different types of fabrics imitating connective tissues, resins for bone-casting, 2-component silicones of different strength and stickiness-grades, artificial blood as well as different pigments. Artificial blood chambers were implemented to have the model bleed when incised. A well-shaped male volunteer was used to ensure anatomical proportions of the model are adequately reproduced.

Simulation of the surgical procedure included the following steps: (1) incision of the skin (Fig. 1a); (2) preparation of the subcutaneous fatty tissue up to exposure of the hernial sac and the linea alba at the midline; (3) opening of the hernial sac (requiring participants not to damage the imaginary adherent bowels) and incision of the linea alba, from the xiphoid bone (cranially) to the pubic symphysis (caudally); (4) identification of the medial borders of the rectus muscles and longitudinal opening of the anterior rectus sheath, requiring not to damage the posterior rectus sheath (Fig. 1b); (5) lateral mobilization of the rectal muscle without damaging epigastric vessels, which were loaded with artificial blood. (6) preparation of the fatty triangle according to Conze et al. [7] (Fig. 1c, d); (7) closure of the posterior rectus sheath with running suture in small-bites technique with invagination of the hernial sac remnants [16] (monofilament 2-0 USP suture) (Fig. 1e); (8) positioning of the 30 × 14 cm large mesh (e.g. commercially available large pore, non-absorbable) on the posterior rectus sheath, requiring to undergird the xiphoid bone in the area of the fatty triangle and to undergird the pubic symphysis; this mesh position is known as retromuscular position (syn.: sublay position or Rives–Stoppa procedure) [4, 6,7,8]; (9) fixation of the mesh (braided 2-0 USP suture) (Fig. 1e); (10) closure of the anterior rectus sheath with small bites running suture (Fig. 1f); (11) closure of the skin with vertical mattress suture, according to Donati [17].

Fig. 1
figure 1

Repair of a median xipho-pubic incisional hernia on the high-fidelity model. a Incision of the skin; b opening of the rectus sheath on its medial border; c preparation of the fatty triangle by insertion of the posterior rectus sheath at its attachment to the xiphoid bone; d depiction of the completed preparation at the fatty triangle level (yellow lines: detached insertion of the posterior rectus sheath from the xiphoid bone; red stars: vertices of the triangle at the beginning of the linea alba); e suture-closure of the posterior rectus sheath; f mesh fixation after positioning it on the posterior rectus sheath in retromuscular position, cranially behind the xiphoid bone and caudally behind the pubic symphysis; and g suture-closure of the anterior rectus sheath (color figure online)

The individual components of the model (10 specific anatomical structures out of 9 different basic materials in 5 variations) were prepared in advance, allowing for the construction of four models per day, for an estimated material cost of €90.00.

Trainees, raters and assessment-tools

Medical students in their internship-year with vocation for surgery were recruited as beginner-trainees, whereas general surgeons were recruited as experienced-trainees. Each participant performed one operation, assisted by a third party student enrolled in surgical rotation. The assistant was allowed to help the participant according to his instructions. Every surgery was video recorded for further evaluation and supervised by the coordinator of the study.

The video records of each procedure were pseudonymized before being evaluated by three blinded-independent raters. The raters followed standardized criteria using a modified version of the Competency Assessment Tool (CAT), adapted to the peculiarities of the model [18]. The CAT is based on the “Operative Performance Rating System” (OPRS), used by the American Board of Surgery for certification of residents. It is an assessment tool for surgery performance and consists of four categories of procedural skills: instrument use, tissue handling, near misses and errors, and end-product quality [19,20,21].

The three raters had the following qualifications: one rater was familiar with the model but had little experience with the surgery in vivo (PhD Student in charge of the development of the model), the second rater was familiar with the surgery but not the model (general surgeon with known expertise in incisional hernia repair), and the third rater had comprehensive knowledge of the model as well as the incisional hernia repair (creator of the model and acknowledged hernia specialist). Video-recordings were rated using a web-based software, engineered for this purpose by the Institute for Artificial Intelligence and Applied Informatics (VI) of the University of Wuerzburg (CATLive). The algorithm for rating the videos is shown in Fig. 2.

Fig. 2
figure 2

Web-based video assessment-software (CATLive)

Autopsy data of the model were rated in terms of esthetics, preparation of the fatty triangle, suture of the linea alba, the suprapubic area, integrity of the rectus sheath, lateral mesh overlap, mesh position at the symphysis and mesh fixation. Rating was conducted using a standardized questionnaire ranging from 4 = excellent to 1 = insufficient. We expected significant differences for model´s autopsy data between beginners and experts regarding the fatty triangle.

Statistics

The required sample size of trainees for each group was not determined a priori. instead, a sequential triangular test was used [22, 23]. The triangular test is statistically extremely robust and offers the opportunity to terminate testing at the optimal rather than a fixed number of participants [24, 25]. The latter is beneficial when trials are cost-intensive such as the model described. Three results are possible in triangular testing: (1) H1 is true, there is a difference between beginners and experienced participants, the line leaves the grey area in favor of the area demarked as H1; (2) H0 is true, there is no difference between beginners and experts, the line leaves the grey area in favor of the area demarked as H0; or (3) The data at hand do not allow to determine whether H1 or H0 is true, the line remains in the grey area, testing has to be continued with more participants (Fig. 3).

Fig. 3
figure 3

Example of three possible outcomes in triangular testing: a  testing reached significance (green point leaves the gray triangle in direction of the H1 hypothesis area) after inclusion of the first three cumulative results. In this case, the trial can be terminated with no additional participants required. b  after three trials, the H1 hypothesis can be rejected since significance cannot be reached (red point leaves the gray triangle in direction of the H0 hypothesis area); in this second example, testing can be terminated as well, however with the conclusion that there is no significant difference between participants. As long as trials remain in the grey area, testing needs to be continued to determine whether H0 or H1 is true (color figure online)

The Cronbach’s α-value was used to assess scale-reliability; values > 0.70 were considered as good [26]. In case of low variance between raters, the α-value cannot be computed and alternatively the Finn coefficient was used. To investigate construct-validity, the Welch test was used to detect differences in mean ratings.

The review and ethics board of the Medical Faculty of the University of Wuerzburg was consulted and it did not consider an approval necessary, since the study protocol was not deemed to represent biomedical or epidemiological research (Protocol No. 20161013 02). Consent to video-recording of the procedure without the possibility of personal identification was obtained at the beginning of the study.

Results

The evaluation criteria, “instrument use” (Fig. 4a) showed excellent internal reliability (α = 0.969), testing could be determined after four trials (eight participants) in favor of H1. The evaluation of “tissue handling” (Fig. 4b) showed excellent internal consistency (α = 0.974) as well. The testing could be determined after three trials (six participants) in favor of H1. A similar result was found for the evaluation criterion “near misses and errors” (α = 0.811), where H0 could be rejected after three trials (Fig. 4c). For the evaluation of “quality of the end product”, the α-value was 0.883 with H0 being rejected after three trials (Fig. 4d).

Fig. 4
figure 4

Triangular-test for expected discriminatory significance between beginners and experts, according to the four evaluation criteria of the CAT (construct-validity). For all the four criteria, H0 could be rejected after 3–4 cumulative tests: a instrument use; b tissue handling; c near misses and errors; and d end-product quality

Beginners were younger than experts (mean age 27, 36 years resp.), five participants were female and seven male. Mean time of surgery was 113 min (± 12) for beginners and 56 min (± 18) for experts. Mean time for rating each video was 17 min (± 5 min). All operations were conducted as planned, no problems occurred with the material employed.

Reliability and Construct-validity: procedural skills (addresses all four criteria) were measured reliably and the α-value turned out to be extremely high (α = 0.990). For “instrument use”, the α-value was 0.969, for “tissue handling”, the α-value was 0.974, for “near misses and errors”, the α-value was 0.811, and for “end-product quality”, the α-value was 0.883. Table 1 shows a summary of the reliability values (Table 1).

Table 1 Validation of the questionnaire for the evaluation of construct-validity (CAT)

For all evaluated criteria, significant differences were found between beginners and experts (Fig. 5).

Fig. 5
figure 5

Construct-validity: average scores achieved in the scales to measure procedural skills “instrument use”, “tissue handling”, “near misses and errors”, and “end-product quality”. The scores are represented as mean ± standard deviation, as evaluated online by the three independent raters. ***p < 0.001 (Welch-test)

Criterion validity: for autopsy data, a significant difference was determined for the fatty triangle .While beginners achieved a mean rating of 2.83 (SD = 0.41), experts received a mean rating of 2.17 (SD = 0.41), from four being excellent to one being insufficient; p < 0.05. Experts and beginners reached comparable scores at the remaining evaluation criteria (Fig. 6).

Fig. 6
figure 6

Criterion validity (results of the autopsy for the operated models): beginners reached a significantly higher score at positioning the mesh behind the fatty triangle than experts (*p < 0.05). See explanation in the discussion below

Discussion

Realistic anatomic models to learn complex multi-step-procedures and to practice technical skills have increasingly become important, whereas an entire framework for the development of knowledge, skills, and attitudes exists for other surgical procedures such as laparoscopic cholecystectomy, research for the design of a surgical simulation curriculum in the field of hernia repair is just beginning to emerge [27]. Recently, a prototype of an umbilical hernia repair model for open mesh repair in the preperitoneal plane was described and validated; different from the current protocol (which focuses on outcome criteria). The umbilical hernia repair model was designed to measure the progression of skills in sequential procedures [14]. The high-fidelity model proposed is considered an improvement of the latter since it introduces several levels of morphological complexity and challenges experienced-surgeons as well. It was designed to perform an open retromuscular mesh repair, according to the advices of its creator, the French surgeon Jean Rives [4]. A recent meta-analysis comparing open retromuscular repair with laparoscopic procedures has not only confirmed its safety regarding recurrence, but has also revealed that the open repair is associated with more perioperative complications [6]. These findings emphasize the importance of selecting the right patients as well as continuously refining skills and strategies to reduce the complication-rate of the surgery. According to Conze et al., adequate preparation of tissues at the subxiphoidal fatty triangle is one of the most demanding parts of median retromuscular mesh reinforcement, the reason why the evaluation of this specific outcome in the present model is so important [7, 8].

The aim of the study was not to investigate the learning process itself. By comparing beginners to experts, the focus lies on surgical-outcome criteria, addressing the question what was learned, not how it has been learned. To investigate how motor-skills required for incisional hernia surgery with mesh implantation are learned, a repeated-measures-design would be required [28]. Other teaching methods, for example the four-step approach according to Peyton [29] or the implementation of self-explanation prompts [30, 31] may shed further light on cognitive processes which underlie acquisition of complex forms of knowledge and skills [32].

Cronbach's α-value was used to assess scale’s reliability; values > 0.70 were considered as good [26]. Cronbach's α can be compared to the more commonly used correlation coefficient. Similar to the correlation coefficient, Cronbach's α is an effect size, created to measure the associative strength of items. The rationale of the coefficient is that items which measure an identical construct should have a high “correlation”, referred to as “internal consistency” or “reliability”. The CAT scale, “tissue handling” for example consists of three rating scale-items: “carefulness of dissection of the sac“, “dissection technique for entry into the rectus sheath“, and “use of non-dominant hand“. To make valid statistical inferences when referring to these three scale-items as “one scale”, they should “correlate” high (good internal consistency). In cases with low variance between the scores attributed by the three raters, the α-value could not be computed and alternatively the Finn coefficient was used. It can be interpreted in the same way as Cronbach's α.

The design of the current model offers many advantages. It meets the criteria defined for a full-procedural simulation model, enabling to perform an entire surgery rather than training isolated steps [33]. The realistic surface feel of the silicone-model combined with the reliable web-based assessment tool (CAT), lends itself to implementation into an augmented reality setting [34]. Presently, we are investigating computer environments in combination with mathematical algorithms to model various scenarios. In combination with the CAT, the model enabled us to reliably measure learning gain and technical skills, two crucial aspects for individual evaluation and feedback. In the future, it may not only be used for training but also for assessment of performance. Young trainees (beginners) can be assessed whether they have the necessary technical-skills before being allowed to operate a patient under supervision. Experienced surgeons (experts) can be updated on new evidence regarding a surgical technique (for example, the preparation at the site of the fatty triangle) or assess whether their longtime idiosyncrasies regarding technical details conform to modern standards. The latter being of utmost importance for experienced surgeons. New evidence is very difficult to implement into daily routine [35]. Implementing new evidence against the “positive longtime experience of how-I-do-it” of experienced surgeons may be even more challenging. The paradox finding that beginners outperformed experts in the most difficult aspect of surgery, namely preparation of the fatty triangle, may illustrate the problem. It is assumed that experts were too comfortable regarding their way to do it and did not pay enough attention to the standardized instructions but relied on their experience instead (“I already know how to do it.”). As a consequence, they were blind to the new task, a cognitive bias that can probably be explained by the Dunning–Kruger effect, which describes the gap between self-confidence (perceived ability) and performance (actual score) in the average population [36]. Other researchers have assigned these types of errors to subconscious mechanisms (in addition to intuition and thinking) accompanying an action (decision making). The latter may help understand how it is possible that surgeons incur technical errors (for example bile duct injury, or in our case, preparation of the fatty triangle) despite seeing the error occur [37]. It is possible, that teaching inexperienced minds may be easier than changing established behaviors. In this regard, the validation study presented may be a contribution for developing new strategies in continuing education. Considering the results and these remarks, the proposed high-fidelity model proved suitable for experts as well as beginners to perform a complete incisional hernia repair with mesh implantation in retromuscular position. In future, updating experts will probably have to include self-assessment to reflect on “unconsciously established how-I-do-it” and contrast it to the new “how it should be done” [38].

Since incisional and ventral hernia repair is becoming more and more tailored to the individual patient [6, 38], further models for laparoscopic repairs including robotic procedures, and also for transversus abdominis release are required [6, 39,40,41]. The high-fidelity model described here is considered a contribution to this development.

Conclusion

The model is a full-procedure-model thoroughly mimicking incisional hernia repair for open retro-muscular mesh implantation with preparation of the fatty triangle. In combination with the CAT, it can be used for training as well as assessment. Future studies will investigate simulation of high-risk-scenarios by altering the model and implementing it into an augmented reality setting.