The challenging new skills needed during laparoscopic surgical procedures require the surgeon to overcome a steep learning curve. Medico-legal concerns and time constraints nowadays control operating room schedules, so the best way to overcome this learning curve seems to be during formal training in a skills laboratory [1]. Because of the complex nature of laparoscopic suturing, numerous curricula have been developed to optimize skill acquisition by implementing known strategies from educational theories. Distributed [2], structured [3] practice using performance-based endpoints [4] has now become the method of choice for teaching laparoscopic suturing using simulation, with excellent results in terms of retention of skill and in vivo transferability [5, 6].

Furthermore many educational theories address the need for expert feedback [1]. Any kind of motor skill learning can be divided into three phases [1, 7, 8]: the cognitive phase, where the trainee tries to understand the different steps of the task; the associative phase where he/she practices the skill, integrating the knowledge of the task into the appropriate motor behavior; and finally the autonomous phase, where the skill is performed without cognitive awareness. It is especially during the associative phase that feedback is felt to play a major role. Different types of feedback refer to performance-related information that is obtained by the trainee himself through the sensory system (intrinsic feedback) or provided by an external source (extrinsic feedback) [8, 9]. This extrinsic feedback, traditionally by an expert, is important to motivate trainees to keep on improving their performance and also to provide them with solutions on how to do this [810].

Of course, distributed training under expert supervision requires considerable faculty time commitment and is subject to scheduling conflicts. Therefore computer-based video training (CBVT) is emerging as a new means to circumvent some of these logistic problems by reducing geographic and temporal constraints on training and decreasing demands on faculty members [8, 10, 11]. Likewise, it has been suggested that collaborative learning using peers to provide external feedback might be a valid alternative [12]. On the other hand, others claim that peers may not sufficiently master the cognitive aspects of the task to assess their partners and might not be able to provide instructions for correction [7]. A thorough cognitive instruction session, structured CBVT, and the use of benchmark criteria might overcome these drawbacks.

In this study we wanted to investigate whether a combination of CBVT and peer feedback could replace external feedback by an expert during proficiency-based laparoscopic suturing training, without increasing the length of the learning curve and resulting in an equally long-lasting high-quality performance.

Materials and methods

Participants

The study population consisted of senior medical students. Only pre-residency interns from the surgical disciplines (general surgery, urology, plastic surgery, and orthopedics) were accrued. A questionnaire concerning demographic information (age, gender, and dexterity), prior surgical or camera navigating experience (number of procedures), and simulator or video game experience (on a ten-point Likert scale) was administered. Psychomotor innate ability was measured as the average time score (in seconds) on three trials of the bean drop and running string exercises on the box trainer [13]. After this baseline testing all subjects completed our basic laparoscopic skills curriculum: daily training on four Southwestern box trainer tasks [13] and one laboratory-specific task [14] until predetermined proficiency criteria were reached.

Study design (Fig. 1)

All research was conducted at the Centre for Surgical Technologies, KU Leuven, Belgium. After baseline testing and accomplishment of the basic laparoscopic skills curriculum, all subjects attended a 1-h hands-on instruction session about intracorporeal suturing and knot tying. Video instructions were combined with live expert demonstrations and feedback. This introduction session was organized in order to allow trainees to capture the cognitive part of the procedure and familiarize themselves with the equipment. Following this initial instruction session, all subjects were pretested on their suturing and knot-tying skills (one trial of the suturing and knot tying exercise). Based on this pre-test score and baseline testing results (demographic features and innate psychomotor ability) students were divided into two balanced groups: an experimental (CBVT + peer feedback) and a control (expert feedback) group.

Fig. 1
figure 1

Study design. Dotted squares indicate training sessions that included trial scoring

All students attended four additional daily practice sessions of 60 min in intracorporeal suturing and knot tying. They trained individually (one student per box trainer with fixed laparoscope) and according to the training condition assigned. Starting from the second training session, students evaluated every suturing and knot-tying trial according to a scoring system including time and errors (described below). During the instruction session and the first training session, trials were not evaluated, allowing students to train without performance pressure. Training was completed when the proficiency criterion was reached (previously determined expert level on two consecutive attempts) and five additional trials were performed for reinforcement. Only when students were not able to reach proficiency during these 4 h was additional deliberate practice (without CBVT or feedback) allowed. One week after training completion students performed a post-test (mean of three trials of the suturing and knot-tying exercise). A delayed retention test (mean of three trials of the suturing and knot-tying exercise) was held after a 4-month rest period. Students were allowed to watch the instructional video once more before these test moments. After the retention test students performed ten additional suturing and knot-tying trials to evaluate their ability to re-achieve the proficiency criterion. During pre-, post- and retention testing, the evaluation (time and error) of the knots was performed by a research fellow.

Suturing task model

A Penrose drain model was created by fixing two 15.0-cm-long Penrose drain pieces to a cork plate with thumb nails. The two pieces of Penrose drain were fixed 1.5 cm from each other in order to simulate tissue under traction. Targets were marked in blue ink every 2.0 cm on each piece of Penrose drain. The trainees had to penetrate the first piece of Penrose drain, reposition the needle, and penetrate the second piece of Penrose drain. Afterwards, a sliding knot using the C-loop technique as described by Szabo et al. [15] was created.

Performance measure

Each trial of the exercise was assessed quantitatively (time) and qualitatively (error score). Completion time was obtained using a stopwatch. The exercise started when both instruments were inside the box trainer and the assisting grasper was holding the suture material. Time was stopped when both suture tails were cut to 1 cm by the laparoscopic scissors. The error score was adapted from a previously described scoring system [5]. First, one penalty point was given per millimeter between the suture and the premarked targets, and per millimeter gap between the two pieces of Penrose drain after the knot was tied. Furthermore, after tying the knot, the two pieces of Penrose drain were pulled in opposite directions. A knot that slipped open before the Penrose drain ruptured was assigned one additional penalty point. The error score was determined by summing these penalty points and multiplying them by a factor of 10. Overall score was obtained by adding the error score to completion time. The maximum score was set at 600 s. In this way, the penalty for an insecure knot in our study might appear less severe compared with previous data (one instead of ten penalty points [5]). However, under the traction of our Penrose drain model, an insecure knot automatically loosened, thereby increasing the error score. Expert level was defined as the mean performance score (outliers excluded) of ten trials of two expert laparoscopists [5].

Training conditions

Control group: Training sessions started with a repetition of the instructional, stepwise video demonstration (right-handed instructions). Afterwards, feedback was provided during the entire training session by a research fellow with extensive cognitive and hands-on practice in laparoscopic suturing and knot tying. Demonstrations by the research fellow were permitted. Feedback frequency was determined by the student (if they had questions) and as judged necessary by the research fellow. Feedback consisted of constructive ways to improve performance and followed closely the script used in the instructional video.

Experimental group: Training sessions started with a thorough comprehensive reading of a detailed written instruction and watching the same instructional, stepwise video demonstration two times. Afterwards, students performed suturing and knot-tying trials individually with continuous possibility of watching the video tutorial. Peer feedback was encouraged but only when desired by the trainee.

Statistical analysis

The difference in amount of trials needed to achieve expert performance (length of learning curve) and differences in performance scores were investigated with Mann–Whitney tests. Proportions were compared using the chi-squared test, and Spearman’s rho was calculated for correlations. Levene’s test was used to detect differences in variance between groups. All data are shown as median (range). Since several comparisons were made, only p-values <0.01 were deemed significant (correction for multiple testing).

Results

Pre-training and post-training evaluation data were available for ten students in each group. One female student in the control group did not attend the delayed retention test because she changed university, and was excluded for this particular analysis. No significant differences were found in the baseline characteristics of the groups (Table 1).

Table 1 Demographic features

All students were able to reach expert level (145 s) on two consecutive attempts. Counting from the second training session, students reached this proficiency criterion after a median of 15.5 (6–25) trials in the experimental group versus 19 (6–43) trials in the control group (p = 0.28; Fig. 2). Compared with proficiency, students improved their performance by 241% and 210%, respectively (p < 0.001; Fig. 3). Independent of the training condition, the amount of trials needed to achieve proficiency was borderline significantly correlated with psychomotor innate ability (Spearman’s rho = 0.53, p = 0.017).

Fig. 2
figure 2

The amount of trials needed to achieve proficiency (counting from the second training session) did not differ between groups (p = 0.28)

Fig. 3
figure 3

On pre-, post-, and retention testing, performance scores for both groups were significantly worse (p < 0.01) compared with the proficiency criterion reached at training completion (expert level, 100%). * Control group, n = 9

Performance scores for the experimental group were 190 (145–280) s on post-test and 220 (156–233) s on retention testing (31% and 51% deterioration compared with proficiency; p < 0.001). For the control group, performance scores were and 192 (153–262) s and 223 (130–275) s, respectively (32% and 54% deterioration compared with proficiency; p < 0.001 and p = 0.004) (Fig. 3). Median performance scores were not significantly different at post- (p = 0.63) or retention testing (p = 0.60). At retention testing, the control group displayed a significantly higher variance in performance scores (Levene’s test = 0.008) (Fig. 4). There was no correlation of psychomotor innate ability with performance scores at post- (Spearman’s rho = 0.11, p = 0.66) or retention testing (Spearman’s rho = 0.005, p = 0.98). During the ten additional trials after retention testing, four students in the experimental group and six students in the control group were able to reach the proficiency criterion (p = 0.396).

Fig. 4
figure 4

Box plot of performance scores at post- and retention testing (Mann–Whitney U-test: p > 0.01). * Control group, n = 9. Levene’s test: p = 0.008

Figure 5 shows the number of penalty points given during training and at post and retention testing. During training, the experimental group more frequently assigned zero penalty points. In the control group the median amount of penalty points per suture was very low (<2) and no learning curve (decrease with trial number) was observed. The control group tended (not statistically significant) to make more errors on post-testing [1 (0.3–4.7) versus 0.3 (0–1), p = 0.04] and on retention testing [1.7 (0–3.3) versus 0.8 (0–2.3), p = 0.45].

Fig. 5
figure 5

(A) Median number of penalty points for both groups during training. (B) Scatter dot plot of penalty points for both groups at post-and retention testing (Mann–Whitney U-test: p > 0.01). * Control group, n = 9

Discussion

For a long time, starting with the Halstedian apprenticeship model, expert feedback has represented the mainstay of surgical training programs [1]. Feedback is felt to play a major role in motivating trainees to keep on improving their performance and also to provide them with solutions on how to do this [810]. On the other hand, distributed training under expert supervision requires considerable faculty time commitment and is subject to scheduling conflicts. Therefore this study investigated whether a combination of CBVT and peer feedback could replace external feedback by an expert during proficiency-based laparoscopic suturing training in box trainers, without increasing the length of the learning curve and resulting in an equally long-lasting high-quality performance.

First of all, no differences in learning curves (starting from second training session) were detected between the experimental and control group. Two other studies addressing the length of learning curve during laparoscopic suturing training could not detect advantages of continuous expert feedback either [9, 16]. Probably this is due to the compulsory self-assessment of the knots, including time and knot quality, in both groups. In combination with the presence of predetermined benchmark criteria, trainees were able to situate themselves on the learning curve and to realize the distance to the final training goal. As previously described [17], working towards this specific goal represented an important motivational factor for both groups.

However, eventually, regardless of the type or length of learning, the ultimate goal is a qualitative and long-lasting skill. To our knowledge this is the first study concerning laparoscopic suturing and knot tying that investigated the influence of feedback on skill retention. This retention testing after a period of rest is needed to detect permanent changes in performance. Our results showed no differences between groups on either post- or retention testing. The expert feedback group even tended to exhibit a less consistent performance at retention testing (Levene’s test = 0.008). The less consistent performance of the control group in our study might seem somewhat surprising at first sight. However, intense expert feedback is known to inhibit certain intrinsic learning strategies and problem-solving activities, resulting in dependency on the provided feedback and inferior performance when that feedback is withdrawn [10]. Probably the higher variability in our control group reflects individual susceptibility to this phenomenon. Overall we conclude that both types of training resulted in similar learning curves and similar qualitative, long-lasting performance and we therefore consider it possible to replace external feedback by an expert through CBVT and peer feedback. This will likely facilitate the practical organization of skills training.

In this study, for both groups, a deterioration compared with proficiency was noted of approximately 30% at post- and 50% at retention testing. When looking at a recent study concerning retention of suturing, similar results were seen [6]. This indicates that the real performance level of a trainee is probably more accurately reflected in a post-training evaluation (after 1 week) than at the end of training. Proficiency-based curricula should therefore incorporate this 1-week post-testing to ensure a real proficiency level has been reached. Furthermore, work is to be done on the maintenance of the acquired skills. As 50% of students were able to reach the proficiency criterion again with only ten additional trials after retention testing, this indicates maintenance training does not need to be very elaborate. Stefanidis showed [6] that short maintenance sessions on a regular base could provide better skill retention compared with a control group without further training. However, more research is needed to elucidate the optimal timing and frequency of this maintenance training [6].

Interestingly, independent of the training condition, a borderline significant correlation (rho = 0.53, p = 0.017) was seen between innate psychomotor ability and the amount of trials needed to reach proficiency in the suturing and knot-tying exercise. Thus it seems possible to detect students that need supplementary training and tailor the training sessions to their needs. Finally, no correlation was found between innate psychomotor ability and post- or retention testing scores. This indicates that it is possible, through sufficient practice, to overcome a lower level of innate ability.

Training in laparoscopic skills usually starts with basic dexterity exercises such as the McGill Inanimate System for Training and Evaluation of Laparoscopic Skills (MISTELS), Southwestern or Rosser tasks, for adaptation to two-dimensional (2D) vision, fulcrum effect, long instruments, limited tactile feedback, etc. [1] Intracorporeal suturing and knot tying is a much more challenging skill that is usually only taught during advanced laparoscopy courses to surgeons that already master basic laparoscopic skills. We, and others [18], believe it is necessary to learn intracorporeal suturing at early stages of the curriculum. The reason is not only because the training of this skill simultaneously increases the dexterity of the trainee but also because, even during basic operative procedures, one might unexpectedly need the skill of intracorporeal suturing and knot tying (e.g., in case of bowel injury). In this study all participating senior medical students, with very limited laparoscopic experience, were able to achieve proficiency. Furthermore, reasonable skill retention was seen for these novice trainees, even without any clinical or laboratory-based maintenance training. Teaching this skill in the early stages of the curriculum is therefore feasible and will hopefully lead to increased training opportunities in the operating room.

Some drawbacks of this study should be taken into consideration. One important aspect of feedback, namely the timing or intensity, was not addressed in this study. When feedback is provided during performance of the skills it is referred to as concurrent feedback, and when provided on completion of the skill it is referred to as summary feedback [10]. Summary feedback has previously been shown to be superior concerning the learning curve [9] and retention testing [10]. In our study, the conventional concurrent way of delivering feedback was used, and results are therefore limited to this type of feedback. Furthermore the amount of feedback was not recorded, making it impossible to evaluate the specific attribution of peer feedback versus the computer-based video training in the experimental group.

Another drawback concerns the evaluation (time and errors) of the knots. At pre-, post-, and retention testing this was performed by a research fellow who was not blinded to training status. However, in our opinion, the objective scoring system used could overcome this drawback to a large extent. During training this evaluation was performed by the students themselves. The experimental group more frequently assigned zero penalty points to a knot, with three students doing this systematically for all their knots. More likely than an extremely good skill of these students, this reflects laziness in scoring when no actual supervision was present. This aspect did not seem to influence final knot quality since the experimental group was able to achieve equal or even slightly better results concerning penalty points at the test moments. In the control group, where training and thus knot evaluation was supervised, likewise a low amount of penalty points (<2) and a lack of learning curve concerning penalty points was noticed. We had the impression that, by the time students started scoring their errors (second training session), they had reached an appropriate qualitative level of performance, needing the extra trials mainly to improve the quantitative aspect (time) of performance. Eventually we believe that including any system of quality control is important to make students continuously aware of the type of errors to be avoided. A last remark concerns the single trial pre-test (instead of two trials at proficiency or three trials on the test moments) due to time constraints. This might increase the risk of underestimating students’ initial skills and overestimating their improvement during training. However, this effect was present in both groups and thus did not influence the main outcome of our study.

Conclusions

Both training methods are very efficient at improving laparoscopic suturing skills and provide excellent skill retention. We therefore conclude that structured training with video demonstrations and peer feedback can replace expert supervision to teach laparoscopic suturing skills to novices. This will facilitate practical organization of skills training.