Surgical residency requires mastery of both theoretical knowledge and technical skill [1, 2]. This is a progressively challenging task, especially with increasingly detailed knowledge of underlying disease processes, advancing technologies, and the advent of sub-specialized fields [3, 4]. Recent restrictions on working hours may provide an additional barrier to attaining necessary exposure to develop adequate intraoperative skills [5, 6]. Moreover, the COVID-19 pandemic has further reduced operative exposure for surgical residents [7]. These factors have combined to create a unique situation in which surgical residency programs are being forced to adapt to ensure that they are graduating technically competent surgeons prepared for independent practice.

Currently, the most common approach to surgical technical training is the master-apprentice model (MAM) [8]. This model is heavily didactic and rarely extends beyond the walls of the operating room (OR) [8]. In an attempt to increase the efficiency and effectiveness of surgical technical training, video-based coaching (VBC) has recently been applied [9,10,11]. A concept first applied in sport, VBC refers to the use of modeling and provision of feedback by a coach on the basis of audiovisual recordings of the player practicing or playing in a game situation [12]. In surgery, VBC is applied similarly with the coach often being a staff surgeon, the player being a resident, and the audiovisual footage pertaining to intraoperative technical or interpersonal skills. Numerous publications, including randomized controlled trials (RCTs), have demonstrated effectiveness at improving objective technical skill [13,14,15,16]. Coaching frameworks, such as the Wisconsin Surgical Coaching Framework, have been designed to further enhance its effectiveness [17]. In addition to improving efficiency and effectiveness for surgical skill training, it is also a teaching method that can be completed in the perioperative period remote from the physical space of the OR and thus may address any potential aforementioned lack of OR exposure. [18]

Yet, the implementation of VBC in surgical residency remains sporadic. A recently published meta-analysis pooled peer-reviewed data pertaining to VBC, but it included data derived from medical student surgical training programs as well as staff surgeon peer-to-peer feedback programs [18]. While these are important areas of surgical education research, we believe that these data introduced significant heterogeneity and limited the ability to apply these findings in a practical setting. Moreover, surgical residency is under significant strain from both work-hour restrictions and decreasing OR exposure and thus may benefit most from VBC research [5,6,7]. As such, the aim of the present systematic review and meta-analysis was to pool previously published data evaluating the impact of VBC and compare surgical residents receiving and not receiving VBC in terms of technical surgical skill.

Materials and methods

Search strategy

The following databases were searched from database inception to October 2021: Medline, EMBASE, Cochrane Central Register of Controlled Trials (CENTRAL), and PubMed. The search was designed and conducted by a medical research librarian with input from study investigators. Search terms included “video-based,” “coaching,” “internship and residency,” “surgical education,” and more others (complete search strategy available in Appendix). The references of published studies and gray literature were searched manually to ensure that all relevant articles were included. This systematic review and meta-analysis is reported in accordance with the Preferred Reporting items for Systematic Reviews and Meta-Analyses (PRISMA) [19]. The study protocol was registered on the PROSPERO International Prospective Register of Systematic Reviews a priori.

Study selection

Articles were eligible for inclusion if they were RCTs that compared technical performance in live or simulated surgical tasks with and without a preoperative or postoperative VBC intervention for surgical residents. For this study, VBC was defined as review of and feedback pertaining to audiovisual footage of a specific surgical operation, task, or simulation prior to and/or following the completion of that same task [9, 18]. Observational studies were not eligible for inclusion. Single-arm studies, studies that did not include surgical residents, studies including medical students, studies evaluating peer-to-peer coaching, and studies that did not employ preoperative or postoperative VBC were excluded. Studies were not discriminated on the basis of language. Case reports, systematic reviews, meta-analyses, letters to editors, and editorials were excluded.

Outcomes assessed

The primary outcome was change in objective measures of technical surgical skill following implementation of either VBC or control. Many objective measures of technical surgical skill exist. Contemporary VBC surgical literature most commonly employs the following objective measures/scales: (1) Objective Structured Assessment of Technical Skills (OSATS) [20]; (2) Mini/Modified Objective Structured Assessment of Technical Skills (MOSATS) [20]; (3) Bariatric Objective Structured Assessment of Technical Skills (BOSATS) [21]; (4) Global Operative Assessment of Laparoscopic Skills (GOALS) [22]; (5) Generic Error Rating Tool (GERT) [23]; (6) time to completion; and (7) other institution-specific global technical skills assessment scales.

Secondary outcomes included post-coaching scores according to the aforementioned objective measures of technical surgical skill. Additional secondary outcomes included resident satisfaction with VBC and procedure/simulation-specific outcomes.

Data extraction

Three reviewers independently evaluated the systematically searched titles and abstracts using a standardized, pilot-tested form. Discrepancies that occurred at the title and abstract screening phases were resolved by inclusion of the study. At the full-text screening stage, discrepancies were resolved by consensus between the three reviewers. If disagreement persisted, the study was excluded. The same reviewers independently conducted data extraction into a data collection form designed a priori. The extracted data included study characteristics (e.g., author, year of publication, study design), resident demographics (e.g., age, year of study, operative experience), intervention characteristics (e.g., simulated and/or clinical environment, timing of intervention), and resident operative performance measures (e.g., OSATS, GOALS, time, number of errors).

Risk of bias assessment

Risk of bias for each included study was assessed using the Cochrane Risk of Bias Tool for Randomized Controlled Trials 2.0 [24]. The Cochrane Risk of Bias Tool analyzes RCTs according to randomization process, assignment to intervention, adherence to intervention, missing outcome data, outcome measurement, and outcome reporting. Studies were assigned low risk of bias, some concerns for bias, and high risk of bias in each domain, as well as overall. Three reviewers assessed the quality of the studies independently. Discrepancies were discussed among the reviewers until consensus was reached.

Statistical analysis

All statistical analyses and meta-analyses were performed on STATA version 14 (StataCorp, College, TX) and Cochrane Review Manager 5.3 (London, United Kingdom). The threshold for statistical significance was set a priori at a p of < 0.05. A pairwise meta-analysis was performed using an inverse variance random effects model for all meta-analyzed outcomes. Pooled effect estimates were obtained by calculating the standardized mean difference (SMD) for continuous variables along with the respective 95% confidence intervals (CIs) to confirm the effect size estimation. The SMDs were utilized to account for variability in objective technical performance scaling (e.g., OSATS, GOALS). Mean and standard deviation (SD) were estimated for studies that only reported median and interquartile range or range using the method described by Wan et al. [25]. For studies that did not report a measure of central tendency, authors were contacted for missing data. Data were presumed to be unreported if no response was received from study authors within two weeks from the index point of contact. Missing SD data were then calculated according to the prognostic method [26]. Assessment of heterogeneity was completed using the inconsistency (I2) statistic. An I2 greater than 50% was considered to represent considerable heterogeneity [27]. Bias in meta-analyzed outcomes was assessed with funnel plots when data from more than 10 studies were included in the analysis [28]. A leave-one-out sensitivity analysis was performed by iteratively removing one study at a time from the inverse variance random effects models to ensure that pooled effect estimates were not driven by a single study. Risk of bias sensitivity analyses were performed for all meta-analyzed outcomes. Subgroup analyses were completed by setting (i.e., operating room, simulation) and level of training (i.e., post-graduate year) where applicable. For outcomes reported in less than three studies or outcomes in which heterogeneous reporting precluded meta-analysis, a systematic narrative summary was provided [29].

Results

Study characteristics

From 2734 relevant citations, 11 RCTs with 157 residents receiving VBC and 141 residents receiving standard surgical teaching without VBC were included [13,14,15,16, 30,31,32,33,34,35,36]. A PRISMA flow diagram of the study selection is illustrated in Fig. 1. Eight studies reported post-graduate year (PGY) of training for included residents; 49.3% and 45.4% were PGY-1, 27.5% and 29.2% were PGY-2, 5.8% and 6.2% were PGY-3, 10.1% and 13.8% were PGY-4, and 7.2% and 5.4% were PGY-5 in the VBC group and standard surgical teaching without VBC group, respectively. Nine studies reported trainee surgical subspecialties; 58.3% and 59.3% were general surgery residents, 23.3% and 24.6% were orthopedic surgery residents, and 18.3% and 16.1% were obstetrics and gynecology residents in the VBC group and standard surgical teaching without VBC group, respectively. Detailed study characteristics of the included studies are reported in Table 1.

Fig. 1
figure 1

PRISMA diagram—transparent reporting of systematic reviews and meta-analysis flow diagram outlining the search strategy results from initial search to included studies

Table 1 Study characteristics of included studies

Coaching environment

Two of the included studies evaluated the use of VBC in the preoperative setting [31, 32]. The remaining studies evaluated the use of postoperative VBC. Three of the included studies evaluated the use of VBC with intraoperative tasks/surgeries (e.g., laparoscopic right hemicolectomy, laparoscopic salpingo-oophorectomy) [14, 15, 31]. The most common objective skills assessment scoring system was the OSATS (8 studies). Detailed surgical and coaching parameters of the included studies are reported in Table 2.

Table 2 Video-based coaching environments

Objective skills assessments

There was no significant difference in post-coaching scores on objective surgical skill evaluation tools between groups (SMD 0.53, 95% CI 0.00 to 1.01, p = 0.05, I2 = 74%) (Fig. 2A). The corresponding funnel plot is presented as Fig. 3. The association between VBC and post-coaching scores was statistically significant with leave-one-out and risk of bias sensitivity analyses with removal of studies by Norris et al., Jensen et al., and Vaughn et al. [15, 32, 35] Post-coaching scores were also significantly different between the two groups upon subgroup analysis only including low risk of bias studies (SMD 1.46, 95% CI 0.56–2.36, p = 0.002, I2 = 77%) and only including simulation-based coaching (SMD 1.34, 95% CI 0.34–2.34, p = 0.009, I2 = 84%).

Fig. 2
figure 2

A Post-video-based coaching objective Assessment Tool Scores—random effect meta-analysis comparing presence and absence of video-based coaching. B Change in Objective Assessment Tool Scores—random effect meta-analysis comparing presence and absence of video-based coaching

Fig. 3
figure 3

Funnel plot for post-video-based Coaching Objective Assessment Tool Scores random effect meta-analysis

The improvement in objective surgical skill scores pre- and post-intervention was significantly greater in residents receiving VBC compared to those receiving standard surgical teaching without VBC (SMD 1.62, 95% CI 0.62 to 2.63, p = 0.002, I2 = 85%) (Fig. 2B). These results were unchanged with leave-one-out sensitivity analysis. Subgroup analyses including only low risk of bias studies (SMD 0.54, 95% CI 0.03–1.04, p = 0.04, I2 = 66%) and simulation-based coaching (SMD 0.51, 95% CI 0.07–0.95, p = 0.02, I2 = 62%) also demonstrated statistically significant improvements in the VBC group. Table 3 reports all objective skills assessment data from the included studies.

Table 3 Comparison between baseline and post-intervention technical performance in the included studies

Risk of bias

Figures 4 and 5 present the risk of bias assessment for each included study and the overall cohort of included studies, respectively, according to the Cochrane Risk of Bias Tool for Randomized Controlled Trials 2.0 [24]. Five studies were found to be at a low risk of bias, three had an unclear risk of bias, and three studies were at high risk of bias. Included studies were uniformly at low risk of bias from outcome reporting, randomization, and missing data. All studies found to be at high risk of bias introduced bias through failure to estimate the effect of the assignment with appropriate analyses. For example, Karam et al. evaluated orthopedic surgery resident performance in repairing a simulated tibial plafond fracture and did not report nor statistically control for previous experience with the procedure [33]. There were no missing outcome data across all studies.

Fig. 4
figure 4

Cochrane risk of bias tool for randomized controlled trials 2.0—individual study analyses

Fig. 5
figure 5

Cochrane risk of bias tool for randomized controlled trials 2.0—grouped outcomes for included trials

Discussion

VBC remains a relatively novel concept in surgical education [9]. The majority of published data are RCTs from the last 5–6 years [13,14,15, 30, 33,34,35]. Upon pooling of these data, this review was able to demonstrate that VBC can be an effective intervention for improving surgical resident technical skill. Specifically, residents experiencing VBC were more likely to demonstrate improvement in scores on objective surgical skill assessments compared to surgical residents undergoing standard surgical teaching without VBC. This same improvement was not observed when post-intervention scores were analyzed in isolation, suggesting that the benefit may be most substantial for trainees with lower baseline levels of objective skill and for trainees earlier in their surgical residency. Risk of bias was relatively low across included studies.

The findings of this review are in keeping with previously published literature examining VBC in surgery. The previous meta-analysis by Augestad et al. demonstrated significant improvements in objective surgical scoring scales in trainees receiving VBC [37]. In RCTs evaluating the use of VBC for surgical residents either improvements or no change in technical skill but increased confidence and knowledge of intraoperative tasks and anatomy were noted [13, 15, 18, 30]. For example, Soucisse et al. demonstrated a significant improvement in OSATS scores in residents undergoing VBC for a simulated bowel anastomosis task compared to residents without VBC [13]. Rindos et al. found VBC to be the most influential variable in improving junior resident surgical skill in a simulated laparoscopic vaginal cuff closure model [30]. The most recent RCT evaluating the use of VBC in surgical residents undergoing laparoscopic salpingo-oophorectomy by Norris et al. did not demonstrate a measurable improvement in objective surgical skill; however, it did improve knowledge and confidence in operative anatomy [15]. In addition to VBC for surgical residents, VBC has also been used in the surgical setting for peer-to-peer coaching [38, 39]. Wisconsin has developed a state-wide VBC program that matches practicing surgeons with surgeon coaches. Participants found value in sharing ideas and learning from other surgeons and overall had a positive perception of the program [39]. These variations on surgical VBC, as well as many others, can all be valuable when applied in the correct setting and warrant further implementation into surgical curricula across different levels of training.

In surgical residency, VBC can address decreasing OR exposure as well as an increasingly demanding curriculum. It offers time and teaching in a perioperative setting pertaining to the technical steps of operations [40]. As we progress into the era of competency-by-design (CBD) residency programs, VBC may become even more important [41]. In addition to heightening the efficiency and effectiveness of surgical teaching, VBC can offer regular check points whereby surgeons are able to assess both the technical competency and intraoperative knowledge of surgical trainees. We believe implementing VBC sessions for residents on a regular basis (e.g., monthly) could be a positive addition to CBD curriculums and warrants investigation.

A barrier to the implementation of VBC into CBD curriculums may be scheduling, given the clinical duties of both staff surgeons and surgical residents. Moreover, while VBC sessions are efficient means to teach surgical skills, they still require an additional 20–30 min of time [13, 15, 18]. Artificial Intelligence (AI) may have a role in lessening that burden and fully realizing the potential of VBC. Recent papers indicate the ability of AI to segment out important parts of operations, which may help identify high-yield intraoperative techniques for VBC [42]. Mirchi et al. even employed a virtual assistant that was able to provide feedback to novice and expert learners based on a simulated surgical task [43]. Importantly, however, AI systems do not necessarily convey the nuanced technical detail that an experienced surgeon may provide [44]. A hybrid version of VBC that combines both staff surgeons and AI-driven tools may be the most efficient and effective solution for the future of surgical post-graduate education [2]

The strengths of the present systematic review and meta-analysis include the rigorous methodology, exclusion of non-RCT data, comprehensive risk of bias assessment, thorough sensitivity analyses, and inclusion of a homogenous group of studies pertaining only to surgical resident education. Limitations of the present study include a small number of residents in the included studies (n = 298) and heterogeneity in objective measures of surgical skill. Moreover, there was no uniform definition of VBC among the included studies, with some studies employing VBC preoperatively and others examined its use in the postoperative setting. Most studies utilized VBC with simulated surgical tasks and thus, the impact of VBC on performance in the OR remains largely unexplored [14, 15, 31]. Further studies examining the use of VBC with live intraoperative audiovisual footage setting are warranted. Lastly, none of the included studies evaluated open operations in a live OR. This is understandable given the inherent difficulty in obtaining high-quality audiovisual footage in a sterile field without a laparoscope. Future studies may evaluate the use of wearable technology, such as GoPros©, for capturing intraoperative data that can be subsequently used for VBC.

The findings of this systematic review and meta-analysis demonstrate a significant improvement in baseline objective surgical skill in surgical residents undergoing VBC compared to surgical residents receiving standard surgical teaching without VBC. The lack of difference in post-intervention scores suggests that the benefit may be most substantial for trainees with lower baseline levels of objective skill and those earlier in their surgical residencies (i.e., PGY-1 and -2). Further work evaluating its effectiveness as part of CBD curricula is warranted and could yield important benefits for contemporary surgical education.