Robotic technology has become pervasive throughout modern surgical practice. Since the introduction of the da Vinci Surgical System (dVSS) (Intuitive Surgical, Incorporated; Sunnyvale, CA) to general surgery in 2000, robotic surgery has been rapidly incorporated into many specialties such as gynecology, urology, colorectal, and bariatrics [1]. As a result, surgical training programs across multiple disciplines must adapt to include robotic platform training in their curriculum. This training adaptation must occur due to the added complexity of robotic surgery—there is a separation of the surgeon’s console and the patient, communication difficulties, 3D instead of 2D imaging, limited demonstration capabilities, and a lack of haptic feedback [2]. These new complexities, in addition to our group’s own study, have shown that the transferability of skills between laparoscopic and robotic platforms is limited [3]. Thus, a novel, specialized and dedicated robotic-specific skills curricula is required for surgical trainees. Often, surgical trainees have little access to the complete robotic surgery systems for training purposes. Simulator-based robotic training offers a safe alternative training option that overcomes most logistical and patient-safety hurdles. There are multiple options for simulator-based robotic training such as the da Vinci skills simulator by Intuitive Surgical Inc., dV-Trainer from Mimic Technologies Inc., RoSS by Simulated Surgical Sciences LLC, and Robotix Mentor from Simbionix [4]. But, there is still a need for structured curricula to be developed alongside these simulators that focuses on the transferability of simulated robotic skills to the live operative environment [5].

To describe the impact of education curricula, the Kirkpatrick hierarchy of educational outcomes is routinely used in the academic setting. Kirkpatrick’s outcome levels applied to medical education research include (1) learner satisfaction; (2) changes in attitudes, knowledge, and skills; (3) changes in behaviors; (4) changes to the care system or patient outcomes. Medical education research should strive to capture outcomes at the Kirkpatrick Level 4, as this level has the highest likelihood for sustained changes in behaviors and improved patient outcomes [6]. To date, most reports of robotic curricula have reached a Kirkpatrick Level 1 or Level 2 of impact by demonstrating robotic simulation curriculum can improve robotic skills in a dry lab setting [7]. Our study aims to reach a higher Kirkpatrick Level of impact by demonstrating improved robotic performance outcomes in a live operative setting associated with a novel simulator-based robotic skills curricula.

Prior surgical research has developed metrics to accurately assess surgical trainee operative competency. The Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) is used to assess operative performance and has been successfully applied to surgical simulation tasks. The National Aeronautics and Space Administration-Task Load Index (NTLX) is a validated tool to assess mental workload that has been utilized in many surgical studies [8,9,10]. Together, the O-SCORE and NTLX are valid forms of training evaluation that can accurately describe changes in a surgeon’s performance and workload. In this study, our group introduced a simulator-based robotic skills curriculum and quantified improvements in performance metrics and subjective workload. We developed our simulator-based robotic skills curriculum by consensus of a cross-specialty, expert panel that evaluated a matrix of all available modules. The panel decided to create a curriculum that had commonalities across specialties, was easily replicable and was feasible for trainees to complete with limited time availability. We hypothesize that the introduction of a simulator-based robotic skills curriculum is associated with significant improvements in workload and performance metrics in the live robotic operative setting.

Materials and methods

Subject recruitment

Under an IRB-approved protocol, 31 subjects were recruited and informed consent was obtained in this prospective cohort pilot study. The subjects were surgical residents in general surgery, urology, and obstetrics and gynecology novice to robotic platforms.

Performance metrics

The attending surgeon assessed subject performance by using a robot-specific modification of the O-SCORE after each case [11, 12]. Our institution modified the original O-SCORE evaluation to better reflect robot-specific surgical characteristics (Fig. 1). A Likert scale scoring system was used, from 1 to 5, using anchor questions of “Complete hands on guidance” to “I did not need to be there” in regard to activities related to each domain. The subject’s workload was assessed following each case using the NTLX survey, a tool that allows subjects to self-assess their performances on six sub-scales including Mental Demand, Physical Demand, Temporal Demand, Performance, Effort and Frustration [13,14,15].

Fig. 1
figure 1

Robotic-specific modification of the ottawa surgical competency operating room evaluation

Pre-training

All subjects participated in a live robot-assisted laparoscopic surgical (RALS) case prior to reaching proficiency on the novel da Vinci Skills Simulator curriculum. Immediately after the case, the attending surgeon completed the RO-SCORE and the subject completed the NTLX.

Robotic simulator training

After completion of the RALS case, all subjects trained to pre-set proficiency goals on a da Vinci Skills Simulator with a novel skills curriculum. These tasks included: Camera Targeting—Level 2, Energy Dissection—Level 1, Energy Switching—Level 2, Ring and Rail—Level 2, Ring Walk—Level 3, Suture Sponge—Level 3, Thread the Rings, and Tubes. This novel skills curriculum was selected from on-board tasks included with the dVSS software. An expert panel decided the tasks and the difficulty level of each task that was required. The scoring goals that determined proficiency for each required task were based on built-in dVSS software metrics. The dVSS was available to subjects for practice during normal business hours in the Surgical Skills Laboratory at the Washington University Institute for Surgical Education (WISE). Each task listed takes approximately 5 min to complete, with approximately four attempts necessary to reach proficiency. This training took, on average, 3 h to complete. The subjects were given 1 month to complete all tasks to proficiency before moving onto the post-training case.

Post-training

Upon reaching proficiency on all tasks of the novel dVSS curriculum, subjects participated in a live RALS case with the same attending surgeon that was present for the pre-training RALS case. Immediately, after the case, the attending surgeon completed the RO-SCORE and the subject completed the NTLX.

Data analysis

RO-SCORE and NTLX scores from the pre-training and post-training live RALS cases were compared with paired Student’s t test.

Results

Thirty-one subjects completed the study, which consisted of a pre-training case, robotic curriculum and post-training case. Completion of the curriculum required an average of 2.3 h. There was a statistically significant improvement seen in all RO-SCORE domains (Fig. 2): Camera Control (Pre-curriculum mean: 1.93, STD: ± 0.78; Post-curriculum mean: 4.80, STD: ± 0.42; p < 0.001), Energy Control (Pre: 1.41 ± 0.50; Post: 4.48 ± 0.57; p < 0.001), Needle Control (Pre: 1.73 ± 0.74; Post: 4.38 ± 0.66; p < 0.001), Tissue Handling (Pre: 1.97 ± 0.76; Post: 4.44 ± 0.56; p < 0.001), Instrument Control (Pre: 1.94 ± 0.65; Post: 4.56 ± 0.62; p < 0.001), Visuospatial (Pre: 1.87 ± 0.68; Post: 4.03 ± 0.73; p < 0.001), Efficiency (Pre: 2.07 ± 0.78; Post: 4.52 ± 0.59; p < 0.001), Communication (Pre: 1.91 ± 0.57; Post: 4.29 ± 0.68; p < 0.001) and Overall (Pre: 2.06 ± 0.85; Post: 4.35 ± 0.69; p < 0.001).

Fig. 2
figure 2

Mean RO-SCORE domain scores of operations performed by trainees before and after completing the curriculum

There was a statistically significant reduction seen in all NTLX domains (Fig. 3). This correlated to a decreased workload in all domains; a higher score meant a higher demand in the domain, a lower score meant a lower demand in the domain. For the performance domain, a higher score meant closer to failure, a lower score meant closer to perfect. The significant reductions seen in all NTLX workload ratings were: Mental Demand (Pre-curriculum mean: 6.70, STD: ± 1.5; Post-curriculum mean: 3.08, STD: ± 1.52; p < 0.001), Physical Demand (Pre: 5.23, STD: ± 2.04; Post: 2.09, STD: ± 0.99; p < 0.001), Temporal Demand (Pre: 4.06, STD: ± 1.49; Post: 2.05, STD: ± 0.9; p < 0.001), Performance (Pre: 6.00, STD: ± 1.65, Post: 2.94, STD: ± 1.59; p < 0.001), Effort (Pre: 6.83, STD: ± 1.45; Post: 3.34, STD: 1.75; p < 0.001), Frustration (Pre: 6.44, STD: ± 1.39; Post: 1.42, STD: ± 1.01; p < 0.001).

Fig. 3
figure 3

Mean NTLX domain scores of operations performed by trainees before and after completing the curriculum

Discussion

In this study, we have shown that a simulator-based robotic skills curriculum is associated with significant improvements in performance and workload in a live operative setting. While previous literature has shown the utility of robotics curricula in a simulated setting, we have demonstrated a feasible curriculum that translates skills acquisition to actual clinical performance improvement.

The improvements we have demonstrated allow for a trainee to focus on higher-order learning objectives in the operating room such as operative steps, complication anticipation and management, challenging anatomy, etc. because the basic skills of operating the robotic technology is achieved before a trainee steps in the room. It should also allow for safer and more efficient progression through a procedure. Finally, we have found anecdotally that supervising faculty will engage a trainee who has completed the curriculum in a greater portion of a robotic procedure than they might otherwise have, thus enhancing learner satisfaction and engagement.

Having a curriculum that is feasible and replicable is essential for its adoption. As the curriculum was developed by a multi-disciplinary expert panel, it has the benefit of being generalizable across a wide variety of specialty training programs. The curriculum is self-guided, allowing learners to complete it in one sitting or multiple sittings which can alleviate the administrative burden on training program directors and their staff. Because it uses built-in modules provided by the manufacturer, the curriculum is replicable, allows for individuals to track progress, and metrics are provided as objective target goals. In addition, the curriculum can be completed in as little as 2–3 h, which our trainees felt was feasible within the confines of their other clinical obligations.

The findings that we have shown represent one of the first assessments of live operative skills after a training intervention using a robotic simulator. These metrics are important in demonstrating the utility of simulation-based robotic surgical training. Our outcomes fit into the description of the Kirkpatrick Level 3 for educational outcomes, changing the clinical practice of learners through robotic curricula. Capturing patient-centered metrics and live operative performance serve as the highest demonstrable level of impact. Our study further increases the validity of simulator-based robotic training by reporting on subjective operative skill improvements.

A potential limitation of this study is the variability presented by assessment in a live clinical setting. The nature and types of the procedures varied, as did their complexity, and the level of involvement by the trainee. Nonetheless, the positive effect of the curriculum was seen across the board despite this variability. For future studies, we may video record the performance so that raters could be blinded to the status of the learner, pre-training or post-training, and could include multiple independent raters.

In summary, we have provided evidence to support that training robotic surgeons on simulator-based robotic skills curriculum is associated with significant improvements in workload and performance metrics in the live robotic operative setting. As surgical training programs continue to adopt robotic technologies, our novel robotic surgery skills curriculum can be incorporated into training programs to improve live robotic surgical skills. These findings suggest that the growing field of robotic surgical training should focus on refining simulator-based surgical curricula while using metrics that capture live robotic performance. This study may also help guide the design of future evaluations of simulator-based curricula that incorporate live operative performance.