Since FDA approval of the da Vinci ® Surgical System in 2000, robotic-assisted operations have become commonplace in the fields of urology and gynecology. Only within the past 5 years has the robotic platform gained popularity in general surgery as well. In the surgical education literature, there has been a trend for residents logging decreased open procedures, with an accompanying increase in laparoscopic, but not necessarily robotic, procedures [1, 2]. Since the Fundamentals of Laparoscopic Surgery (FLS) curriculum has successfully standardized residents’ laparoscopic education, residency programs may need to consider integrating a similar robotic curriculum to ensure proficiency with this technology [3]. Robotic curricula have been described as effective; however, they are mostly limited to urology and gynecology training programs currently [47].

Virtual reality surgical simulation has gained attention as a validated method for robotic training. It has been shown to improve basic skillsets, demonstrate good content and construct validity for laparoscopic and robotic training, and translate into improved skills in the operating room [811]. As this technology is gaining more attention in the surgical education literature, it is becoming used as a marker for robotic performance and proficiency training [1214]. Development of robotic curricula is in initial stages, but widespread implementation has yet to occur in general surgery training programs. More importantly, it remains unclear whether general surgery residents are acquiring robotic skills as they progress through residency.

Surgical educators have proposed that acquiring advanced laparoscopic skills throughout residency should translate into developing robotic skillsets on the da Vinci ® platform. This concept has been suggested in the recent literature, where laparoscopically trained senior surgical residents demonstrated equivalent performance on advanced FLS tasks using the robotic platform compared to laparoscopy; however, they actually performed worse on simple tasks [15]. This study further demonstrated that laparoscopically naïve medical students perform better on advanced FLS tasks using the robotic platform compared to laparoscopy, suggesting that the robot’s technical design may improve a novice’s performance. Nevertheless, it remains unclear whether there is a direct correlation between degree of laparoscopic experience and task-specific performance on robotic simulator tasks.

Therefore, we sought to evaluate the efficacy of general surgery training on acquiring robotic skills by comparing robotic simulation performance at various training levels. We hypothesized that laparoscopic skills acquired through general surgery residency training will translate into task-specific performance on robotic simulator exercises.

Methods

Study design

Thirty-six participants were prospectively enrolled in this study with varying degrees of general surgical training: eight medical students (MS), ten junior residents (JR), ten mid-level residents (MLR), and eight senior residents (SR). JR included postgraduate year (PGY)-1 and PGY-2 residents; MLR included PGY-3 residents; SR included PGY-4 and PGY-5 residents. Prior to task completion, each participant was given an instructive overview of the da Vinci Si ® surgeon console and Mimic Simulation® platform, and allotted 5 min of practice time on the playground simulation module for familiarization of instrumentation. Each participant then performed three simulation tasks of increasing complexity after watching the task-specific tutorial (in order from easy to difficult): MatchBoard #1, EnergyDissection #2, and SutureSponge #3. Each participant’s specific task score (0–100) and overall (i.e., cumulative) score (0–300) were tabulated and compared between groups.

Participant recruitment

This study was approved by the Weill Cornell Medical College Institutional Review Board. Participants were voluntarily recruited from the New York Presbyterian Hospital—Weill Cornell general surgery residency program, which currently enrolls 51 categorical residents. Medical students were voluntarily recruited from the principal investigator’s laboratory and surgical rotations at the Weill Cornell Medical College. After informed consent was obtained, participants answered a brief demographics questionnaire: collected information included age, sex, level of residency training, hand dominance, prior robotic experience (number of hours on the console), history and degree of prior video gaming, and number of laparoscopic cases logged during residency training per the Accreditation Council for Graduate Medical Education guidelines. Of note, the general surgery residency at our institution has a formal laparoscopic skills curriculum to prepare residents for the Fundamentals of Laparoscopic Surgery (FLS) certification.

Task description and metrics for evaluation

The three selected simulator tasks were chosen because they had varying degrees of difficulty per the da Vinci ® Skills Simulator module brochure: MatchBoard (easy), EnergyDissection (intermediate), and SutureSponge (difficult). The MatchBoard task requires the participant to use two arms to place letters and numbers in their designated positions on a checkerboard (Fig. 1A). The EnergyDissection task requires the participant to use two arms to electrocauterize and sharply divide six blood vessels with the goal of minimizing blood loss (Fig. 1B). The SutureSponge task requires the participant to drive a curved needle through multiple designated areas on a sponge from different angles, using either hand in a forehand or backhand approach (Fig. 1C). These tasks were also chosen to reflect some of the basic surgical skills required for robotic proficiency; they incorporate hand–eye coordination while operating multiple instruments simultaneously (MatchBoard), electrocautery and cutting (EnergyDissection), and curved needle driving from a variety of angles (SutureSponge). All tasks also incorporated other basic skillsets including camera clutching, efficient instrument utilization, and speed.

Fig. 1
figure 1

da Vinci ® surgical simulation virtual reality tasks. Virtual reality simulation tasks including MatchBoard (A), EnergyDissection (B), and SutureSponge (C). ©2015 Intuitive Surgical, Inc

These skillsets were measured quantitatively using the metrics programmed in the Mimic Simulation® software. Overall score was the primary metric for evaluation. This metric has been shown to be a unique identifier of performance, distinguishing between novices (<10 robotic cases) from intermediates (10–50 cases) and experts (>50 cases) in selected tasks. The subset metrics that contribute to overall score have also been shown to be unique identifiers, including economy of motion (total distance travelled by all instruments in centimeters), time to complete, instrument collisions, master workspace range (diameter of user’s working volume on master grips in centimeters), critical errors (number of metrics whose score is zero), instruments out of view, excessive instrument force, missed targets, object drops, and misapplied energy time [16, 17]. Furthermore, the SutureSponge task has been previously shown to be one of the strongest skill tasks to differentiate between novices and intermediates or experts, likely because it requires a more advanced skillset than other simulator tasks (such as MatchBoard and EnergyDissection) [17]. Lastly, in our study, an overall score of 207 was considered “superior” since prior studies have shown that median scores of 64, 82, and 61 for MatchBoard, EnergyDissection, and SutureSponge, respectively, are estimated standards of an intermediate surgeon [16].

Statistical analysis

Statistical analyses were performed using STATA, release 13 (StataCorp, College Station, TX). For comparison of categorical variables, Fisher’s exact and Chi-square tests were used for ≤5 and >5 observations, respectively. Student’s t test or Mann–Whitney U test were used to analyze continuous parametric and nonparametric variables, respectively. For all analyses, a two-tailed p value of <0.05 was considered significant; independent predictors with p value of <0.1 on univariate analysis were included in multivariate analysis. Lastly, logistic and linear regressions were performed to assess whether any correlations existed between simulator task performance and participant demographics, including number of laparoscopic cases logged during residency training.

Results

Participant demographics were generally comparable between groups (Table 1). Seventy-eight percent of the cohort were male, and 92 % were right-hand dominant, both without difference between groups (p = 0.54 and p = 0.27, respectively). As expected, SR was the oldest group, and MS was the youngest (p < 0.001). Seventy-two percent of participants had a prior video gaming history, and 38 % of those were current gamers, both without differences between groups (p = 0.27 and p = 0.21, respectively). Of those with any video gaming history, the median number of estimated total lifetime hours played was 2600 (range 625–18200) and did not differ between groups (p = 0.60). The median number of laparoscopic cases increased with higher level of surgical training as expected, and there were significant differences between all groups individually (see Table 1); notably, none of the residents had completed FLS certification at time of participation. Lastly, only two participants had operative experience on the robotic console, both of which were less than 5 h; there was no significant difference in robotic experience between groups (p = 0.09). As our residency program does not have an official robotic simulation curriculum, none of the subjects had experience with the simulator prior to participation.

Table 1 Participant demographics

On group analysis, the median overall scores did not differ (Fig. 2A): 188 (84–201) for MS, 183 (91–234) for JR, 197 (153–218) for MLR, and 205 (169–229) for SR (p = 0.14). However, by individual group comparison, SR outperformed MS (p = 0.036). On specific task comparison, there were no significant differences in MatchBoard (p = 0.27) or Energy/Dissection (p = 0.99) median scores between groups (Fig. 2B, C). However, the median SutureSponge score was highest for SR (61, range 39–81) compared to MS (43, range 26–61), JR (43, range 11–72), and MLR (55, range 36–68) (p = 0.039) (Fig. 2D). The only metrics with significant differences between groups were instrument collisions and missed targets, which were both lower in the SR group (p = 0.04 and p = 0.004, respectively).

Fig. 2
figure 2

Score distribution between groups. Task performance score distribution for overall (A), MatchBoard (B), EnergyDissection (C), and SutureSponge (D). Values reported as median, 0th, 25th, 75th, and 100th percentiles. Asterisk p < 0.05 (individual comparison), ¥ Kruskal–Wallis multiple comparison, MS medical student, JR junior resident, MLR mid-level resident, SR senior resident

Eight of the 36 participants (22 %) achieved a “superior” overall score >207. On univariate analysis, only the number of laparoscopic cases logged was associated with a “superior” overall score (Table 2). Age, sex, hand dominance, history of video gaming, and total lifetime hours of video gaming were not associated with “superior” performance (p > 0.05). Multivariate linear regression analysis demonstrated a positive correlation between overall score and number of laparoscopic cases logged during residency (p = 0.02, r 2 = 0.14) (Fig. 3A). Task-specific analysis failed to demonstrate a correlation for both the MatchBoard and Energy/Dissection tasks (p = 0.10 and p = 0.35, respectively) (Fig. 3B, C). Instead, the overall correlation was mainly influenced by SutureSponge performance; its score was significantly correlated with number of laparoscopic cases (p = 0.005, r 2 = 0.21) (Fig. 3D).

Table 2 Predictors of “superior” score
Fig. 3
figure 3

Linear regression—score versus number of laparoscopic cases logged. Linear regression analyses between number of laparoscopic cases logged and overall score (A), MatchBoard score (B), EnergyDissection score (C), and SutureSponge score (D)

There was no significant association between current video gaming and overall score on univariate analysis (p = 0.42, Table 2), or correlation of total lifetime hours of video gaming and overall score on linear regression (p = 0.89, R 2 = 0.0006). Furthermore, there were no correlations between total lifetime hours of video gaming and MatchBoard score (p = 0.47, R 2 = 0.02), EnergyDissection score (p = 0.98, R 2 < 0.001), or SutureSponge score (p = 0.33, R 2 = 0.03).

Discussion

Robotic operations are becoming more prevalent in foregut, bariatric, colorectal, and other general surgical fields with comparable short-term outcomes to laparoscopic surgery [1820]. While the long-term outcomes of robotic-assisted operations in general surgery have yet to be fully evaluated, it is clear that use of the robot in general surgery specialties continues to grow and formal training is likely necessary. As robotic residency curricula are being developed to incorporate virtual reality simulation, it is important to recognize factors that effect resident performance on robotic simulator tasks. This will allow educators to integrate residents’ known inherent strengths and weaknesses into a trainee-specific education.

In this study, we found that general surgery residents currently show limited improvement in overall robotic skills during the course of their training. This is likely explained by limited robotic console exposure time for residents during training, as well as the lack of a formal robotic curriculum. In the 1990s, surgical educators established the FLS training program, which subsequently was validated as an effective instructive tool that improves performance in the operating room [21]. Similar to the goal of FLS, implementation of a robotic fundamentals curriculum would likely improve resident robotic operative skills; in fact, recently proposed robotic simulation curricula have proven to be effective for urology residents [4]. Importantly, any robotic curriculum would require routine practice, as recent data have shown that robotic skills deteriorate after as little as 4 weeks of inactivity—practicing in at least 2-week intervals is required to maintain proficiency [22, 23].

On task-specific analysis, we found that senior residents performed better than junior residents and medical students only on the advanced SutureSponge task, but not on the less complex MatchBoard and EnergyDissection tasks. This difference in SutureSponge performance is best explained by the fact that increased laparoscopic experience is correlated with advanced robotic task performance (i.e., SutureSponge). Senior residents have advanced experience in laparoscopic techniques, such as intracorporeal suturing, and thus, the learned skillset of curved needle driving is transformed from the laparoscopic setting to the robotic platform. On the other hand, since prior studies have demonstrated that laparoscopically naïve subjects perform FLS tasks better using the robotic platform while senior residents perform worse on simple tasks [15], it is not surprising that the medical students and junior residents in our study performed similar to senior residents on the easier tasks (i.e., MatchBoard and EnergyDissection). These findings would be reinforced if future studies evaluated the performance of additional advanced surgery-specific tasks, such as intracorporeal knot tying and suturing, between senior and junior residents.

Contrary to prior studies that have demonstrated a correlation between video gaming experience and better initial performance in laparoscopic skillsets [24, 25], in our study, we did not identify a correlation between any video gaming history and robotic skills. This finding has been reported in the robotic surgical education literature and may be explained by the robotic platform’s interface [26]. Specifically, the robotic platform transforms 3D arm and hand movements onto a 3D screen and also allows the user to precisely control the instrument tip with natural wrist and finger movements. On the other hand, laparoscopy translates limited, and sometimes countering, hand movements onto a 2D screen, which is more similar to traditional video gaming. Furthermore, this same study instead observed superior robotic suturing performance in athletes and musicians, but not in video gamers [26]. Perhaps this illustrates that those who are already proficient in activities requiring manual dexterity and hand–eye coordination in a 3D environment are more adept to acquire robotic skills. Nevertheless, we did not find other demographics to be predictive of robotic skill performance such as age, sex, and hand dominance.

Now that surgical educators are learning more about resident performance on robotic simulator tasks in the setting of laparoscopic proficiency, several groups have described effective robotic curricula using virtual reality simulation [1214]. Currently, our residency program does not have a formal robotic skills curriculum—future implementation will likely depend on the outcomes of further studies evaluating the correlation of robotic simulator performance and operative skills, as well as the long-term durability of skillsets. While these efforts are still in development, ultimately, the optimal method of surgical education is integrating individualized training tactics and “deliberate practice” into the surgical curriculum. This concept engages trainees to practice skills that are tailored to improve an observed technical weakness. Prior studies in laparoscopic cholecystectomy have shown that when supervising surgeons give formal feedback identifying a trainee’s weakness during an operation, subsequent tailored simulation instruction significantly improves future operative performance [27]. Similar studies in robotic surgery are warranted to determine whether combining real-time constructive feedback with virtual reality simulation will translate into improved robotic skills.

While our results provide insight into future robotic education, there are three potential limitations to our study. First, there is a relatively small sample size of our cohort—a larger study population would reduce the chance of type II error and could potentially identify additional differences in task performance where we have concluded there are not. Second, we limited our evaluation to three simulator tasks, only one of which was advanced (SutureSponge). Future performance analyses including more complex surgery-specific tasks such as intracorporeal suturing and knot tying may give further insight into optimizing robotic technique education. Lastly, while virtual reality simulation has become a validated method for performance evaluation, we did not correlate our results to performance in the operating room. Further studies validating a large-scale, trainee-specific, virtual reality-based, robotic curriculum with improved operating-room performance would provide an effective model for robotic education.

In conclusion, skillsets obtained during current general surgery residency training with no formal robotic curriculum show minimal improvement in robotic simulator task performance during the course of training. The differences in robotic skills between senior residents and juniors appear limited to a more advanced surgery-specific task, which correlates with increased advanced laparoscopic experience. Since there is limited improvement in overall robotic skillsets, either fellowship training or an integrated robotic curriculum may be necessary and warranted for improved proficiency on the robotic platform.