Introduction

Surgical training has traditionally been based on an apprenticeship-style model, whereby clinical acumen and surgical skills are acquired under the supervision of an experienced mentor. With the introduction of mandated regulations for clinical duties and responsibilities, as well as the integration of innovative new surgical technologies such as robotics, the Halstedian model of training has been found insufficient to meet the needs of many contemporary trainees [1, 2].

Surgical simulation, when properly integrated into a comprehensive curriculum, has proved to be a valid educational tool to address this training gap [35]. Simulation-based training not only has the benefit of increased trainee exposure to content, but it allows for deliberate practice in a low-stakes environment that does not compromise patient safety [6]. In addition, formative and summative assessments can be made of the trainee using simulation-based devices, ensuring competency-based training of surgical trainees.

Outside of the United States, robotic surgery remains a relatively novel surgical technology, including in countries such as Canada. Until a critical mass of expertise and clinical volume develops, clinical opportunities for trainees will remain limited, as there is a “trickle-down” effect from novice faculty surgeons who continue to work through their respective learning curves. Robotic simulators may provide trainees with an opportunity to develop basic skills during this adoptive phase of technology, addressing the unavoidable training gap mentioned earlier. Due to the somewhat prohibitive costs associated with integrating such simulators into a robotic surgery training curriculum, validity evidence must be demonstrated, with various forms of validity evidence required. Face validity concerns the realism of a simulator and is determined by novice, non-experts. Content validity involves a judgment made by experts on whether a simulator actually teaches or assesses the content material of importance. Construct validity evidence relates to the ability to accurately distinguish “content novices” from “content experts”, and is critical to any valid simulator or simulation.

In addition to construct validity evidence, simulators that are to be used as an assessment tool should also demonstrate criterion validity. For example, concurrent validity, a form of criterion validity, concerns whether an assessment made using the simulator correlates with assessments made using accepted “gold standard” evaluative tools.

The aim of this study is to determine whether a commercially available robotic simulator, the da Vinci® Skills Simulator (dVSS), demonstrates validity evidence for both training and assessment purposes in the context of multi-disciplinary surgical trainees.

Methods

As part of a larger, more comprehensive 4-week robotic surgery basic skills training curriculum, residents and faculty members from the University of Toronto Divisions of Urology and Thoracic Surgery and Department of Obstetrics and Gynecology (ObGyn) were included in this validation study. Prior to testing on the dVSS, all subjects were provided with an introduction to the da Vinci robot (dVR) that included both a discussion and a demonstration of robot set-up, docking, instrument exchange, camera navigation, instrument clutching, suturing and knot tying, and object manipulation. This introduction also included approximately 10 min of hands-on basic skills training using the dVR and various inanimate part-task training models. Each subject was then assessed on their performance of two standardized skill tasks: Ring Transfer (RT) and Needle Passing (NP). Time to completion and number of errors were recorded for both tasks by two trained faculty educators, with errors being defined as dropped objects, unintentional instrument collisions, and excessive force on the model.

One week after the introductory session with the dVR, each subject was given a brief standardized introduction to the dVSS. Subjects were first permitted to complete a practice exercise (“pick and place”) to gain familiarity with the dVSS functionality, after which they performed at least seven different exercises on the dVSS: Camera Targeting 1, Peg Board 1, Peg Board 2, Ring Walk 2, Match Board 1, Thread the Rings, and Suture Sponge 1. Using the built-in Mimic® scoring algorithm, each subject was assessed on overall score, time to completion, economy of motion, and number of errors for each exercise.

Participants with more than 20 robotic surgical cases performed were categorized as more experienced (ERS), while all others were classified as novice robotic surgeons (NRS). Statistical analysis was conducted using SPSS® v21 software with independent T-tests used to compare mean scores for construct validity evidence and non-parametric Spearman’s correlation calculations to determine concurrent validity evidence.

Results

Demographics

A total of 53 participants were enrolled in this multi-disciplinary dVSS validation study: 27 from urology, 13 from ObGyn, and 13 from thoracic surgery (Table 1). The majority of subjects (89 %) had either no prior robotic console experience or had performed <10 robotic cases. Only five subjects (9 %) had performed ≥20 robotic surgical procedures, though four of them had performed >50 cases each.

Face and content validity

Overall, most subjects (97 %) agreed that the dVSS demonstrated acceptable realism in comparison to the dVR. More specifically, no less than 92 % of subjects agreed that the dVSS was a good simulation of the dVR with respect to each of camera navigation, clutch functionality, EndoWrist manipulation, and needle driving. Only 64 and 42 % of participants felt that the dVSS accurately simulated the dVR in regard to knot tying and dissection/cautery, respectively. Overall, 89 % of all participants felt that the dVSS was as effective at basic robotic skills training as using the dVR with inanimate models. All five surgeons (100 %) with significant robotic experience felt the dVSS was a valid educational tool for novice robotic surgery trainees.

Construct validity

ERS were found to perform significantly better than NRS on five out of the seven dVSS exercises with respect to overall score: Camera Targeting 1 (92 vs. 67 %, p = 0.008), Peg Board 1 (92 vs. 77 %, p = 0.004), Match Board 1 (85 vs. 68 %, p = 0.028), Thread the Rings (90 vs. 72 %, p = 0.011), Suture Sponge 1 (86 vs. 73 %, p = 0.042). Only the Ring Walk 2 (88 vs. 73 %, p = 0.086) and Peg Board 2 (92 vs. 83 %, p = 0.082) exercises did not demonstrate evidence of construct validity (Table 2).

ERS also outperformed NRS on both dVR standardized tasks: RT time (65 vs. 172 s, p = 0.001), RT errors (0.7 vs. 3.3, p = 0.004), and NP time (90 vs. 226 s, p < 0.001). There was no difference between ERS and NRS, however, for NP errors (1.8 vs. 4.0, p = 0.08).

Concurrent validity

Participants’ overall score on all but one (Peg Board 2) of the seven exercises selected for validation correlated with time to completion on both RT and NP tasks (p < 0.05). All other dVSS performance metrics (time to completion, economy of motion, and number of errors), for five of the seven different exercises correlated only with time to completion of the NP task (p < 0.05). None of the seven different overall scores, however, correlated with number of errors for both RT and NP tasks (Table 3).

Discussion

This study is the first to examine the validity evidence of the dVSS as both an instructional tool and assessment device in a multi-disciplinary cohort of trainees. The seven different dVSS exercises selected all seem to demonstrate acceptable face, content, construct, and concurrent validity evidence, supporting the integration of the dVSS when developing comprehensive, competency-based basic robotic skills training curricula. While advanced robotic surgical training will require subspecialty-specific training content and procedure-specific instructional methods, this multi-disciplinary study demonstrates that basic robotic skills can be taught and assessed through a common curriculum, using a common surgical simulator.

The potential benefits of robotic surgery have been well documented [712], and while the integration of robotics into clinical practice has been widespread, the development of validated training curricula and certification policies has not. Utilization of surgical robotics has also become a multi-disciplinary endeavour with subspecialties such as urology, gynecology, cardiothoracic surgery, general surgery, and otolaryngology all adopting the technology [7]. The availability and use of robotic simulators has the potential to significantly improve the initial learning curve associated with the adoption of any new technology, such as robotics, by permitting both educational and assessment opportunities. Surgical simulators are, however, associated with significant capital costs so it is imperative that proper validity evidence be provided before integration of such instructional methods. In addition, given the various surgical disciplines now using the surgical robot, it is a duty of educators to develop a multi-disciplinary curriculum that is applicable to many different surgical trainees from various backgrounds, at least for basic skills training, rather than “reinventing the wheel” for each subspecialty.

Several studies have found similar validity evidence for the dVSS as a training tool [1315]. Liss and et al. [13] demonstrated acceptable content and construct validity evidence for the dVSS in a cohort of urology trainees and faculty members. Similarly, Kelly and colleagues demonstrated excellent face, content, and construct validity among a multi-disciplinary group of surgical trainees and faculty [14].

To date, only one other study has found validity evidence of the dVSS as a potential assessment device [16]. Hung and colleagues demonstrated that among a cohort of urologists, the dVSS demonstrated excellent concurrent and predictive validity evidence. In addition, the authors found that simulation-based training on the dVSS was particularly beneficial for “weaker” robotic surgeons.

There have been several other studies evaluating the validity evidence for other robotic surgical simulators such as the dV-Trainer™, RoSS®, and ProMIS® [1720], many with similar findings. While there have been limited head-to-head comparisons between robotic simulator platforms, differences are likely to be of minimal educational significance as all have the common benefits of providing learners with opportunities for deliberate practice, content exposure, and even feedback.

There are several limitations to this study. While study participants were drawn from multiple surgical specialties, this was a single institution study, potentially impacting the generalizability of the results. Further multi-centre validation studies are required to validate the robustness of the concurrent validity evidence. The two faculty raters were trained specifically regarding the definition of errors during the performance of the RT and NP tasks; however, all participants were rated by only one faculty educator. As such, reliability scores for the number of errors made during the RT and NP tasks is not available, potentially compromising validity. Finally, the cohort of participants was a relatively inexperienced group of robotic surgeons, with <10 % of participants having performed more than 20 robotic cases. The evidence in support of utilizing the dVSS as both a training tool and assessment device may therefore be limited to a relatively novice audience.

Conclusions

This multidisciplinary validation study of the dVSS provides excellent face, content, construct, and concurrent validity evidence. This supports its integrated use in a comprehensive basic robotic surgery training curriculum, both as an educational tool and potentially as an assessment device.