Simulation-based training is a popular educational model for teaching key cognitive and technical skills required to perform a surgery [1]. There is growing evidence in the literature that simulation improves clinical performance in the operating room, enhances surgeon’s confidence, leads to fewer complications, and increases adherence to best practices [2]. Numerous studies have stressed the importance of these benefits for laparoscopic surgery (LS), which presents significant technical challenges compared to open surgery [3, 4]. For example, in LS the working space is limited, the surgeon operates with elongated tools, there is no tactile feedback, and vision is restricted as the endoscope acquires a 2D projection of the operating field. To alleviate these obstacles, it is crucial for the surgeon to develop advanced psychomotor skills before performing laparoscopic procedures.

Over the last decade, virtual reality (VR) simulators have served as a major vehicle for surgical training providing a reproducible and controlled environment for practice on a variety of minimally invasive surgery (MIS) tasks [5]. These tasks range from fundament skills such as camera navigation and instrument coordination to the more complex ones (e.g., bowel anastomosis) or even entire operations (e.g., laparoscopic cholecystectomy). For basic training, the virtual models usually resemble standard geometric shapes (e.g., sphere, cube, etc.), with which the trainee interacts toward a predefined objective (e.g., peg transfer). The main goal of the trainee is to achieve a basic level of proficiency in order to advance his or her skills to the next level which involves manipulation of virtual anatomies. Hence, the tasks become more meaningful as they involve activities encountered in real surgical practice, such as clip application and suturing. For greater realism, the virtual models are fused with physics-based properties such as deformation, gravity, and body motion.

The VR tasks are typically integrated into software that runs on a host computer connected to the mock-up of tool handles and an endoscopic camera. A critical component in the setup is the multisensory module that continuously monitors the degrees of freedom (DOFs) of the tools and the camera (hereafter referred to as instruments). This information is then transmitted to the graphics software, which makes the virtual objects respond realistically. Among the various signals transmitted, tracking and 3D pose estimation of the instruments constitute fundamental prerequisites. Obviously for tasks involving actions such as grasping, additional DOFs are needed (such as the opening angle of the graspers on the tool tip), but for basic tasks the aforementioned information is sufficient. For example, measuring the position and orientation of the camera is essential for constructing a projection view of the virtual world. The pose of the tool shaft is needed for rendering its view and for achieving visual consistency when it is occluded by other objects in the scene. Furthermore, knowing the tip position is essential for triggering interactions with the virtual world, which is important for simulation tasks. Having these pieces of information in hand, the development of a VR simulator for fundamental skills training, such as camera navigation and tool coordination, would be a straightforward process.

In the current VR-based surgical trainers, tool and camera tracking is usually achieved with electromagnetic (EM) sensors that provide 6DOF information (e.g., trakSTAR™ by Ascension). This system provides sub-millimeter accuracy and fast acquisition rates (up to 250 Hz), though at an increased cost (about $5000 for a system with two cube-centimeter-sized sensors and moderate coverage range). A basic limitation is that the sensor is prone to background EM noise, which in surgical simulation is notable due to the metallic tools. Nevertheless, several commercial systems and research prototypes have incorporated this technology for simulation training [6,7,8].

Another option is to use optical tracking based on an infrared (IR) emitter and retroreflective markers, such as the Polaris™ system from NDI. In this case, the device is able to track a number of passive and active IR tools in real time simultaneously [9,10,11]. Although EM noise interference is not applicable, a free line-of-sight is always mandatory as well as the fact that multiple markers are needed to be attached to the tool to obtain orientation. Moreover, the cost can be higher than that of an EM device.

Another alternative is to use normal calibrated cameras and software which detects appropriate pattern markers, such as the MicronTracker™ from Claron Technologies, or the freely available version of ARToolkit [12]. The algorithm essentially detects the marker’s border and estimates its pose (position and orientation) with respect to the camera. This method also provides sub-millimeter accuracy and requires a clear line-of-sight similarly to optical tracking. Furthermore, the acquisition rate is limited by the frame rate of the camera. Significant advantages are its simplicity and low-cost solution. The cost is essentially determined by the cameras employed. Examples of using pattern-based visual tracking can be found in [13] and [14] where the same camera used by the surgeon was also utilized for tracking a marker attached to the tool tip.

A few image-based algorithms have also been proposed for estimating the pose of the endoscopic shaft and for tracking the tool tip using a monocular camera [15,16,17,18]. This problem is remarkably challenging provided that neither special markers nor stereoscopic cameras are employed. The reported methodology essentially exploits the cylindrical symmetry of the instrument’s shaft [16], or the instrument’s insertion point with respect to the camera [18]. However, although these algorithms have clear benefits, they do not estimate the camera orientation. In addition, the reported results show moderate position accuracy and processing speed.

In the current literature, sensor-based systems are the preferred solution for instrument tracking [19]. On the other hand, sensors can be expensive and need to be attached/embedded into the tools, which clearly alters ergonomics. A preferred solution, which eliminates these obstacles, would be a robust vision-based algorithm. The newly developed Leap Motion controller (Leap Motion Inc., CA, US) is a compact, low-cost (about $50), vision-based position tracking system. A significant advantage of the device is that it can capture objects such as fingers and linear tools, which is useful for laparoscopic instrument tracking. Here, the term ‘linear tool’ refers to a slender object with long cylindrical shape, otherwise referred to as ‘pointable’ object by the manufacturer. In contrast to the aforementioned solutions, the device includes a stereo camera pair with sophisticated software, which measures the tool tip position and orientation with sub-millimeter accuracy. Hence, the need to modify the ergonomics of the endoscopic tool is eliminated. To the best of our knowledge, the potential of using the Leap Motion controller for surgical skills assessment has not been studied yet. In this paper, we present a collaborative study between two institutions that aimed to investigate the usage of the controller in a low-cost simulator for assessment of basic laparoscopic skills. The face validity of the system was also evaluated by analyzing the opinion outcomes from individuals involved in the study.

Methods

The training platform included a mechanical interface for instrument control, the Leap Motion controller for instrument tracking, and a host computer with a monitor for running and displaying the VR-based surgical tasks, respectively (see Fig. 1). The functionality of these components is described below.

Fig. 1
figure 1

A system overview of the experimental setup. The dummy laparoscope is inserted into the port of the middle holder. The other two holders are used for the custom laparoscopic instruments

Mechanical interface

The first requirement in laparoscopic simulation training is to reproduce the laparoscopic instrument movement characterized by the fulcrum effect. This is usually accomplished by means of trocars inserted into the surface of a pelvic trainer or a box. However, such a solution is impractical in our case since the Leap Motion controller requires uncovered space above its placement. Hence, the instrument control interface is designed to reproduce the pivoting action of the body wall on instrument handling, and at the same time to ensure a clear line-of-sight between the controller and the instruments. As will be described later, the interface should also allow the control of a dummy laparoscope consisting of two shafts placed parallel to each other. To fulfill these requirements, custom-made instrument holders were constructed as shown in Fig. 1. This configuration allowed the instruments to undergo the constrained movements encountered in laparoscopy and at the same time the instrument tip is kept inside the field of view (FOV) of the controller.

Instrument tracking with the Leap Motion controller

The Leap Motion controller is a small (L × W × H = 8 × 3 × 1 cm3) USB sensor designed to detect and track hands, fingers, and pointable objects. The device incorporates three IR LEDs placed at equal distance from each other, and two CCD cameras placed between each IR sensor pair. A right-handed Cartesian coordinate system is employed with the origin centered at the center of the system (Fig. 2A). In our setup this is the global coordinate system. The field of view is an inverted pyramid centered on the origin (Fig. 2B). The effective range extends from approximately 2.5–60 cm above the device, and up to approximately 20 cm in the X- and Z-axes. According to a published work on the controller’s accuracy, a deviation between a desired 3D position and the average measured positions was found to be below 0.2 mm for static setups and about 1 mm for dynamic setups [20].

Fig. 2
figure 2

A The Leap Motion controller uses a right-handed coordinate system. Green, red, and blue arrows denote the x-, y-, and z-axes, respectively. B The field of view of the controller is an inverted pyramid centered at the origin of the coordinate system. (Color figure online)

As per controller specifications, it is capable of tracking linear tools with diameters of 3–10 mm [20]; however, our initial experiments with actual laparoscopic tools showed that tracking was not adequate, probably due to the low IR reflectance received by the controller’s sensors. To overcome this issue, a set of custom tools (8 mm diameter) was constructed, with a weight equivalent to that of the actual surgical instruments (see Fig. 3A). The tools were controlled via custom holders, as described above, whereas the Leap Motion controller was placed in front of the holders as shown in Fig. 1. The controller’s application programming interface provided the position and direction of the tool tip with respect to the origin.

Fig. 3
figure 3

A Two custom tools used in the IN and BO tasks, B A dummy laparoscope used in the CN task. The viewing direction of the camera coincides with the medial line (dashed blue line). The camera up vector (green) is perpendicular to the medial line and the vector connecting the tips of the two shafts (red). The black arrows denote the directions of the shafts. (Color figure online)

The standard camera model employed in VR applications requires a transformation matrix which describes the pose of the camera with respect to a global coordinate system. To reproduce this matrix, three parameters are essential: the position of the camera, the camera viewing direction (otherwise referred to as the view plane normal), and the camera up vector. For a linear tool, the controller measures only the first two parameters. Information about the rotation of the tool around its axis is not provided and hence the camera up vector cannot be defined. To overcome this limitation, a custom instrument which served as a dummy laparoscope was constructed. It consisted of two shafts (each of 50 cm length and 8 mm diameter) placed parallel to each other at a 5 cm distance (see Fig. 3B). At the proximal end a simple handle was fixed. Based on this configuration, the camera position was thought to be located at the distal end of the middle line between the two shafts. The viewing direction of the camera coincided with the direction of the medial line, which was obtained from the average of the direction vectors of the two shafts. The camera up vector was defined to be perpendicular to the viewing direction and the vector connecting the tips of the two shafts (Fig. 3B). After inserting one of the shafts into the custom holder, the operator could use the entire configuration as a dummy laparoscope, moving it around the pivoting point of the holder (see Fig. 1).

Simulation experiments

Based on the aforementioned setup, three VR simulation tasks were developed for assessment of key laparoscopic skills such as depth perception and hand–eye coordination. These tasks were performed by manipulation of the dummy laparoscope (task 1) and the custom laparoscopic instruments (tasks 2 and 3). The 3D models involved in the tasks were designed with the Blender software. For 3D rendering, the Ogre3D graphics engine was used. To simulate physics-based interaction of the virtual models, the BulletPhysics engine was integrated. In the following paragraphs, a description of each task is provided.

Camera navigation (CN): This task employs having targets (gallstones) at random locations on a tissue surface (Fig. 4A). The task requires the user first to detect and then to focus on each gallstone with the laparoscope for a predefined time interval (5 s). Focus is deemed successful when the camera–target distance is ≤3 cm. In this case the gallstone is highlighted. In addition, the user is asked to maintain the orientation indicator of the laparoscope within a certain range (+/−5°) around a requested orientation angle. This process is performed for a total of 5 targets.

Fig. 4
figure 4

Screenshots of the three simulation tasks: A camera navigation, B instrument navigation, and C bimanual operation

Instrument navigation (IN): A board with six buttons is introduced at the center of a virtual scene (Fig. 4B). Initially, one of the buttons is highlighted in green and the trainee is given a time period of 5 s to hit this button with the instrument tip but without touching the board floor. Once the button is hit or the given time period has expired, another button becomes highlighted in a random order. The camera is set at a fixed pose providing a close-up view of the entire button board.

Bimanual operation (BO): a board with two cylindrical pots is introduced into the virtual scene (Fig. 4C). The goal is to transfer sequential balls into each pot by using both instruments. Each ball is colored either red or green, denoting the pot that should be transferred to (red and green pots located on either side of the board). The task requires the user to hold one instrument with its tip inside a certain spherical area in order to activate the other instrument to touch and then transfer the ball into the appropriate pot. It should be emphasized that during transferring with the second instrument, the tip of the first one must be kept inside the aforementioned area; otherwise the ball is detached and drops. This process is repeated with the instruments used alternately for four times. A transfer failure occurs when a ball is accidentally dropped outside the board or transferred into the wrong pot. Throughout the task, the camera is located at a fixed pose providing a close-up view of the entire board.

The accompanying video (video 1) demonstrates examples of the 3 VR tasks performed with the aid of the Leap Motion controller. For the CN task, the user first locates the target and then rotates the dummy camera to align its orientation indicator (in red) with the desired orientation (in green). Then the target is approached with the camera. Note that when the requested camera–target distance is achieved the target becomes green. As long as the camera is held steady for a predefined time interval, the target orientation changes and the process is repeated with a new target. For the IN task, the user hits the highlighted buttons with the left and the right instrument alternatively. Note that on a couple of occasions the board becomes red because it is accidentally hit with the instrument tip. For the third task, the user first holds the left instrument with its tip positioned within the green area to activate the right instrument to perform ball transfer. This activation is visually notable by a color stripe on the tooltip of the right instrument (i.e., the one used for the transfer). The task is repeated with the instruments alternatively. Note that the next ball (transferred with the left instrument) is dropped because the tip of the right instrument makes an accidental move outside the highlighted red area.

Study design

The experiments and data collection were carried out in two medical training simulation centers: MPLSC (Athens, Greece) and CRESENT (Riyadh, Kingdom of Saudi Arabia). After familiarization with the system, two groups of surgeons (28 experts and 21 novices) performed two trials of each training task. The experts were those with significant experience in laparoscopic surgery defined as having performed more than 100 laparoscopic operations. The experience of the novices was limited to less than ten laparoscopic operations. A written informed consent was obtained from all subjects prior to participation. Each participant first received a detailed description of the tasks, and after a short familiarization with the system each task was performed in a sequential manner (CN, IN, and BO).

Performance metrics and analysis

The various task-dependent metrics, common in laparoscopic training, considered were time, pathlength, and errors. Time was the total time required to complete a task, while pathlength was the total pathlength of either the camera (for CN) or the laparoscopic tools (for IN and BO). Two errors were predefined for each task (task-specific errors) and are presented in Table 1. Because each subject performed two trials per task, the average for each metric across the two trials was used for statistical analysis. The analysis was performed using the MATLAB® Statistics toolbox (MathWorks, Natick, MA, USA). Between-group comparison of performance metrics was undertaken with the Mann–Whitney U test (5% level of significance).

Table 1 Description of errors recorded in each simulation task

Questionnaire

The realism and training ability of the tasks were also assessed via a questionnaire completed by the participants after the study completion. Using a 5-point Likert scale scoring system, the provided choices were “none,” “low,” “medium,” “high,” and “very high.” Each of the three training tasks was assessed based on the following criteria:

  1. 1.

    How do you rate the realism of the graphical representation of the VR objects?

  2. 2.

    How do you rate the realism of the interaction between the instruments and the VR objects?

  3. 3.

    How do you rate the difficulty of the task?

  4. 4.

    How important was the lack of force feedback during tool–object interaction?

Results

The effectiveness of the simulation training system was assessed by comparing the performance of the two experience groups. Figure 5 shows bar charts of time and pathlength for each simulation task. Figure 6 shows a similar graph but for the two error types measured in each task. Table 2 shows the median values and interquartile range for all metrics observed across the three tasks, as well statistical comparison results. It is noticeable that in most performance metrics the p values indicate a highly significant performance difference between the two groups (p < 0.01). Moreover, the interquartile difference of experts was clearly smaller than that of the novices for all metrics across the three tasks, indicating a robust performance of the former (Table 2).

Fig. 5
figure 5

Bar charts for the time (left) and pathlength (right) for each of the three simulation tasks. On each bar, the top edge is the median and the error bar extends to the 75th percentile

Fig. 6
figure 6

Bar charts for the errors recorded during performance of each of the three tasks: camera navigation, instrument navigation, and bimanual operation. Top edge is the median and the error bar extends to the 75th percentile

Table 2 Performance results (medians) and statistical comparison (experts vs. novices) for the four observed metrics across the three simulation tasks

From Figs. 5 and 6, one can clearly see that experts demonstrated higher performance than novices for all metrics across the three tasks (i.e., less time, shorter pathlength, and fewer errors). For the CN task, experts significantly outperformed novices with respect to ‘disorientation time’ (p < 0.05), which means that the focus of the camera was maintained within the allowable orientation area. However, for the other metrics, experts had higher, though not significant (p > 0.05), performance (less time, shorter pathlength, and less lost focus). For the IN task, the expert group completed the task significantly faster (p < 0.05), with shorter tool pathlength (p < 0.05) and fewer errors (p < 0.05). Especially for the ‘missed buttons’ metric, it is worth noting that experts managed to press all buttons within the maximum time limit allowed. Bimanual operation was the most demanding task; the results showed that there was a highly significant difference in the performance of the two groups across all metrics: time, pathlength, dropped balls, and transfer failures (p < 0.001). Similarly to the ‘disorientation time’ (CN task) and the ‘missed buttons’ (IN task) metrics, experts demonstrated almost zero ‘transfer failures,’ which means that they managed to transfer all balls successfully.

The greatest differences in the performance for time and pathlength between the two groups occurred in the BO task (see Fig. 5), which proved to be the hardest task since it required fine and simultaneous coordination of both instruments. This evidence is also supported by Fig. 7, which illustrates samples of instrument trajectories from subjects with different experience that performed a BO session. It is clear that the left- and right-hand trajectories of the experts are well separated, implying an increased level of instrument coordination. In contrast, the novices’ trajectories of the left and right hand are interlaced, denoting poor instrument manipulation. Note that according to this task the pot where the ball should be transferred to and the current instrument performing the transfer are always located on the same side (i.e., either on the left or the right side). Experts appeared to perform fine instrument movements without crossing the other side of the board and tend to move both hands in a similar fashion (Fig. 7A). In contrast, novices tended to move the instruments over the entire board, in more chaotic pattern, in order to manage touching, and then transferring, the ball (Fig. 7B).

Fig. 7
figure 7

Kinematic data from the left (blue) and the right (red) endoscopic tools for experts (A) and novices (B). The data were measured with the Leap Motion controller while the subjects performed the BO task. (Color figure online)

Table 3 illustrates the subjects’ ratings for the face validity of the simulator. With regard to the graphical representation and the physics-based behavior of the simulation tasks, both groups agreed that the attributed realism was highly to very highly sufficient to provide the expected training qualities. Only for CN the interaction realism was rated lower than that of the other tasks. With regard to the difficulty of the tasks, the subjects found CN to be rather easy, something that was indirectly highlighted by the performance scores; in three out of the four metrics, there was no significant performance difference between the groups. The BO task was found to be the hardest task and IN was found to have medium to high difficulty. Regarding the sense of force feedback, both groups seem to agree that its absence does not play a significant role for CN (a task that does not involve tool–object interaction). For IN, force feedback was rated medium to highly important, whereas for BO it was considered very important.

Table 3 Feedback questionnaire statements and mean ratings for each task and experience group

Discussion

In this paper, we present a low-cost simulator for assessment of fundamental laparoscopic skills based on the Leap Motion controller. We have developed three different VR tasks, each requiring demonstration of special technical skills such as depth perception, hand–eye coordination, and bimanual manipulation of the instruments. Table 4 presents a basic overview of the characteristics and cost of the proposed system in comparison to the two broad classes of VR laparoscopic simulators. Our system offers the opportunity for basic skills training and assessment at a significantly lower cost compared to the advanced VR simulators. However, the latter also offers advanced training scenarios, the implementation of which requires the usage of sophisticated hardware equipment at a much higher cost.

Table 4 Overview of major classes of VR simulators employed for laparoscopic training and assessment

The results demonstrated construct validity (the ability to discriminate between different levels of expertise) of the system for all basic laparoscopic tasks and most assessment metrics considered. With regard to the assessment of the tasks by the participants, the questionnaire results showed that only for CN the interaction realism was found to be lower than that of the other tasks, probably because the user does not interact with virtual objects, but he/she navigates inside the cavity to locate them. Another important point was that for IN the need for force feedback was rated medium to highly important, whereas for BO it was found to be very important. The latter was expected since BO was the hardest task; among other challenges, the subjects did not have a sense of force feedback when the instrument touched a ball or when the ball was transferred with the instrument into the virtual bin.

Using the Leap Motion controller for instrument tracking provided significant advantages over the current simulation systems. First, there is no need for special electromechanical sensors. It is well known that sensors deteriorate ergonomics of surgical instruments and thus change the way the surgeon holds them or experiences their form and weight [21]. Second, the system is portable since the whole setup is compact in size and lightweight. The constructed tool used as a dummy laparoscope is also lightweight and easy to transport. For those tasks that camera manipulation is not needed, the scene view with respect to the camera can be fixed. Thus, the operator may use the laparoscopic graspers to interact with the virtual models, which provides a natural way of instrument usage.

With regard to the underlying tracking method [22], the 3D reflections captured by the cameras are first sliced into multiple 2D cross-sectional images. The cross-sectional positions and sizes of the 3D objects in each 2D slice are determined based on the positions of three IR sources used to illuminate the object and the captured reflections. The 3D structure of the object is then reconstructed by assembling a plurality of the cross-section regions obtained in the 2D slices. However, it is critical to note that IR reflectance from objects in the surrounding environment may interfere with the reflectance from the object of interest. This observation was the main reason for constructing the proposed interface for instrument insertion. Our initial experiments with the Leap Motion controller positioned inside a box trainer (often used as a mock of the abdominal cavity) showed that the instruments could hardly get tracked. A possible explanation is that the controller detects significant IR reflectance from the interiors of the box (i.e., sidewalls and roof), and thus instrument tracking becomes infeasible. An interface for instrument insertion and control that is similar to the one proposed has already been employed by commercial simulators [23].

To the best of our knowledge, this is the first study that employs Leap Motion controller as the core component for assessment of surgical skills. Our results showed that this technology may well be used for instrument tracking, allowing the development of a variety of VR surgical tasks. However, the Leap Motion controller was unable to provide additional information that is essential for the development of more complex tasks. For example, in its current form, the controller cannot provide information about the rotation of the instrument shaft around its axis. A possible remedy may be the development of special tools, such as the proposed dummy laparoscope. In this case, the axis rotation may be estimated by the relative direction of two shafts placed parallel to each other.

With regard to the camera navigation task, a drawback is that the (hypothetical) simulated camera axis is not co-axial with its rotation axis (trocar axis), as in the case of surgical endoscopes. A potential solution would be an alternative configuration where the camera handle and the view axis would be aligned with either of the two shafts of the dummy laparoscope. These shafts should then have different diameters in order to allow identification of the one that is aligned with the handle. This approach would allow alleviating another minor issue, the torque exerted by the camera shafts. The two shafts could be shifted closer to each other, hence reducing further the torque exerted to the handle. However, in the current system configuration, the amount of torque is almost negligible due to the lightweight material employed (wood).

An additional, yet significant, limitation of the system is that surgical tasks requiring usage of the jaws of the instrument cannot be simulated. These tasks involve actions such as grasping, clipping, and cutting, which are essential in laparoscopic surgery. In its current form, the Leap Motion controller can only be used for basic training tasks on depth perception and instrument navigation. In order to be able to capture additional degrees of freedom, custom instruments may be developed. For example, since the controller provides information about the position of linear objects, one may use a custom tool similar to the dummy laparoscope developed. Under this configuration, the distance between the two parallel shafts could be regulated by a custom gripper attached to the handle of the tool. However, in this case one may face additional challenges such as when the camera projections of the two shafts coincide with one another. In this case, the projection angle between the two shafts will be zero, although the gripper may still be open. A similar problem was encountered in the CN task, when the dummy laparoscope was held vertically with respect to the viewing plane of the controller. To overcome this limitation, the task was designed so as to allow the requested orientations to vary within a range of <180°. Despite this compromise, our results demonstrated that the two experience groups had a significant performance difference in one metric.

As mentioned earlier, the VR tasks considered here did not involve actions such as grasping, which also require the sensation of force feedback. Although there are force feedback technologies currently available, in this study we were mostly interested in developing a low-cost, simple, and portable system for assessment of basic surgical skills. In fact, the benefit of force feedback in laparoscopic training simulators has been a major topic of debate in the literature. Research studies indicate that training on VR systems without feedback can also lead to enhancement of surgical performance and differentiate subjects with different levels of experience [24, 25]. In addition, in a study examining the role of force feedback in surgical training, it was shown that for basic tasks haptic-enhanced simulation did not demonstrate an appreciable performance improvement among trainees [26]. Hence, although feedback can be of crucial importance for technically demanding tasks, it may not be necessary for basic skills training.

As a future work, we plan to address the aforementioned limitations and to study the effect of force feedback by performing comparison studies with simulators that incorporate this feature. Moreover, we aim to develop more advanced tasks for addressing other important surgical skills. Finally, we aim to investigate the ability of the proposed system for surgical training and transfer of virtual skills to the operating room.