Simulation constitutes a novel paradigm for training and assessment in a wide range of medical procedures such as diagnostic examinations (e.g., colonoscopy), therapeutic interventions (e.g., surgery), and effective communication between team members or with the patient (e.g., taking medical history); especially in surgery, the importance of simulation-based training is highlighted by the fact that trainees can learn in a controlled environment and have the freedom to make mistakes without the need for intervention by an expert to stop patient harm. Moreover, trainees are able to review their performance via an assessment report, receive constructive feedback from an expert, and perform the exact same scenario for skills improvement.

In minimally invasive surgery (MIS), simulation-based learning has been introduced as a valuable alternative to traditional training, mainly due to the complexity of the tasks involved in the performance of a surgical procedure. In particular, MIS requires the demonstration of advanced manual dexterity since the surgeon operates in a restricted environment using long and thin instruments, while an assistant holds the endoscopic camera. Moreover, the lack of depth perception and reduced tactile feedback pose some additional constraints, which make the performance of MIS more challenging than traditional open surgery. On the other hand though, MIS offers significant benefits for the patient such as easier recovery and shorter hospital stays.

Over the last few years, numerous simulation devices have become available, allowing training and assessment of psychomotor skills that are essential in performing MIS procedures. Studies have also showed that these skills are directly transferable into the operating room, emphasizing further the usefulness of simulators in clinical practice [1]. The existing trainers can be divided into three main categories: physical reality (PR), virtual reality (VR), and augmented reality (AR) [24]. PR trainers essentially consist of a box having the form of an abdominal cavity, within which synthetic models of anatomical parts are placed for training purposes. Based on this principle, various customized systems equipped with a variety of electromechanical sensors have been developed in order to capture the tool’s movements [5]. VR simulators employ highly sophisticated multisensory equipment, advanced computer graphics and physics-based modeling techniques, for reproducing tasks that imitate real-life surgical scenarios. The trainee essentially manipulates a custom electromechanical interface to interact with virtual anatomical elements presented on the monitor. Recently, a novel AR simulator was proposed [6]. Using the actual laparoscopic instrumentation, the operator is able to interact with virtual elements, similar to VR, which are though introduced into an actual box trainer, similar to that used in PR systems.

Instrument tracking is a critical part for every training system belonging to one of these categories. It essentially refers to the methodologies that are applied in order to obtain information regarding the instrument pose (position and orientation), with respect to a known coordinate system [7]. In VR or AR simulators, knowledge of the laparoscopic instrument pose is essential for implementing training scenarios that provide realistic interaction between the instrument and virtual objects, such as pegs or rings [6]. In PR systems, information regarding the movements of the instrument shaft or tip is also essential for obtaining valuable kinematic information for assessment purposes. However, although the current literature abounds of computational techniques designed for assessment purposes [4], these are based on data obtained from sensors attached to an arbitrary position of the instrument, thus providing an abstract measure of the kinematics of the tool [8, 9]. Precise information about the orientation of the shaft or the position of the tip is not usually employed, mainly due to the difficulty in obtaining this information from the sensor’s data. Recently, some computer vision approaches have been employed for estimating the instrument pose [7, 10].

In most cases, instrument tracking is achieved with 3D tracking devices that consist of: a movable component firmly attached to the instrument and a static component placed at a known position with respect to the simulation environment (e.g., box trainer) [11]. Such devices may be based on electromagnetic (EM) [6], optical [8, 12, 13], or mechanical sensors [14, 15]. The state-of-the-art 3D tracking systems currently used in laparoscopic simulation training provide highly accurate information regarding the pose of the movable component with respect to the known reference frame (static component) [11]. However, to obtain the pose of the instrument with respect to this reference frame, an extra calibration step is essential: knowledge of the instrument pose with respect to the attached tracking device.

While commercially available laparoscopic simulators implement custom calibration methods and achieve accurate results, to the best of our knowledge no plug-and-play calibration technique exists that could be easily applied in experimental practice. In their proposal of a guidance system for laparoscopic surgery, Nicolau et al. [16] presented a calibration method for the position and orientation of the instrument tip with respect to a pattern marker attached to the handle. However, as described by the authors, the calibration method itself was not the main focus of that work. Pagador et al. [17] presented a calibration method for instrument tracking using EM tracking devices. Although the authors provided a detailed description, the proposed calibration protocol required a custom made wooden apparatus, hence making their method difficult to replicate.

In this paper we propose a robust, accurate, and easy-to-implement calibration protocol for estimating the pose of a laparoscopic instrument with respect to a tracking device that is attached to a random location on the instrument handle. Our principal aim is to describe a calibration method that can be easily adapted to various types of rigid endoscopic tools and tracking devices. This method can provide accurate tool kinematics for use in both box-trainer platforms as well as prototypes of custom VR and AR simulators. The proposed method is designed and tested using an EM tracking system, but it can be adapted to other type of tracking systems such as optical infrared devices or visual markers.

Materials and methods

Experimental setup

The basic components of the experimental setup include: standard laparoscopic instrument and trocar, and the trakSTAR™ (Ascension Tech Corp., Burlington, VT) EM tracking system. The tracking system consists of a transmitter placed at a fixed position on a planar surface, and a receiver (sensor) mounted at an arbitrary position on the instrument’s handle. Additionally, a tripod that holds the trocar at a fixed position with respect to the transmitter is employed. The experimental configuration is illustrated in Fig. 1A.

Fig. 1
figure 1

A Illustration of the experimental setup consisting of a laparoscopic instrument, an EM tracking device, a surgical trocar, and a tripod to hold the trocar in a fixed position. B The coordinate systems corresponding to the experimental setup, \(\varvec{C}_{\varvec{T}}\) for the EM transmitter and \(\varvec{C}_{\varvec{S}}\) for the EM sensor, and the laparoscopic instrument expressed as a vector \(\varvec{V}_{\varvec{I}}\), from points \(\varvec{P}_{{{\mathbf{start}}}}\) to \(\varvec{P}_{{{\mathbf{tip}}}}\)

Theoretical background

The aim of the proposed calibration method is to calculate the pose (position & orientation) of the instrument shaft, as well as the 3D position of the tooltip, with respect to the transmitter. Figure 1B illustrates the two coordinate systems corresponding to the experimental setup: the reference frame of the transmitter (\(\varvec{C}_{\varvec{T}}\)), here referred to as the global coordinate system and the reference frame of the sensor (\(\varvec{C}_{\varvec{S}}\)), here referred to as the local coordinate system. The EM tracking device provides the position and orientation of the sensor with respect to the transmitter, defined as a linear transformation \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}\) from \(\varvec{C}_{\varvec{T}}\) to \(\varvec{C}_{\varvec{S}}\). A Cartesian transformation between two coordinate systems is expressed as a homogenous transformation matrix that consists of two parts: rotation and translation:

$$\varvec{M} = \left[ {\begin{array}{*{20}c} {R_{3 \times 3} } & {\begin{array}{*{20}c} x \\ y \\ z \\ \end{array} } \\ {\begin{array}{*{20}c} { 0} & 0 & 0 \\ \end{array} } & {1 } \\ \end{array} } \right]$$
(1)

As Fig. 1B depicts, the instrument’s shaft can be described as a vector \(\varvec{V}_{\varvec{I}}\) connecting the points \(\varvec{P}_{{{\mathbf{start}}}}\) and \(\varvec{P}_{{{\mathbf{tip}}}}\) of the local reference frame, where \(\varvec{P}_{{{\mathbf{tip}}}}\) refers to the tooltip and \(\varvec{P}_{{{\mathbf{start}}}}\) is an arbitrary point lying on the shaft, close to the instrument handles. Expressed in spherical coordinates any point along the shaft is defined as:

$$\varvec{P} = \left[ { \begin{array}{*{20}c} {x_{1} } \\ {y_{1} } \\ {z_{1} } \\ \end{array} } \right] + \left[ {\begin{array}{*{20}c} {r\cdot\sin \left( \theta \right)\cdot\cos \left( \varphi \right)} \\ {r\cdot\sin \left( \theta \right)\cdot\sin \left( \varphi \right)} \\ {r\cdot\cos \left( \theta \right)} \\ \end{array} } \right]$$
(2)

where \(\left( {x_{1} ,y_{1} ,z_{1} } \right)\) are the local coordinates of \(\varvec{P}_{{{\mathbf{start}}}}\), θ and φ are the two angles defining the orientation of \(\varvec{V}_{\varvec{I}}\) with respect to \(\varvec{C}_{\varvec{S}}\), and r is the length of the vector connecting \(\varvec{P}_{{{\mathbf{start}}}}\) and \(\varvec{P}\).

Using the Cartesian transformation matrix of Eq. 1, any point with \(\left( {x,y,z} \right)\) coordinates belonging to the local reference frame can be transformed to a point in the global reference frame using Eq. 3:

$$\varvec{P} = \varvec{M}_{{\varvec{T} \to \varvec{S}}} \cdot \left[ { \begin{array}{*{20}c} x \\ y \\ z \\ \end{array} } \right]$$
(3)

Applying Eq. 3 for \(\varvec{P}_{{{\mathbf{start}}}}\) and \(\varvec{P}_{{{\mathbf{tip}}}}\) along with the transformation \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}\), which is provided by the EM tracking system, the vector \(\varvec{V}_{\varvec{I}}\), corresponding to the pose of the shaft, as well as the tooltip position \(\varvec{P}_{{{\mathbf{tip}}}}\), can be fully defined in the global reference frame. So, essentially the calibration method aims to compute the following parameters: \(\varvec{P}_{{{\mathbf{start}}}}\), r, θ and φ.

Calibration protocol

The proposed calibration method consists of two steps. In the first step, the instrument is fully inserted into the trocar, which is positioned at a fixed tripod within the range of the EM transmitter (Fig. 1A). This setup prevents the instrument from moving to a direction other than the direction of the trocar. The trocar direction and consequently the instrument shaft define an arbitrary axis in the global reference frame (\(\varvec{C}_{\varvec{T}}\)). At this stage a 360° rotation of the instrument around its shaft is performed, as illustrated in Fig. 2. During this rotation, the EM sensor performs a circular motion with respect to \(\varvec{C}_{\varvec{T}}\) providing a set of uniformly distributed poses \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}}\). The barycenter of rotation for this circular path, which lies on the shaft (Fig. 2), is calculated using Eq. 4.

$$\left[ {x_{c} , y_{c} , z_{c} } \right] = {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 N}}\right.\kern-0pt} \!\lower0.7ex\hbox{$N$}} \cdot \mathop \sum \limits_{i = 1}^{N} \left[ { x, y, z } \right]_{i}$$
(4)

where \((x_{c} , y_{c} , z_{c} )\) are the coordinates of the barycenter in the global reference frame, N is the total number of sensor’s positions, and \(\left( {x, y, z} \right)_{i}\) are the coordinates of the \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}}\) origins in the global reference frame.

Fig. 2
figure 2

Step 1 of the calibration method: A 360° rotation of the instrument around its shaft provides a set of uniformly distributed poses of the EM sensor with respect to the EM transmitter

Since \((x_{c} , y_{c} , z_{c} )\) lie on the instrument shaft, transforming these coordinates into the local reference frame provides \(\varvec{P}_{{{\mathbf{start}}}}\). This transformation is achieved using the inverse of any \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}}\) :

$$\varvec{P}_{{{\mathbf{start}}}} = \left[ {\varvec{T}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}} } \right]^{ - 1} \cdot[x_{c} , y_{c} , z_{c} \varvec{ }] ^{T} \varvec{ }$$
(5)

Given the fact that the sensor moves along a circular path, the collected poses of the sensor lie on a plane that is perpendicular to the axis of rotation. Hence, singular value decomposition (SVD) of the collected \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}}\) origins provides the orientation of the instrument shaft with respect to the global reference frame in a form of a 3D vector. The remaining step is to transform this 3D vector into the local coordinate system. This is achieved by using the inverse of any \(\varvec{M}_{{\varvec{T} \to \varvec{S}}}^{\varvec{i}}\) (Eq. 3). This transformation results in a direction vector \([n_{x} , n_{y} , n_{z} ]\) at the local coordinate system. The angles θ and φ, which describe the orientation of \(\varvec{V}_{\varvec{I}}\) with respect to the EM sensor, are calculated as:

$$\theta = \arccos \left( { n_{z} / \sqrt {n_{x}^{2} + n_{y}^{2} + n_{z}^{2} } } \right)$$
(6)
$$\varphi = \arctan \left( {n_{y} /n_{x} } \right)$$
(7)

The second step of the calibration protocol aims to find the length of \(\varvec{V}_{\varvec{I}}\), denoted as r in Eq. 2. During this step, the instrument is positioned so that that the tooltip comes in contact with the surface on which the EM transmitter is placed (see Fig. 3A). At this stage, \(\varvec{P}_{{{\mathbf{start}}}}\) and the angles θ and φ are known. Hence, considering an arbitrary length for the instrument shaft, a random point \(\varvec{P}_{{{\mathbf{rand}}}}\) that lies on the shaft is assumed (Fig. 3B). Solving a line–plane intersection system of equations, the point \(\varvec{P}_{{{\mathbf{tip}}}}\) at which the line \(\varvec{P}_{{{\mathbf{start}}}}\)\(\varvec{P}_{{{\mathbf{rand}}}}\) intersects the bottom plane of \(\varvec{C}_{\varvec{T}}\) is derived. The distance between this point and \(\varvec{P}_{{{\mathbf{start}}}}\) is the length of the instrument’s shaft (r).

Fig. 3
figure 3

A Step 2 of the calibration method: The instrument tip is placed in contact with the EM transmitter’s bottom surface. B The length (\(r\)) of the instrument shaft is calculated via the point \(\varvec{P}_{{{\mathbf{tip}}}}\), at which the line \(\varvec{P}_{{{\mathbf{start}}}}\)-\(\varvec{P}_{{{\mathbf{rand}}}}\) intersects the bottom plane of the EM transmitter coordinate system

Results

To evaluate the accuracy of the proposed method, three evaluation experiments were performed with regard to the instrument’s orientation and tip position. The first experiment aimed to measure the accuracy in the estimation of angles θ and φ that describe the instrument’s shaft orientation. As a gold standard we used a second sensor connected to the same transmitter that the sensor attached to the instrument handle is connected to. In particular, a custom component was built allowing the second sensor to be positioned inside a trocar so that its axis was perfectly aligned with the direction of the trocar. Based on this configuration, we were able to obtain theoretical estimates about the angles θ and φ, which essentially provide the direction of the trocar (with respect to the EM transmitter). The trocar was always placed at a fixed position with respect to the transmitter. Then, a set of 50 calibration estimates about the trocar direction (angles θ and φ), were obtained by rotating the instrument around the trocar direction axis (Fig. 2). These estimates were provided by the proposed method based on the measurements obtained by the first sensor attached to the instrument handle as described in the Methods. Comparing the outcome of each of these calibrations with the theoretical values of θ and φ, the mean error and standard deviation were measured. As Table 1 illustrates, the mean errors were 0.46° ± 0.2° for angle θ and 0.6° ± 0.51° for angle φ.

Table 1 Mean errors and standard deviations for the three experiments that were performed to evaluate the accuracy of the proposed method

Then, we evaluated the accuracy of the proposed method in estimating the 3D position of the tooltip. During the second experiment, a set of tooltip positions was recorded while the tooltip moved across a line parallel to the x-axis of the transmitter’s coordinate frame, as illustrated in Fig. 4A. The projection of these positions on the xy and xz planes of the transmitter coordinate frame can be seen in Fig. 4B, C, respectively. Table 1 depicts the mean error regarding the deviation of the recorded positions from the theoretical line: 0.67 ± 0.4 mm in the y-axis and 0.37 ± 0.2 in the z-axis.

Fig. 4
figure 4

A 3D positions collected while the instrument tip moves across a line (indicated in red) parallel to the X-axis of the transmitter’s coordinate system. B Deviation from the ideal line in the Y direction. C Deviation from the ideal line in the Z direction (Color figure online)

During the third experiment, a set of tooltip positions were collected while the tooltip performed random movements on the xy plane of the transmitter’s coordinate frame, as illustrated in Fig. 5A. The mean error regarding the deviation of the recorded positions from the theoretical plane, indicated as a red line in Fig. 5B, is: 0.39 ± 0.2 mm (Table 1).

Fig. 5
figure 5

A 3D positions collected while the instrument tip performs continuous random movement on the xy plane of the EM transmitter’s coordinate system. B Deviation from the ideal plane in the Z direction

Figure 6 illustrates qualitative results regarding the potential use of the proposed calibration method in an AR environment. A pattern marker was employed to obtain the relationship between the camera and the EM transmitter. This setup provided the pose of the EM sensor, attached on the instrument handles, with respect to the camera coordinate system. Using this information along with the outcome of our calibration, a virtual cylinder (in red) was augmented at the camera scene, in order to visually illustrate the accuracy regarding the estimation of the position of the shaft with respect to the camera. Although tracking of the pattern marker introduced additional errors to the final result, the visual outcome is indicative of the accuracy that the proposed calibration method provides.

Fig. 6
figure 6

Screenshots of an AR application, where a virtual representation of the instrument shaft (in red) is rendered on top of the real shaft. Position and orientation of the virtual shaft are calculated using the output of the proposed calibration method (Color figure online)

Discussion

This paper presents a calibration method for calculating the pose of a rigid laparoscopic instrument with respect to a 3D tracking sensor that can be attached to an arbitrary position on the handle. The proposed method allows real-time tracking of the tip position, as well as tracking of the shaft orientation with respect to the training, or operating, environment. The goal of this study is to provide an accurate and easy-to-implement calibration protocol, which can be applied to different types of instruments and a wide variety of tracking devices. Existing custom training systems may also be in benefit from this technique, such as, for example, for extracting the tooltip position or for generating a kinematic model of the instrument shaft.

Our results indicate submillimeter accuracy in the estimation of the tooltip position with respect to a known coordinate frame. This level of accuracy allows the potential use of the proposed method for objective assessment of the trainee’s performance in standard box trainers, where information about the tip position cannot be extracted, although it is important [18]. Studies have showed that knowing how the operator performs a surgical maneuver is essential since a higher level of dexterity is associated with shorter operations and fewer complications [19].

Our results also indicate subdegree accuracy in the estimation of the orientation of the instrument shaft with respect to the sensor attached to the handle. This finding allows the proposed method to be employed, for example, in AR applications, where standard problems such as occlusion handling require increased levels of precision [10]. To support this claim, the presented method was recently employed by our group to obtain an accurate 3D model of the laparoscopic instrument for AR applications in simulation-based training [6].

A significant advantage of the proposed method is that it is based on two simple calibration steps that are performed only once, prior to task performance. Yet, the experimental setup is based on a common EM sensor, a surgical trocar, and a static holder, all components of which can be easily available in a simulation training laboratory. These prerequisites allow the method to be easily applied by users with non-technical background to derive a 3D representation of the instrument shaft. An additional advantage arises from the fact that the method allows placement of the sensor at any arbitrary position on the instrument handle, offering surgeons the flexibility to decide upon an optimum placement of the sensor that will not affect or restrict hand movements. Potentially, our method can also be utilized for in vivo applications, where freedom of motion is crucial. In such a case, one could design experiments to obtain the kinematics of the entire 3D instrument model, and also combine this information with pre-calculated (e.g., from CT or MRI) anatomical position data of critical anatomies. However, a number of additional challenges need to be addressed in an in vivo study such as sterilization, as well as the maximum range of the EM transmitter, and the presence of other ferromagnetic materials (see below).

A potential limitation originates from the inherent inaccuracy of EM sensors, which may be affected by various external factors; for example, EM sensors similar to the ones used in this study can demonstrate significant loss of accuracy due to EM interference caused by metallic objects present in the surrounding environment and in close proximity with the sensors employed. In a similar manner, optical-based tracking devices require a clear field of view between the camera and the sensor in order to provide accurate tracking results. These factors should also be considered and avoided in the calibration setup; otherwise, they could significantly affect the accuracy of the results.

In conclusion, we have presented a calibration technique to obtain the tip position and a 3D model of the shaft of rigid endoscopic instruments utilized in MIS. In contrast to other works were EM tracking sensors are placed on the tip of surgical tools [20], our method utilizes a single EM sensor placed on the handle. Moreover, the proposed method does not require the employment of special calibration frames [17], and it is simple, inexpensive, and has potential applications not only in training systems but also in the operating room.