Introduction

Before 1970, open surgery was the only way for surgical entry into the sino-nasal tract. This kind of surgery was reserved for organic disease because of its difficulty, invasiveness, and painfulness. Wigand [1] in 1978, Messerklinger and Stammberger [2] in 1985 were the first surgeons to use an endoscope to perform a minimally invasive approach into the sinuses and to treat functional disease. Nowadays, surgeons are still using an endoscopic approach, but the technology has evolved and allows treatment of benign and malignant tumors. Surgeons are able to open all sinuses (maxillary, ethmoid, frontal, and sphenoid sinuses) and to access the brain through the sino-nasal tract, the anterior, and the middle-stage of the skull base. The rate of complication has been reduced thanks to three major advances: Firstly, reconstruction procedures (endoscopic duroplasty), secondly, navigation systems (see for instance Klimek and Mösges for a review of the pioneering works [3] and Justice et al. for more recent works [4]), and thirdly, a better knowledge of vital structures as the ethmoid roof, the lamina papyracea, the cribriform plate, the area between the posterior ethmoid and the sphenoid, the optic nerve, and the internal carotid artery in the lateral roof of the sphenoid.

In these operations, the (right-handed) surgeon stands to the (right) side of the patient, who is in supine position. He/she holds a rigid endoscope in his/her minor (left) hand while the major (right) one holds either a suction instrument, which is used for breaking the different structures and for sucking bleeding, or other instruments, a forceps for instance to remove the fragments. Therefore, the surgeon has to frequently switch instruments in his/her majorhand. In order to overcome this difficulty, surgeons would benefit from a third hand: One for holding the endoscope, the second for holding a suction instrument, and the third for holding a forceps.

An attractive alternative is to make use of a robotic assistance. Transoral robotic surgery (TORS) has already proved to provide safe access to the oral cavity, the oropharynx, hypopharynx, supraglottis, and the glottis [5, 6]. However, the literature review shows that only a few investigations have yet been performed to access the skull base transorally for resection of tumors: In [5, 7, 8], different ports were tested on cadavers or animals with a da Vinci robotic systemFootnote 1 (Intuitive Surgical). As reported by Hanna et al. [9], the transnasal approach provides a more direct and less invasive access to the skull base: The robotic system, again a da Vinci, makes it possible to accomplish precise closure of dural defects. Other robotic solutions, which are restricted to holding and moving the endoscope, have also been explored. Since they are more dedicated to the task, these systems can be smaller, less bulky and potentially less expensive. Several active laparoscope positioners have been developed since AESOPFootnote 2 (Computer Motion) in the mid 90’s, a few of them still being commercialized as for instance the LapmanFootnote 3 (Medsys), SoloAssistFootnote 4 (AKTORmed) or FreeHandFootnote 5 (Prosurgics). However, for various reasons (size, kinematics, control interface \({\ldots }\)), they do not meet the requirements of skull base surgery, which justifies other developments. Some of them are based on industrial robots, such as a Mitsubishi PA 10-6c for Strauss et al. [10], a Mitsubishi RV1a for Wurm et al. [11], a Stäubli TX40 for Eichhorn and Bootz [12], a parallel robot from URS for Nimski et al. [13]. Other devices are based on medical robots such as a Neuromate (Renishaw) for Xia et al. [14]. However, these prototypes are too bulky near the tip of the nose and give only limited access to the sino-nasal tract and the skull base. Note that a robot is defined by the ISO 8373:2012 document as “an actuated mechanism programmable in two or more axes with a degree of autonomy, moving within its environment, to perform intended tasks.” Many systems that are commonly called robots do not comply with this definition. In this paper, the term “robot” is only used in accordance with this definition. Devices that use robotic or mechatronic technologies found in robots but unable to carry out autonomous motion are called differently (e.g. robotic system, robotic manipulator, robotic positioner).

The goal of our work is to design a compact, ergonomic, and safe robotic endoscope positioner dedicated to the sino-nasal tract, and the anterior and middle-stage skull base. In this paper, we present the two first steps toward this purpose. Section 2 describes a motion and force analysis of the surgeon’s movement that was performed on cadaver heads in order to obtain objective data for specification. Subsequently, Sect. 3 describes an experimental comparative study, again on cadaver heads, with three different kinematics in order to define the best architecture satisfying the motion and force requirements.

Motion and force analysis

The shape, topology, and dimensions of the endoscope positioner should comply with the kinematics of a human arm and should not restrict the surgeon’s movement during the procedure. It has also to be safe for both surgeon and patient, as well as being as compact as possible in a cluttered environment such as that of an operating room (OR). To quantify these specifications, we carried out an experiment to characterize the surgeon’s movement in terms of motion of the endoscope during the procedure (orientation, penetration into the nostril, speed, workspace) and in terms of interaction forces with the tissues, so as to determine security thresholds for preventing damage of vital organs.

Materials

The experimental setup consisted of the following components:

  • A custom head-holding frame (Fig. 1) to immobilize the head but also to define a common coordinate system for computed tomography (CT) and motion data. It is made of PVC material to prevent artefacts in the CT images. Four PVC screws were fixed to each cadaver skull. Then, the head was fastened in the frame by four adjustable screws that come into contact with the PVC screws. On the anterior part of the skull, five PVC fiducial passive landmarks were also inserted for registration purposes. They are filled with a contrast substance that can be clearly visualized on the bone window of the CT scan;

  • Conventional surgical instruments: A \(30^{\circ }\) endoscope (Ø: 3 mm, length: 170 mm, Wolf, USA), a cubit suction instrument (Ø: 4 mm, length: 170 mm, Medtronic, USA) and a Blakesley nasal forceps (Ø: 4 mm, length: 180 mm, Medtronic, USA);

  • A 3D optical tracking system based on infrared LED markers (Easytrack 500, Atracsys, Switzerland) to track motions of the endoscope and of the suction instrument on which markers are appropriately mounted (Fig. 2);

  • Two 6-axis Force-Torque (FT) sensors (Nano 43 and Mini 45, Schunk GmbH, Germany), one mounted on the endoscope, the other on the suction instrument;

  • As shown in Fig. 2, in addition to the FT sensor and the infrared LED markers, each instrument was equipped with an appropriate handle to facilitate grasping by the surgeon. Each handle was designed so that the FT sensor measures the efforts exerted by the surgeon either to navigate the endoscope or to open or break anatomical structures with the suction instrument;

  • A custom CT image-based navigation system developed in a MATLAB environment using the Image Processing Toolbox in the Windows XP SP2 OS. A screen shot of the graphical interface is shown in Fig. 3.

Fig. 1
figure 1

Custom head-holder with adjustable screws and fiducial landmarks

Fig. 2
figure 2

Endoscope (left) and suction instrument (middle) equipped with a handle, a FT sensor, and infrared LED markers (right)

Fig. 3
figure 3

Screen shot of the graphical user interface: CT scan navigation system (left), FT data of the endoscope and the suction instrument (upper right), endoscopic and external camera images (bottom right)

Method

An ENT resident operated 13 cadaver sino-nasal tracts. The approach was typically the same as actual surgery: The surgeon was standing to the right of the cadaver head, holding the endoscope with his left hand and operating the suction instrument or the forceps with his right hand. The surgeon performed ethmoidectomy surgery with the suction instrument by applying normal forces to the structures whenever possible. The goal was to completely open all the sinuses. When this step was accomplished, the surgeon broke the vital structures (internal carotid in the sphenoid sinus, sella turcica, lamina papyracea, and anterior skull base) with the suction instrument to test their breaking resistance.

During each procedure, the tracking system recorded motion data of the endoscope and of the suction instrument thanks to the infrared LED markers. The FT sensor of the endoscope provided data about friction forces and contact forces normal to the endoscope axis, whereas the FT sensor of the suction instrument provided data about minimal forces necessary to open the sinuses, and the maximal forces that each vital structure could withstand before being damaged. The motion of the surgeon’s hands was also recorded via an external camera. Prior to the surgery, a high-resolution CT scan had been performed on each head for further navigation purpose and data processing (slice thickness of 0.625 mm for native images and 1mm for reconstruction in the sagittal, axial, and coronal views).

Experimental procedure

The experiments were performed at the Anatomy Laboratory of the School of Medicine, University of Montpellier. The first step was devoted to initialization, consisting of:

  • A point-to-point registration procedure, as proposed by Arun et al. [15], to estimate (i) the tool tip and endoscope tip positions in the tracking coordinate system, (ii) the rigid transformation between the coordinate system of the CT images and the tracking coordinate system. This was performed using the five fiducial landmarks, which are easily recognized on the images and can be found in the tracking system reference frame, by touching them with the instrument tip.

  • Since the result of a point-to-point registration was not robust enough for further calculations with so few landmarks, it was refined using a surface morphing procedure: The head was swept with the endoscope tip in order to acquire a cloud of points on its surface. Using a 3D model of the head generated from CT images, a conventional iterative closest point (ICP) algorithm was finally run, as proposed by Besl et al. [16]. This was performed with the aid of the Cloud CompareFootnote 6 V2.1 software using the results from the point-to-point registration as initial conditions. Table 1 shows the good result obtained with this 2-step procedure on one of the cadaver heads with a cloud of 2,463 points acquired at a frequency of 65 Hz.

  • Estimation of the offset of both FT sensors: A FT sensor has a natural offset that needs to be filtered. This could be done by initializing it—by applying a known load in a known direction (in our case, the handle weight along the \(z\) axis), which can be subtracted from the data during data processing. In such a way, all efforts read by the sensors correspond either to external efforts or dynamical loads.

Table 1 Precision of the registration procedure and standard deviation (the RMS cannot be computed using the Cloud Compare procedure)

Once the initialization was performed, the clinical procedure was initiated during which position/orientation and FT data were recorded. Then, the voluntary destruction of the aforementioned vital structures was performed. Videos from the external camera and the endoscope camera were recorded during the whole procedure. This information was used offline to detect collisions between tools during the experiment for further processing of potential bias introduced on the data. The last step consisted of data exploitation, where videos were synchronized, and external forces applied to the instruments were calculated from FT sensor and tracker information. This was done using Newton–Euler laws from which, given linear and angular velocities and accelerations, it is possible to calculate the resultant efforts applied to each instrument [17]. The whole experiment lasted about 40 min for each nasal cavity.

Results

Data processing consisted in merging all the data continuously recorded during the experiments, and in identifying the relevant moments where particularly important tasks were performed, ensuring data adequacy, and seeking reasons for any non-adequacy.

We believed that the endoscope and the suction instrument were sufficiently rigid to work without excessive bending during nominal use, which seemed to be reasonable as they are specific products for the sino-nasal tract. This turned out to be true for the endoscope, but not for the suction instrument when breaking the vital structures. However, the surgeon tried to approach the structures as much as possible in a normal direction, thus minimizing bending. Accurate estimation of deflection would require a precise knowledge of contact points, which is not possible to obtain. Anyway, bending is only a consequence of the application of forces and/or torques on the instrument. The same force required to break a given structure would have been measured if applied with a stiffer or softer instrument. This latter point leads to another comment: The figures given below approximate the forces to drive the endoscope or break vital structures. They are obviously not sufficient to contribute to the biomechanics of the sino-nasal tract. However, they are necessary and sufficient to determine the ranges of values of parameters (1) to specify and design a robot, (2) to monitor several critical parameters for patient’s safety when the robot is working.

The FT sensor mounted on the endoscope provides data relative to forces tangent to the surface (thus friction) and contact forces normal to the endoscope axis, applied to the endoscope throughout the duration of each procedure. We quantified the maximal forces applied to the endoscope in order to evaluate the forces that the robot should be able to overcome. The moduli of the forces are always higher than 10 N and never higher than 20 N (Fig. 4).

Fig. 4
figure 4

Moduli of the maximal forces applied to the endoscope during the entire procedure for the 13 sino-nasal tracts explored

With the FT sensor mounted on the suction instrument, we recorded the minimal forces necessary to open the sinuses and to damage vital structures. The moduli of the forces lie between 10 and 24 N to open the maxillary sinus, 20 and 38 N to open the ethmoid sinus, 8 and 26 N to open the frontal sinus, and 18 and 34 N to open the sphenoidal sinus. The moduli of the forces to break the anterior skull base, the carotid, and the anterior wall of the sella turcica are always higher than 40 N, while they are a little lower for the lamina papyracea (between 13 and 34 N). Note that the absolute value of a force is computed disregarding its direction since the tip of the suction instrument forces the anatomical structure at a single contact point. As a consequence, the measured torque purely results from the contact force and does not contribute to any additional information.

After verification on video films, we were able to say that the discrepancies in these figures were mostly due to inter-individual variability. Therefore, the robot should allow forces around 20 N (normal friction forces) while preventing forces higher than 40 N so as not to injure vital structures. It is yet possible that the 20 N lower threshold is over-estimated, since the surgeon might also have applied forces in directions that were not efficient for breaking the desired structures. Therefore, these values need to be confirmed by experiments on living tissues and adjusted to guarantee appropriate safety margins. However, from a design perspective, this possible over-estimation is conservative and ensures that the robot is capable of applying a sufficient amount of force. To protect the lamina papyracea that are more fragile, one possible solution would be to create kinematic constraints (also termed as virtual fixtures) on the robot, thus preventing the instruments from approaching this area.

The tracking system gave the endoscope position and orientation throughout the surgery. Contrary to the effort figures given in a coordinate system related to the sensor, the rotation values are given in a coordinate system related to the patient (Fig. 5). We measured an endoscope rotation travel in the sagittal plane ranging between \(26^{\circ }\) and \(66^{\circ }\), in the axial plane between \(34^{\circ }\) and \(68^{\circ }\), and around the endoscope axis between \(42^{\circ }\) and \(71^{\circ }\). We computed an endoscope angular velocity between \(12^{\circ }\)/s and \(43^{\circ }\)/s in the sagittal plane, \(15^{\circ }\)/s and \(56^{\circ }\)/s in the axial plane, and \(27^{\circ }\)/s and \(57^{\circ }\)/s around the endoscope axis. The penetration depth of the endoscope into the nose varies between 70 and 100 mm.

Fig. 5
figure 5

Definition of the CT image planes and expected fixed point position due to the nose constraint

An important question is to establish whether the endoscope, when inside the nasal cavity, rotates about a fixed point. In this case, it would justify choosing kinematics generating a kind of remote center-of-motion (RCM). The advantages of such architectures over conventional robots are their high accuracy, their intrinsic safety (the instrument passes through the RCM whatever happens), and the limited swept volume by the robot links outside the patient. This makes sense since the nose entrance is fairly constrained and the endoscope displacement inside the nasal cavity seems to comply with it, as shown in Fig. 5. In order to assess this assumption, a data analysis was proposed as follows: An algorithm looks for the point in space that minimizes the sum of the square distances to the endoscope axis throughout the entire procedure. A gradient descent method with a fixed step (equal to 0.5 mm) was then implemented in MATLAB, where the gradient is estimated via the evaluation of the square distances of six points in the neighborhood of the test point. The starting point was estimated approximately using the CT images somewhere inside the nose. We established that for each sinus, there is a fixed region that encloses the instantaneous center of rotation rather than a true fixed point, which can be represented by a box whose edges are a few millimeters long. An example is given on Fig. 6 (left). For a given nasal cavity, these sinus boxes (one for each sinus) can be included in a larger bounding box in which the instantaneous center of rotation will remain when exploring the whole nasal cavity. The corresponding position of such a bounding box inside the nose is shown in the three cross-section views of Fig. 6 (right). Experimentally, it has been found that the dimensions of the bounding boxes vary from 4.8 to 20.9 mm along \(x\), from 13.8 to 30.9 mm along \(y\) and from 7.2 to 34.9 mm along \(z\). Such variations are related to anatomical peculiarities of the head and also to the surgeon’s performance during experiments.

Fig. 6
figure 6

Fixed region analysis: boxes for the 4 sinuses and the 4 boxes enclosed in a bounding box (left); projections of the bounding box in the CT scan planes (right): a coronal view, b axial view, c sagittal view

Robot specification

Once the surgeon’s movement is characterized in terms of motion and force, it is necessary to specify the kinematic architecture of the endoscope positioner [18]. We have shown that for sino-nasal skull base surgery, the endoscope motion should not be reduced to a RCM type of task, since the center of motion position is not constant. However, this variation is limited, and the issue is to determine whether it can be practically negligible or not. This motivated us to experiment with an RCM architecture and a conventional serial one. We rapidly established that a different architecture combining RCM and serial components could be an interesting alternative and should also be considered. We therefore evaluated three types of kinematics on cadaver heads.

Kinematic description

The three architectures have the following properties:

  • The first one is EVOLAP, a prototype built for minimally invasive laparoscopic surgery at Université catholique de Louvain [19]. The main manipulator has two active degrees of freedom (DOF) and generates a RCM. Its particular kinematic structure consists of three orthogonal parallelograms translating the end-effector over the surface of a half-sphere (Fig. 7a, b). A local manipulator (Fig. 7c, d) composed of a gimbal with two passive DOFs is attached to the distal end of an adjustable passive arm attached to the main manipulator. The translation of the laparoscope to produce the in-out motion (zoom of the video images) and the rotation about its axis are not controlled in the version of the robot used for this experiment, although other versions embed these active DOFs in the local manipulator.

  • The second robot is a conventional six-active DOF industrial robot (VIPER s650, Adept TechnologyFootnote 7) with a controller from Cerebellum Automation.Footnote 8 The endoscope is fixed to the robot via a 40 cm curved link intended to free the nostril region (Fig. 7e, f). The VIPER kinematics allows controlling both penetration and orientation of the endoscope.

  • The third kinematics (designated as HYBRID in the sequel) is a mix of the EVOLAP and the VIPER kinematics: The VIPER holds the local manipulator of EVOLAP. We modified the control of the VIPER to constrain the endoscope motion within the volume of a half-sphere centered on the nostril. This way the penetration of the endoscope (the zoom) is remotely actuated, which is not the case with the current version of EVOLAP.

    Fig. 7
    figure 7

    a EVOLAP [19], b EVOLAP kinematic chain in details, c Local manipulator mounted at the distal part of EVOLAP and HYBRID solutions, d Schematic detailing of the local manipulator, e VIPER s650 (Adept Technology); f VIPER s650 kinematic chain in details with a terminal curved link holding the endoscope

Surgeon interface

Whatever the kinematics selected to hold the endoscope, the issue is to provide the surgeon with a guiding interface which is as intuitive as possible. One often refers to transparency, ideally meaning that the impedance of the robot should be close to zero such that the surgeon does not “feel” its weight, friction, or inertia.

Several types of interfaces (e.g. voice, head motion, foot pedal, joystick) can be used to control endoscope positioners in a so-called telemanipulation mode: The surgeon acts on a device that in turn acts on the robotic system. Another approach is to implement an automatic control of the robot based on visual servoing: Markers are stuck to the distal part of the instruments and are tracked by the endoscope so that they remain in the center of the image [20]. It is however difficult to assess the safety of such a method as markers may be temporarily hidden by blood, smoke, or another instrument. An interesting alternative is “co-manipulation.” In a co-manipulation mode, the surgeon manually moves the endoscope as in conventional surgery. The difference is that the robot follows the motion of the endoscope and maintains it in position when the surgeon no longer holds it. In our setup with a non-backdrivable robot, this requires a FT sensor and an appropriate force control law. The difficulty lies in properly tuning its parameters so as to obtain enough transparency and avoid impeding the natural motion of the surgeon.

Two modes have been tested:

  • Telemanipulation: A simple PlayStation joystick was mounted on the endoscope handle to allow finger-controlled omnidirectional displacements of the endoscope and real time velocity adjustment;

  • Co-manipulation: As mentioned in Sect. 2, a FT sensor is mounted between the endoscope and a dedicated handle held by the surgeon (Fig. 7d). The measured signal corresponds to the effort applied by the surgeon to the handle while trying to control the endoscope. This signal was then used in a force control law to generate robot trajectories in such a way that the surgeon could feel the endoscope as a free body.

Experimental comparisons

A first short experiment on cadaver heads was performed by four surgeons (three seniors and one resident) to select which interface was most suitable to control the robots. Subsequently, a second experiment on cadaver heads was performed to determine which kinematics was the most adequate. During this second experiment, the surgeons were asked to navigate inside a head model and touch three different landmarks with the endoscope tip, while trying to maintain a natural and uniform speed. This task was performed four times with each kinematic architecture. A path analysis was then performed to compare the surgeon’s movement when manipulating the endoscope in a free hand mode, and when comanipulating the endoscope through the 6-DOF VIPER or the HYBRID kinematics. The first repetition for each kinematics was considered as a learning trial and thus excluded from the analysis.

Results

We established that the best interface for sino-nasal endoscopic robotic surgery is the co-manipulation mode. The reason is that it is the most natural way for the surgeon to move the endoscope in and out of the nostril, a movement that is done rather frequently (we counted an average of 50 for the ethmoidectomy). Moreover, it allows him/her to leave the endoscope in a stable position, freeing one hand when necessary. Figure 8 shows the surgeon controlling EVOLAP and HYBRID within this mode.

Fig. 8
figure 8

Control of EVOLAP (a) and HYBRID (b) in co-manipulation mode

Regarding the kinematics, each solution has its own advantages and drawbacks as summarized in Table 2. One important concern is safety. As the co-manipulation system does not provide haptic feedback, the surgeon cannot feel the interaction forces exerted by the endoscope on the head. This can lead to a situation where the efforts on the endoscope are too high and harmful to the patient. This situation is less likely to occur with the EVOLAP and HYBRID architectures where the passive joints of the gimbal introduce compliance on the orientation of the endoscope. A second FT sensor has been mounted on the VIPER wrist flange to measure the external forces and torques applied, and shut down the system when a security threshold is exceeded.

Table 2 Comparison of the three kinematic architectures

From a usability point of view, as stated by Jarassé et al. [21], the system should be as transparent as possible in order to have the smallest impact on endoscope path and task duration. Figure 9 shows two projections of spatial trajectories, where the highlighted points correspond to the landmark positions that were expected to be touched by the surgeon. Average task duration and path length are summarized in Table 3. As can be seen, the paths obtained with the VIPER exhibit large distortions with respect to the free hand ones. On the other hand, the HYBRID co-manipulation system allows motions with smaller distortions. Despite the fact that the surgeon was urged to maintain a uniform speed, the task duration varies considerably according to the architecture used, as the systems are not perfectly transparent. Results of this preliminary experiment tend to show that the 6-DOF VIPER configuration is the one that most affects the surgeon performance. Conversely, an optimized version of the HYBRID configuration is likely to have a small enough impact on the endoscope handling while offering the possibility to perform surgical gestures with both hands.

Fig. 9
figure 9

Coronal and sagittal views of the trajectories (Free hand: red curves; VIPER: green curves; HYBRID: blue curves)

Table 3 Comparative results: time and trajectory length

Conclusion

The objective of this work was to specify a compact, ergonomic, and safe endoscope positioner dedicated to sino-nasal tract and anterior and middle-stage skull base surgery. Analysis of the surgeon’s movements on sino-nasal tracts of cadavers allowed us to quantify motion and force ranges for design purpose. We have shown experimentally that a standard 6-DOF robot, to which the endoscope is attached via a gimbal-like passive remote manipulator, may offer a sufficient safety level and ensures a fair quality of motion. However, we can also state that a probably cheaper alternative could make use of a dedicated 4-DOF robot with 3 translations to position the local manipulator, the 4th DOF (locally actuated or remotely actuated through cable transmission) being used to control the endoscope rotation. Finally, we have shown that the co-manipulation mode was the most user-friendly interface for the surgeon. Future work needs to be performed to develop a prototype embedding and improving these features for further in vivo tests.