1 Introduction

Image-guided surgical navigation, together with remotely controlled robotics, is one of the main technologies supporting minimally invasive approaches in surgery. Compared to traditional open surgery, minimally invasive surgery aims at reducing incisions on the patient’s body and tissue retraction. It brings some well-known advantages, such as a faster recovery of bowel functions and a reduced risk of wound-related complications. Therefore, some further consequences are a shorter hospitalization and a faster recovery of the patient who can quickly return to his/her normal activities. However, minimally invasive approaches introduce also new difficulties for surgeons. The indirect and restricted access to the operation area in the patient’s body limits both the vision of the organs and the mobility in handling surgical tools. 2D images, which are obtained by means of a camera inserted in the body, and the CT slides traditionally consulted by surgeons, do not allow the estimation of the depth of anatomical structures without moving the camera. Furthermore, surgeons have to face with difficulties in hand-eye coordination to synchronize the instrument movements shown on a screen with the hand movements.

The low quality and the reduced field of view of intra-operative images often make the underlying surgical scene reconstruction difficult. Image guidance systems have been introduced just to support this task, usually by mapping preoperative data into intraoperative environments [1].

In minimally invasive surgery, the use of images registered to the patient provides a great support for both the planning phase and the guidance during the intervention. An efficient 3D reconstruction of the patient’s anatomy from medical images (MRI or CT) allows improving the standard slice view by visualizing the organs’ 3D models. Important innovations have been introduced in image segmentation for the automatic [2] and interactive [3] extraction of 3D models from CT or MRI data.

Nowadays, there are different software used in medicine for the visualization and the analysis of scientific images and the 3D modelling of human organs: Mimics [4], 3D Slicer [5], ParaView [6, 7], OsiriX [8], and ITK-SNAP [9] play an important role among these tools.

Image-guided surgery benefits from important technological innovations, dealing with the identification of therapeutic targets in images, image registration with respect to the patient’s body, instrument tracking with respect to the patient’s body and registered images, the detection of any discrepancy between the images and the patient’s body, the evaluation of the intervention accuracy and the design of intuitive and usable interfaces that should provide the surgeon with useful contextual information [10]. A large-scale survey about the state-of-the-art and challenges of computer-aided and image-guided interventions can be found in [10].

In surgery, Augmented Reality (AR) technology provides a sort of an “X-ray vision” of the patient anatomy and pathology. It can increase the surgeon’s view with additional information consisting of virtual organs reconstructed by medical images of the area involved in surgical operations. In particular, it can be useful to visualize tumors or vessels that surgeons could only detect by touch.

Compared to other image-guided techniques, AR-based laparoscopy reduces the surgeon cognitive load by sparing from mentally associating information from different sources with objects in the real scene [11]. It can overcome problems related to its limited field of view in laparoscopic interventions by enhancing the surgeon’s spatial awareness. In this way, AR technology can improve the surgeon efficiency and allow a faster intraoperative decision making. Moreover, another important advantage is the possibility to integrate data retrieved from quantitative analysis of preoperative images that would remain hidden in standard images. Besides preoperative, intraoperative, and endoscopic images, typical data used for scene augmentation are real-time interventional measurements, conceiving some patient’s vitals, the position or even the force feedback of a surgical tool, and planning data, which are surgeons’ annotations such as labels or cutting lines [11]. Therefore, AR can support the surgeon in resections by showing cutting trajectories planned in a preoperative phase.

Systems for context-aware AR visualization [12] are also able to filter available information and show only data which are relevant to a specific recognized phase in the operating room.

A modern medical augmented reality system should exhibit the following features [13]:

  • Reliability to provide real-time accuracy control in any situation;

  • Usability for an effective interaction with the surgeon;

  • Interoperability for a good compatibility with other instruments.

The system accuracy mainly depends on the registration phase [14], which should assure a good correspondence between virtual and real organs: even a small error in this phase may have serious consequences on patient health. The aim of such procedure is to keep the corresponding anatomical structures in preinterventional and intrainterventional datasets aligned before and during an intervention [15]. The two datasets are defined in two different coordinate systems. In our application, registration accuracy is a key aspect for a correct working of the dynamical adaptive views. Moreover, the registration accuracy must be maintained in real-time during the intraoperative phase. An ideal AR environment supporting the intraoperative phase should perform real-time updates of virtual objects and information visualized on the real objects [16]: to this end, it should reduce acquisition, tracking, and registration delays as much as possible. Maintz and Viergever [17] proposed a classification of image registration methods based on image modality, image dimensionality, registration basis, geometric transformation, user interaction, optimization procedure, and subject and object registration.

The choice of a good tracking device, which allows detecting the position of surgical instruments inside the patient’s body, is a key aspect for the system accuracy. The tracking technology is often used in modern operating rooms and provides an important help to enhance the performance during real surgical procedures. Tracking systems can be based on six different principles: time of flight (TOF), spatial scan, inertial sensing, mechanical linkages, phase-difference sensing, and direct-field sensing. Rolland et al. [18] report all the advantages and limitations coming from the adoption of these different technologies. Koivukangas et al. [19] make a more detailed comparison in terms of technical accuracy focusing only on optical and electromagnetic tracking systems: they conclude that the former category has a slightly better accuracy than the latter. The benefits and the limitations of the adoption of electromagnetic tracking in medicine are extensively discussed in [20].

Optical trackers can detect natural landmarks, which are scene features such as edge and corners that can be manually [21] or automatically [22] selected, or artificial landmarks, also called fiducial markers, which are specifically designed to be easily detected. The latter are a more robust and reliable solution especially when surgical scenes lack distinctive features or in the presence of local deformations or illumination changes.

In general, the drawback of an optical system is the absence of a direct line-of-sight between its infrared optical camera and the fiducials on surgical instruments. On the other hand, any conductive or ferromagnetic material inside the tracker’s field of view or near the field generator can interfere with electromagnetic tracking [10].

The work presented in this paper consists of two parts.

In the former part, we present a visualization and navigation system that supports surgeons not only in diagnosis and preoperative planning, but also in a following intraoperative phase by turning it into an image-guided surgery. This system can present both the traditional patient data in the form of CT images and 3D anatomic models derived from them. The system is able to automatically reslice the orthogonal planes in order to provide the surgeon with an accurate visualization nearby the actual position of the instrument.

In the latter part, we present an augmented reality system which superimposes virtual organs over real-time images of the patient’s body according to the surgeon’s point of view and the medical instrument location. The system also enriches the visualization of virtual organs with depth and distance information and provides several other visual and audio cues to help the surgeon in improving the intervention accuracy.

1.1 Related works

Some old works [23, 24] introduced computer-assisted approaches by presenting the results of stereotaxic neurosurgical interventions.

However, the first guidance system for surgeons based on augmented reality was developed to support the planning phase of a craniotomy [25]. Such work had already highlighted one of the most important challenges, that is tracking the organs’ position in real-time without attaching any device on the patient’s body.

Image-guided and AR-based interventions have concerned almost exclusively neurosurgery for many years because the rigid stable frame provided by the skull makes the registration procedures easier. Nowadays, neurosurgery is still the most popular application field thanks to the presence of rigid structures that allow an easier registration between the virtual objects and the real scene. A recent work in this field is described in [26]. Previously, De Paolis et al. [27] presented a realistic virtual model of the human brain that could be used in a neurosurgical simulation for both educational and preoperative planning purposes. In order to obtain a realistic and useful simulation, the authors focused the research on the physical modelling of the brain as a deformable body and on the interactions with surgical instruments.

However, the application of augmented reality has been proposed also for several other intervention fields [1], including maxillo-facial surgery [28, 29], orthopedics [30], cardiac interventions, Holmium Laser Enucleation of the Prostate [31], spinal [32], abdominal [33], and thoracoabdominal surgery, and so on.

Most of these interventions deal with removing tumors [34] through the radio-frequency thermal ablation (RFA) technique [35,36,37,38]. In this scenario, a guidance system should guarantee an accuracy better than 5 mm in centering the tumor with the needle tip, preferably with an insertion during less than 10 min, to reduce the possibility of destroying healthy cells [36]. To overcome the well-known hand-eye coordination problems during the needle insertion, Wen et al. [39] proposed the combination of a projector-based augmented reality interface with a surgical robot that surgeons can guide through hand gestures.

Unfortunately, with the exception of neuro- and orthopedic surgery and prostate therapy, some issues still slow down an actual clinical adoption and commercialization of image guidance systems [10]. Besides technical challenges, there are also concerns about the costs and the usability of these technologies. Indeed, the use of navigation systems is not always intuitive and sometimes requires specific training procedures. Some of them must be performed in dedicated high-end facilities such as the IRCAD Center [1], even though the fundamentals of navigation can be taught also through low-cost simulations [40].

A complete survey about the status of augmented reality in laparoscopic surgery can be found in [11]. A previous survey work [41] dealt with augmented reality in minimally invasive surgery.

Teistler et al. [42] described a system for presenting multiplanar reconstruction of CT images to medical laymen in a forensic investigation context. The user can interact with its interface through a cheap 3D game controller.

The vessel navigator presented in [43] exploits a topological map retrieved from the abdomen computed tomography to guide the surgeon along an optimal path during laparoscopic gastrectomy. The real-time vascular deformation system presented in [44] exploits a skeleton representation of the virtual vessel to improve the usual static virtual environment of endovascular navigation systems.

In some other works, Chen et al. [45] proposed the use of see-through displays to show the virtual anatomical structures. For the special case of neuronavigation systems, Watanabe et al. [26] proposed the use of a tablet positioned on the patient’s head to visualize the augmented scene; in this way, the system spares surgeons from alternating their gaze between the patient’s body and the monitor.

Ricciardi et al. [28] presented a virtual reality medical viewer that allows an inspection of CT and MRI images superimposed on the 3D models of the organs. The system has a high scalability, since it can exploit a desktop, a head-mounted display, or a CAVE-like system. An augmented reality application based on ARToolKit [46], presented in a previous work [47], allows a selective visualization of only some organs of the abdominal area using specific markers.

Aloisio et al. [48] presented a computer-based simulator, with appropriate tactile feedback device, that can be an efficient method for facilitating the education and training process in order to reduce costs of education and provide realism with regard to tissue behavior and real-time interaction. The system simulates a coronary angioplasty intervention.

Some works introduced touchless interaction [39, 49,50,51] based on gesture recognition [52, 53]. For instance, De Paolis et al. [51] presented a system able to interpret in real-time the user’s movements used in the surgical preoperative planning for the navigation and manipulation of 3D models of the human body. The platform permits avoiding any contact with the computer so that the surgeon is able to interact with the models of the patient’s organs moving the finger in the free space.

Hisense CAS (Hisense Computer-Assisted Surgery System) is a commercial simulation platform for the 3D visualization of pediatric liver structure. It uses the DICOM data from CT and MRI images and exploits a gesture-controlled display. Zhang et al. [54] assess the use of Hisense CAS for preoperative planning and intraoperative support to surgical procedures on pediatric patients affected by hepatoblastoma.

The AR system for liver thermal ablation described in [36] uses two jointly calibrated cameras to track the needle inserted by the surgeon and register the body model in the camera frame by exploiting 15 opaque markers placed on the patient abdomen. It takes also into account the organ deformations caused by the normal respiratory activity thanks to some guiding information retrieved during each expiratory phase. The experimental tests showed such system allows reaching a target with an accuracy of 2 mm during an insertion of less than 1 min. One of the main differences between such system and the one presented in this paper conceives the adopted tracking modality and technology. While the system presented in [36] exploits opaque markers detected by two generic cameras, our system relies on reflective spheres detected by specific tracking devices (the Polaris Vicra [55] and the Vicon Bonita [56]). Compared to such system, our work has slightly different goals and consequently other points of strength. While the system developed by Nicolau [36] focuses on considering data about breath to improve registration accuracy, our work mainly addresses information presentation issues: it proposes visual and audio cues to represent important data about depth, distances, instrument orientation, and possible collisions with other organs.

Not even the HMD-based surgical navigator described in [45], which focuses on the pelvic area of the human body, provides such guiding features and perception cues.

The oldest form of augmented visualization employed in surgical theaters uses static video displays. Nowadays, it is still the most used thanks to really comfortable 4K resolution full-HD displays, which include also 3D vision features.

2 Methods

2.1 Workflow of image-guided surgery

The UML (Unified Modeling Language) activity diagram in Fig. 1 shows all the phases characterizing the image-guided surgery scenario for which the system has been designed. We described the main phases, such as the building of the 3D models of organs, the image registration with respect to the patient’s body, and the tracking of surgical tools in detail in the following subsections.

Fig. 1
figure 1

UML activity diagram of image-guided surgery workflow

2.2 3D models of the patient’s organs

We built the 3D models from medical images by means of segmentation and classification algorithms: we associated different colors to the organs to replace the gray levels in the medical images.

We used 3D Slicer [5], a multiplatform open-source software package for visualization and image analysis: it provides functionalities for segmentation, registration, and three-dimensional visualization of multimodal image data.

Figure 2 shows the image processing steps for a case study of a 2-year-old child with a benign tumor of the right kidney; some organs (the liver and stomach) are completely hidden and the skin and the muscles of the abdominal region are displayed in total transparency; the tumor is shown in magenta. The mesh editing was carried out using the open-source MeshLab software application [57]. A radiologist has validated the 3D models of the organs.

Fig. 2
figure 2

An example of medical image processing

2.3 Tracking systems

In an early experimental phase, we chose the Polaris Vicra optical tracker of the NDI Inc. [55] to detect without delay the position and the orientation of the surgical tool used by the surgeon within a defined coordinate system; the system has been used also in order to permit the overlapping of the virtual organs on the real ones in the augmented visualization of the scene. The system uses two infrared cameras and a position sensor to detect infrared-emitting or retroreflective markers affixed to a tool or object; using the information received from the markers, the sensor (273 mm × 69 mm × 69 mm) is able to know the position and orientation of the tools within a specific measurement volume [58] (Fig. 3) with an accuracy of 0.2 mm and 0.1 tenth of a degree. Data are updated at a maximum rate of 20 Hz.

Fig. 3
figure 3

Polaris Vicra’s measurement volume

In a later experimental phase, we adopted the Vicon Bonita [56], another tracking system consisting of four infrared cameras working at a frame rate of 240 fps. Technical specifications of Bonita cameras [56] describe their measurement volume in terms of wide angle of view (82.7° × 66.85°) and narrow angle of view (32.7° × 24.81° ). In any case, the measurement volume is no less than 4 m × 4 m × 1.5 m.

2.4 Visualization devices

Unlike other HMD-based systems [45], our system was designed for full-HD and 4K displays, which allow the surgeon to share the same vision and thus to cooperate during the preoperative and intraoperative phases.

2.5 The surgical navigator

2.5.1 The user interface

The developed application is supplied with a specific user interface that allows a user to take advantage of the feature offered by the software. The application is provided of four sections with the aim to provide support in the different steps of the surgical procedure: the CT slices in the axial, coronal, and sagittal planes are displayed within three different windows and a fourth window shows the 3D model of the organs built from them. A slider bar in each window allows sliding the different views of the medical image set.

In the 3D model window, the surgeon can add or remove organs to focus only on some regions of interest. Furthermore, some rendered organs (such as the muscles and skin) could become transparent to allow the surgeon to view other organs behind them.

It is possible to visualize the outlines of the organs showed in the 3D model window are superimposed on the slices in the windows of the three visualization planes. This allows the surgeon to assess the accuracy of the segmentation procedure. Moreover, the user can add the visualization of each of the three planes on the 3D model.

Figure 4 shows the user interface that visualizes the 3D model of the organs and the slices in the axial, coronal, and sagittal planes.

Fig. 4
figure 4

The navigator interface

2.5.2 The registration phase

The registration procedure implemented in our application exploits punctual high-precision measures based on an optical tracker: it relies on a closed-form solution for a least-square problem that exploits unit quaternions to represent rotation [59]. The applied method is based on the choice of three fiducial points on the patient’s body: we mark them with a semi-indelible ink on the patient’s skin before the CT scanning in order to retrieve the same points in the 3D model after the elaboration of medical images. Then, in the intraoperative phase, the visualization software allows detecting the correspondence between the points on the patient’s skin and those appearing in the 3D model. Figure 5 shows the three fiducial points applied on the patient’s body that are visualized in the reconstructed medical images and detected by means of the optical tracker.

Fig. 5
figure 5

Fiducial points for the registration phase

Unfortunately, any movement of the patient’s body could alter the correspondence between the real and virtual organs achieved through this preliminary registration phase. For this reason, we place on the patient’s body a reference tool provided with some reflective spheres that will be detected by the optical tracker in order to fix the registration during the surgical procedure and maintain the correct overlapping between the virtual and real organs. Moreover, the surgical instrument is equipped with some reflective spheres arranged according to a specific geometry in order to detect its position and orientation by means of the optical tracker; we assume that the sensitive part of this instrument is the tip.

The entire system consists of four different reference systems:

  • The reference system of the optical tracker;

  • The reference system of the camera;

  • The reference system associated with the tool located on the camera;

  • The global reference system that identifies the position of the virtual object in the real scene.

Figure 6 shows the transformation chain that takes into account the different reference systems of each device used in the application. For an accurate overlapping of the 3D medical data onto video images of the real scene, it is necessary to calculate the relations between these coordinate systems.

Fig. 6
figure 6

Transformations among the reference systems

The Tcomp transformation, shown in Eq. 1, is used to find the position of the surgical instruments compared to the reference rigid body.

$$ T_{comp}=T_{ref}^{-1}T_{raw}M_{cal} $$
(1)

This transformation is calculated by means of the Traw (the transformation that specifies the surgical tool pose with respect to the tracker coordinate system), \(T_{ref}^{-1}\) (the relation between the reference rigid body coordinate system and the tracker coordinate system), e Mcal (the relation between the surgical tool pose and position of its tip).

The T transformation, shown in Eq. 2, describes the position of the probe tip inside the 3D virtual scene.

$$ T=M_{reg}T_{ref}^{-1}T_{raw}M_{cal} $$
(2)

In Eq. 2, the Mreg transformation is the result of the registration phase and is used to define the reference rigid body position in the global 3D scene reference system.

2.5.3 Reslicing and clipping visualization

The application provides the automatic reslicing of the orthogonal planes in order to associate the tip of the surgical instrument to the intersection point of the coronal, sagittal, and axial planes.

The view of the medical images and of the virtual organs in the application depends on the actual position of the surgical instrument on the patient’s body.

This situation is shown in Fig. 7. The probe we chose as the surgical tool has some reflective spheres disposed according to a well-defined geometry. We considered the tip of the probe as the sensitive part. Therefore, the application changes dynamically and in real-time the views according to the position of the tip of the surgical instrument, which is detected by the optical tracker. It provides the surgeon with an accurate visualization of the 3D model and of the CT slices near the surgical tool position by performing an automatic reslicing of the orthogonal planes. To this aim, the surgical tool tip is associated to the intersection point of the coronal, sagittal, and axial planes.

Fig. 7
figure 7

Automatic reslicing

In this way, the surgeon, during a minimally invasive surgical procedure, can have an accurate visualization of the 3D model and of the CT slices exactly next to the actual position of the surgical instrument.

In order to have a more clear visualization of the interest area, it is possible to activate the clipping modality which shows a section of the 3D model along a visualization plane pointed by the surgical tool.

Figure 8 shows the clipping visualization modality; in this case, the cuts are applied for the sagittal and coronal planes and the axial plane is not visualized. The clipping is dynamic as well as the reslicing.

Fig. 8
figure 8

Clipping visualization modality

2.6 Augmented visualization in the patient’s body

The overlapping of virtual organs on the patient’s body is based on the same rigid transformations reported in the previous paragraph. We perform a registration phase relying on fiducial points placed on the patient’s body.

The simple augmentation of the real scene cannot provide information on the depth of the scene, because, even though the 3D models of the organs are accurately visualized on the patient’s body, they appear to be positioned on its surface without giving the perception of depth, which is a crucial factor for the scene realism. This situation is shown in Fig. 9, where the virtual organs are overlapped on a mannequin. The human depth perception relies on 16 different cues, characterized by different persuasive power, precision, and interaction mode [60, 61]. Moreover, in the special case of AR applications designed for surgical support, this aspect is very important to give a complete and correct idea of the patient anatomy.

Fig. 9
figure 9

Augmented scene without occlusion

2.6.1 Scene occlusion and depth perception

To improve depth perception, we adopted a “dark matter” method [62], consisting of a partial view of the 3D model of the patient’s organs through the placement of a sort of “window” visualized on the real skin as a gateway to the internal organs. This window limits the view of the surgeon to a specific region of interest and hides all the organs in other parts of the patient’s body. Only through this window can the internal virtual organs be seen giving the realistic impression that virtual organs are really inside the abdominal area.

Unlike the models of virtual organs, the 3D model of the external surface that covers the rest of the body organs is rendered only in the z-buffer or depth-buffer, but not in the color-buffer. A 3D model of the external surface of the mannequin, built in order to occlude part of the organ’s model, is shown in Fig. 10.

Fig. 10
figure 10

External surface used for the occlusion

In addition, it is possible to slide the visualization window onto the surface of the mannequin and locate it in a precise position that provides a view of the organs of interest.

This designed sliding window, which permits the visualization of a part of the organs, is shown in Fig. 11.

Fig. 11
figure 11

Use of a virtual window to provide depth perception

2.6.2 Distance information

Occlusion techniques only improve the perception of organs’ position in the scene, but they provide no information about distances. For this reason, we completed our system interface by providing the surgeon with information about the distance between the surgical instrument and the organ of interest (Fig. 12).

Fig. 12
figure 12

Distance information inside an informative box

The distance between the surgical tool and the organ of interest is reported inside an informative box on the screen. We implemented also a bar that grows when this distance decreases and changes its color from green to red when the distance becomes higher than an alert threshold. When the surgical tool comes into contact with the organ, the virtual probe representing it in the augmented scene becomes red. Furthermore, during the surgical operations, a white line in the augmented scene shows the possible direction of the real instrument according to its orientation in the real world. At the same time, a red line shows the minimal distance direction between the tip of the surgical tool and the organ of interest.

Besides the described video feedbacks, the application gives audio alerts to notify the proximity to the organ of interest: it emits audio impulses whose frequency is inversely proportional to the distance between the surgical tool and the organ. We tried to reproduce the acoustic warnings triggered by the park assist sensors designed for cars. Indeed, such scenario inspired the surgeons that suggested the audio feedback idea during preliminary talks. We implemented a 70-dB acoustic warning emitted by the PC speakers with a tone frequency of 1–7 Hz.

The application notifies also the proximity to other organs close to the organ of interest by making them opaque to help the surgeon avoid undesired collisions.

Another combined audio–video feedback was presented in [63], where spheres centered on the tip of the surgical instrument were proposed as a visual representation of distances: we implemented other visual cues because the rendering of these spheres could partially blur the visualization of the patient organs.

2.6.3 Trajectory guidance for tool insertion

We developed some additional features to enhance the precision and accuracy of the surgical intervention.

The application highlights the right trajectory the surgeon should follow to reach the specific point of the organ. This path is drawn as a blue line segment starting from the tip of the surgical tool (Fig. 13).

Fig. 13
figure 13

Blue line showing the right trajectory to the target

Moreover, the application should help the surgeon in detecting the correct tool position to reach the target point accurately and effectively. To address this issue, the application visualizes an extension of the tool; in the test, we used an ablator for radio frequency treatment of the tumors.

When the needle is aligned with the trajectory segment, both the needle extension and the trajectory line turn to a green color (Fig. 14).

Fig. 14
figure 14

Green line showing the perfect alignment of the instrument with the trajectory to the target

Another issue concerns avoiding collisions with blood vessels inside the organ [43]. The application checks in real time whether the mesh model of other organs (for instance, blood vessels) is intercepted by the needle trajectory: in this case, the color of the trajectory will change from blue to red in order to advise the surgeon that he has to change the path to reach the specific organ (Fig. 15).

Fig. 15
figure 15

Red line highlighting that other organs are intercepted by the needle trajectory

2.7 Software implementation

We developed the applications in C++ by using several open-source software libraries.

We used the PQP (Proximity Query Package) library [64] to compute the minimum distance between the surgical instrument tip and the vertices of the organ meshes [65].

We implemented image processing and visualization by means of the IGSTK (Image-Guided Surgery Toolkit) library [66, 67]: it is a set of high-level components integrated with low-level open-source software libraries and application programming interfaces. IGSTK provides several functionalities such as the ability to read and display medical images and the possibility to interface to common tracking hardware. IGSTK includes ITK (Insight Segmentation and Registration Toolkit) [68], an open-source software system for 3D computer graphics, image processing and visualization, and VTK (Visualization Toolkit) [69], an open-source software system that employs leading-edge segmentation and registration algorithms in two, three, and more dimensions. The graphical interface was built using FLTK (Fast Light Toolkit) library [70].

IGSTK is based on a state machine that allows to increase the safety and robustness of the toolkit [71]: indeed, a state machine allows the limit and control of possible application behaviors to preserve the state planned in its design phase and guarantee a reproducible and deterministic behavior. Each IGSTK component is based on a separation between public and private interfaces. When a user sends a request for an action through a public method call, this request is translated into an input for the state machine that changes its state on the basis of its current state and of the received input, as expected in the design phase. In any case, the class will always be in a valid state where each type of behavior has been programmed.

IGSTK provides also an interface to common tracking hardware: we used it to interact with the Polaris Vicra [55], the first optical tracking system we chose for our application. Then, we developed a specific library, integrated with IGSTK, as a software interface to the Bonita Vicon tracker [56].

The igstk::Tracker class, which represents the tracking system, contains multiple igstk::TrackerTool objects, one for each surgical tool that has to be tracked. Vendor-specific subclasses that extend the igstk::Tracker and the igstk::TrackerTool classes can be instantiated to handle tracking devices and trackable objects, respectively. The igstk::Tracker class can assume different states indicating whether a tool has been attached, the communication has been established, or position data are ready to be sent.

The platform IGSTK is based on timer events synchronized by a clock generated by the igstk::PulseGenerator class. A communication thread, which runs asynchronously from the rest of the platform, handles a igstk::Communication object included into the igstk::Tracker object: it receives the data about the surgical tool position from the tracking system and stores them into a buffer, while the main thread reads data from that buffer. In this way, application locks are prevented in any case of lost connection with the tracker.

To calibrate the instrument used in the surgical application and obtain the transformation elapsing between the tool and the point of interest, we implemented the PivotCalibration algorithm: it allows storing the position and orientation of the tool just by keeping the instrument tip position fixed and moving the tool in the space [72].

2.8 Test protocols

We tested the application in three different scenarios: we performed an accuracy test in our laboratory, an overlapping test on a mannequin, and a real test in the surgical theater.

2.8.1 Accuracy test in the laboratory

For the first test, we prepared a wooden box of dimensions 300 × 300 × 126h mm. We put three small wooden parallelepipeds of different sizes and positions inside it (Fig. 16).

Fig. 16
figure 16

Parallelepiped inside the wooden box used in the experimental tests

We loaded two models, built with Blender 3D [73], representing the box and an object inside it respectively. We reduced the opacity level to make the internal object visible. We will be able to visualize the points needed to show the surgical tool entrance point and the target points; it is also possible to show the fiducials that allow performing a calibration to align the virtual world to the real one.

In order to carry out the calibration phase, we positioned the tool tip on the point highlighted by a flashing red sphere (Fig. 17) and acquired the point coordinates. We performed these operations for three different fiducial points.

Fig. 17
figure 17

Positioning the ablator on the first fiducial point (the red sphere on the virtual box)

After the calibration phase, we moved the tool to different positions to check the correspondence between the real and the virtual world.

Before switching to the augmented visualization, we made a first precision test by changing the scene viewpoint through rotations, translations, and zooms. In the augmented reality phase, we overlapped on the real box the virtual one, which allows seeing objects inside it (Fig. 18).

Fig. 18
figure 18

Overlapping of the real box on the virtual one in the augmented scene

We checked if the target point reached in the augmented scene is reached also in the real world by means of an opening side of the wooden box.

A millimeter scale superimposed on the objects in the box allows to estimate the accuracy in reaching the selected points and measuring the error. Figure 19 shows the augmented visualization of the wooden box with the objects inside.

Fig. 19
figure 19

Augmented scene showing all the objects inside the box

The tests for the evaluation of the system accuracy were carried out by using the Starburst Xli enhanced RF ablator with a needle of 120 mm. The goal of the tests was to reach the center of the three small wooden targets inside the box used as a testbed. The description and sizes of the targets are reported below:

  • Target 1: Cube with a side of 20 mm;

  • Target 2: Parallelepiped with the dimension 20 × 40 × 22 h mm;

  • Target 3: Parallelepiped with the dimension 30 × 48 × 22 h mm.

2.8.2 Mannequin test

We started a test on a mannequin with the aim to overlap the virtual 3D models of the organs on a replication of the patient’s body and evaluate the achieved alignment among virtual and real organs. We chose three fiducial points on the mannequin. We soon abandoned such test because the results were not enough successful due to the different thoracic girths between the scanned patient and the mannequin. Figure 20 shows the augmented visualization over the mannequin.

Fig. 20
figure 20

Augmented visualization on a mannequin

2.8.3 Test in the surgical theater

Later, we decided to make a qualitative test of the application in the operating room during the radiofrequency ablation (RFA) of a live tumor [74]. Such procedure consists in the insertion of a needle (Fig. 21) inside the liver parenchyma to reach the center of the tumor (Fig. 22): an array of electrodes is extracted from the tip of the needle and a RF electrical current is injected in the tumor tissue to cause the tumor cell necrosis for hyperthermia.

Fig. 21
figure 21

Tool for radiofrequency ablation of tumors

Fig. 22
figure 22

Insertion of the tool for radiofrequency ablation

The superimposition of the virtual models of the patient’s anatomy (liver, cancer, etc.) on the real anatomy allows an easier and more accurate insertion of the needle: in this way, risks for the patient and surgery time should be reduced. For visualization surgeons used a 4K (8 megapixels, 58” viewing area), ultra high-definition wall-screen already available in the operating room.

The test was planned in order to qualitatively evaluate the application precision and also to highlight all the possible issues related to a real use of the developed platform during a surgical procedure for the treatment of a liver tumor. The patient was operated in open surgery and the tumor was located on the surface of the liver; a registration phase was carried out in order to overlap the virtual organs on the real ones and the fiducial points were chosen in correspondence of easily recognizable anatomical points on the rib cage in order to avoid problems due to deformation of the skin. In that situation, the surgeon was able to know the correct position of the real tumor by touching it and to apply the ablator on it in order to verify the correct overlap between the real and the virtual tumor.

The test did not involve any risk for the patient because the application was not used to guide the surgeon during the tumor resection. Anyway, the patient was informed on the nature of the experiment and signed an informed consent form. No ethics authorization was needed because the developed system was used in the surgical theater only to give precision evidence to the surgeon. Once the surgeon had identified the tumor by himself, he had the possibility to verify the system precision and accuracy. To this aim, the surgeon chose the clinical case of a patient with a tumor close to the liver surface. The intervention, which consisted in the removal of liver and colon tumors, was conducted in open surgery with the traditional laparotomy techniques, without relying on the augmented reality data. Therefore, in such scenario, the system was employed only to provide the surgeon with a confirmation of his deductions, but it had no impact on the surgeon’s decisions.

The tests were carried out using the NDI Vicra as optical tracker: as shown in Fig. 23, the device was placed very close to the operating bed to ensure the proper detection of the surgical tool, taking into account the limited tracking volume.

Fig. 23
figure 23

Test in the surgical theater

3 Results

In the following subsections, we present the results of the accuracy test and of the real test in the surgical theater. We are leaving out the test on the mannequin, which was immediately abandoned, as we explained in Section 2.8.2.

3.1 Accuracy test

A good guidance system in surgery should allow the surgeon to reach an accuracy better than 3 mm. Using the previously specified testbed, six tests were carried out: the results of these measurements are shown in Table 1, which reports (in mm) the errors on the coordinates of the points reached during the tests and the distance between the center of the real chosen target and the reached point in the augmented visualization. The test in the laboratory proved the system allows reaching a target point with a good accuracy.

Table 1 Distances in reaching the targets expressed in mm

The root mean square (RMS deviation) is 0.03 mm.

3.2 Test in the operating room

During the test in the surgical theater, the tracker at the foot of the operating table was able to detect in real-time the surgical tool thanks to the reflective spheres (markers) that had been properly placed on that instrument. The surgeon confirmed the correct overlap between the real and the virtual tumor during the test in the operating room. However, as we explain in Section 4, the medical staff highlighted some usability issues related to the presence of the tracker in the operating room: to overcome these problems, we adopted another tracking system, namely the Vicon Bonita [56].

4 Discussion

The developed system and the conducted tests gave important food for thought from many points of view.

The experimental tests in the operating room addressed a specific case dealing with the RFA ablation of a liver tumor [74]. However, the same methodology could be adopted in general for any parenchymal organ without a significant mobility inside the human body. On the other hand, the described approach is not suitable for moving or hollow organs without a rigid structure (such as the stomach, intestine, or breast): the registration procedures would be impossible for them, as they would lose the alignment with 3D models.

Since the first design phases, we expected a first barrier for the adoption of a surgical navigator could be the skepticism of some surgeons. As surgeons are typically familiar with traditional CT medical images, our surgical navigator gives also the possibility to visualize them in addition to 3D models because sometimes 3D models may not convince the surgeon.

Of course, one of the key aspects for the developed system is accuracy, which is particularly important in the registration phase. We exploited the punctual measures provided by an optical tracker, but also other methods have been proposed in literature. Surface-based registration methods try to match corresponding surfaces from the virtual and the real world [11]. Unfortunately, passive surface-based techniques fail in complex scenes characterized by occlusions, non-homogeneous lights, and lack of textures. On the other hand, active approaches would offer better performance, but they require significant modifications to surgical instruments [11].

A recent work [75] presented a registration method focused on liver surgery: it is based on view planning [76], which is a technique for detecting a short list of best views that could be used in a 3D model reconstruction. However, such implementation works only with rigid scenes, since it does not even consider motion from breathing or heartbeat, and with fixed camera parameters.

Volume-based methods exploit the whole intraoperative volume data to provide a deeper anatomy description. Unfortunately, they require very expensive 3D scanners and typically produce a huge amount of 3D data that may generate a heavy computational load [11].

Intentional and non-intentional movements can alter the position of the patient’s body during the surgical procedure. Moreover, the pressure caused by the introduction of a surgical instrument in the abdominal cavity [77] and the surgical actions on tissues generate shifts and deformations of internal organs. Also, natural motions deriving from the patient physiological activities, such as breath and heart beat, can partially induce these phenomena. Computational advances give a significant contribution, allowing also real-time intraoperative imaging [78]. Nowadays, multicore architectures or GPUs are often employed to face with the computational load of some complex registration procedures [79, 80]. Recent works proposed non-rigid approaches, such as stereo vision-based [81] and 3D surface mosaic-based methods [82], to deal with soft tissues.

As shown in Table 1, our method, based on measures provided by the optical tracker, allowed us to achieve in most cases an accuracy of about 2 mm, which is a satisfactory result for a guidance system in surgery. However, the precision of a surgical procedure depends also on the surgeon’s correct perception of depth and distances. In particular, depth perception deriving from binocular vision plays an important role in movements for reaching targets and grasping objects, which are crucial for the typical surgical operations [83]. For these reasons, stereoscopic vision can improve accuracy in laparoscopic procedures and significantly reduce errors [84]. Stereoscopic shutter glasses can give the illusion of the 3D structure of the organs, but they can generate discomfort [85, 86]. As an alternative solution, our system provides some audio and visual cues to improve the perception of the spatial relations among the anatomical structures and guide the surgeon during the intervention. Besides improving the surgical precision, these application features can help the surgeon in hand-eye coordination [87]. Therefore, by compensating the lack of depth perception, they can improve the skills of expert surgeons and shorten the learning curve of novices. We think audio-visual cues could integrate perfectly with haptic force-feedbacks in future research work.

In Section 2.6.1, we have highlighted how the implementation of a virtual window can overcome the lack of depth perception during augmented visualization. By funnelling the attention on a limited region, virtual windowing [62] prevents the surgeon from becoming aware of what is happening elsewhere. This is a general drawback of all the surgical navigation systems, which is called “inattentional blindness” [88]. To mitigate this effect, some authors [89, 90] proposed the adoption of inverse realism to highlight the main features of some occluded objects, because the knowledge of their position could be anyway important during a surgical intervention.

Compared to occlusion techniques, motion parallax provides a weaker depth perception, but it can estimate relative distances from motion [91, 92]. On the other hand, stereo vision can estimate absolute distances also in static scenes, but it can generate discomfort, visual fatigue, and headaches in surgeons [93]. Livatino et al. [94] assess the adoption of stereoscopic 3D visualization in endoscopic teleoperation.

Also, shadowing [91, 95] and perspective can indirectly contribute to enhance depth perception. However, such features are strongly related to shapes. Hansen et al. [96] proposed a correlation between thickness of tumor contours and their depth: unfortunately, this solution does not suit well to more complex scenes.

A combination of these techniques is not straightforward and can generate conflicts [97]. A recent work [98] tried to provide depth perception by reproducing the blur effect characterizing the depth of focus of a microscope: the authors state their solution is “suitable for microscopic neurosurgical applications with smaller working depth ranges.”

However, in a comparative evaluation conducted for 20 surgeons on seven rendering methods with a head-mounted display, semi-transparent surface rendering, and virtual window provided the best performance in terms of depth perception [60].

Finally, the test in the operating room gave some feedback about the ergonomics of the employed hardware tracking system. Even though the NDI Vicra tracker could properly detect the reflective spheres on the surgical tool, the surgeon’s body could obscure the tool visibility whenever it was between the tracker and the instrument (Fig. 23): under these conditions, only two cameras could not be enough for a continuous tracking of the surgical instrument. Therefore, we decided to develop a new platform prototype using the Vicon Bonita tracking system [56] and placing the four infrared cameras at the four edges of the operating bed. The choice of another tracking device not only allowed achieving a wider field of view, but it also improved the resilience and the robustness of the system. Polaris provides another tracking device, namely, Polaris Spectra, which has a wider field of view [99]. Nevertheless, we preferred the Vicon Bonita, because the presence of four cameras allows a good tracking even when one of the cameras is obscured by the body of a medical operator. Therefore, such tracking system has not only a better resilience but also a better usability, since it allows the medical staff to move around the operating table without worrying about avoiding to hide one of the tracking cameras.

5 Conclusions

In this paper, we presented an augmented surgical navigator based on the 3D modelling of the patient’s internal anatomy.

The developed AR application allows obtaining a correct positioning of the virtual organs built from CT images on the real ones. An appropriate chain of rigid transformations was implemented to achieve a proper integration of the virtual scene in the real one using specific fiducial points.

Furthermore, in order to provide the visual impression that virtual organs are properly positioned inside the body and not on its surface, a partial view of the organs is provided using a visualization window sliding on the mannequin surface; the remaining part of organs is occluded. The obtained result provides a realistic visualization and a correct impression of depth.

We implemented a combination of visual and audio cues to allow the surgeon to improve the intervention precision and avoid the risk of damaging anatomical structures. The test scenarios proved the good efficacy and accuracy of the developed system.

Some tests were carried out using a specific testbed in order to estimate the accuracy of the augmented visualization. Then, a real test in the operating room allowed the surgeon to check the correct overlap between the real and the virtual tumor. Such test suggested us to change the tracker device to obtain a more robust configuration with respect to occlusions.

In accordance with surgeons, we are planning some further experimental tests for a more accurate validation of the developed system also in surgical interventions on other parenchymal organ. Mode and terms for the software distribution will be defined after such experimentation.

As future work, we are evaluating the adoption of more robust tracking methods able to track a laparoscopic instrument even in the presence of a noisy background [100].

A future version of the developed system could provide also touchless interaction [39, 49, 50] by integrating devices based on gesture recognition [52, 53] such as Leap Motion [101, 102], based on infrared cameras, or Myo [103, 104], a wearable armband that detects the movements of the arm muscles. For instance, we could associate gestures detected by such devices with some control commands to move, shrink, or enlarge the sliding window.

While Leap Motion limits the user movements within a defined pyramidal volume, Myo allows a more free form of interaction, but it could become uncomfortable for the user’s arm when it is worn for a long time. Therefore, feedback from surgeons will be evaluated to choose the most suitable gesture device for the specific scenario.

Most of the interface components, as well as the module for distance measurement, could be reused to develop also a surgical simulator or a serious game for surgical training. In such a scenario, we could also introduce a haptic interface to provide a force feedback when the virtual surgical instrument touches a virtual tissue [105].