Introduction

Background

Technological advances, including robot-assisted minimally invasive surgery (RAMIS), are reshaping modern healthcare. Since the Czech term “robota”—which translates to “servant”—was coined in 1921, surgical robots have progressed as surgeons’ assistants in less than a century. They offer a plethora of benefits over traditional laparoscopic techniques: improved ergonomics, lower levels of fatigue, filtering of tremors and enhanced visualisation [1]. However, robots, often not intuitive, are limited by increased operating times [2, 3], high baseline and maintenance costs [4], lengthy learning curves [5] and lack of haptics [6]. Haptics (haptaesthai, Greek for “to touch”) describes the sensation of touch generated by kinaesthetic (i.e. force-related) and tactile information [7]. Receptors on the skin sense pressure, stretch, temperature, vibration and texture; thus, tactile feedback is compromised greatly in robotic surgery. Whilst the surgeon still manipulates the laparoscopic grasper in traditional laparoscopic operations, its shaft pivots around the insertion point resulting in friction that distorts the already limited sense of touch [8].

The importance of haptic feedback is widely recognised, as numbing fingers with local anaesthetic significantly reduces grasping ability [9]. Conveying tactile sensations to the operator in a non-visual display is challenging, though. Interestingly, kinaesthetic feedback per se reduces forces applied in block transfer [10]. Although King et al. recruited 20 subjects and did not assess precision in complex tasks, similar findings have been obtained from work in vivo [11].

Force scaling was proposed in 2002 to enhance telepresence and optimise force feedback [12]. It constitutes a series of algorithms that adjusts forces transmitted to the operator. In the absence of feedback, the scaling factor (SF) is 0, whereas if the feedback is relayed unchanged, the SF = 1. Importantly, during ocular surgery or brain cortex stimulation, forces exerted are very small; without upscaling, feedback can be masked by the inertial mass of the robot and friction between its joints [13]. To date, the optimal degree of force feedback in robotic surgery remains elusive.

IBIS VI

To address current limitations, the authors have developed a surgical robot prototype, IBIS [14]. The latest version, IBIS VI, comprises two robotic arms, each controlling a laparoscopic grasper (Fig. 1). IBIS VI is pneumatically actuated; removal of electric parts reduces cost and weight, while minimising safety concerns, such as risk of fault current. This technology achieves higher backdrivability and power-to-weight ratio. Moreover, it renders force feedback without the need for encoders or force sensors. Graspers are detachable from their potentiometers and pneumatic actuators, allowing for effective sterilisation. At the tip of each grasper, a pneumatic microcylinder exists to achieve a higher range of movements. IBIS VI provides 7 intra-abdominal degrees-of-freedom (DoFs): the five of traditional laparoscopy (yaw, pitch, roll, zoom in/out and grip) plus yaw and pitch at the grasper tip. Furthermore, it maintains an immovable point of insertion of the grasper, hence preventing contamination of force feedback. IBIS is also equipped with position sensors that track movements of each arm. It is operated remotely, using the Phantom Desktop haptic device. In summary, IBIS VI constitutes an ideal robot for training surgeons, as it is more affordable, compact and safer compared to electric counterparts. For a detailed summary of the mechanics of IBIS, refer to the Online Appendix (A, Fig. 1).

Fig. 1
figure 1

A lateral view of IBIS VI, a 7 degree-of-freedom pneumatically driven surgical robot, during needle insertion

Herein, the authors employ IBIS VI to determine the optimal degree of force feedback in simple and complex tasks. Block transfer is compared with needle insertion under four different SFs (0.0, 0.5, 1.0 and 2.0), chosen to represent a realistic range around physiological conditions (i.e. SF = 1.0). The authors hypothesise that increasing the scaling factor will reduce the forces exerted, until augmented feedback will overwhelm operators and result in increased, dysregulated forces. Differences will become more prominent in complex tasks. User satisfaction and depth of insertion of the needle will follow a similar trend. Operating times may gradually increase with force feedback, according to the literature.

The primary aim is to investigate the effect of force scaling on forces exerted by operators in simple (i.e. block transfer) and complex tasks, in order to determine the optimal scaling factor. The secondary objectives are threefold: to test whether operating time differs when altering the scaling factor (in block transfer, BT) and to explore the depth of insertion (in needle insertion, NI), as well as user satisfaction. This is the first study to assess force scaling in a pneumatic surgical system.

Materials and methods

Participants

Two right-hand-dominant males, identifying themselves as novices, were recruited on February 28th, 2018. Both adults gave informed consent. Subject selection was based on age and experience in robotic surgery. In this single blind, single-centre study, participants were asked to perform block transfer (BT) and needle insertion (NI) using IBIS VI under different SFs (0.0, 0.5, 1.0 and 2.0), set randomly. Upon completion of each task, participants were offered a questionnaire to assess satisfaction (Online Appendix A, Fig. 2). This study was conducted in the Institute of Biomaterials and Bioengineering, Tokyo Medical and Dental University, and was completed on the 16th of May 2018.

Fig. 2
figure 2

Summary of the modified block transfer task: the initial position of the blocks at the start of the task is shown above (a) and the final position at the end of the task is shown below (b)

Task selection and setup

Participants received training through 20 trials of traditional laparoscopic BT and ten trials using IBIS VI. To operate the robot, compressed air was supplied by an air cylinder. The operator controlled IBIS via a commercial master device (Phantom Desktop Geomagic Touch X). A low-cost strain sensor (Micro Load Cell SC133, Sensorcon) (1.0 × 10−2 g) was used to track force. The software used to quantify force was developed by the Tadano-Kawashima laboratory. For reference, a force of 5 N would feel identical on the master side when SF = 1.0, yet it would feel equal to 10 N if SF = 2.0. Finally, an endoscope (Olympus Endoeye Flex 3D) was supported by the world’s first pneumatic manipulator, the EMARO device (Riverfield inc.), offering flexible views.

Block transfer

Standard commercial BT—involving 3 mm wide pegs and 9 mm hollow blocks—was modified to compensate for simplicity. A 3D-printed base was tailored (using SolidWorks) to resemble the commercial board in dimensions, but having 1 mm wider pegs. Three blocks were made using tin-cured silicone and marked 1–3, with a pen. Blocks consisted of part A and part B silicone solutions (Dragon Skin®), mixed at a 1:1 ratio. The mix was placed inside a vacuum degasser for 1 h to remove air trapped during stirring. It was then poured onto a 3D-printed tray for moulding and left for 24 h to dry. Blocks were then extracted manually using scalpel and scissors. A smooth, hollow, 3D-printed cylinder was inserted within each block, resulting in a final inner diameter of 4.5 mm.

Three blocks were placed in predetermined starting positions on the board (Fig. 2). Participants were asked to lift the blocks with one robotic grasper and transfer them to the designated pegs, labelled 1–3, in numerical order. Time, applied force and user satisfaction were recorded, but no time limit was set. If a block fell from the grasper during transfer, the task was restarted. BT was completed 20 times in total for each participant (i.e. five times for each of the 4 SFs) (Online Appendix A, Fig. 3). All repetitions occurred on different days, to minimise confounding from motor memory.

Fig. 3
figure 3

An overview of the needle insertion task

Needle insertion

Two sheets of silicone, measuring 10 mm in height each, were fitted inside a 3D-printed cage, measuring 20 × 30 mm. The inferior sheet comprised hard silicone (Dragon Skin®), whereas the superior one contained soft silicone (Eco Flex®). The point of insertion was marked as a black dot in the middle of the upper sheet. A 27G needle (38 mm long) was chosen for insertion and was replaced after every task, to minimise confounding due to blunting of the tip. The needle was held loosely, using laparoscopic forceps (Karl Storz®), for subjects to grab with the robotic grippers. Subjects were instructed to insert the needle perpendicularly (as if administering an intramuscular injection) and to fully traverse the soft silicone layer without entering the hard sheet (Fig. 3) (Online Appendix A, Fig. 4). Penetration force was recorded and images were taken with a camera (Canon® Power Shot G1 X Mark II), set at an equal height to the cage, to identify the vertical depth of insertion by pixel count. The experiment was repeated five times for each SF. The task was restarted in case the needle was dropped.

Fig. 4
figure 4

The average net force (N) exerted during robot-assisted, modified block transfer under different scaling factors (0.0, 0.5, 1.0 and 2.0). Error bars show the total quadrature uncertainty (± 0.3 N). *p < 0.05

Data analysis

The data was analysed using the IBM Statistical Package for Social Sciences (SPSS). The boundary for significance was set at p < 0.05. Adobe Photoshop was chosen for pixel counting, when assessing depth of needle insertion.

Results

Force analysis

Raw data for maximum force were converted from grammes to Newton, by multiplying with g = 9.81 m/s2 and dividing by 1000. Baseline recordings in the absence of force (i.e. noise) were subtracted from their respective force readings. The compensated force from three axes, Fx, Fy and Fz, was then compiled into a net force (Fnet), the magnitude of which was calculated as follows:

$${F_{{\text{net}}}}=~\sqrt {F{y^2}+F{x^2}+F{z^2}} .$$

The average Fnet for each subject was derived by adding the five values for each experiment in every SF and dividing by 5. Subsequently, the averages for each participant were added and divided by 2. Results are shown below for BT and NI, respectively (Figs. 4, 5). Error bars represent the uncertainty of the calculation (derived as shown in Online Appendix B).

Fig. 5
figure 5

The average net force (N) exerted during robotassisted needle insertion under different scaling factors (0.0, 0.5, 1.0 and 2.0). Error bars show the total quadrature uncertainty (± 0.3 N). *p < 0.05, **p < 0.001

For BT, increasing kinaesthetic feedback reduced the forces exerted. This reduction appeared greater upon introduction of feedback (SF = 0.0 to SF = 0.5) but continuing with increments in SF. A one-way analysis of variance (ANOVA) revealed a significant difference [F(3,36) = 3.440, p = 0.027]. Homogeneity of variance was not violated, as evidenced by Levene’s test (p = 0.469, based on mean). Post-hoc analysis using Tuckey HSD identified a significant difference between SF = 0.0 and SF = 2.0 [mean difference (MD) = 0.484, p = 0.04].

While introduction of feedback using any SF reduced forces exerted compared to baseline, it was unclear whether increments in SF reduced force. Although a one-way ANOVA unveiled a significant difference [F(3,36) = 10.254, p < 0.001], Levene’s test yielded significant heterogeneity and thus, a Welch test was conducted, confirming the significant difference [F(3,19.23) = 4.795, p = 0.012]. Post-hoc analysis using Tuckey HSD revealed significance between SF = 0.0 and all positive SFs (SF = 0.0/SF = 0.5: MD = 1.45, p < 0.001; SF = 0.0/SF = 1.0: MD = 1.32, p = 0.001; SF = 0.0/SF = 2.0: MD = 1.31, p = 0.001). No significance was detected between the positive SFs.

Operating time

The average time to complete BT in different SFs was calculated by adding the times in each repetition and dividing by the number of experiments. Findings are presented in Fig. 6. Error bars depict standard deviation (SD).

Fig. 6
figure 6

The average operating time during robot-assisted, modified block transfer for different scaling factors (0.0, 0.5, 1.0 and 2.0). Averages for both subjects were compiled (N = 2). Error bars represent one standard deviation (± 1 SD). *p < 0.05

Length of operation appeared largest in the absence of feedback and in the presence of augmented feedback, with spread of data (around the mean) largest for the latter group. A one-way ANOVA [F(3,36) = 3.747, p = 0.019] was not taken into consideration, due to lack of tenability of the homogeneity assumption (Levene’s test, p = 0.003). A Brown–Forsythe test was chosen, instead [F(3,14.61) = 3.747, p = 0.035]. Post-hoc analysis using Tuckey HSD yielded significance between SF = 0.0 and SF = 0.5 (MD = 1.859, p = 0.023). Notably, the difference between SF = 0.0 and SF = 1.0 was equal to the cut-off for significance (MD = 1.859, p = 0.05).

Depth of insertion

Depth of insertion (d, in mm) was calculated as follows:

$$d={L_{{\text{needle}}}} - \frac{{{h_{{\text{cage}}}}}}{{{l_{{\text{needle}}}}}} \times {H_{{\text{cage}}}},$$

where Lneedle is the total length of the needle (constant, 38 mm), Hcage is the height of the cage (constant, 30 mm), and hcage and lneedle are the sizes of the cage and visible needle, respectively, as measured by Adobe Photoshop. Although the results were not free of error, uncertainty could not be calculated directly due to the crude method of measurement (i.e. pixels). The average depth of insertion (of five trails per SF, for both participants) is shown below (Fig. 7). Error bars represent SD.

Fig. 7
figure 7

The average depth of insertion during robot-assisted needle insertion for different scaling factors (0.0, 0.5, 1.0 and 2.0). Averages for both subjects were compiled (N = 2). The ideal depth of insertion was 10 mm. Depth of insertion in each SF is shown in red. Error bars represent one standard deviation (± 1 SD). *p < 0.05, **p < 0.001

The results show that increments in SF reduced the depth of insertion. The maximal increase was observed upon introduction of feedback at SF = 0.5. A one-way ANOVA proved significance [F(3,36) = 38.385, p < 0.001] in the presence of a tenable assumption of homogeneity (Levene’s test: p = 0.678). Post-hoc analysis using Tuckey HSD suggested multiple significant differences between groups: SF = 0.0/SF = 0.5: MD = 4.81, p < 0.001; SF = 0.0/SF = 1.0: MD = 5.49, p < 0.001; SF = 0.0/SF = 2.0: MD = 6.90, p < 0.001; SF = 0.5/SF = 2.0: MD = 2.10, p = 0.021. The aforementioned confirm an important reduction in depth of insertion with increments in SF. The results also support a smaller—albeit significant—difference when increasing the SF from 0.5 to 2.0. Interestingly, participants managed to direct the needle as instructed only for the largest SF, penetrating into the hard sheet of silicone for merely 0.21 mm (d = 10.21 mm). In contrast, the insertion traversed most of the hard sheet for SF = 0.0 (7.11 mm; d = 17.11 mm). For SF = 0.5 and SF = 1.0, the needle entered the hard sheet by 2.31 mm (d = 12.31 mm) and 1.62 mm (d = 11.62 mm), respectively.

User satisfaction

Herein, the subject averages for all four questions were added, with a maximum possible score of 20. Scores reflect overall satisfaction. The questions addressed feeling in control, telepresence, speed of operation and preference over non-robotic alternatives for BT and NI, respectively (Figs. 8, 9). Error bars represent the cumulative score for each participant, serving as an assessment of how similar this experience was for the subjects.

Fig. 8
figure 8

User satisfaction with IBIS in the modified block transfer under four different scaling factors (0.0, 0.5, 1.0 and 2.0). Data for each scaling factor is the average of the sum of the scores in four questions for N = 2. Error bars represent cumulative scores for each participant, with a maximum possible score of 20. *p < 0.05

Fig. 9
figure 9

User satisfaction with IBIS in the modified block transfer under four different scaling factors (0.0, 0.5, 1.0 and 2.0). Data for each scaling factor is the average of the sum of the scores in four questions for N = 2. Error bars represent cumulative scores for each participant, with a maximum possible score of 20. *p < 0.05

Intermediate SFs, SF = 0.5 and SF = 1.0, were preferred in BT, with the latter scoring the highest. Minimum satisfaction was observed in the absence of feedback, but variability was highest for SF = 2.0. A Kruskal–Wallis test [H(3) = 10.852, p = 0.013, mean ranks 0.0: 9.50, 0.5: 18.50, 1.0: 23.75, 2.0: 14.25] revealed significance between SF = 0.0 and SF = 1.0 (p = 0.01), using the Bonferroni correction. In NI, satisfaction was similar across all conditions [Kruskal–Wallis: H(3) = 0.134, p = 0.99, mean ranks 0.0: 15.63, 0.5: 16.81, 1.0: 17.19, 2.0: 16.38]. Of note, the maximal and minimal variability in responses was present for SF = 2.0 and SF = 0.0, respectively. With regards to qualitative data, one subject commented that force feedback at SF = 0.5 was faint. Both subjects agreed that telepresence was minimal in the absence of feedback. Subjects also acknowledged that force felt “really strong” at SF = 2.0. Ultimately, one participant felt losing control at SF = 2.0 due to “unelicited, dysregulated motions” of IBIS.

Discussion

Effects of force scaling on operator force

The results partly confirm the original hypothesis, as increasing the SF reduced forces exerted, but without an increase in force at SF = 2.0. Intriguingly, significant reductions in force (2.07 N–1.58 N, p = 0.04) were present only between SF = 0.0 and SF = 2.0, indicating little benefit of force feedback at physiological magnitudes in BT (Fig. 4). A possible explanation is that despite the modifications, BT is not complex enough for feedback to offer a meaningful benefit. This questions the quality of the task as well as its widespread use in surgical training and competitions for medical students. However, the effects observed at SF = 2.0 might be the result of unblinding due to the prominent difference in sensation between experimental conditions. Albeit previous research supports that haptic feedback is most useful in complex tasks, such as milling of hard tissue, the ideal SF remains elusive [15]. A reaction force observer, though, was used in the aforementioned study, rendering direct comparison with our experiments suboptimal [16]. Furthermore, the maximum force in BT was unexpectedly observed upon fitting blocks onto pegs, rather than on pushing down when making contact to lift them. In summary, only augmented force feedback significantly reduces the forces exerted during BT.

Reduction in force was significant between SF = 0.0 and all positive SFs in NI (Fig. 5), indicating the importance of force feedback in elaborate tasks. Still, findings do not verify our hypothesis, as force exerted at SF = 2.0 did not differ significantly from the minimum force (i.e. SF = 0.5: 1.73 N vs. SF = 2.0: 1.87 N, p = 0.96). Similar experiments, with limited sample size (N = 4), have reported reductions in force only in the vertical direction (i.e. z-axis) [17]. However, due to the acute angle of insertion in some trials herein, analysis of the net force, rather than force in the z-axis, became most meaningful. Whilst kinaesthetic feedback alleviates forces exerted, some argue it results in less tissue damage only when combined with tactile feedback [18]. Although neither precision of insertion nor damage to silicone was assessed, addition of tactile feedback would be of debatable benefit, as no direct palpation was involved in this experiment. Interestingly, one subject noted oscillations concurrent with loss of control at SF = 2.0. Studies support that augmented feedback conditions overwhelm surgeons and decrease performance, yet our results point otherwise [19]. Despite heterogeneity of variance in the data, the Welch test ensured conservative analysis, rendering statistical inaccuracies unlikely. Collectively, augmented feedback offers significant benefits in NI, comparable to those under physiological conditions.

Secondary outcomes

Time to complete BT followed an opposite trend to that hypothesized (Fig. 6). Force feedback (SF = 0.5) reduced operating time significantly compared to SF = 0.0 (23.6 vs. 29.2, p = 0.02). Physiological conditions (SF = 1.0) also reduced operating time, yet not significantly (24.2 vs. 29.2, p = 0.05). A likely justification is the inherently short length of the task and its simplicity, thus failing to introduce fatigue, normally present during surgery. Augmented feedback did not reduce operating time significantly, either (26.5 vs. 29.2, p = 0.48). Larger studies (N = 100) disagree [4, 20], while others support that lengthening of time in robot-assisted surgery is only significant for novices [4, 20]. Conversely, a recent phase 3, multicentre trial (N = 326) favoured robot-assisted laparoscopy versus open prostatectomy (246 min vs. 280 min, p < 0.001) [21]. Although the evidence in favour of higher operating times in robotic surgery is vast, a meta-analysis—focusing on pyeloplasty in children—attributes this debate to the lack of a universally accepted definition [22]. Indeed, it is unclear whether operating times include the lengthy preparations and docking of the robot prior to surgery. In summary, intermediate SFs reduce operating times in IBIS-assisted BT.

Introduction of feedback resulted in significant reductions in depth of needle insertion compared to baseline (p < 0.001) (Fig. 7). Interestingly, a further significant reduction was observed between SF = 0.5 and SF = 2.0 (12.31 mm vs. 10.21 mm, p = 0.02), rejecting the original hypothesis. Instead of constituting a distractor or opposing force, augmented feedback became the most useful SF [23]. This finding is not surprising, as precise tasks often require smaller forces. While IBIS has a lower mass than its electrically driven counterparts and is compensated for inertia, feedback from physiological scaling factors may be masked—to an extent—in delicate tasks (e.g. NI). Interestingly, the needle penetrated the hard silicone sheet (by 0.21 mm) at the SF = 2.0, raising the question whether the range of selection of SFs was appropriate. While upscaling from 2.0 is theoretically possible, exacerbation of oscillations and risk of harm to the operator make that venture unlikely to pursue. Perhaps, a SF = 1.5 would be most meaningful to understand the differences between SF = 1.0 and SF = 2.0. Nevertheless, it is possible that the needle did not enter the hard silicone sheet, if one considers that insertion rarely occurred at exactly 90°. Overall, increments in SF reduce the depth of insertion.

Robotic surgery increases user satisfaction by improving ergonomics and reducing stress (physical and mental), yet the role of SF is less well described [24]. Results in BT confirm our hypothesis, with the highest satisfaction seen for SF = 1.0 (p = 0.02) (Fig. 8). In contrast, satisfaction did not change in NI (Fig. 9). Notably, Kruskal–Wallis is less powerful than ANOVA, and the Bonferroni correction can yield false negatives, so a significant difference between SF = 0.0 and SF = 0.5 (p = 0.28) may have been overlooked [25]. In addition, the significant reductions in force and depth of insertion support that haptic feedback offers benefits the operators may not be aware of during surgery [18]. Training of subjects in BT, but not NI, could have also posed a source of confounding. Of note, the biggest variability in responses for both tasks was in “preference of IBIS over non-robotic alternatives”, where one subject consistently scored higher for SF = 0.0, SF = 0.5 and SF = 1.0, but lower for SF = 2.0. This highlights the need to tailor feedback not only to the task but to the surgeon, as well. Both operators were 21 years of age; hence, their sensory modalities were fully functional. Surgeons of different age, though, have dissimilar thresholds for sensation. Taken together, physiological SFs increase satisfaction in BT, but not in NI.

Future direction

Determining the ideal degree of feedback is not straightforward, as SFs change depending on the task. The next step will be to compare different masters. Higher SFs may favour a delta-mechanism master, such as the sigma7 (Force Dimension), which provides stiffness at the expense of workspace, compared to the serial-linked Phantom Desktop utilized herein. The Kawashima laboratory is also developing a pneumatic master, in an attempt to construct a fully pneumatic surgical system. To address individual preferences in SF, another venture would be to incorporate detectors of a surgeon’s fatigue (e.g. via electromyography) that automatically adjust the SF during an operation. The future could see a system which adjusts its SF based on both the operator and task. In this way, common tasks (e.g. suturing) may become fully automated [26, 27]. Taking a step further, a more immersive system could receive input from the operator’s prefrontal cortex for an additional level of dynamic adjustment of the SF, based on the operator’s stress and concentration [28]. Furthermore, embedding active constraints (i.e. navigational systems that strictly allow movement solely through predetermined paths) could eventually achieve precision impossible by human hand [29, 30]. However, surgeons with experience perform better even in the absence of feedback and thus, standardisation of training needs to be addressed together with the fundamental issues of lack of non-proprietary platforms and health inequalities [31]. Therefore, IBIS constitutes an affordable and portable prototype, ideal for surgical training and suited for the future of healthcare.

Conclusion

In conclusion, increments in SF reduce forces exerted in modified BT and NI when using IBIS, a pneumatic surgical robot prototype. Higher scaling factors (SF = 2.0) are not associated with negative outcomes, when compared to SF = 0.5 and SF = 1.0, and are superior to SF = 0.0. Operating time and depth of needle insertion are also reduced. User satisfaction is highest for SF = 1.0 in modified BT, yet no trend is present for NI. However, the results of this study are limited by our methodology, sample size and the mechanics of IBIS. In summary, the optimal magnitude of kinaesthetic feedback in robotic surgery differs depending on the task and the operator. It is of paramount importance to tailor the SF dynamically, in order to achieve the best operative results and advance surgical automation.