Introduction

Fatigue is defined as weariness or exhaustion from labor, exertion, or stress [1]. With regards to surgical performance, fatigue can be difficult to quantify, especially in a live theatre setting. It is believed that muscular and mental fatigue can develop during prolonged operating leading to a reduction in a surgeon’s fine motor control and reduced precision of instrument movement [2].

Virtual reality (VR) simulators have been used extensively in the airline industry to objectively assess the effect of fatigue on pilot performance [3, 4]. The assessment of fatigue using laparoscopic surgery VR simulators has also been reported [5]. However, to date there has been no published data regarding fatigue in intraocular surgery.

One of the commercially available ophthalmic virtual reality surgical training systems on the market is the Eyesi©, produced by VRmagic (Mannheim, Germany) (Fig. 1). Originally designed as a vitreo-retinal surgical training device, it now has a dedicated anterior segment training module. It allows repeated measurements of standardized surgical tasks. Feedback is provided in the following main categories: surgeon efficiency, achievement of surgical target or goal, surgeon error/tissue injury, and formative education/feedback [6]. The forceps training module has previously demonstrated construct validity [7] and we sought to use this to assess the effect of fatigue on intraocular surgical performance.

Fig. 1
figure 1

Eyesi© Intraocular Surgical Simulator (VRmagic, Mannheim, Germany). Note the left phacoemulsification pedal, right microscope pedal, instructor screen, head prop with electronic eye, and viewing microscope on an adjustable platform

Materials and methods

The study was performed using a virtual reality cataract surgery simulator (Eyesi©, VRmagic, Mannheim, Germany). The simulator consists of a mannequin head prop with a mechanical eye that pivots and rotates when manipulated by the surgeon. Various probes inserted into the mechanical eye can virtually emulate different intraocular instruments. A virtual operating microscope complete with zoom/pan/focus foot pedal provides stereoscopic images of the eye and instruments to the surgeon. A separate phacoemulsification foot pedal can also be used. Images from the microscope are also relayed onto an instructor viewing screen allowing real-time task monitoring and viewing of historical performance data. The simulator is loaded with various modules such as anti-tremor, forceps, capsulorhexis, and phacoemulsification with their availability varying according to the software version. Each module has different difficulty levels to simulate increasingly complicated tasks.

The study participants included seven experienced ophthalmic surgeons from our department with a wide range of sub-specialist interests. All were classed as “experienced” because they had completed more than 350 phacoemulsifications and intraocular lens insertions meeting the Royal College of Ophthalmologists minimum criteria for completion of training [8].The sample size was small due to the limited number of available experienced surgeons in our department. Ethics committee approval was obtained and all participants provided written consent. Each participant was also informed that the data collected would not be part of any professional appraisal and would be reported anonymously.

The Eyesi anterior segment forceps module requires the surgeon to grasp six objects from the periphery and place them in a net in the center [Fig. 2]. The size, shape, and antero-posterior location of the objects within the anterior chamber vary according to the difficulty level of the module, which ranges from one to four. The module teaches surgeons to accurately grasp the edge of a capsulorhexis flap while keeping the eye centered and avoiding injury to the lens or cornea. We chose to use this module as it has previously shown construct validity. For each attempt, the total possible score can range from 0 to 100. The simulator awards positive points for the percentage of the task completed and subtracts from this for reduced efficiency and errors. This can be represented as shown below.

Fig. 2
figure 2

Eyesi© anterior segment forceps module, level 4 (VRmagic, Mannheim, Germany). Notice the thin triangular objects in the peripheral anterior chamber that the surgeon has to grasp with the forceps and then place into a central “net”. Successful placement is indicated by a change in color of the object to green

Forceps module total score = number remaining objects (0 = 100 points)—excessive task time score (40 s = 0 points, 400 s = −20 points) – corneal injury score (corneal surface touched by instrument measured, 0 mm2 = 0 points, 10 mm2 = −100 points) – lens injury score (lens surface touched by instrument measured, 0 mm 2 = 0 points, 40 mm2 = −25 points) – odometer score (measuring distance traveled by instruments in the anterior chamber, 100 mm = 0 points, 140 mm = −20 points) – operating without red reflex score (time counted until surgeon returns into red reflex, 0 s = 0 points, 400 s = −20 points) – interacting out of focus score ( −5 points per out-of-focus interaction up to maximum −20 points) – non-horizontal insertion/removal of instruments score (−2 points per event up to maximum −20 points).

A standard study format was used for all surgeons and they were supervised by either one of two investigators (SW or JP). Age, sex, hand dominance, years of intraocular operating experience, and approximate number of cataract operations done to date were recorded for each surgeon.

The first set of data collection was immediately before a live theatre session. Details of the theatre list including time duration, number, and type of listed procedures were recorded. Each surgeon received a standardized orientation to the simulator with two practice sessions (level 1 and level 2 forceps module). Scores for these were not recorded. All surgeons then completed ten attempts at level 4 forceps module. The parameters recorded for each attempt were total score, total time, total time score, corneal injury score, lens injury score, odometer score, and operating without red reflex score. Each surgeon had a “plateau” score calculated for every parameter, which was the average of their final four attempts. This was a practical attempt to reduce the effect of the steeper part of the learning curve (thus giving the subject a further six attempts after their orientation session).

The surgeons then returned immediately after their scheduled theatre lists to complete a further ten attempts on the forceps module (level 4), and the same scores were recorded.

Statistical advice was sought from the local research and development department. Tests of normality (histograms and Q-Q plots) were applied to determine data distribution. Parametric testing (paired t test) was used for normally distributed data and non-parametric testing (Wilcoxon signed-rank test) for skewed data. Multiple comparison corrections were not required for our sample size. A p value < 0.05 was considered statistically significant. No corneal injury scores were observed and this data was discarded from the statistical analysis. All data was analyzed in SPSS and results are presented as means ± SD [95% confidence interval].

Results

Six of the surgeons were right-hand dominant and one was left-hand dominant. Other surgeon characteristics are shown in Table 1. The mean number of phacoemulsifications and IOL insertions performed by the surgeons to date was 3,800. This was also the most common procedure performed in the evaluated lists. A minority of other procedures were also undertaken including one endoscopic removal of naso-lacrimal duct silicon tube and one pterygium excision with conjunctival autograft.

Table 1 Surgeon characteristics

Pre- and post-theatre mean plateau scores for all parameters along with p values are shown in Table 2. Post-theatre session improvements in mean values for total score, total time score, total time, odometer score, and operating without red reflex score were noted. Of these, the improvements in total score and total time were found to be statistically significant (p = 0.028 and p = 0.033, respectively). Lens injury score worsened post-theatre but this difference was not statistically significant. Sample size was not suitable for meaningful statistical comparison of surgeon characteristics with simulator performance.

Table 2 Pre-theatre versus post-theatre performance with p values for experienced surgeons (n  = 7, result presented as mean ± SD [95% CI])

Discussion

Fatigue remains a concern amongst all professions requiring prolonged periods of focused concentration with minimal margins for error. Various approaches have been adopted to measure fatigue. The airline and space industries conducted pioneering studies combining in-flight physiological and performance indices with post-flight assessments on simulators [9]. These studies provided valuable data, which have led to fatigue-reducing pilot schedules.

Amongst surgeons, fatigue may be due to increased workload either in the operating environment (e.g., high-volume lists) or prior to operating (e.g., sleep deprivation). In general surgery, various measurement methods have been adopted including comparing patient outcomes with non-fatigued colleagues [10, 11], intra-operative assessment of muscle fatigue using surface electromyography [5], and the use of virtual reality simulators [12].

Our study represents a first attempt at measuring the effect of fatigue on intraocular surgery. It did not demonstrate a deleterious effect of fatigue on surgeon performance after a regular operating list in the National Health Service (NHS). A usual operating session within the NHS is 3.5 to 4 h with minor inter-departmental variations. This includes patient transfer times, during which the surgeon would be busy writing operation notes or reviewing the next patients’ files. A slight improvement in performance was noted in the majority of the measured parameters. We feel this represents a peak on the traditionally described “S-shaped” performance curve indicating increasing surgeon capability with repeated performance of a surgical task. It has been observed in psychomotor skill-acquisition studies that performance curves decrease in the presence of fatigue and distraction, which was not seen in our study [13]. The improvement could also be secondary to a learning curve effect. This is, however, best measured using a “retention test” whereby the learner, after resting for a period of time, is asked to perform the skill that had been practiced [14]. Our study design had no rest periods, making the performance curve explanation more likely.

Cataract surgery remains the most common intraocular procedure performed within the NHS. Data from the Royal College of Ophthalmologists shows that the average waiting time for a cataract operation in 2003/2004 was 190 days [15]. Heavy investment in cataract surgery pathways has subsequently been made to reduce this to less than 3 months. One of the discussed approaches has been to do “high-volume” lists. While the definition of this remains vague, it generally implies that the surgeon does many more procedures than they would normally be accustomed to doing within the same time span. Indeed, the exact number may vary from surgeon to surgeon but it does raise questions regarding patient safety and surgical outcomes. Our study provides a template to measure surgeon fatigue in high-volume lists. Future application could include measurement of the “fatigue-threshold” of each individual surgeon allowing the list to be adjusted to their capacity.

We would like to acknowledge some limitations to our study. Our sample size was limited by the number of available experienced surgeons in our department. Acquisition of similar simulators by more hospitals around the country could provide a larger sample size through multicenter collaboration. The theatre lists included in our study were not standardized for procedure type with some extraocular operations included. Further studies could be designed to be more procedure-specific. We were only able to use the forceps module due to its validation. Once validated, modules with higher complexity could also be used to more accurately gauge the effect of fatigue. Post-theatre simulator data could also be combined with intraoperative measurements using innovative methods such as surgeon ocular and hand movement tracking to provide more information.