Introduction

Since the introduction of ultrasound (US) machines, the human–computer interaction (HCI) methods employed in US machines are constantly evolving and combine a variety of modalities [1, 18]. The current application scenarios for US scans range widely and are principally but not limited to medical or diagnostic purposes [3]. Regardless of where ultrasound machines are used, user productivity is always of primary concern in the design of the user interface [1].

Traditionally, a physical control panel (PCP) on the US machine provides the main HCI for machine control [17, 18]. A mix of physical buttons, keys, sliders, trackballs and knobs are commonly used on the PCP of modern US machines. More recent designs have added touch screens [13] or remote controllers [2] to the PCP. However, according to [5], having a PCP separated from the main screen can cause interference with the workflow, as the operator needs to switch his or her attention frequently between the main screen and the PCP. Furthermore, it was also found that work-related injuries are closely related to the repetitiveness of adjusting the parameters on the PCP [5]. Moreover, some ultraportable machines lack some or all physical controls. A supplement or alternative to using the PCP on US machines is the use of voice control [12, 16, 19]. However, the use of voice control during a US scan is not preferable, as it can disturb the natural conversation between the patient and the sonographer [5]. To solve the usage difficulties observed in current HCIs on US machines, we propose to integrate eye gaze tracking technology into the interface.

One’s eye gaze naturally reflects his or her location of attention, and usually, people have to fixate on or near a target in order to perform a task [20]. Therefore, it is natural to employ eye gaze tracking as an HCI method. With the advancement of gaze tracking technology, it has been applied to many fields, including social science, psychology, computer science and neuroscience [15]. Attempts to employ gaze tracking to the control of medical devices have also been made to facilitate effortless control of systems, such as in [7, 11, 14]. The first attempt to integrate gaze tracking to control US machines was performed in [6], where the operation of image zooming and panning is integrated with gaze tracking. However, the “Midas touch” problem [10] is not properly handled, and thus, task completion time when using gaze-supported HCI is not reduced comparing to the conventional HCI [6]. Despite that, [6] provides guidelines to future research on gaze-supported HCI designs for US machines.

In our proposed HCI, to increase examination efficiency, we intend to merge some aspects of the PCP to the main screen, such that the operator no longer needs to switch attention to the PCP to set certain parameters. This is a meant as a test of concept study. To achieve this, without having to touch the screen, we use a simple handheld controller as the replacement of the buttons or keys on the PCP for the US machine. As a result of the work from [6], hybrid control mode is introduced to better handle the “Midas touch” problem. To avoid switching attention from the image to the controls, the pop-up menu appears on top of the US image based on the gaze position in order to alter the machine settings easily.

A detailed demonstration of the proposed HCI is presented in “System design” section. A user study was conducted to test the validity of our design ideas. The methods and results of the user study are presented in “User study” section. We conclude our paper in “Conclusions” section by discussing all the advantages and disadvantages in our HCI, as well as our ideas for future work.

System design

In this section, we provide the rationale for and the design details of the proposed gaze-supported HCI. We start by presenting the common features of HCI designs among several brands of frequently used US machines. Next, we summarize the difficulties with some of these features and use them to motivate our design goals. We then proceed to explain our specific HCI design.

Table 1 US machine details

Common HCI design features of US machines

We conducted a survey to observe the HCI designs on different models of US machines. A total of 7 different models, listed in Table 1, were studied. Despite the difference in brands and in the year of manufacture, three common characteristics can be observed among these US machines. First, as shown in Fig. 1b, the HCIs are comprised of two components: a main screen, for the display of US images and machine status, and a PCP, for all control inputs to the machine. Second, even though there are differences in layout and shapes, all PCPs comprise of buttons, knobs, one trackball, and one touch screen. Third, the methods for adjusting frequently used parameters are very similar or even identical among the observed machines. Among the hundreds of functions that can be used on US machines, only approximately 10 of them are used frequently (in more than \(90\%\) of US scans) [5]. The three most frequently used functions are image freezing, low-resolution zoom, and measurement. These are discussed next.

Fig. 1
figure 1

System demonstration

Image freezing is always achieved by pressing the designated button near the trackball. For low-resolution zoom of the image, 2 machine models use a pair of buttons that control zoom in and out. The others use the rotation of a knob to control zooming. For measurement, the following procedures are followed on all HCIs:

  1. 1.

    Press the “measurement” button on the PCP.

  2. 2.

    Select the measurement type (distance, area) on the touch screen.

  3. 3.

    Perform the measurement by using the trackball and the selection button.

Based on these observations, we conclude that the general HCI methods for frequently used control inputs on US machines have remained relatively unchanged during recent years, and these HCI designs are adopted broadly. As a consequence, sonographers report little difficulty switching from one machine to another to perform US scans.

Usage difficulties and design ideas

According to [5], the PCP-based HCI suffers from several usage difficulties, which are: (I) the requirement of an excessive amount of operations on the PCP for the adjustment of certain settings, which prolongs the duration of the US scan; (II) the repetitiveness of body movement when adjusting some parameters on the PCP, which not only introduces unnecessary workload to the operators, but also facilitates work-related injuries; and (III) the frequent switch of attention between the main screen and the PCP, which can hinder the normal operation flow.

Therefore, the goal of our HCI system is to solve the problems (I)–(III) and facilitate effortless interaction on US machines.

The reason for the employment of the gaze tracker comes from the information conveyed in gaze signals. As the positional information about one’s eye gaze naturally reflects the region of interest when performing certain tasks, the excessive steps needed on the PCP to inform the machine about the location of the operation can be reduced. Besides, if we use gaze signal as a source of control input, the requirement of body movement can also get decreased. Some of the required machine control can be achieved directly with eye gaze. Therefore, the employment of the gaze tracker provides the solutions to problems (I) and (II).

However, due to the “Midas touch” problem, (there is “passive” gaze at the ultrasound image and “active” gaze at a control button, with switching between the two), gaze tracking may not be a reliable source of control input. Therefore, we use a multimodal HCI, by complementing eye gaze with a simple handheld controller. As opposed to the PCP, a handheld controller is compact and flexible to use, and very little body movement is required for its operation. Thus, problem (II) can be addressed.

Fig. 2
figure 2

Gaze-supported HCI demonstration

With the gaze tracker and the handheld controller, we can accomplish our design idea of supplementing or replacing the PCP. We propose to merge the PCP into the main screen and use the gaze signal and the handheld controller for control inputs. With a merged PCP on the main screen, parameter adjustment can be achieved by looking at the main screen only, and virtual buttons and menus on the main screen replace the physical ones on the PCP. Hence, the switch of attentions is no longer a problem to the operators, and problem (III) is solved. However, we also do not want the virtual buttons and menus on the main screen to obstruct the US images. Therefore, a gaze-supported pop-up menu is designed that can be invoked or revoked by pressing the designated button on the controller. The pop-up menu will show up at the location of gaze for efficient setting of parameters at the location of attention.

HCI arrangement

To test our design ideas, we build the system whose component diagram is shown in Fig. 1a, and the system setup is shown in Fig. 1c. The personal computer (PC) is the processing center that transmits the US images and control inputs between our HCI and the US machine. According to the study in [5], the following frequently used control inputs are applied in our HCI: freeze, low-resolution zoom, and measurement. With these controls of the US machine, we can test our design ideas as our HCI can cope with a variety of US scans for routine US examinations.

We can see from Fig. 2a that the controller has a trackball and several buttons. Therefore, it is easy to move the cursor with the trackball, and the buttons on the controller can be used in addition to the gaze signal to form a multimodal control input to avoid the “Midas touch” problem. The symmetrical design of the controller makes it easy for both the left-handed and the right-handed person to use. The key is that the operator does not need to see the controller, only feel it during use.

To be able to adjust the above-mentioned parameters, alongside with the gaze-supported pop-up menu, two more key functionalities are introduced in our HCI: gaze-supported zooming and hybrid control mode.

The gaze-centered zoom can be achieved by scrolling the track ball up or down after the pop-up menu has been invoked. The scrolling up will zoom in the image at the center of the pop-up menu, and vice versa. The following procedures are followed for gaze-centered zoom:

  1. 1.

    Activate the pop-up menu at the point of gaze.

  2. 2.

    Scroll the trackball up/down to zoom in/out the image.

We can compare this with the low-resolution zoom of conventional HCIs, which follow the following procedure:

  1. 1.

    Reach for the knob/button for image zooming.

  2. 2.

    Rotate/press the knob/button to zoom in or out the image.

  3. 3.

    Pan the image if necessary.

Based on this comparison, our design can simplify the operation of image zooming, as no panning is needed since the HCI knows the proper center for the zooming operation.

The hybrid control mode is designed also for the pointing and selecting with our HCI. Because the human eye movement has a random component in it, most commercial gaze trackers have a tracking error of approximately \(1^{\circ }\) of visual angle [4]. Therefore, fine pointing and selecting based on gaze position along are not practical. For that reason, we introduce a hybrid control mode. By pressing and holding the trigger button on the controller, the hybrid control mode is activated. On activation of the hybrid control mode, the cursor is set to the current gaze location. While holding the trigger button, the operator can move the trackball to fine-tune the location of the cursor. On releasing of the trigger button, a selection is made. Such hybrid control mode allows the operator to quickly find the cursor and position it to the point of gaze.

Measurement of area or length can also be performed with the pop-up menu. By clicking the trigger button while the pop-up menu is invoked, the measurement mode is entered. On entering the measurement mode, the cursor will change color (to pink), as shown in Fig. 2b. Selecting points in measurement mode can be completed via the hybrid control mode or via using the trackball and clicking the trigger button for selection confirmation.

To freeze or unfreeze the image, the left button on the controller needs to be pressed. Since the freezing of the US image is a time-sensitive operation, a dedicated button on the controller is assigned to the image freezing operation.

Figure 2b shows the main screen of our HCI. The pop-up menu is invoked. The letter “M” symbol in the pop-up menu indicates the measurement mode. The “+” and “–” symbols indicate the zoom in and zoom out operations, respectively. By selecting one of the operations in the pop-up menu, the corresponding symbol will be updated to reflect the operator’s selection. In our case, we select the zoom in operation, and the “+” symbol changes to “\(\times \,5.39\),” for example, indicating the image is zoomed to be around 5 times larger. The two co-centered circles and a cross mark the location of the gaze position. The status bar on the side shows the operation status of our HCI.

User study

In this section, we discuss the user study performed to compare our proposed HCI with a conventional US machine HCI. First, we specify our testing goal and hypotheses. Then, the user study is presented. Lastly, we perform the analysis of collected data from the user study, to test our hypotheses.

Goals and hypotheses

The goal for the user study is to test the design ideas of our HCI. Therefore, a comparative user study is performed for first-time users of our system. In our HCI prototype, we enabled only a limited set of menu selections. However, the US scanning scenarios achievable with these selections occur frequently. For evaluation, we select a typical tumor size measurement task. Similar measurement tasks are frequently practiced during cardiac and fetal US scans. The efficiency and workload are the evaluation criteria for HCIs. Metrics for the assessment of HCI efficiency and workload are the task completion time and the NASA task load index (TLI), respectively. The NASA TLI is a commonly used subjective survey for the evaluating the workload of a given task [8, 9]. The following hypotheses are made:

$$\begin{aligned}&\mathcal {H}_1: {\text {There is no difference in the task completion time}}\\&\qquad \quad {\text {when using our HCI comparing to the conventional}}\\&\qquad \quad {\text {HCI.}}\\&\mathcal {H}_2: {\text {There is no difference in workload when using our}}\\&\qquad \quad {\text { HCI comparing to the conventional HCI.}} \end{aligned}$$

Experiment

In the experiment setup, the SonixTOUCH US machine (BK Medical, Richmond, Canada) is used, both as a reference standard, and to test our new multiomodal, gaze-enabled prototype. As discussed in “Common HCI design features of US machines” section, the HCI method on the SonixTOUCH US machine for our US examination task is typical of commonly employed HCI designs. To achieve the connectivity and functionality of our HCI system as depicted in Fig. 1a, the Ulterius SDK and the Qt5 framework are used in our software. We employ the GP3 Eye Tracker (Gazepoint Research, Vancouver, Canada), which has an eye tracking accuracy of \(1^{\circ }\) in visual angle and a sampling rate at \(60\text{ Hz }\). A phantom (CAE, Montreal, Canada) with 5 ellipse-shaped tumors inside is chosen for the participants to perform the tumor size measurement task.

A total of 8 participants were involved in our experiment. They were first-time users of an US machine of any kind, have not used our HCI, and have not had prior experience with gaze tracking. The Behavioural Research Ethics Board at the University of British Columbia approved our study (study number H15-02969-A004). The following protocol is followed with the participants.

First, they are informed of the purpose and procedure of the experiment. A basic shooting game is designed for each participant to become familiarized with the hybrid control mode. Once they are familiarized with the gaze tracker and the hybrid control mode, demonstrations on how to use both our proposed HCI and the conventional HCI for tumor size measurement are presented. The tumor size measurement task is composed of three steps.

  1. 1.

    Maneuver the US transducer on the phantom to look for tumors.

  2. 2.

    Once a tumor is shown on the image, freeze the image, and then, zoom the tumor to the center of the screen.

  3. 3.

    Measure the area of the tumor.

Each participants are asked to perform 5 sets of measurements with each HCI. They are given 5 min to practice the tumor size measurement on each of the HCIs before their measurements are recorded for analysis. Once the participant freezes the image, the tumor measurement task completion time is recorded. After participants have completed all the measurement tasks, they are asked to play the shooting game again using each of the HCIs, and their performance is recorded to test the selection accuracy. Immediately after the game, the participants are asked to complete the NASA TLI form, and user feedback is collected.

Fig. 3
figure 3

Demonstration of the shooting game

Fig. 4
figure 4

One-way ANOVA for duration

Figure 3 shows the interface for the simple shooting game. We can see there are 6 targets on the screen. The user can shoot at the target by clicking near the center of the target. To play the shooting game, users are instructed to shoot all 6 targets in an orderly manner. A new frame of randomly generated targets will appear if the user completed shooting at the current targets. The aim of this game is to familiarize the user with using a hybrid control mode. Unlike the conventional method, where the cursor can only be moved in a continuous manner, the hybrid control mode enables the user to position the cursor to any arbitrary location on the screen in an instant (i.e., from one target center to another) by gazing at the desired location. This shooting game can also be played without the hybrid control mode, where the cursor needs to be operated through the trackball.

Data analysis and results

From the user study as described above, the following data are collected: (1) the selection accuracy at the beginning of the experiment with our HCI, (2) the selection accuracy at the end of the experiment with our HCI, (3) the selection accuracy with the conventional HCI, (4) the task completion time with our HCI, (5) the task completion time with the conventional HCI, (6) the NASA TLI for our HCI, and (7) the NASA TLI for the conventional HCI. Based on the user data, we can evaluate \(\mathcal {H}_1\) and \(\mathcal {H}_2\) accordingly. We choose a significance value \(\alpha = 0.05\) for the hypothesis testing using one-way ANOVA. The MATLAB function “anova1” was used for the generation of the test results where outliers are detected automatically and removed when performing the test.

Analysis of HCI efficiency

The one-way ANOVA test is performed on the task completion time with each HCI to evaluate \(\mathcal {H}_1\). Figure 4a shows the result. At \(p = 3.00\times 10^{-4}\), the one-way ANOVA test rejects the hypothesis \(\mathcal {H}_1\). Thus, there is a difference in the task completion time when using different HCIs for the given task. Comparing to the average task completion time of 19.2 s with conventional HCI, using our HCI can reduce the average task completion time by 20.6% to 15.3 s. Furthermore, to test the effectiveness of the three key functionalities in our HCI, the one-way ANOVA tests are also performed on the following hypotheses:

$$\begin{aligned}&{\mathcal {H}_3: {\text {There is no time difference in operating the zooming}}}\\&\qquad \quad {\text { function when using our HCI comparing to the}}\\&\qquad \quad {\text { conventional HCI.}}\\&{\mathcal {H}_4: {\text {There is no time difference in performing tumor size}}}\\&\qquad \quad {\text {measurements when using our HCI comparing to the}}\\&\qquad \quad {\text {conventional HCI.}} \end{aligned}$$

With \(p = 1.17\times 10^{-7}\), the one-way ANOVA test rejects \(\mathcal {H}_3\). Hence, we know that the zooming operation duration with our HCI (average = 4.8 s) can be reduced comparing to that on the conventional HCI (average = 7.4 s) by \(35.7\%\). The result also confirms that the designs of gaze-supported pop-up menu and the gaze-centered zooming are very efficient. With \(p = 0.10\), the one-way ANOVA test accepts \(\mathcal {H}_4\), such that using our HCI uses the same amount of time as the conventional HCI when carrying out the measurement task. Hence, we claim that the design of the hybrid control mode for measurement is as good as that with conventional HCI.

Fig. 5
figure 5

One-way ANOVA for measurement and selection errors

In addition to the task completion time, the measurement accuracy is another factor to evaluate the efficiency of the HCI. Therefore, the following hypothesis needs to be evaluated:

$$\begin{aligned}&\mathcal {H}_5: {\text {There is no difference in measurement accuracy with}}\\&\qquad \quad \text {{different HCIs during the user study.}} \end{aligned}$$

Considering image noise and that the area of the tumor on the image is different when scanned from different angles, we cannot use the actual size of the tumor as the evaluation standard for the measurement accuracy. Instead, we use recorded images during the experiment after the user has frozen the image. The edge of the tumor on the recorded image is manually traced by an expert. We use the area of the traced shape as the ground truth to evaluate the accuracy of the measurement. Since the tumors on the images are of different sizes and zoomed to different scales, rather than using the absolute differences between the measured size and the ground truth, we employ the relative measurement error, calculated as:

$$\begin{aligned} \text{ relative } \text{ measurement } \text{ error } \!=\! \frac{|\text{ measured } \text{ area }\!-\!\text{ ground } \text{ truth }|}{\text{ ground } \text{ truth }}. \end{aligned}$$
(1)

We use the relative measurement error for the one-way ANOVA test of hypothesis \(\mathcal {H}_5\). At \(p = 0.12\), the one-way ANOVA test accepts \(\mathcal {H}_5\), suggesting that the measurement accuracy is not affected by the differences in HCIs during the experiment. Figure 5a shows the test result.

Furthermore, we evaluate the accuracy of pointing and selecting using the hybrid control mode, comparing to the conventional method. The selection accuracy in the shooting game can help us to perform the comparison. Therefore, the hypothesis below is proposed:

$$\begin{aligned}&\mathcal {H}_6: {\text {There is no difference in the accuracy of pointing and}}\\&\qquad \quad {\text {selecting with different HCIs during the user study.}} \end{aligned}$$

To test the hypothesis \(\mathcal {H}_6\), the one-way ANOVA test is carried out on the three data sets: the selection accuracy at the beginning of the experiment with our HCI, the selection accuracy at the end of the experiment with our HCI, and the selection accuracy with conventional HCI. Figure 5b shows the results. With \(p = 0.27\), the one-way ANOVA test accepts the hypothesis \(\mathcal {H}_6\), indicating that making selections with our HCI is as accurate as using the conventional HCI. Hence, we can verify that doing measurements with each of the HCIs are equally accurate.

Analysis of HCI workload

To evaluate the hypothesis \(\mathcal {H}_2\), the data of NASA TLI for each of the HCIs need to be evaluated. The NASA TLI divides the workload required for a given task into 6 sections, which are: mental demand, physical demand, temporal demand, performance, effort, frustration. NASA TLI evaluations assign the score on each sections of the workload, and the overall workload of the task is calculated as the sum of each component score. Figure 6 shows the bar plot of the averaged score for each workload component. The large error bar indicates the large individual differences when evaluating the workload. Nevertheless, Fig. 6 reveals that the workload required when using each of the HCIs is different. The conventional HCI requires more physical activity, whereas our gaze-based HCI demands more mental activity and is more frustrating to the users. This result is in accordance with the user feedback at the end of the user study. When passing the overall NASA TLI score to the one-way ANOVA test, it accepts the hypothesis \(\mathcal {H}_2\) with \(p = 0.75\), suggesting that our HCI does not impose excessive workload on the operator comparing to the conventional HCI. The acceptance of the hypothesis \(\mathcal {H}_2\) also indicates that our design of the gaze-supported HCI is intuitive, causing no extra effort for the operator to interact with the US machine. Figure 7 shows the ANOVA test results of hypothesis \(\mathcal {H}_2\).

Fig. 6
figure 6

NASA TLI score for each workload component

Fig. 7
figure 7

One-way ANOVA on NASA TLI overall score

Based on the user feedback, there is still room for improvement on our gaze-supported HCI. The most commonly reported problem is with the use of the gaze tracker. Out of the 8 participants in our user study, 3 of them need to re-calibrate the gaze tracker during the experiment due to observed deterioration in tracking accuracy. As for the calibration of the gaze tracker at the beginning of the experiment, 7 participants repeated the calibration and one needed to calibrate three times to yield a desirable tracking accuracy. Repeated calibrations take time and may hinder the work flow for US machine operators. Another reported issue when using our HCI is from the natural blinking of the eye. Since the gaze location is lost when the operator blinks, and the gaze tracker takes some time to resume the tracking after the operator reopens his or her eyes, some large disturbances of the tracking accuracy can occur.

Conclusions

In this paper, we present a novel approach to the user–machine interaction on US machines. Statistical results imply that our HCI can reduce the task completion time while maintaining the overall workload to the operators. Three novel functionalities were implemented in our HCI: gaze-centered zooming, gaze-supported pop-up menu, and hybrid control mode.

However, for the testing of design ideas, only a limited number of control inputs were enabled in our HCI. Our future work will consider a larger set of control inputs in our HCI. The reported problems of repeated calibrations and loss of gaze positions during eye blinking need to be solved. A delay is also observed on refreshing the US images on our HCI. Since all participants are asked to freeze the image before the measurement, the delay does not affect the performance of the given task. Yet, for clinical practices, the delay in our HCI needs to be minimized.