Keywords

1 Introduction

In recent years, the increasing shortage of skilled labor [7] in manufacturing and logistics environments has led to an increased reliance on AMRs to fill the gap. This growing AMR market [11] generates more hybrid environments where humans and robots coexist. These robots support industrial processes through their flexibility and decentralization that make them crucial components of the fourth industrial revolution.

While integrating robots in these hybrid environments brings numerous benefits, it also presents challenges. One such challenge is ensuring smooth and efficient interactions between AMRs and human workers, requiring effective communication. A key aspect of this communication is the ability to anticipate the navigational intent of the AMR, as it allows humans to make informed decisions and adapt their behavior accordingl. Improving this communication is vital for enhancing system efficiency and acceptance of robots. However, communication tools in industrial AMRs need to be improved to fulfill communication needs. They primarily rely on car-like indicators or the use of a projected ’floor spot’ in front of the robot. Additionally, the movement of the robot itself is an implicit communication tool - be it intentional or not.

2 State of the Art

Engineers and designers face a lot of choices on how to convey information to humans. Dey et al. [2] provide a taxonomy to distinguish design decisions. Designers need to decide whether only one addressee is targeted (unicast), everyone is targeted with unspecific messages (broadcast), or everyone is individually addressed (multicast). At the same time, designers must carefully choose the perspective a message is explained from: Is the message clearer from the robot’s (allocentric) or the pedestrian’s perspective (egocentric). In all cases, designers need to be careful that their messages can’t be misunderstood as from an opposing perspective or addressing the wrong person.

A choice commonly discussed in the context of autonomous driving external HMIs (eHMIs) is the choice between an explicit and implicit mode of communication as in [15]. While explicit communication uses additional signaling, implicit communication is part of an agent’s behavior itself. A common argument favoring implicit communication is that human navigation mainly relies on it. At the same time, technical systems are said to require explicit messaging because they cannot express themselves like humans, and using them might expand the system’s capabilities limited by just using movement [15].

Dragan et al. [4, 6] describe the differences between functional, predictable, and legible motion. While functional motion just requires the robot to reach its goal, predictable motion is created by moving according to the observer’s expectations, given an observer knows its goal. Predictable motion is achieved by selecting the most efficient path [6]. Legible motion, in contrast is the motion that makes its goal easier for an observer to infer. It is achieved by selecting the path that maximizes the probability that the observer knows the goal given its path [6].

When using driving behavior or the dynamic HMI (dHMI) [1] as a communication tool, there are two groups of constructing these. Firstly, specific situations can be addressed by specific behaviors. Examples of these are a “back-off” movement to clarify the priority in bottleneck scenarios with AMRs [10] or lateral movements and changes in driving speed in bottleneck scenarios for autonomous vehicles [12]. The second group is formed by human-aware navigation planners that attempt to create legible trajectories by implementing perception and prediction of humans in a mobile robot’s surroundings. An analysis of these techniques can be found in [13]. Both approaches promise to improve the interaction with humans. It is not clear, however, which is best suited for communicating the navigational intent of industrial AMRs.

Despite research in road traffic and robot interactions, we found no general consensus on which communication approach to use for AMRs. It is therefore necessary to further evaluate how the choice of communication mode influences the interaction with AMRs. This paper compares an implicit and an explicit communication approach for communicating the navigational intent of AMRs in corridor-style situations. We aim to address the following research questions:

  • RQ1: Which communication approach is best suited for communicating the future trajectory of industrial AMRs in terms of usability?

  • RQ2: How do the communication approaches influence participants’ expectations of the robot’s future trajectory?

By addressing these research questions, we seek to contribute to the development of better communication tools for human-robot interaction of intralogistics AMRs to increase their usability.

3 Methods

Two communication approaches were compared: An explicit one (A), and an implicit one (B). The comparison was performed in a crossover trial where half of the participants first saw either communication approach A (AB) or B (BA). This helped avoid invisible transfer effects occurring in a within design while increasing the amount of qualitative feedback received compared to an in-between design.

The tools were compared in three scenarios where the mode of communication and robot behavior varied: road crossing, intersection encounter, and bottleneck (see Fig. 1. Each participant experienced all three scenarios in the same order with one of the communication interfaces (A or B) in the first period and then the same sequence with the other mode of communication in the second run. In each scenario, there were two randomly assigned variants like driving straight or right at an intersection (see Fig. 2) to reduce sequence effects. Participants were led through the scenarios by assigning them brief picking tasks with a defined start and end point. This simulated the workload of workers and was used to synchronize the interaction repeatably.

Fig. 1.
figure 1

Overview of interaction scenarios: intersection, crossing and bottleneck. ‘R’ marks the robot starting points and ‘P’ the participants’. Robot paths are blue and approximate participant’s paths in dotted black. (Color figure online)

For each communication approach (explicit or implicit), a communication tool prototype was developed, communicating in a broadcast manner using allocentric (from the robot) messages (see Fig. 2). Both were implemented on the same Innok Heros AMR that drove on a predefined trajectory through ROS’s geometry twist parameters.

As an explicit interface (‘A’ in Fig. 2), the mobile robot was equipped with a short-throw projector. The projector (LG Allegro 2.0) in turn received images calculated by a script on a Raspberry Pi 4B that calculated from the geometry twist parameters an image to project onto the floor, showing the predicted robot path (see Fig. 2).

The pre-programmed movement trajectories for the intent-expressive implicit communication tool were derived from theory-driven principles. The principles of generating legible motion by Dragan et al. [4,5,6], results of studies in the realm of human factors in autonomous driving research like [12], as well as common conventions in intralogistics like right-hand traffic were considered. The trajectory for the intersection scenario can be seen in Fig. 2.

The experiment was intended to answer two research questions. For RQ1 addressing the comparison on usability, three two-sided hypotheses on each dimension of usability as defined in the ISO standard [3] (effectivity, efficiency, and satisfaction) were formed. The task completion time was computed as the time from the beginning of the robot movement to the moment participants arrived at the picking task target. With the same task for all participants, completion time is a viable measure of efficiency. Legibility was derived from the legibility section of Dragan’s questionnaire [4]. Legibility is defined as making the intent inferable so that it can be regarded as a measure of communication effectivity. For satisfaction, a trustworthiness score was gathered using the trust in automation questionnaire by [9]. Although the paper mentions the ambiguity of compiling a unitary trust score, an average value of the three trustworthiness dimensions was compiled for comparability. While trust only represents a part of the entire scope of satisfaction, it was chosen for its critical role in acceptance. The resulting hypotheses are:

  • There is no difference in...

  • H\(_{1, 0}\) (Efficiency): ...task completion time...

  • H\(_{2, 0}\) (Effectivity): ...legibility according to the corresponding section in Dragan’s questionnaire,ch18Dragan.2015...

  • H\(_{3, 0}\) (Trust): ...the trustworthiness score derived from Körber,ch18Korber.2019

  • ...when crossing paths with an AMR using communication tool A as opposed to communication tool B.

To evaluate the hypotheses of RQ1, Grizzle’s [8] approach to evaluating cross-over experiments was used.

RQ2 intended to find differences in reception and to identify issues of both designs for future research. Behavior was classified by evaluating camera recordings, expected behavior was collected by letting participants draw the expected paths, and the thought process was attempted to be reconstructed by semi-structured interviews. This approach allowed us to relate quantitative results to potential key issues.

The 32 participants were young individuals (Mean age = 27.9, SD = 6.6) from mainly technical and university backgrounds. 62.5% of participants were female, 37.5% male. Before the study, an ethics committee’s written consent was gathered (2022-655-S-KH).

Fig. 2.
figure 2

Explicit (A, left) and implicit (B, right) communication tools compared in this study. Shown exemplary for the intersection scenario. The two dotted lines (for B) are the two trial variants.

Table 1. Influence of mode of communication on meassured quantities

4 Results

First, the existence of a carry-over effect is tested between the test periods. All hypotheses are tested using t-tests, or if t-test assumptions were violated, Wilcox rank-sum test (Results obtained marked with “W") was employed. Significant results (\(\alpha = 0.05\)) are marked in bold. The results can be observed in Table 1. In summary, there was a significant effect for both trustworthiness (H2) and legibility (H3), while no effect could be found for the task completion time.

The drawings with participant’s expectations were quantitatively clustered, and all images overlayed. Figure 3 shows in summary results compiled from gathering participants’ expectations. Each 2\(\,\times \,\)2 Square composed of four subimages shows one scenario, with all combinations of scenario variations (periods 1 and 2) and mode of communication (A and B). Each image contains all drawings superimposed, clustered with frequency-size coded bubbles, and overlaid by a barplot indicating the frequency of correct predictions.

Fig. 3.
figure 3

Results of compiled expectations. Subimmages show all overlayed expectation, superimposed with barplot with correct (yellow) and false (blue) expectations. (Color figure online)

From the semistructured interviews, issues participants were facing were gathered. The answers were transcribed and classified using a mixed inductive-deductive coding approach as in [14]. Of all the issues collected (n = 129), most (n = 26) criticized the implicit interface’s lack of communication. The motion of the implicit communication tool was also described as incomprehensible or unsafe (n = 16). More (n = 16) mentions regarded the robot’s missing reaction both for the implicit (n = 11) and the explicit (n = 5) interface. The explicit interface was most commonly criticized for the uncertainty of the green color’s meaning (n = 14; precedence for robot or human?). Quality issues of the projection (n = 19) were also often commonly expressed: Lag (n = 10), flickering, barely visible projection, and invisible projection (each n = 3).

5 Discussion

5.1 Findings

The findings of our study indicate that explicit communication outperformed implicit communication in terms of legibility and trust. This suggests that explicit communication is the preferable choice for the investigated corridor-style scenarios in intralogistics. As implicit communication was described as lacking communication, it was possibly implemented too subtly or in the wrong way. Furthermore, the critique voiced about uncoordinated behavior might indicate that implicit motion requires more precise control systems. Perceived nonresponsiveness calls for designing systems that are more interactive, therefore probably requiring dynamically generated trajectories that require the robot’s ability to sense pedestrians and react accordingly.

For the explicit interface, the uncertainty of the meaning of the projection path’s green color supports a potential ambiguity between allocentric and egocentric messages that needs to be considered in explicit communication tools. The comments criticizing the unresponsiveness of the system uncover a need for designing the navigation and resulting projection to be dynamic. These findings call for further exploring implications and expectations associated with different communication methods.

In summary, our study revealed that implicit communication of the future trajectory was more challenging to interpret, while explicit communication had a risk of non-intended interpretation. Regarding usability metrics, we found that explicit communication led to higher satisfaction, inferred from the trust, and improved communication effectiveness, inferred from legibility. Notably, there was no significant difference in the participant’s efficiency, as deduced from their task completion time. Further comparing the two communication tools used here, one should note that the projection setup hardware comes with more direct cost per robot and uses a lot more power to run, possibly making it a less viable option for companies using AMRs.

5.2 Limitations

Both communication tools used in our study were not fully developed or optimized. Therefore, the comparison between explicit and implicit communication should be interpreted cautiously, as it primarily focuses on comparing trajectory floor projection with one implementation of driving behavior. Secondly, our study mainly focused on corridor-style interactions and did not involve open spaces. As a result, the findings cannot be extrapolated to scenarios involving open spaces. Other limitations include the presence of lag in the projection system, which was noticed by some participants (n = 10), and visibility issues with the floor projection due to poor lighting conditions (n = 6). Additionally, in some instances (n = 10 out of 192 trials), the trials were interrupted by manual emergency stops due to the robot being too close to participants. This resulted in evaluating the system of the robot and stop-operator rather than solely the robot itself. Furthermore, our participant pool of young individuals with primarily technical backgrounds may have influenced the results. Future studies could include participants with diverse age profiles and professional backgrounds more representative of the real worker population to obtain a more comprehensive understanding.

The way the duration of the interaction was measured was possibly not precise enough, differences in efficiency may otherwise have been significant. For a decision in an industrial application context, long-term learning effects are relevant, as workers will have frequent contact with AMRs in their daily work.

5.3 Outlook

Areas for future research include improving current and developing new communication tools and interaction strategies for human-robot interaction in intralogistics, as well as extending the study setup to open-space scenarios with autonomous navigation implemented. Furthermore, the entire system’s efficiency (human + robot) should be considered, not only the human’s.