1 Collaborative Work, Teams, Teamwork, and Taskwork

The relevance of joint work or division of work in groups, organizations, or teams is becoming more and more important in everyday work and organizational context. Assigning the completion of a task to several entities can have immense benefits, especially with complex tasks that require a wide range of different subject matter experts, knowledge, or skills to complete (Higgs et al. 2005). Thus, joint work is not limited to the level of individuals, but can occur between groups, organizations, or teams (e.g., joint work of an environmental organization with a city or a government). The purpose of joint work is the mutual pursuit of one or more specific goals. To reach this common goal, all entities must actively engage in joint activities. This evolving process between social entities is defined as collaboration (Bedwell et al. 2012). Collaboration can therefore take place between individuals, groups, organizations, teams, and so on. In contrast, teamwork only includes the joint work between individuals in teams and thus represents a specific form of collaboration (Bedwell et al. 2012).

The label “team” usually refers to the association of at least two individuals who are pursuing one or more common goals, often within an organizational framework. To achieve these goals, teams must interact dynamically and adaptively (Kluge 2021; Morgan et al. 1986; Salas et al. 1992, 2015). This interaction is characterized by a high degree of interdependence among the team members, since achieving the common goals requires the execution of several subtasks, work steps, actions, or operations that have been distributed among different team members. In organizational contexts, teams are usually formed to perform specific tasks.

Thereby, one can distinguish between actions that are concretely directed toward the task execution and actions or processes that are not directly addressing the actual task execution but are indispensable for several people working together effectively. Actions that specifically address the task execution, such as interacting with tools, machines, or other equipment to process a workpiece (or similar), or the execution or adherence to procedures and rules to complete a sales process (or similar) are generally referred to as taskwork (Kluge 2021). Therefore, taskwork describes and encompasses all the functions that each team member must perform to reach the common team goal(s) (Mathieu et al. 2008). When several people work together, it is not only about performing tasks. In order to work effectively as a team, the team members must interact with each other and “share knowledge, coordinate behaviors, and trust one another” (Salas et al. 2015, p. 600). These interactions, relationships, and interdependencies among the members of the team are generally referred to as teamwork (Kluge 2021). Obviously, taskwork and teamwork are intertwined, so that their effectiveness mutually benefits each other (Salas et al. 2015): For instance, imagine a pilot/co-pilot team whose goal was to land safely at the destination airport. If both only focused on operating their assigned instruments (taskwork) without interacting with each other or the tower (teamwork), the goal, including landing, would be at risk for failure, as compared to if the team had worked together. If the team focused solely on teamwork, it would also likely lead to disaster. As this example shows, teamwork is essential “for effective team performance, as it defines how tasks and goals are accomplished in a team context” (Salas et al. 2015, p. 600).

Teamwork encompasses a variety of processes, including communication, cooperation, and coordination: three essential processes for team behavior (Hagemann and Kluge 2017; Kozlowski and Ilgen 2006) that are strongly linked to team effectiveness (Brannick et al. 1995). Broken down to its essence, communication describes the exchange of information between two persons, most commonly called sender and receiver. In relation to team communication, this definition may need to be broadened to include that an information exchange can have a shaping and changing influence on team attitudes, behaviors, and cognitions (Salas et al. 2015). Since, as already outlined, teams pursue one or more common goals, they may need to work purposefully and cooperate. Thereby, team cooperation is decisively influenced by the attitudes, beliefs, and feelings of each team member (Salas et al. 2015) that determine whether or not the team member will engage in and support the collective goals (Sinclair 2003). Cooperation can be proactive, in that the team member intentionally does something that is conducive for attaining the common team goal (e.g., offer support to the team members). The latter could be also more implicit, in that the team member refrains from doing certain things that could impede the attainment of the common goal (e.g., put own interests aside in favor of team success) (Sinclair 2003). However, in order to work together effectively and efficiently as a team, it is not enough to communicate and cooperate together; all related processes must also be adequately coordinated, meaning that all team- and taskwork-related actions, operations, processes, work steps, subtasks, etc., must be orchestrated in sequence and time (Marks et al. 2001). This can be done either explicitly, in that team members intentionally use processes such as communication for coordination purposes, or implicitly, by anticipating the team needs “and dynamically adjusting their behaviors accordingly without having to be instructed” (Rico et al. 2008, p. 164). The aforementioned processes, communication, cooperation, and coordination, are central to taskwork and teamwork, as they determine how the team inputs (e.g., resources and knowledge) are converted into team outputs (e.g., satisfaction, performance, quality, and accuracy) (Driskell et al. 2018).

To consider it from another perspective: For a team to be content and perform successfully, good communication, cooperation, and coordination are prerequisite. Therefore, these processes have been studied in team research extensively, and many scientifically based training and development approaches exist (for an overview of existing training approaches, see Goldstein and Ford (2002)). In addition, optimizing the aforementioned processes by means of training, various tools, and, above all, technologies can support team communication, coordination, and cooperation. For example, one of the simplest technologies we have used for decades, and still use, is the telephone to simplify communication channels (not only) between and among team members. Currently, teams also have the option of using more advanced technologies such as email, messaging, or video conferencing systems as standard, depending on the communication requirements. A relatively recent approach to support teams in their teamwork processes is the use of augmented and mixed reality (AR/MR) although the development of this technology for the multi-user context started more than 20 years ago (Sereno et al. 2020). The next section provides a broad overview of existing possibilities to support teams using AR.

2 Supporting Teams Using AR—An Overview

By the late 90s, the first researchers had identified the potential of AR and MR technology in terms of their application in the research area of computer-supported cooperative work (CSCW). In CSCW, collaboration is often classified by means of four categories allocated to the dimensions of time and place following the CSCW matrix (Fig. 1), as introduced by Johansen (1998). The time dimension can either be denoted as synchronous or asynchronous. In synchronous interaction, the information exchange (as outlined above) happens instantaneously, whereas asynchronous interaction takes place at a certain time delay. An example of synchronous collaboration may be direct interaction via speech (also including telephone or video calls), and the use of emails represents an example of asynchronous collaboration. In the place dimension, the actual spatial location of each collaborator is denoted. In remote collaboration, collaborators are in different geographical locations, as compared to a co-located situation, where the collaborators work together in the same physical space (e.g., the pilot/co-pilot example).

Fig. 1
figure 1

CSCW matrix, as introduced by Johansen (1998)

As AR allows a seamless merging of the users’ real and virtual environments (Billinghurst et al. 2014), researchers considered the potential in collaboration by

  • integrating the user’s real environment in remote collaboration and

  • enhancing co-located collaboration by integrating virtual content.

As synchronous collaboration appears to benefit the most from these two enhancements, most research concerning AR in collaboration has addressed synchronous scenarios with only a few exceptions, such as the work presented by Irlitti et al. (2016), who proposed alternative opportunities for asynchronous collaboration using AR.

The factor of place has been represented equally in research. One of the first examples of a co-located application was the “Studierstube” by Szalavári et al. (1998a), an application that used AR to display the same 3D graph for both team members. The application allowed both collaborators to interact via a 3D mouse and showed that this application was superior to a traditional desktop environment. Kuzuoka (1992) evaluated a system in 1992 that used AR to share the view between two remote team members and would allow a seamless collaboration by not only ensuring verbal communication but also sharing the gestures of the team members.

Apart from time and space, symmetry is another factor in AR-supported collaboration describing the team members’ authority or role, as defined by Feld and Weyers (2019). If every team member has the same authority and role, the collaboration is denoted as “symmetric.” An early example was the already mentioned “Studierstube,” as well as another work by Szalavári et al. (1998b), who proposed a concept where multiple users could have a high-quality video-game experience using AR, such as for playing mah-jongg, which was well accepted by the users.

In contrast, Bauer et al. (1999) evaluated an asymmetric AR teleconference system in 1999, where one expert guided a field worker through a wiring task using a shared pointer and found an increased efficiency in the collaboration task.

In general, the early research considered solving technical challenges and system architecture for possible AR-collaboration applications, and when the hardware became less of an obstacle, the focus shifted more to the development of techniques and applications. A recent example was the work by Piumsomboon et al. (2018a), in which a small avatar (i.e., “Mini-Me”) of another team member was displayed and dynamically positioned in case the team member was not in view. This avatar was found to improve the performance of the collaboration and increased the social presence for the team member assisted by the avatar. This work was also aligned with another recent topic in AR collaboration: the virtual representation of remote team members. For example, Yoon et al. (2019) investigated a possible effect of different avatar representations on social presence in a remote collaboration application using AR. This topic was not exclusive to AR but was also important for VR collaboration.

Due to the increasing power of the available hardware, the concept of sharing a user’s real environment in 3D and real time was investigated. For example, in their work Superman versus Giant, Piumsomboon et al. (2018b) investigated the concept of sharing a user’s real environment via an unmanned aerial vehicle (UAV) to a remote VR user. This user could then traverse this reconstructed environment either via flying or increasing their size to increase the remote user’s spatial awareness and assist the local user more efficiently.

In their literature review, Ens et al. (2019) identified five research opportunities they expected to play a major role in future research:

  • complex collaboration structures in time, space, and symmetry

  • convergence and transitional interfaces

  • empathic collaboration

  • collaboration beyond physical limits

  • social and ethical implications

Most research today is investigating or proposing solutions for a single factor of the dimensions time, space, and symmetry. However, it is likely that future research will surpass these boundaries and investigates supporting synchronous and asynchronous collaboration at the same time or allows for co-located, as well as remote, team members to collaborate. Furthermore, a mixture of various user roles should be expected, such as a virtual conference, where the current session chair, the presenter, and the audience all have different user roles that can change over the course of the conference. This work at hand focuses specifically on these three aspects and presents a use case, concept, and solution for such issues.

Due to the limits of current devices, users are usually bound to one technology (e.g., AR or VR) throughout the collaboration. This, however, is likely to change in near future, as the industry is developing devices that support AR and VR simultaneously, such as the Lynx R1 (https://www.lynx-r.com/) or the Meta Quest devices (https://www.oculus.com/). This would then not only allow the user to use a single device for both AR and VR, but also to seamlessly traverse between those during the collaboration.

For perception of other team members, current research has often considered direct task-related cues, such as pointing gestures and gaze direction. Due to additional hardware, such as heart-rate sensors and face cameras, various emotions of a user could be recorded and shared with the team members, which could allow for higher empathy among team members.

Another research opportunity is expanding collaboration beyond the physical limits, where researchers do not recreate physical, co-located collaboration, but implement features that would be impossible in a real collaboration. For example, the aforementioned Superman versus Giant concept by Piumsomboon et al. (2018b) allowed to scale the user in a virtual representation of a real environment or allowed them to fly, both of which are not possible without the usage of AR or VR.

Finally, an opportunity for future research includes the social and ethical implications of using AR and VR in collaborations, but not exclusively in collaboration. One major aspect of this is the privacy concerns when using these technologies. For example, in the Superman versus Giant concept, the environment was streamed to a remote user. However, there may be people in this environment who do not agree to be part of the streamed content, or there may be places being streamed that the users do not have the appropriate broadcast rights to visually transmit. Therefore, it is important to consider how to handle sensitive data in such a setting. This agrees with another aspect, the social acceptance of using AR and VR. One example is Google Glass, from 2012. These glasses allowed the wearer to display basic information on the glasses as well as to capture photos and videos. These glasses were rejected by most users as bystanders could be recorded unwillingly by the wearer. They were even banned in some places, such as in hospitals or banks.

These opportunities show not only that AR has the potential to enhance collaboration between team members, but also that there are still future research areas to consider. Therefore, the following section highlights the role of taxonomies to provide a tool for decision-making and how to proceed in research and application development.

3 Where to Start and What to Support? Considering Team, Task, and Technology in the Selection and Development of AR-Based Support

In the literature, a number of taxonomies have been presented that consider the classification of AR applications, their design, and/or their deployment (e.g., Feld and Weyers (2019)). Most taxonomies have been either technique-, user-, information-, or interaction-centered (Normand et al. 2012). Yet, if AR applications are to be implemented in an organization, which can be understood as an integral element of everyday contexts, it is particularly important not to consider people (user), technology (technique), and the organization independently, but rather holistically and as interacting entities in the sense of the socio-technical system design approach (Schweiß et al. 2019). According to this approach, the deployment of technology and the organization must be subject to joint optimization (Ulich 2013), meaning that it is not sufficient to develop technology or the organization without also adapting or optimizing the other part at the same time. This perspective was also reflected by the (hu-)man–technology–organization (MTO) concept, which emerged from this approach and further placed the work task as primary in the center of the socio-technical system (Ulich 2013). From this perspective, the technical and social subsystems of the organization are linked by the work task, that is, the work task creates the connection between people and the structure of the organization (Ulich 2013).

Since AR technologies in organizational contexts represent a form of socio-technical systems, their development (or selection) and implementation should follow the socio-technical system and the MTO concept, in that social aspects (e.g., team requirements, team capabilities), work tasks (or processes), and technological aspects (i.e., the system to be deployed) should be regarded as mutually interacting factors (Ulich 2013). Neglecting one of these factors (or their mutual interaction) in the development or selection of AR technologies could significantly impede the intended system benefits (Salas et al. 2008; Schweiß et al. 2019).

The use of AR technologies in an organizational team context is typically implemented due to a specific goal. The aim is mostly to support or improve one or more specific teamwork process(es) (as outlined in the introduction of this chapter). Therefore, before defining system capabilities, one should consider the respective team skills and requirements as well as the requirements of the task or teamwork process(es) to be supported (Salas et al. 2008; Schweiß et al. 2019). Such task-and-team analysis would provide the data to design a system where the technology fit the team needs and requirements (team-technology fit, Schweiß et al. (2019), Thomaschewski et al. (2019, 2021)) as well as the team task and process(es) (task-technology fit, Goodhue and Thompson (1995a), Thomaschewski et al. (2021)).

Therefore, the authors have proposed a theory-driven taxonomy for the support of teams using AR, which provides a task analytical basis for the selection and design of AR assistance in teamwork (Thomaschewski et al. 2019). The taxonomy is intended to describe specific situations of teamwork and to select conducive augmentation possibilities. It can support the design of the teamwork processes where AR technologies would be used. For example, the taxonomy could be used as a checklist to anticipate and define different user groups and their needs before the development of the prospective AR technology. Furthermore, the taxonomy could be used in the very early developmental stages of AR technologies to evaluate whether all relevant constellations or potentials of AR application have been considered. The taxonomy could also serve to evaluate existing AR technologies in the research context, such as by identifying influencing factors (independent variables) or target (dependent) variables and varying or observing their expressions (Thomaschewski et al. 2019).

Based on contemporary as well as established findings from CSCW and organizational psychology research, we defined four dimensions in accordance with the MTO concept: (1) social aspects (representing the social system component), (2) technical aspects (representing the technical system component), (3) teamwork processes (representing the work task as a link between the social and technical aspects, and the dimension), and (4) teamwork benefits due to AR utilization (representing assumed and intended benefits resulting from AR use). The taxonomy is shown in Fig. 2.

Fig. 2
figure 2

Teamwork taxonomy (Thomaschewski et al. 2019)

The first dimension, social aspects, can be used to determine the specific social context. The focus is on the supported team’s constellation, needs, and skills. This dimension serves to determine (1) how the team is qualitatively constructed (e.g., an expert-only team, or a mixed team of novices and experts), (2) how many team members (or users) will interact (e.g., one-to-one or many-to-one), (3) if the team will work co-located or spatially dispersed, (4) the synchronicity of the different team tasks (do the team members work synchronously or asynchronous), and (5) the degree of coupling between the team members (e.g., level of team member familiarity and collaborative history). Altogether, this dimension considers the team composition as well as other contextual factors.

The work task, and therefore, the teamwork process, links the social and technical aspects. The second dimension, teamwork processes, serves for defining the specific work tasks and teamwork processes that shall be supported or enhanced using AR. In accordance with the MTO concept and human-centered design approaches, the dimensions social aspects and teamwork processes should be regarded first and serve as a baseline for defining the technical requirements of the system.

The third dimension, technical aspects, considers the design of the actual AR tool, based on the previously defined social aspects. With the support of information representation (7), the augmentation(s) could be planned (e.g., audio or graphic, or both). Using the class (8) on characterizing information reception along with (7), the perception of the augmentation(s) can be predicted. Finally, with category (9), the function of the augmentation could be determined (e.g., whether the tool was intended to create telepresence or be capable of inserting information/annotations).

In addition, the taxonomy offers a fourth dimension; teamwork benefits due to AR utilization, on the vertical axis, which is intended to specifically derive the assumed and intended benefits of the to-be-developed or deployed AR technology (e.g., whether the tool would increase the situation awareness in the team or enrich communication). Actively anticipating the intended benefits could assist in evaluating whether the technology could achieve this goal, along with the previously defined categories.

Altogether, the taxonomy is specifically focused on the design of AR-supported teamwork. Based on user- and task-centered development approaches, it could be used to describe teamwork and to select appropriate team- and task-specific augmentation(s). Schweiß et al. (2019) presented a possible approach by applying the presented taxonomy in a user-centered design process. They discussed the role of the taxonomy as specifically used in the requirement analysis step in an iterative user-centered design process. Not only could it be used to gather initial requirements at a very early stage of the process, but also at later stages, in which the first prototypes had been investigated by the user, and potential adaptions to the requirements were identified. Additionally, the taxonomy could be used to gather relevant use cases to the actual scenarios in the development process, suitable for the design and evaluation process. Finally, the taxonomy enables not only the application of design guidelines via its classification but also the creation of new guidelines if it identified the user-centered design process itself.

4 Make It Special: Spatially Dispersed Teams

Some work teams are spatially dispersed teams. This is a type of team that is becoming increasingly popular in companies and organizations (e.g., Boos et al. (2017)). These teams work in a remote, synchronous or asynchronous setting, which may be either asymmetric (as in case of remote guidance, e.g., Bauer et al. (1999)) or symmetric as in the use case outlined below. In the literature, spatially dispersed teams have been examined within the context of office work and mostly are referred to as virtual teams. Therefore, the focus has been on their advantages (e.g., the possibility to access experts worldwide) and disadvantages (e.g., forced asynchrony due to time differences), adequate support interventions, and leadership. However, in other fields, there are often teams that do not work in one place. Examples can be found in a variety of high reliability organizations and production settings, such as the military, aviation, fire departments, nuclear power plants, and refineries (Hagemann et al. 2012).

Due to the spatial dispersion, members of these teams do not have a mutual visual context or workspace and cannot communicate immediately without technological support. This can impede their team performance, especially when these teams execute interdependent team tasks, several team tasks in parallel or simultaneous team and individual tasks (Bardram 2000; LePine et al. 2008; Marks et al. 2001). The lack of virtual and synchronous communication can lead to discordant assumptions concerning the current team task status in relation to the end goal, which is according to Kraut et al. (2002) referred to as task state awareness. Therefore, spatially dispersed teams often show a poor task state awareness, which, as a consequence, negatively affects the temporal coordination of the subtasks of their team task.

Temporal coordination, specifically relevant in synchronous scenarios, encompasses three main components: (1) the correct sequencing of the subtasks (Bardram 2000), (2) the correct timing of the subtasks (e.g., Hollnagel (1998)), and (3) the ability to adapt dynamically to variables in the team’s context (Kluge et al. 2018). To show a preferably good temporal coordination, teams have to orchestrate their subtasks according to the aforementioned components. Due to the spatial dispersion, these teams are prone to scheduling errors (e.g., synchronization problems, inaccurate judging of duration, and low levels of shared temporal cognitions; McGrath (1991)), which leads to not being “on the same temporal page” (Mohammed and Nadkarni 2014, p. 405), creating discord regarding when subtasks should be started and finished as well team members adhering to different schedules and pacing (Gevers et al. 2009).

One possibility to counteract poor temporal coordination is the use of coordination artifacts, such as clocks, specific software, schedules, checklists, and so on (Bardram 2000). These are suitable for supporting the temporal coordination, as they provide reliable information about the team process state and thus contribute to enhancing the task state awareness of the team. Therefore, it appears reasonable to use AR for supporting spatially dispersed teams in their temporal coordination since AR can provide information about the process state of the team task (e.g., by means of graphical annotations). Moreover, head-mounted AR displays (AR-HMDs), in particular, are well-suited to provide information in the visual periphery of the user and thus create an ambient awareness of the augmented information (Schmalstieg and Hollerer 2016). By using the peripheral instead of focal attention of the user (Cadiz et al. 2002), less attention is required (Downs et al. 2012), and fewer cognitive resources are used, so that the user has access to more cognitive capacities for the execution of the actual team tasks.

Altogether, we assume that AR technology allows for superimposing information about the teamwork process state in an ambient manner, which generates ambient awareness of the process state and in turn improves the task state awareness of spatially dispersed teams with the result of an optimized temporal coordination (Fig. 3). To test this assumption, the authors have conducted several laboratory studies. The next sections provide an overview of that research.

Fig. 3
figure 3

Research assumption: how AR can enhance temporal team coordination

5 Setting Up an Experimental Team Task Context

Our overall objective is to increase task state awareness of spatially dispersed teams with the help of AR. Therefore, the authors wanted to build and empirically investigate the effects of temporal coordination artifacts that support the temporal coordination of teamwork of spatially dispersed teams in a production setting by enhancing the team’s task state awareness.

To test a corresponding tool, a suitable test environment and use case were required. A corresponding test environment that encompasses a production setting and can be realized as teamwork context is WaTrSim (i.e., wastewater treatment plant simulation; for previous studies with WaTrSim, please see Burkolter et al. (2009), Frank et al. (2017), Thomaschewski et al. (2021), Weyers et al. (2015)). WaTrSim is a digital simulation of a realistic wastewater treatment plant (Fig. 4, left panel) that is fully controllable via a graphical user interface (Fig. 4, right panel).

Fig. 4
figure 4

Real facility of the WaTrSim located at TU Dresden (left panel) and part of the WaTrSim interface (right panel)

The simulated scenario in WaTrSim is to initiate the plant with the goal of generating a high production outcome (e.g., maximizing the amount of purified water and gas, minimizing the amount of wastewater). To achieve this goal, a sequence of 13 fixed steps has to be executed in the right order and a preferably good timing. The steps consist of adjusting the settings at different parts of the plant (heaters, valves, and tanks). Thereby, the users are supported by the gaze guiding tool (GGT, for further description, pretest and evaluation, please see Frank et al. (2017), Kluge et al. (2013), Weyers et al. (2015)), which is a digital manual that guides the users gaze to the part of the plant where the setting has to be changed and displays the exact change specification. As shown in Fig. 5, the GGT is a semi-transparent overlay with a cutout highlighting the area to be controlled. The cutout/plant area is identified by a red–orange rectangle. Adjacent to the cutout, there is an additional info box that directs the user on their task.

Fig. 5
figure 5

Gaze guiding tool (GGT). Note: The original GGT is in German and has been translated for this chapter

Since we aimed at supporting spatially dispersed teams, this setting was extended to a team context (Fig. 6). Therefore, three instances of WaTrSim were used, so that two instances could be ran as individual task (IT, task was controlled by a single user) and one instance as team task (TT, task was controlled by two users as a team; each user must perform a predefined subset of the 13 fixed steps to initiate the WaTrSim). To simulate spatial dispersion, both team members were located in separate rooms, so that they had no mutual (physical) workspace and no possibility to communicate (Fig. 7). Both rooms were equally equipped with two on-wall projections representing the IT and TT, respectively. The projections were positioned on separate walls at a 90° angle, so that the user had to actively turn to see the process state of the other task. Both team members were equipped with AR glasses (HoloLens 1), which displayed the GGT. Additionally, the AR interface displayed information about the process state of the task the user was not currently working on. The latter is the ambient-awareness tool (AAT). The AAT is intended to increase the team’s task state awareness by implicitly presenting team members with ambient information about the current state of the task they not currently working on and consists of iconic representations of the next three steps that must be performed in the respective task. The simulations were controlled via a tablet. AR glasses, tablets (used for controlling WaTrSim), and PCs (running the simulation and handling the communication) were connected via the local Ethernet network, as shown in Fig. 6.

Fig. 6
figure 6

System architecture of the general study setup for the team context: connection between AR glasses, tablets (for controlling WaTrSim), and PCs (running the simulation and handling the communication) have been implemented via local Ethernet. IT_1 PC providing individual task for participant in room 1, IT_2 PC providing individual task for participant in room 2, TT PC providing team task for both participants. Cont server used to remotely start and control the ITs and TT. HoloLens and tablet are connected via local Wi-Fi connections provided by local access points

Fig. 7
figure 7

General laboratory study diagram with WaTrSim

6 Pre-study I: Does Gaze Guiding Help?

During a first pretest in 2018, the general study setting and the functionality of the tools were to be tested as a proof-of-concept. Therefore, we used a first prototype of the AAT (Fig. 8). For the representation of the next three steps, we used colored symbols that were displayed in the periphery of the AR interface. While working on the IT, the symbols representing the process state of the TT were presented on the left side of the projected simulation surface of the IT. The symbols representing the IT while working on the TT were presented to the right side of the simulation surface of the TT (Fig. 8).

Fig. 8
figure 8

Pre-study 2018. Left panels: team member A working on the IT; the symbols left to the simulation surface represent the AR-based symbols, showing the next three steps in the TT. Right panels: team member B, working on the TT; the symbols right to the simulation surface represent the AR-based symbols, showing the next three steps in the IT

In total, 11 participants (all students, 9 female) participated in the pretest. The dyadic team was formed by one participant and one female student assistant each, who had a high level of expertise in controlling the WaTrSim. Thus, we have tested a total of 11 teams. One study session lasted approximately 60 min and consisted of 15 min HoloLens usage training, 15 min WaTrSim operation training (IT only), and 30 min actually operating WaTrSim (IT and TT; for further explanation of the task please refer to Setting up an experimental team task context).

6.1 Results

The pretest showed that the implemented study design worked stably. Furthermore, additional questionnaire data showed that the participants had considered the devices for the AAT attractive and helpful as well as perceived the HoloLens as supportive for the specific task (WaTrSim initialization); and working with the HoloLens did not require high mental effort.

To summarize, the preliminary study showed that the concept of the AAT could be successfully applied in a spatially dispersed interdependent production tasks. Since AR superimpositions are not effective per se, we conducted a further study that evaluated different interface arrangements of the AAT to identify the most supportive interface design. This step was necessary from our perspective, as different aspects such as perceptual issues (Drascic and Milgram 1996; Kruijff et al. 2010), the cluttering of superimpositions (Rosenholtz et al. 2007), the misleading of attentional cues (Veas et al. 2011), or information overload (Doswell and Skinner 2014; Irlitti et al. 2016; Irizarry et al. 2013; Krevelen and Poelman 2010) had to be considered. The next section describes our approach to identify the final AAT interface.

7 Pre-study II—How and What to Superimpose?

To determine the most suitable interface, we conducted a two-part study with a paired sample in which we examined the usability and user experience (UX) of several interface configurations (see Thomaschewski et al. (2021)). The first part of the study was conducted in the laboratory and focused on evaluating different display options of several object properties for the ambient awareness objects (AAOs), considering their anticipated usability in the context of the interface. The objective of the first part of the study was to derive possible interface configuration clusters, which would also be evaluated for UX in the second part of the study. Since this study was focused on the evaluation of different interface designs and not on the coordination between team members, we did not invite dyadic teams but individuals for the evaluation.

In total, 22 participants (3 female) that were familiar with operating WaTrSim participated in the study. Due to one participant withdrawing over the course of the second part of the study, the analyses were based on the data of 21 participants.

As shown in Fig. 8 (and the final version in Fig. 13), the AAT consists of three icons (depending on the next step tank, heater, or valve) and indicates the next three steps for the task, the user is currently not looking at. In order to configure an interface with a possible high usability, eight object properties (Fig. 9) were predefined that would influence the usability of the interface, depending on their visualization (display mode). For example, placing the AAOs at a distance too far from the simulation surface could result in the objects not being in the field of view (FOV) and thus impair usability; too close, and it could distract the user.

Fig. 9
figure 9

AAO properties that were evaluated in the first part of the study. Following a predefined hierarchy, evaluation started with choosing a display mode for abstraction level, followed by object size, object distance, object position, object-simulation distance, progress-bar position, critical process state indication and ended with choosing a background for the AAOs

7.1 Pre-study II, Part I: Usability Evaluation

For the usability evaluation, the authors developed an instrument, the usability cluster questionnaire (UCQ), which was specifically suited to the evaluation of the interface. Using a hierarchical forced-choice paradigm (HFC), the UCQ directs the participant to select different display modes in the AAO properties, so that each execution leads to an individual interface configuration per participant. As illustrated in Fig. 9, the choice hierarchy was defined from 1. abstraction level of the AAOs to 8. background, meaning, that each participant chose a display mode for the abstraction level in the first step (1, 2, or 3), the display mode for object size in the second step (small, medium, or large), up to the display mode for background in the last step (none, grey, or white). For each AAO property, the participants were asked to select the subjectively “most helpful or appropriate display mode” (Thomaschewski et al. 2021). A more detailed description of the UCQ can be found in Thomaschewski et al. (2021).

To allow the participants to evaluate the AAO properties using the UCQ, the simulation surfaces of the WaTrSim were projected onto two different walls at a 90° angle, as in the pre-study (see above). However, in contrast to the pre-study, only, images of the simulation surface were used, so that the simulation could not actually be controlled (Fig. 10). AAO properties were displayed via HoloLens 1.

Fig. 10
figure 10

Pre-study II, part one

To investigate whether the individual interface configurations could be assigned to specific patterns, a divisive hierarchical cluster analysis was subsequently performed. The results indicated the formation of three clusters, so that three interface configurations could be derived. The results are shown in Fig. 11.

Fig. 11
figure 11

AAT configurations as result of pre-study part I

7.2 Pre-study II, Part II: User Experience Evaluation

Part II was conducted as an online study. Therefore, the participants from Part I were invited via email to evaluate the three interface configurations according to their UX in a follow-up online survey. To evaluate the UX, the participants were shown video mock-ups of the three inferred interface configurations from study Part I (Fig. 11) and asked to evaluate the UX using the AttrakDiff (Hassenzahl et al. 2003) and five scales of the user experience questionnaire (UEQ; attractiveness, perspicuity, dependability, stimulation, novelty; Laugwitz et al. (2006)).

Subsequent replicated ANOVAs did not show significant differences among the perceived UX of the three interface configurations. However, the results did appear to indicate a minor preference for cluster 3 (Figs. 11, 12, and 13). A more detailed elaboration of the results can be retrieved from the work presented by Thomaschewski et al. (2021).

Fig. 12
figure 12

Visualization of the UX total score means. Left panel: results from the AttrakDiff. Right panel: results from the UEQ. Error bars indicate the 95% confidence interval

Fig. 13
figure 13

Simulation surface of WaTrSim and the final AAT left sided to the simulation surface

According to these results, the interface for the main study would be designed, in which we would finally investigate the impact of the AAT on the temporal coordination of spatially dispersed teams. The next section outlines the design of the main study.

8 Main Study: Design to Evaluate the Impact of the AAT on the Temporal Coordination of Spatially Dispersed Teams

Based on the proposed assumption in the introduction (Fig. 3), the main study investigated whether the developed AAT could positively affect the temporal coordination of spatially dispersed teams. In this context, not only the main effect of the AAT on the temporal coordination is investigated. In addition, different design components of the AAOs were examined for differences in their supportive effect.

To this end, 110 dyadic teams were studied in five different groups, among which the factors of dimensionality (2D vs. 2.5D) and dynamics (static = without progress bar vs. dynamic = with progress bar) of the AAOs had been varied (Fig. 14).

Fig. 14
figure 14

Experimental conditions of the main study, showing the alteration of the factors (1) dimensionality (2D vs. 2.5D) and (2) dynamics (static = without progress bar vs. dynamic = with progress bar). Note: The GGT (see section Setting up an experimental team task context) was available to all groups (including control group)

The authors’ intention to report this study here has the focus of outlining existing research concepts and to give a most complete picture possible of our research agenda. At the time of chapter writing, the evaluation of the effectiveness of the AAT is still being completed. However, some statistical results can already be made regarding the participants’ subjective assessment of the AAO:

Apart from the measure of the general usefulness of the AAOs for starting up WaTrSim, all of the following reported outcomes were recorded on a Likert-scale of 1 to 5, where 1 corresponds to a low and 5 to a high rating. First, it should be noted, that the participants rated the AAOs as rather attractive (single item, M = 3.38, SD = 1.21, no significant differences in the rating between the experimental groups) and in general useful for starting up WaTrSim (single binary item, 72.25% yes vs. 27,43% no). The user experience of the whole system (including AAOs, Gaze Guiding Tool, tablet control, and WaTrSim GUI; 3 items, M = 3.48, SD = 0.95) as well as the subjective meaningfulness and relevance of the AAOs (7 items, M = 3.43, SD = 0.84) was rated rather positive. Again, we did not observe any significant differences between the groups. The support of the AAOs regarding the temporal coordination of team and individual task was rated as mediocre (single item, M = 2.56, SD = 1.40), whereby significant differences between the groups (F(3, 170) = 4.452, p = 0.005) resulted. Post-hoc tests showed that the group with 2DD AAOs rated the supportiveness significantly higher in comparison to the group with 2DS AAOs (p = 0.02) as well as to the group with 3DS AAOs (p = 0.025). The support of the AAOs in terms of remembering the steps for the start-up procedure was rated rather low (single item, M = 2.02, SD = 1.24).

Taken together, the AAOs are perceived as attractive and comfortable in use. Additionally, the participants rated the AAOs as meaningful, relevant, and generally useful for operating WaTrSim, but rather mediocre supportive in terms of the temporal coordination between individual and team task. With respect to memorizing the steps of the start-up procedure, the AAOs were perceived as rather low supportive.

The previous sections explained the development and deployment of an AR-based support tool that uses graphical and abstract icons for the interface design to support spatially dispersed teams. The preliminary work suggests that productivity and accuracy in teams could be supported by such tools. However, what about the overall team feeling and experience? When team members work in different spaces, this can lead to a lack in the team feeling/experience (for an overview, see Morrison-Smith and Ruiz (2020)). Apart from the socio-emotional components that should be taken into account in terms of good work design, a lack of feeling of community can again have negative impacts on team performances.

A possible approach to maintain the team feeling/experience despite (long) distances between team members may be the use of AR-based avatars. The next section presents another of our preliminary study where an AR-based assistance system was developed that used avatars to support geographically distant teams.

9 A Within-Subject-Study: Development of an AR-Based Avatar Assistance System to Support Spatially Dispersed Teams

The ongoing study could be regarded as an extension to the aforementioned studies and was intended to provide evidence regarding how interaction and communication, and thus the team experience, could be enabled for spatially dispersed team partners using AR-based avatars.

Research already exists on the use of avatars in collaborative work processes. For instance, Piumsomboon et al. (2018a) investigated the deployment of an AR-based “Mini-Me” avatar in two different mixed-reality (AR and VR) collaboration scenarios and showed that social presence and the collaboration experience could be enhanced using avatars. Yoon et al. (2019) investigated the effects “of avatar appearance on social presence and user’s perception in AR” (p. 1) for collaborative tasks and showed that a realistic, full-body avatar was considered best suited for remote collaboration. Furthermore, Waldow et al. (2019) reported feedback from subjects, suggesting that visual cues to the avatar’s gaze direction may be relevant, but this has not been empirically evaluated.

Based on these findings and the study results previously described for the present work, further follow-up studies will aim at investigating the deployment of avatars in symmetric (team members assigned to equal roles (Ens et al. 2019; Feld and Weyers 2019)) and spatially dispersed team scenarios in regard to the teams’ temporal coordination. Therefore, in a first step, a feasibility study was conducted to investigate whether the behavior of a full-body avatar as a representation of the spatially dispersed team member including context cues, could influence task performance (accuracy and processing time) and the perception of co- and social presence in the context of a collaboration task.

9.1 Study Design

Again, the WaTrSim was adopted for the study, and this time the participant’s task was to read certain states (e.g., tank levels, heater temperatures, etc., hereinafter region of interest (ROI)) on the wall-projected simulation surface. These would be subsequently reported (verbally) to the spatially dispersed team member (which in this case was the investigator, who was located in a different room). The support consisted of the participant seeing the team partner as an AR-based avatar through the HoloLens 1 (Fig. 15). The avatar controlled by the investigator used a full-body tracking system based on a combination of a Microsoft Kinect Azure as well as head and finger tracking by the MS HoloLens 2. The investigator and participant were able to communicate verbally via microphone and speakers using Skype. Figure 16 depicts the technical layout in detail.

Fig. 15
figure 15

Study design. Left panel: investigator controlling the avatar. Right panel: participant’s view of the WaTrSim surface and the AR-based avatar

Fig. 16
figure 16

Technical design: for the full-body tracking of the investigator, we used a combination of Microsoft Azure Kinect for upper and lower body with additional tracking data from Microsoft HoloLens2 for hand and head tracking. The so generated animation stream was provided via server to the HoloLens1 the participant wore to animate the avatar. For spatial registration (positioning of investigator, avatar and participant), a puppet was used giving the investigator orientation where the participant was located, the participant needed to stand in one specific position in the room

The avatar control was handled by the experimenter and realized through an external tracking system, which tracked the movements of the experimenter and mapped them to the avatar. In order to have a reference point for the physical orientation, gaze direction, and pointing gestures, a mannequin was used (Fig. 16), which represented the relative position of the participant.

To observe the effects, the factors (1) avatar behavior (avatar actively pointing to the ROI vs. avatar not pointing to the ROI) and (2) contextual cues (additional highlighting of the ROI vs. no additional highlighting of the ROI) were varied in a within-subject design (Fig. 17). As dependent variables, co- and social presence as well as processing time and accuracy were measured for each condition. Therefore, each participant underwent all experimental conditions, with each condition consisting of five trials that were randomized to counterbalance learning and memory effects.

Fig. 17
figure 17

Experimental conditions showing the alteration of the factors (1) avatar behavior and (2) contextual cues

9.2 Results

For analysis, the data of 23 students (12 female, mean age: 24.09 (SD = 3.27)) were used. A total of 30.43% of them indicated that they had prior VR or AR experience, and 43.48% indicated prior experience with avatars but not in a vocational context.

Social presence was measured with the 5-item scale by Bailenson et al. (2003), to survey co-presence, we applied the co-presence subscale by Harms and Biocca (2004). Team performance was operationalized by measuring the time the participants needed to orally read the ROI. To measure accuracy, error rates were calculated.

For co-presence, a one-way ANOVA indicated a significant difference between the groups (F(3, 66) = 11.048, p < 0.001, η2 = 0.179). Post-hoc tests showed significant differences between the (a) no support condition and avatar condition (p = 0.009), (b) no support condition and avatar + highlight condition (p < 0.001), (c) highlight condition and avatar condition (p = 0.016), and (d) highlight condition and avatar + highlight condition (see Fig. 18). Comparing the groups regarding social presence, a one-way ANOVA showed no significant differences (F(2.02, 44.35) = 4.19, p = 0.021, η2 = 0.058). Processing time appeared enhanced by additional highlighting of the upcoming state: A one-way ANOVA indicated significant differences between the groups (F(3, 66) = 6.167, p < 0.001, η2 = 0.167)). Subsequently, post-hoc tests revealed significant differences between highlight condition and avatar condition (p = 0.007), and between avatar condition and highlight + avatar condition (p = 0.033) (see Fig. 19). Processing errors were so few that a comparison of the accuracy was not meaningful.

Fig. 18
figure 18

Boxplots for the co-presence measure. Scale range 1:7

Fig. 19
figure 19

Boxplots for the processing time. Y-axis shows processing time in seconds

9.3 Discussion

The ongoing study may be the first study to examine the role of an avatar in a team collaboration task in a simulated work environment. Though the expressiveness of the data were limited, the insights could provide guidance for designers. First, co-presence has been shown to be effective and that it is higher when the avatar is actively engaged in the task. Second, highlighting of specific regions of interest has a positive influence on processing time. Third, AR-based support enabled the participants to perform almost error-free. These findings provide a foundational direction for the design of AR-based avatar interfaces. The publication of the ongoing study’s results is currently in preparation.

10 General Discussion and Conclusion

In this chapter, we introduced the topics of collaborative work, teams, teamwork, and task work and showed the potential of AR applications for the support of teams. We reviewed the state of the research concerning team-supportive AR applications and presented a self-developed taxonomy for the use and/or the implementation of AR applications in teamwork environments. Furthermore, we provided insights into past and current research in the field of team support by means of AR. Thereby, the focus was on the AR-based support of spatially dispersed teams. Our intention for writing the chapter at hand was to outline our research agenda of the past years. To discuss our findings to this point, we summarize the topics covered in the chapter below. Thus, the focus is on our experience-based conclusions. Additionally, we outline the contribution we believe our research allows us to provide.

Already in 1995, Goodhue and Thompson proposed that the technology used must have a good fit to the task it is supposed to support (Task-Technology Fit) (Goodhue and Thompson 1995b). However, what we can conclude from our ongoing studies so far is that it is not only of great importance to establish a fit between the task and the technology that is as good as possible. It is also of great importance to consider the fit between the team to be supported and the respective technology used (team-technology fit, Thomaschewski et al. (2019, 2021)). Nevertheless, from the view of current research presented in this chapter, we were able to infer that during the development of AR-based assistance systems, the task is still in the focus. When it comes to the support of spatially dispersed teams, the specifics of the team to be supported are not considered (e.g., experience level of the team members, number of team members, team history, etc.).

In the context of the aforementioned research review, we have only been able to identify one paper (Piumsomboon et al. 2018a) in which the team aspect was at least considered, but the application was then discussed more from a task, interaction technique, and technical communication perspective. Neglecting the needs of and demands on the team to be supported can negatively affect the performance as well as the well-being of single team members and/or the general team experience.

Therefore, we would like to encourage developers and researchers to draw attention not only on the respective task, but on the team that shall be supported. In order to also take these person-related variables into account, we advocate that the focus in the development of AR-based support tools should be more on the experience and behavior of the team members. To assist developers, researchers, and users in this regard, we have presented a taxonomy in this chapter. This taxonomy is intended to be used to consider both task and team-related factors when developing and/or deploying AR-based technologies.

We demonstrated how team and task can be taken into account as reciprocally dependent and equally significant factors in the development of AR-based applications with the presentation of pre-study II. To this end, we based our interface configuration on UX as well as usability measures, that we derived from assessing participants, who were familiar with the actual task (operating WaTrSim). The reported results for the main study confirmed the effectiveness of this approach: On the one hand, we showed that the superimpositions used in the interface were perceived as attractive, relevant, and generally supportive for starting up WaTrSim. On the other hand, the supportiveness of the superimpositions in relation to the temporal coordination of individual and team task was rated as rather mediocre. In relation to remembering the next steps of the start-up procedure, the support level was rated as rather low. However, the reported results are based on the subjective perception of the participants. Further, analysis of objective-assessed performance markers will shed more light on the actual effectiveness of the developed tool.

To demonstrate a further approach for supporting spatially dispersed teams, we additionally outlined a study in which we developed an AR-based avatar assistance system. Here, we showed that guiding the gaze by means of AR-based highlighting of relevant interface parts can significantly reduce processing time in a search task. Furthermore, we showed that the perception of co-presence could be significantly increased by an active avatar behavior, whereas we found no significant influence on the perception of social presence. By primarily focusing on the avatars influence on co- and social presence, we could show that avatars can positively contribute to the team feeling and teamwork experience. Regarding limitations, it should be noted that the influence of the avatars’ sex on the participants’ perception of the team experience was not evaluated. Here, preliminary work showed that the avatars’ sex can have a significant influence on help-seeking behavior. Lehdonvirta et al. (2012) showed that female avatars were more likely to be asked for help in comparison to male avatars.

Our research of the past few years led to the notion that physiological differences between male and female users should be considered in AR-related research much more. For instance, on average, women have a smaller interpupillary distance (Dodgson 2004) as well as a less strong muscular system in the head and neck region (Côté 2012), which might lower the usability of head-mounted displays for them significantly. Additionally, recent research has shown that, on average, women are also more prone to different types of motion sickness (Garcia et al. 2010), which might be related to the female hormonal cycle (Matchock et al. 2008). Taken together, from our point of view, we highly recommend to draw more attention on possible effects of the avatar’s sex as well as on sex-related differences in the users in future AR research. Against this background, in our future research, we will focus on the avatars’ sex as well as on gender identification as an influencing factor in AR-related research. Considering these differences will generate results that can be generalized to a broader, more diverse (and thereby more realistic) population.

In conclusion, the chapter at hand provides several contributions to AR-related team research: By reviewing the state of research in relation to AR-based support of spatially dispersed teams, we provide an up to date overview of current and past developments, which might serve as a framework including the provided taxonomy. This framework can be helpful for guiding future developments in the context of AR-based support of spatially dispersed teams. We further illustrated the implementation of that framework by outlining a set of empirical studies. These studies cover the topics of AR-based interface design, visual guidance technology, as well as the implementation of AR-based avatars for spatially dispersed teams.

Still, more research is needed in the field of supporting spatially dispersed teams, as they are becoming increasingly prevalent due to their high potential. However, since the behavior and experience, and accordingly the performance, of those teams can differ from teams working on-location, more research is needed to develop and offer adequate support methods/applications for spatially dispersed teams.