Keywords

1 Introduction

Embodied virtual agents designed as animated 2D or 3D characters resembling human beings have become a well-established research area. Embodied agents have been used to inform, explain and demonstrate sequences of activities in educational, training, commerce or entertainment settings [1]. A special type of virtual agents, the embodied virtual trainers or coaches, have been applied in context of physical exercising and exergames. Some examples include an animated virtual fitness trainer applied to motivate and supervise fitness exercising [2], a virtual coach for health self-management of chronic patients [3], a conversational agent applied to teach lifestyle modifications [4], and a 3D trainer avatar in virtual balance therapy for older people [5].

The potential of embodied virtual agents for delivering engaging and motivating experiences has been addressed in research in relation to a wide range of attributes, including non-verbal communication and facial expressions [6], multimodal strategies and appearance [7], proximity and attentional focus [8], facial similarity between the user and the agent [9], gaze, gesture and body orientation [10], body movements [11], humour [1], and smile [12]. The results of a recent literature review show that relational behaviour positively affects the perception of the virtual agent’s characteristics, the relation with the agent, as well as intention to use [13]. Research in digital game-based learning environments suggests that users can benefit from interacting with embodied agents in terms of motivation, enjoyment and perceived usefulness [14]. Research related shows that smiles tend to enhance positive social evaluations of embodied agents [12]. The appearance of embodied agents also affects likeability, interaction, performance and the feeling of rapport [7, 13, 15].

Rapport has been explored in context of interaction with virtual embodied agents in a number of recent research studies. Rapport was defined as a dynamic structure of mutual attentiveness, positivity, and coordination [16]. Research beyond virtual environments and virtual agents has shown that trainer's rapport with trainees belongs to significant predictors of their satisfaction [17]. The sense of rapport is linked to effective communication, persuasion, greater liking and trust, interaction and engagement [18]. Based on the findings from research on human to human rapport it has been hypothesised that rapport can be established in a similar manner between humans and virtual agents, especially through responsive behaviour of the agent, e.g. listening, feedback, gazing, facial expressions, gestures and postures [18].

Consequently, designing for rapport with embodied agents has been seen as one of the key success factors for delivering engaging and motivating virtual experiences. The potential of virtual agents to establish rapport with humans through verbal and nonverbal communication has been already demonstrated in a number of studies [18,19,20]. Research has shown that effective forms of communication which contribute to establishing rapport with virtual agents are the ones that indicate positive emotions (e.g. head nods), attention (e.g. verbal feedback), mimicry and coordination (e.g. synchronised movements) [18, 24]. However, research on rapport with virtual trainers in the context of physical exercising has been scarce and have included mostly early experimental studies, e.g. rapport between the virtual and the human dancer [25].

This paper describes the design of the virtual embodied agent “Anna” who acts as a virtual trainer in an immersive gamified mixed reality (MR) environment for physical training of older patients with hypertension developed in the bewARe project. The bewARe project is a three-year R&D project founded by the German Federal Ministry of Education and Research (BMBF). The bewARe system is designed to be part of a non-drug hypertension therapy and exploits a range of AR/VR/MR technologies and wearable sensors to measure vital parameters. The bewARe system offers a series of gamified mini-exergames delivered to senior trainees with hypertension with the support of the virtual trainer to enhance motivation, positive experience and engagement in mini-exergames. The embodied, anthropomorphic virtual trainer “Anna” is the key component in the multimodal interaction design of the bewARe system. “Anna” is designed as an animated 3D full-body female trainer silhouette (Fig. 1). The virtual trainer supports the user in performing a set of physical exercises following the training sequence with five main phases displayed in the user interface: welcome, warm-up, mini-exergames, cool-down and evaluation. A gamification approach is combined with these techniques to enhance positive behavioural changes. Based on the sensor data acquired during training (e. g. blood pressure, heart rate), the system adapts to the individual training needs of senior users.

Fig. 1.
figure 1

The design of the virtual trainer “Anna” in bewARe exergames for senior users.

2 Method

The primary goal of our research was to evaluate rapport with the virtual trainer “Anna”. Our research question was: Does the human-like, embodied virtual trainer, we designed as part of the multimodal interaction design in the bewARe system, support the building of rapport with senior users and how do users perceive our virtual trainer compared to a real trainer? All participants provided informed consent. The study was approved by the Ethics Committee of the Charité – Universitätsmedizin Berlin (No. EA1/019/20).

2.1 Design

In the exploratory pilot study 22 participants over 65 years old with diagnosed essential hypertension were included. The study was conducted in a simulated mixed-reality setting in a mobile living lab (VITALAB.mobile) between October and November 2020. All participants tested two mini-exergames on two appointments (strength endurance and endurance workout). Inside the exergames participants were guided by a rapport agent embodied by the virtual trainer “Anna”. Both exergames lasted 20 and 25 min but differed in their level of gamification. The strength endurance-based game was a low gamified personal training, whereas the endurance-based game was highly gamified. The study was conducted in the Living Lab Vita.Lab mobile, which is a mobile VR/AR lab in a truck. Due to the limitations of currently available MR headsets, e.g. a limited field of view, a VR headset (HTC Vive Pro) was used to enable interaction in the virtual environment which was very similar to the physical space. By recreating the interior of the truck in VR, it was possible to simulate a nearly ideal MR environment with a VR headset. In both exergames, the focus was on the motor movement patterns and the perception and imitation of the movement information represented by the agent. In the course of the exercises the virtual agent “Anna” acted as a personal trainer and provided feedback in form of hand-clapping, verbal praising and asking questions.

In the first visit a strength-endurance exergames were tested with 22 participants. Each participant performed five guided exercises: squat, overhead press, diagonal pull, leg raise, toe stance. Each exercise was repeated up to 20 times per set in two sets. Between the sets there was an active break of one minute. The virtual trainer demonstrates the exercises, gives advice on what to pay attention to and performs the exercises simultaneously with the participant. During the second visit endurance exergames were tested with 22 participants. Each participant performed three guided exercises: ball game, high five and hustle dance. In the ball game, using the Valve Index controllers, the participant throws a virtual ball into a virtual ring held by the virtual trainer. In the high five game, the participant mirrors the movements of the virtual trainer and the touch points are displayed in the form of purple circles. In the hustle dance, the participant imitates dance steps and moves of the virtual trainer.

2.2 Measures

In order to investigate which design features of the virtual trainer “Anna” contribute to a positive rapport (on the cognitive, affective, and interaction levels) in the different types of exergames with senior users we adapted the rapport scale by [18,19,20]. The rapport scale is a self-reported post-questionnaire and was applied after both exergames. The adjusted rapport scale was measured with a 9 point metric (0 = Disagree Strongly; 8 = Agree Strongly) following the original format used by [18,19,20].

The rapport scale proposed by [19, 20] was adapted to the bewARe project. Since the bewARe virtual trainer cannot respond to the user individually on a verbal level, e.g. no conversation is possible, two items related to the conversation with the agent (e.g. “I felt that the listener was interested in what I was saying”) were deleted. On the other hand, since the bewARe project focuses on motor movements, two new items on interactional rapport (items 6 and 11) were added to the scale to capture the perception and imitation of the movements represented by the virtual agent. These items are related to motor simulation and motor learning through movement observation. Grounded in research results showing that positive rapport is associated with “likableness”, another item (item 1) was added to the scale: “I find the agent likeable”, similar to [19]. All items from the rapport scale are summarised in Fig. 3 below.

Furthermore, qualitative data was collected in semi-structured interviews at the end of the evaluation. The replies to the question were recorded, transcribed and analysed using summative content analysis by [26]. The occurrences of words related to the virtual trainer were quantified by counting the frequency. Finally, the latent content analysis was conducted to interpret underlying meanings. Verbal and nonverbal cues were assessed as descriptors of rapport [25]. All items were measured with a nine-point Likert scale (0 = Strongly unpleasant; 8 = Strongly pleasant).

2.3 Participants

The mean age of the participants was 75 years (SD: 3.6; 59.1% female). The participants had no increased risk of falling (Tinetti 27.6) and had no cognitive impairments (TICS 37.3) [21, 22]. 45.5%, reported university as their highest education qualification. 54.5% already had previous experience with virtual reality and all of the included participants had essential hypertension.

3 Results

3.1 Rapport Scale

In both exergames high ratings were reached for likeability (6.59 ± 1.40/6.32 ± 1.46), movement imitation (7.00 ± 1.11/7.05 ± 1.13) and involvement (6.18 ± 1.71/6.00 ± 1.85). Significant differences between strength endurance and endurance workout occurred in relation to the interactional (item 15: p = 0,016) and affective (item 12: p = 0.045) aspects of rapport. Accordingly, during the strength endurance exergame participants were less frustrated when interacting with the agent (0.68 ± 1.13/1.91 ± 2.27) and could behave as they wanted (4.73 ± 3.01/3.36 ± 2.67). Figure 3 summarises the results.

Fig. 3.
figure 2

Mean values and standard deviation of the items of the rapport scale in comparison between strength endurance and endurance VR workout. *p < 0.05.

The study also explored the perception of senior trainees of selected characteristics of the virtual agent such as body, head, hand and facial movement, voice quality and pauses in speaking. The evaluation of the characteristics of the virtual trainer showed consistently high scores, with the exception of facial expression. There were no significant differences between the two exergames (Table 1).

With regard to rapport and characteristics of the participants, similar results were obtained in both exergames for female and male participants. The comparison of participants with and without prior VR experience showed that the participants with prior VR experience rated items 4 (p = 0.011) and 14 (p = 0.025) as well as facial expressions of the virtual trainer (p = 0.018) significantly better than persons without prior VR experience. The comparison of different age groups revealed that younger seniors rated item 12 (p = 0.030) and pauses in speaking (p = 0.029) significantly higher.

Table 1. Results of the evaluation of the characteristics of the virtual agent.

3.2 Interviews

The results of qualitative interviews show that about 60% of the participants expressed positive opinions about the virtual trainer. Some of the positive aspects were explicitly related to rapport, e.g. a number of participants stated that they enjoyed being guided by and following the movements of the virtual trainer, which indicates positive rapport on the interactional/kinaesthetic level. Some participants mentioned they liked the voice and the bodily motion of the virtual trainer, which are also relevant for rapport. About 40% of the participants expressed negative opinions about the virtual trainer, and some of these opinions were explicitly related to rapport, e.g. a number of participants felt they could not establish rapport with the virtual trainer and thought that they could build a more personal relation with a real trainer. The latent content analysis revealed that the design of the virtual trainer’s face was rather controversial. While 59% liked the abstract, neutral face, 41% perceived the face as unfriendly and expressionless due to not recognisable mimics and a smile on the face.

4 Discussion

The primary goal of this study was to evaluate the rapport of senior users with the virtual trainer in context of mixed reality exergames for patients with hypertension. The results from the survey and the interviews indicate that the embodied virtual trainer “Anna” was effective in establishing rapport with senior users. Most participants liked the neutral design, voice and the guidance of the virtual trainer and enjoyed following the movements. Some participants however found it harder to build rapport with the virtual trainer compared to a real trainer, which may be related to the specific attributes, such as a low-level of expressed emotion through the face and voice quality of the virtual trainer. Further studies could have a more in-depth look into a relationship between the level of expressed emotion as well as further attributes of virtual trainers and their effectiveness for establishing rapport with different users.