Keywords

1 Introduction

1.1 Background

Current commercial games, such as Assassin’s Creed, ® exemplify the incredible level of realism in virtual environments and the characters residing within them. However, that realism fades during in-depth interactions with characters employing artificial intelligence. Real-time puppeteering is one strategy to increase the realism of those interactions within virtual environments. Puppeteering is a technique that allows the movements, expressions, and actions of a virtual avatar to be fully controlled by an individual through a motion capture system [1] (Fig. 1).

Fig. 1.
figure 1

Characters from Assassin’s Creed® Origins: The Hidden Ones [25]

1.2 Avatar Realism

An avatar is a virtual representation of a human being, which is fully controlled by an actual human [2]. The aesthetics or visual representation of an avatar can vary based upon the artist depiction of their creation (see Stylization).

A study conducted by Trinity College Dublin [3] explored the appearance of realism and animation of an individual’s own-virtual face creating character appeal. The findings of this study concluded that whether an avatar is realistic or displayed abstract cartoon-like features, the participant was not affected by the animation realism [3]. Realistic faces are equally appealing to cartoon faces in both animation realism conditions. However, when looking at an emphasis of perceived emotions, a study [4] analyzing different stylization techniques concluded that different avatar stylizations can affect the expression appearance and human response towards the perception of that Avatar.

The Threshold Model of Social Influence [5], states that the social influence of a real person is high, however, the influence of an Artificial Intelligence (AI) character depends on the character’s realism. Furthering this research, Von der Putten [2] found that participants experienced greater negative feelings towards artificial agents than avatars who displayed more realistic features, in addition to a feeling of greater social presence. Their findings essentially state that avatars who aesthetically display more realistic features and behaviors will create a greater social awareness and perception of the virtual character.

By leveraging the work we have done utilizing avatar puppeteering [1], we are seeking to expand upon the listed study findings to see if the stylization produces an effect of how participants perceive emotions or an emotional connection with an avatar controlled by a puppeteer within a training atmosphere. We are also interested to see if the look of an avatar controlled by a puppeteer changes the perceived virtual presence and what biases may be exposed.

1.3 Motion Capture Technology

Motion capture systems are frequently used in the entertainment industry in the creation of animations for seamless, elegant, and fluid appearance of characters (See Fig. 2) [6]. The initial motion capture sequences are polished by animators to ensure movements appear clean and natural [7]. The use of this technology in real-time, without the clean-up sequence, is still in its infancy. Participants interacting with a character controlled by a puppeteer see the character display gestures and facial expressions as if they are speaking to someone in real life. The individual controlling the character can use a web camera to view the person with whom they are interacting. This can change the participant’s experience from a witness to someone who is wrapped into the experience. By interacting with an avatar played by a human in real time, it may create a more personal and immersive emotional experience.

Fig. 2.
figure 2

War for the planet of the apes motion capture – twentieth century fox [26]

1.4 Puppeteering

The use of technology-based communication can pose unique challenges [8,9,10]. Specifically, the use of technology can lead to errors in interpretation and decreases in performance [8, 9]. One area that can be especially challenging is the interpretation of human expressions. The human face has forty-six unique action units that are capable of making more than ten thousand unique configurations and these configurations allow for greater understanding of the communicated message [11]. For example, a close friend may use a subtle eyebrow raise to accompany a statement. The statement alone might be considered offensive, but in conjunction with the eyebrow raise, it would be perceived to be a pleasant and playful jest. In addition to facial expressions, changes in voice tone can also change the context of a statement [11, 12].

The realism of interpersonal interactions in virtual training domains can be an important factor in influencing decision making in a learning population [13]. Some training tasks require very detailed realism, including the ability to recognize small-motor movements, eye movements, voice intonation and the gesticulations.

The interaction between automated virtual agents does not yet approach human-to-human experiences. This drives our team’s interest in increasing the realism of human interaction in virtual environments by using real-time puppeteering. The focus is on merging state-of-the-art commercial technologies to support human-dimension training for force effectiveness. The working prototype provides a platform for feedback from end-users and informs the requirements and procurement communities. This capability is possible at a low cost, though the technology is still in its infancy.

To date, research on real-time puppeteering is limited. Even more limited is research on the impacts of puppeteering in a training environment. This is problematic as more organizations are relying on technology for training [14]. The goal of this research task is to explore the training implications of interacting with a character controlled by a puppeteer training certain human interactions in virtual environments. This paper examines: the areas of realism that might provide the greatest payoff per investment; the impact of stylization on perceptions of avatar realism and the overall effect of these perceptions on important organizational outcomes in regards to a training engagement and training performance; what expressions immerse the learner enough to increase engagement in the training activity; and finally, the link between appearance and emotional connection to the avatar.

1.5 Stylization

Stylization is the act of representing an object in a non-natural form [15]. Emoticons are an extreme example of stylization. The stylization of avatars, specifically virtual humans or virtual characters, can range from more realistic to cartoonlike in stylization [16]. Stylization might be done to accentuate specific features such as facial expressions, or it can enhance the ambiance of the situation (See Fig. 3). Japanese Anime is a good example of stylized characters, with larger-than-life, expressive eyes and almost childlike expressions [17]. The level of realism or stylization of a character is an important design decision that effects development time, cost, and user perceptions. Design decisions, such as this, need to be made early within the development process to ensure consistency with other features such as the environment [18]. In addition, the specific learning task must be considered to determine the desired level of realism of interpersonal interactions to support decisions of a learning population [19].

Fig. 3.
figure 3

Artist’s rendition of character stylization [27]

2 Stakeholder Goals

2.1 U.S. Army Research Laboratory Simulation and Training Technology Center (ARL STTC)

The Army Research Laboratory’s Simulation and Training Technology Center has the mission to execute the Army’s science and technology program for simulation and training to accelerate learning and optimize human performance. The mission of this team within ARL STTC is to explore strategies to improve soldier training by leveraging commercial game technology. Puppeteering is a rapidly developing technology area that can provide functionality that AI is not yet able to provide within virtual environments.

2.2 Impact to the Army

Puppeering has the potential to support a wide range of training capabilities within the Army. Specifically, the U.S. Army Warfighters’ Science and Technology Needs Bulletin, published by the U.S. Army Capabilities Integration Center at the U.S. Army Training and Doctrine Command (TRADOC) [20], that describes the need for virtual humans that should: (1) be capable of supporting “natural language processing to enable interactions (verbal and non-verbal);” (2) A virtual human must be able to “understand, reason and make assumptions about the environment;” and (3) “Virtual humans will populate large-scale simulations to expand the range of on-demand, interactive training opportunities and reduce human overhead support.” (Pg. 40) While the goal is to have much of this functionality available via AI, that technology is currently not yet able to support these needs. Puppeteering is expected to bridge the gap until AI can more seamlessly meet the need.

The U.S. Army Learning Concept for Training and Education (TRADOC Pamphlet 525-8-2) [21] describes the future Army learning environment in this way:

“Replicating the complex global environment within the learning context and conditions is critical to providing tough and realistic training and education. This complex global environment involves operations among human populations, decentralized and networked threat organizations, information warfare, and true asymmetries stemming from unpredictable and unexpected use of weapons, tactics, and motivations across all of the training domains. Adversaries are likely to employ information warfare to degrade mission command capabilities or conduct global perception management and influence campaigns. Army training and education must account for these and other factors during training and education activities. Adaptability is paramount; the learning system must provide training and education solutions to teams, Soldiers, and Army civilians synchronized to the operational tempo. To meet these challenges, Army training and education must do the following:

  1. (1)

    Portray the complex environment to develop leaders, Soldiers, Army civilians, and teams that understand the situation, apply appropriate judgment, adapt to changing conditions, and transition effectively between operations. Army training and education prepares Soldiers, and Army civilians to exercise mission command to exert influence on key individuals, organizations, and institutions through cooperative and persuasive means.

  2. (2)

    Create situations allowing individuals and teams to master fundamentals and hone skills.

  3. (3)

    Present complex dilemmas forcing leaders to think clearly about war to match tactical actions with operational and strategic objectives.

  4. (4)

    Create situations allowing individuals and teams to experience, become comfortable, and eventually thrive in ambiguity and chaos and then provide meaningful feedback on their performance.

  5. (5)

    Provide the required repetition, under the right conditions and with the right level of rigor, to build mastery of both fundamental and advanced warfighting skills (Pg. 11).”

The Army currently conducts live training events that, in some cases, require paid actors to interact with soldiers as part of a training scenario to establish a sense of realism. Soldiers are able to study the patterns of certain individuals to look for anomalies. Then they might engage various individuals in a mock-up of an operational environment. For example, they may meet with a key leader or law enforcement. These live training events can be costly and logistically complex. Similar training within a virtual environment could significantly reduce the overall cost of training for the Army. Characters could be controlled by AI until a soldier approaches one, then an actor/actress could step into that role and provide a rich interaction. A small number of role-players, maybe one female, one male, and possibly a native speaker or two, could hop into specific characters. Cost savings are important, but it cannot be at the expense of training capability. This represents just one use-case of how this technology can benefit Army training.

2.3 Defense Equal Opportunity Management Institute (DEOMI)

The Defense Equal Opportunity Management Institute (DEOMI) was established by the Department of Defense (DoD) in 1971 as the Government’s premier institution for education, training, and research in human relations, equal opportunity, equal employment opportunity, and diversity. DEOMI is responsible for training Equal Opportunity Advisors (EOAs) for all branches of the uniformed services and civilian Equal Employment Opportunity (EEO) counselors for civil servants. Additionally, DEOMI provides policy and strategy guidance to the DoD for equal opportunity, sexual harassment, and sexual assault.

2.4 Impact to DEOMI

Building on previous research [22], this project will serve as the initial phase of a test plan to determine the usability of this platform as a DEOMI training tool. The scope of this investigation is to determine how realistic avatars are perceived by DEOMI students. The long-term goal of this project is to shape the future of EOA/EEO training by developing an immersive training environment for DEOMI facilitators and students. We aim to build a platform that overcomes the limitations of static multimedia training materials (e.g., training videos, computer programs), is capable of real-time adaptive flex-training, and provides comprehensive play-by-play verbal, postural, and interpersonal engagement feedback. Such a platform will expand DEOMI’s capabilities, creating more robust and comprehensive programs.

3 Method

3.1 Participants

Participants will include 200 active duty Service Members enrolled in the Equal Opportunity Advisor Course (EOAC) or reserve Service Members enrolled in the Equal Opportunity Advisor Reserve Component Course (EOARCC) at DEOMI. Participants will receive credit that can be used to partially fulfill the EOAC/EOARCC requirement to complete volunteer hours.

3.2 Experimental Design

Using a within-subjects design, the present study will manipulate avatar stylization to determine how stylization influences perceptions of avatar realism. Participants will watch two videos, each with differently stylized avatars, welcoming them to the program they will be attending at DEOMI. After each video, participants will answer questions that will measure perceived avatar realism.

3.2.1 Avatar Videos

The team created two videos of a single avatar welcoming the EOAC/EOARCC students to DEOMI. To determine what avatar characteristics will result in the highest ratings of perceived realism, the stylization of the avatars were varied between the two videos such that one contained a stylized avatar (See Fig. 4) and the other contained an avatar with more realistic features (See Fig. 5).

Fig. 4.
figure 4

Stylized avatar male civilian

Fig. 5.
figure 5

Stylized uniformed female

The script (Table 1), background, and all other aspects of the videos are identical. In the videos, the avatar is standing up in an office-like setting, making the entire body of the avatar visible to participants.

Table 1. Script stated to participants within study

This will allow all aspects (e.g., facial expressions, posture, hand gestures) of the avatar to be rated for realism. In the video, the avatar is facing towards the viewer (perceivably making eye contact) and verbally delivers a message containing the welcoming statement, what they can expect to learn from the course, and general instructions on when/where to show up for class. Participants will watch both versions of the welcome message. The presentation will be counter-balanced to eliminate the possibility of confounding order effects.

3.2.2 Perceived Realism

At the conclusion of each video, participants will rate each avatar’s realism. We will measure realism across two sub-dimensions: visual and behavioral realism (See Fig. 6).

Fig. 6.
figure 6

Taxonomy of avatar characteristics

Within the sub-dimension of visual realism, we will measure fidelity and anthropomorphism. Fidelity refers to the quality of the avatar image and its surroundings. Anthropomorphism refers to the extent in which the avatar truly represents a human-like figure. Within the sub-dimension of behavioral realism we will measure kinetic conformity and social appropriateness. Kinetic conformity refers to the extent to which the avatar’s movement resembles that of a real human being. Social appropriateness refers to the extent to which the verbal and nonverbal avatar responses align with communicative norms.

3.2.3 Analyses

Paired-sampled t-tests will be conducted to determine which of the two styles of avatars appear more realistic to the participants.

4 Way Ahead

Moving forward with this study we will be investigating how the characteristics and look of an avatar will effect participant’s opinion and attitude. We will also examine what effect a virtual character controlled by a puppeteer produces in relation to how participants perceive emotions or an emotional connection within a training atmosphere.

Future research will build upon this work to explore the role bias plays in EEO tasks and interactions.

5 Conclusion

Technology and research into the topic of using a real-time human-controlled virtual character for EEO purposes is in its infancy. The research described in this paper is seeking to expand upon the listed study findings to see if stylization of virtual characters as compared to more photo-realistic characters produce a noticeable effect of how participants perceive emotions and how they build rapport with a puppeted avatar within a training atmosphere. This paper has described planned research and a projected plan where similar research might go to further.