1 Introduction

Social interaction is an intricate subject requiring a consideration that extends beyond the study of entities in isolation, to encompass natural dynamic and developing interrelationships accounting for the context and environment in which it unfolds. The chapters in this part approach the study of interaction by considering interaction from the perspective of emotion and related processes. The term emotion is used in a loose manner throughout the part and many of the concepts dealt with can be more properly referred to as being affective in nature; that is, they concern a wide gamut of affect-related artefacts that may take place over a variety of time scales in a variety of contexts.

An important underlying methodology throughout the part, in the spirit of affective computing (Picard 1997, 2003), is the investigation of interaction and emotional processes through the use of computational models embedded in real-time systems capable of interacting with humans. Here, it is worth noting some of the views regarding the place and utility of affect and emotion research in the modern world. Affect has long been unfairly deemed to be nothing more than an insignificant byproduct of biological reasoning processes, or even a destructive obstacle to controlled, logical reasoning and intelligent behaviour. And of all the domains of application, it is in the domain of computation and machines where one who is new to this area may ponder the practicality, role and utility of affective computing research. A newcomer may, for example, expect that emotion research is concerned with abstract or otherwise vague concepts related to questions that, while intellectually stimulating, may not be of a practical nature or yield foreseeable benefits for creating concrete systems. The issue of subjective feeling and concerns about if and how machines may be endowed with such capacities, for example, may wrongly be perceived to dominate this area.

Thankfully, views on the role and utility of emotion have changed remarkably, and as the reader will hopefully experience throughout this part, while fundamental questions such as the aforementioned are still of great importance, aims are towards highly practicable systems with concrete day-to-day applications in the here-and-now. Possibly the utmost of these involve attempts to endow machines with forms of emotional intelligence, so that they can better recognise, understand, adapt to and, ultimately, interact with human users. Contemporary systems are, by any measure, remarkably ignorant and unintelligent when it comes to the human interactant, who is expected to expend effort in conforming to the interaction methods of the machine. The potential for more natural interaction to take place should reduce the effort involved for the user, and associated frustrations, during human–computer interactions.

1.1 Agents and Interaction

The computerised systems dealt with here most often take the form of embedded agents: These are graphical representations, often of real humans, that can express themselves graphically, and sometimes verbally when interacting. These agents are very useful as vehicles for testing emotion and interaction theories and also as practical applications where human–computer interaction is desired.

The term intelligent virtual agents , also referred to in the following texts as virtual agents or agents, is used to refer to computerised entities that utilise a variety of computational models for the purpose of enhancing their decision-making capabilities in order to operate in a more intelligent, or at least behaviourally credible, manner. Such agents are embedded within a virtual environment, although they are not limited to interacting solely within it, and often embodied, i.e. have a graphical appearance they may help to convey their state within the environment.

Embodied conversational agents (ECAs) are a specific type of IVA that have a purpose of interacting directly with users. Represented by an animated character on the screen, ECAs typically employ a number of important capabilities for interacting with users, for example through speech, gesture and facial expressions.

Agents are an important focal point for affective research: To the emotion theorist, they provide a concrete way of testing theories and controlling experimentation. The attempted reconstruction and a posteriori comparison of such systems with their real counterparts is perhaps the best way of establishing the actual nature of the real system. In the same way, emotion theorists and computational modellers use their expertise to provide updates and solutions that improve real-world applications. Such agents should be able to keep track of facial expressions, gestures, speech and other important interaction details, such as who is speaking, in order to alter inner state to adapt and produce appropriate behaviour.

1.2 Key Capabilities

When studying a single entity, a road-map such as that provided in Fig. 1 is useful for gaining an idea of the capabilities of importance for creating the aforementioned agents. These can be enumerated as follows:

  1. 1.

    Detection and recognition of aspects of the user, the environment and contextual features of importance. This relates to the sensing of important data, perception and interpretation for use in higher-level processes. It may range from determining who the current speaker is, to detecting the change in facial expression of a user from one of joy to frustration.

  2. 2.

    Planning based on intention, internal state and externally detected factors and events.

  3. 3.

    Generation and expression to ensure that appropriate behaviours are generated to express the state and intentions of the agent.

  4. 4.

    Internal representation and memory as important concepts underlying the operation of all of the above.

Fig. 1
figure 15_1_213094_1_En

Diagram (Moore 2005) of core behaviours for spoken language processing. These behaviours are also more generally applicable as a road-map of the relationships between categories of core behaviours and capabilities desirable for an intelligent conversational agent.

Many more specific elements and capabilities can be placed in one of these categories, for example, managing the start of interaction and associated communicative behaviours (Peters et al., 2007).

2 Structure

With the key capabilities described above in mind, this part consists of five chapters.

Chapter “Fundamentals of Agent Perception and Attention Modelling”, investigates practical approaches for constructing real-time agents capable of perceiving and attending to their environment and users.

Chapter “Generating Listening Behaviour”, considers face-to-face conversations in more detail, to examine the joint activity of speakers and listeners through the construction of agents that are able to show that they are listening to the speaker.

Chapter “Coordinating the Generation of Signs in Multiple Modalities in an Affective Agent, looks at issues of coordination, so that an agent is capable of providing appropriate behaviour over available modalities, such as facial expression, speech and gestures.

Chapter “Representing Emotions and Related States in Technological Systems”, is concerned with the inner representation required when systems operate on emotions and related states.

Chapter “Embodied Conversational Characters: Representation Formats for Multimodal Communicative Behaviours”, deals with requirements for representation languages used for planning and displaying behaviours.