Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1.1 Motivation and Overview

When looking at the advanced technical systems we constantly use in our everyday lives, we make a striking observation: Although these systems provide increasingly complex and “intelligent” functionality, like modern household appliances, smart phones, cars, machines, and countless numbers of electronic services do, there is often a considerable lack of comfort and convenience in use. Extensive (or tenuous) operating instructions have to be downloaded from the Internet; lengthy menu promptings have to be passed; and in many cases the user is even left with no option but to explore the system’s functionalities by him- or herself. Depending on the particular user, the situation, and the system at hand, these obstacles may not only impede an exhaustive use of these innovative products and services, but may cause frustration and a reluctant attitude, and consequently the user may even lose interest in employing the system any further.

In other words, there is a wide gap between the growing functional intelligence of technical systems on the one hand and the lacking intelligence in providing this functionality to the user on the other hand. One reason, illustrated in Fig. 1.1, lies in the fact that technical systems offer their functionality in a strictly uniform way. They make no distinction between user types or even individual users, whether they are experienced with the system or not, request just a specific function, or have needs that demand some smart explanation of particular aspects of the system’s functionality.

Fig. 1.1
figure 1

Present-day human-technology interaction

Companion-technology aims to bridge this gap by complementing the expanding functional intelligence of technical systems with an equivalent intelligence in interacting with the user and to integrate the two. It does so by enabling the realization of arbitrary technical systems as Companion-systems—cognitive technical systems that smartly adapt their functionality to the individual user’s requirements, abilities, preferences, and current needs. They take into account the user’s personal situation, emotional state, and disposition. They are always available, cooperative, and reliable and present themselves as competent and trustworthy partners to their users. Companion-systems are technical systems that exhibit so-called Companion-characteristics, namely competence, individuality, adaptability, availability, cooperativeness, and trustworthiness. These characteristics are implemented through the well-orchestrated interplay of cognitive processes based on advanced perception, planning, reasoning, and interaction capabilities.

In this chapter, we give an introduction to Companion-technology. We present the underlying theory and discuss its conceptual constituents. They include the acquisition, management, and use of comprehensive knowledge; the abilities to reason, to decide, and to recognize a user’s context, emotion, and disposition; and the capacity for dialogue with individual users.

Up to now, the notion of a technical or artificial Companion has appeared in the literature only in a few contexts. The most prominent work is reported by Wilks [22]. Here, Companions are supposed to be conversational software agents, which accompany their owners over a (life-)long period. Rather than “just” providing assistance they are intended to give companionship by offering aspects of real personalization. In recent years, the paradigm of Robot Companions emerged in the field of cognitive robotics [1, 6, 13]. Those Companions are autonomous embodied systems, which accompany and assist humans in their daily life. Here, the main focus of research lies in the development of advanced training and learning processes to enable the robots to continuously improve their capabilities by acquiring new knowledge and skills.

In contrast, Companion-technology builds upon wide-ranging cognitive abilities of technical systems. Their realization and synergy have been, for roughly one decade, investigated under the research theme of cognitive systems or cognitive technical systems. The theme is focused on capabilities such as environment perception, emotion recognition, planning, reasoning, and learning, and their combination with advanced human-computer interaction. An overview on cognitive technical systems was initially published by Vernon et al. [19], while Putze and Schultz give a more recent introduction [15]. A comprehensive survey on the current state of the art in research and development towards Companion-technology is presented by Biundo et al. [5] in the special issue on Companion Technology of the KI journal [4].

However, up to now a systemized definition of the essence of Companion-technology or companionable systems has been lacking. The first attempt to come up with such a definition was made when establishing the interdisciplinary Transregional Collaborative Research Centre “Companion-Technology for Cognitive Technical Systems” [2, 3, 21]. In this chapter, we elaborate on this definition and draw the big picture of a novel technology.

1.2 The Big Picture

As the illustration in Fig. 1.1 shows, it is obvious that two important prerequisites for an individualized and context-sensitive functionality of technical systems are already given. First, both the system and the user are embedded in the environment. Provided with suitable perception capabilities the system would thus be able—in a way similar to human users—to perceive and recognize those context parameters that are relevant for the system’s correct functioning and its interaction with the user. The system would also be able to observe its user and sense parameters that give an indication of the user’s contentment and his or her emotional state and disposition. Second, the user has a mental model of the technical system he or she is using and, in particular, has individual expectations concerning the system’s behavior, including the way it should present its functionality and the interaction modes it should use to do so.

These prerequisites are utilized to provide input for the cognitive processes that establish a Companion-technology. Figure 1.2 shows the components that are required. Corresponding to the user’s mental model of a technical system, a Companion-system will be equipped with a comprehensive knowledge base. It holds knowledge about the system itself such as declarative descriptions of its technical functionality and operation conditions as well as knowledge about the individual user, his or her abilities, preferences, and requirements. Based on this knowledge, advanced planning and reasoning facilities implement the technical functionality. Plans of action are automatically generated according to the user’s profile. Depending on the application at hand, these plans either serve to directly control the system, or are passed on to the user as recommendation for action.

Fig. 1.2
figure 2

Future human-technology interaction

The situational context is perceived through various sensors, as are the user and his or her behavior. The emotional state and disposition are recognized by analyzing multiple modalities such as speech, facial expressions, hand and body gestures, and physiological measurements. With that, it is feasible to dynamically adapt the system’s technical functionality according to sudden unexpected changes of the world and the user state.

A system’s knowledge base does not only support the generation and adaptation of technical functionality, but also determines how the system and the user interact. Just as humans interact with their environment by employing various cognitive and motoric skills, Companion-technology enables systems to select appropriate communication devices and modalities according to both the current situational context and the user’s tasks, preferences, emotional state, and disposition.

Companion-technology gives Companion-characteristics to technical systems. Competence, individuality, adaptability, availability, cooperativeness, and trustworthiness are realized through a three-stage approach. The first stage is advanced cognitive functions including perception, knowledge-based planning and reasoning, dialog management, and multi-modal interaction. By means of these cognitive functions, the second stage implements a number of cognitive competences. They include a robust recognition of the environmental situation and the user’s emotional state; an individualized technical functionality and user-system interaction by continuously taking individual user characteristics and preferences into account; a consistent consideration of location, time, and behavioral context; and a robust activity recognition and plan execution monitoring.

In a third stage, a variety of meta-functions build upon the above-mentioned capabilities, thereby manifesting the Companion-characteristics. These meta-functions include:

  • supporting the user with motivating comments and confirmation;

  • sustaining the dialogue with the user and conducting meta-dialogs;

  • recognizing a user’s intentions;

  • explaining the system’s behavior and the system’s recommendations;

  • detecting erroneous situations and reacting appropriately;

  • convincing the user of overarching goals;

  • generating, presenting, and explaining possible alternatives for action;

  • recognizing and accounting for changes in users’ behavioral strategies;

  • clarifying ambiguous user reactions through appropriate system intervention.

Companion-technology aims to lend Companion-characteristics to technical systems of all kinds: technical devices such as ticket vending machines, digital cameras, espresso machines, dishwashers, cars, and autonomous robots; electronic support systems such as navigation systems or fitness-apps; and complex application systems or electronic services such as travel and booking agents or planning assistants which help users in the accomplishment of a range of everyday tasks.

To give an impression, Fig. 1.3 shows the so-called Companion-space—a systematic view on application perspectives of Companion-technology. It indicates classes of prospective Companion-systems using the three dimensions of technical realization, Companion-task and application domain. Although not every point in this space describes a meaningful Companion-system, it does nonetheless demonstrate the great breadth of variation possible in such systems. For each application there exist various technical realizations and various tasks for which different cognitive functions, competences, and meta-functions are relevant. When analyzing user and situation parameters for the application “navigation”, for example, it may be essential—depending on the actual device used—to first determine whether the user is traveling by car, by bicycle or on foot. Furthermore, the Companion-task in this context could be to give instructions on how to configure the navigation system or on how to operate the system in order to find a particular way to a certain destination. In a similar fashion, the implementation of the complex Companion-task of “monitoring” would require a whole range of cognitive functions and competences to be realized across a variety of devices.

Fig. 1.3
figure 3

Application perspectives of Companion-technology: the Companion-space

By providing a novel paradigm for the operation of and interaction with technical systems of any kind, Companion-technology addresses important societal concerns. As increasingly complex technical systems continue to find their way into ever more areas of our lives, the requirements placed on individual users when using these systems also increases. At the same time, developments in technology continue to open up new and unforeseen opportunities for technical support and digital assistance. In this field of tension—especially as concerns the future of our aging society—Companion-technology is poised to make further important contributions. The areas of potential applications range from new types of individualized user assistance in operating technical devices over new generations of versatile organizational assistants and electronic service providers, to innovative support systems, for instance, for persons with limited cognitive abilities.

1.3 The Role of Knowledge

In order to function in a companionable manner, Companion-systems need to rely on comprehensive and multifaceted knowledge. Depending on the application domain at hand and the tasks to be performed, various kinds of knowledge are relevant. In order to provide this knowledge in a systematic way, knowledge bases of Companion-systems are structured along two lines. The static Knowledge Model holds a number of declarative models to supply the high-level cognitive functions of planning, reasoning, and decision-making. The dynamic World Model represents the current states of both the environment and the user and updates these states over time. Figure 1.4 shows the structure of these models.

Fig. 1.4
figure 4

Knowledge architecture for Companion-systems

The knowledge model has three components. The ontology represents static knowledge in terms of hierarchies of concepts and relations that characterize the application domain. The planning model holds the action portfolio. Single actions and entire courses of action describe the various possible ways of acting in the domain in order to achieve certain goals or accomplish certain tasks.

The user model basically enables the individualized and user-adaptive functionality of Companion-systems. It includes profiles indicating a user’s technical knowledge level, his or her expertise w.r.t. the system’s functionality, and preferences regarding ways of acting and interacting. Furthermore, information on the user’s personality, abilities, general disposition, and motivational background is provided as are individual emotion patterns that help to assess the user’s current emotional state when certain emotion parameters were sensed.

In order to ensure an effective use of the static knowledge as a whole, the various models are properly synchronized. This includes the coherent naming, the implied semantics, and the use of concepts and relations, which have to be established by a co-ordinated, tool-supported construction and maintenance of these models.

The dynamic world model reflects the current states of the application and the environment, the user’s emotional, dispositional and motivational situation, and their development over time. Beyond that, the world model embodies the connection between the sub-symbolic processing of signal streams from various sensors, which collect audio, visual, and physiological data, and the inference-based information processing on the symbolic level. It consists of a Markov Logic Network and a multi-layered Markov-Model. The network encodes rules that are derived from the symbolic knowledge model and represent relevant information about the user, the application, and the environment. Its role is twofold. On the one hand, it enables the multi-layered Markov-Model that analyzes and interprets sensor data to put the recognized data in context, thereby improving the quality of recognition results. On the other hand, perception can be initiated and guided this way. This is particularly important when active sensing is required to support higher-level decision making on the symbolic level or when recognition results are ambiguous and need to be specified.

Figure 1.4 presents the knowledge architecture and the processing of knowledge in Companion-systems (see also Chap. 2). There is a close mutual interaction and exchange of information not only between the static and dynamic models, but also between the functional components such as Planning and Acting and Interaction and Dialog and the models. Once individualized assistance is requested, information from the user model serves to configure the planning component as well as the interaction and dialog components by providing respective user information. This way, it is guaranteed that the functional behavior of the system, its dialog strategies, the modalities, and media for interaction are geared to the needs of the particular user.

The sensor data processing modules recognize current parameters of the user and the environment over time. This information is further processed and combined with input from the plan execution, dialog, and interaction components. It leads to declarative descriptions of the environmental state and the user situation, which are stored and continually updated within the world model, thereby enabling the system to immediately react to changes of the state and the user situation.

Initially, the entire knowledge model is set up by a modeler. The knowledge stored in this model is not genuinely static, however. It may change in the long run and therefore needs to be updated from time to time. As far as the user model is concerned, this process is supported by the system itself. The world model stores a history of the user’s reactions to the system’s actions as well as to its interaction behavior. If for quite a number of episodes it is observed that the user behavior deviates from what is known about his or her expertise or preferences, for example, the user model is updated accordingly.

1.4 Planning and Decision Making

Companion-systems provide their technical functionality in a way such that each user is served individually, according to his or her specific needs, requirements, abilities, expertise, and current situation. This demands flexibility from the system’s functional behavior, its responsiveness, and its ability to reason and reflect on its own behavior as well as on the user’s reaction. Companion-systems meet these requirements by being provided with the high-level cognitive abilities of planning, reasoning, and decision making. Here, Artificial Intelligence planning technology [7] plays an essential role. Based on declarative descriptions of states, actions, and tasks, it allows for the construction of plans—courses of action—that are appropriate to reach a specified goal or accomplish a certain task.

Plans serve different purposes, depending on the particular application and the current Companion-task. They may be executed automatically to control a technical system directly; they may be used to instruct a human user on how to operate a technical system; or they may function as a guide when assisting a user in the accomplishment of a complex task (cf. Chaps. 5 and 6). Furthermore, the ability to plan enables Companion-systems to generate plans of action in close co-operation with a user by following a mixed-initiative strategy (see Chap. 7). Figure 1.5 shows an application example, where a Companion-system and a user co-operatively develop a workout plan.

Fig. 1.5
figure 5

A Companion-system and a user co-operatively generate a workout plan

Basically, plans are generated through causal reasoning. Actions are described by pre- and postconditions. The preconditions indicate in which states the action is applicable; the postconditions specify the effects of the action, i.e. the state changes it raises. Starting from a given goal and a description of the initial state, the planning component of the system selects appropriate actions from the action portfolio of the planning model. Appropriate actions are those whose effects coincide with the goal or with subgoals. Subgoals are preconditions of actions that in turn enable the execution of actions relevant to achieve the goal. By (partially) ordering the actions according to their causal dependencies, a plan is automatically generated. Executing this plan in the initial state finally leads to the goal. Action selection and the ordering of actions are determined by planning strategies and heuristics. They account for the system’s ability to show a functional behavior that is customized to the individual user, his or her personal situation, and the current environmental situation.

An Artificial Intelligence planning approach particularly well suited for Companion-systems is Hybrid Planning, which combines causal reasoning with reasoning about hierarchical dependencies between actions (cf. Chap. 5). Here, the planning model distinguishes between abstract and primitive tasks. Primitive tasks are actions that can be executed immediately, while abstract ones have to be refined over a cascade of hierarchy levels. For each abstract task the model provides one or more methods for refinement. A method represents a course of abstract and/or primitive tasks suitable to accomplish the respective abstract task. This way, predefined standard or individual solutions for problems and tasks can be specified in the planning model. This provides even more flexibility for the planning component of a Companion-system. It can decide to just use a predefined standard plan and thus speed up its response time, for example; it can modify a standard plan to meet specific user requests; or it can build a completely new plan from scratch.

Based on the plans of action a Companion-system creates and uses for support, feedback on the appropriateness of the system’s functional behavior can find its way back into the underlying model. If it turns out, for example, that users regularly change their strategy of action or deviate from the procedures the system proposes, a careful analysis of this behavior may induce a modification of the planning model or the underlying user models, respectively. Chapter 8 discusses the issues of strategy change from a neuro-biological perspective.

One of the most prominent proficiencies that distinguishes Companion-systems from conventional technical systems as well as from today’s cognitive systems is the ability to explain their own behavior. This ability is essential for implementing the Companion-characteristics of competence and trustworthiness. Explanations of the system’s operations or the instructions for action it presents to the user are automatically generated by deriving and verbalizing information about causal and hierarchical relationships between actions. This information is obtained by analyzing the underlying plan of action and its generation process. The plan explanation technique is introduced in Chap. 5, whereas Chap. 7 presents a most useful combination of plan and ontology explanations.

Another essential functionality of Companion-systems is to adequately react if the execution of a plan fails. The reasons for an execution failure can be manifold and need to be ascertained carefully. To this end, information from various sources is used. It includes sensed data provided via the dynamic world model and information obtained through a multi-modal dialog with the user. Depending on the reason for failure the user is accordingly instructed and the plan is automatically repaired so as to provide a way out of the failed situation and to finally reach the original goal.

Chapter 24 describes a prototypical Companion-system where the functionalities of plan generation, plan explanation, and plan repair are integrated with components for multi-modal user interaction and dialog. This system provides advanced assistance to users in the task of setting up a complex home theater.

1.5 Interaction and Dialog

A main asset of Companion-systems is their dialogic nature. This characteristic reaches far beyond simple slot-filling interaction, but entails sustaining the dialogue with the user and conducting meta-dialogs.

Researchers therefore investigate the cognitive abilities that determine the design of the interaction and dialog between a human user and a technical system. Humans interact with their environment in multiple ways and, in doing so, they may use almost all of the senses, cognitive abilities, and motor skills available. Consequently, a Companion-system, as a peer communication and interaction partner to the human, is able to interact with its users through different modalities and a variety of input and output devices [8, 9], cf. Chap. 10 Modalities and media are determined according to the current situation and the individual user model that indicates the user’s interaction preferences, cf. Chap. 11 This addresses the Companion-characteristics of individuality and adaptability. A prominent example is information seeking behavior, cf. Chap. 3

Small latency in interaction is vital to ensure availability and cooperativeness of the perceived interaction. In a functional imaging study, it was observed that an unexpected delay of feedback by only 500 ms has an equally strong effect on brain activation as a complete omission of feedback [11]. Hence additional neural resources are needed in such potentially irritating situations, which also leads to further cognitive load and therefore should be avoided.

Understanding the interaction between a user, or multiple users, and a Companion-system as an adaptive dialogue is the natural choice, as it is made up of a sequence of consecutive interaction steps, including meta-dialogues when the train of mutual understanding is interrupted (cf. Chap. 9). An example of such a scenario is shown in Fig. 1.6. This cumulative interaction structure forms the basis for the determination of user intentions by the Companion-system. Under laboratory conditions, this calls for the development of an experimental paradigm involving the interaction history and presenting dedicated and reproducible stages of interaction, as presented in Chaps. 12 and 13 In a very practical industrial setting, Companion-systems have been used and evaluated as machine–operator assistance systems (Chap. 23).

Fig. 1.6
figure 6

Users interacting with each other and with a Companion-system

For an effective and constructive dialog, the system is not only able to recognize the current dialog situation and user’s disposition, but can choose among various strategies to keep the dialog going. Therefore, Companion-systems change the interaction strategy in the course of action, leading to evolving search user interfaces (cf. Chap. 4). Neurobiological fundamentals of strategy change are a basis for understanding and designing the dialog accordingly, see Chap. 8.

In this context, the Companion-characteristic of trustworthiness is of particular importance. One means to show trustworthiness is the ability to conduct explanation dialogs [14], i.e., a Companion-system is able to explain its own behavior and the situational circumstances that can be considered as a cause (cf. Chap. 7). The nature and effect of such explanatory interventions can be measured when comparing to non-intervening situations, which was the subject of large field studies [12], cf. Chap. 13 Here, the focus was laid on identifying strategies which avoid mistrust and resistance. A main aspect was to investigate which intentional stance of the Companion-system is insinuated by the user.

1.6 Recognizing Users’ Situation and Disposition

To ensure that the functionality of Companion-systems is customized to the individual user, adapting to his or her emotional state and current behavioral disposition, a pivotal facet consists of the cognitive abilities of perception and recognition of the users’ situation and disposition. The technology must be able to recognize and appropriately interpret any relevant changes in the environmental conditions as well as the user’s state on a continuous basis.

Changes in behavioral disposition and emotion occur in various ways, hence a wide range of parameters are used to detect them. They include prosodic and linguistic characteristics (Chap. 20), articulated motion (Chap. 17), head and body positioning and gestures (Chap. 16), facial expressions (Chap. 18), as well as psychobiological data. In total, a fully multimodal interpretation [17, 18] of the situation is required, see Chaps. 10 and 19 The dynamic evolution and prediction of emotions, dispositions, and moods is best captured under modeling hypotheses as detailed in Chaps. 4 and 21

It is vital that the multi-modal recognition processes include location and time components, take into account the operational context and consider background information. The latter includes, among other things, typical behaviors and emotional patterns of the individual users and their environmental and situative disposition. Interactions between users and objects are modeled on the basis of the knowledge base of the Companion-system (Chap. 15) and serve as an environmental perception system. The environmental conditions of the user and the specific user parameters are then captured reliably and dynamically, interpreted and subsequently transformed into a total state description in a cascade of recognition and fusion processes (Chap. 19). In a dedicated demonstration scenario of a ticket vending task, the interplay of the various modalities and the subsequent information fusion aspects have been carefully studied (Chap. 25). It was revealed how stepwise dialogs are sensitive and adaptable within processing time to signals and background data, resulting in a user-adaptive and very efficient Companion-system.

Realization of Companion-systems must be based on real-world situational aspects and emotional processes in interactions between humans and computers, and it must make available system elements for realization of these effects. This is achieved through investigation and provision of decision-relevant and actionable corpora. The experimental settings must include non-linguistic, human behaviors, which are induced by a natural language dialog with delay of the commands, non-execution of the command, incorrect speech recognition, offer of technical assistance, lack of technical assistance, and request for termination and positive feedback [20]. Data acquisition is designed in a way such that many aspects of User-Companion interaction that are relevant in mundane situations of planning, re-planning, and strategy change (e.g. conflicting goals, time pressure, …) are experienced by the subjects, with huge numbers and ranges in quality of recorded channels, additional data from psychological questionnaires, and semi-structured interviews [16] (Chap. 13). Established Wizard-of-Oz techniques as well as fully or semi-automated interactions have been employed, leading to general insights in the design and annotation of emotional corpora for real-world human-computer-interaction [10]. As multi-modal annotation is a novel and demanding task, software support systems such as ATLAS and ikannotate have been developed and tested (Chap. 19).

Eminently, corpora are a rich source of studying general feedback, planning and interaction activities in multiple modalities in real-world Human-Machine Interaction (HMI), see Chap. 14 Main assets of data for designing Companion-systems are elaborated hardware synchronicity over many modalities recorded in multiple sensory channels, and a setup with dedicated and standardized phases of subject-dispositional reactions (interest, cognitive underload and cognitive overload) as well as standardized HMI-related emotional reactions (such as fear, frustration, joy). Figure 1.7 shows such a multi-sensorial setup of a data recording during natural interaction. Corpora entailing these standards with up to ten modalities have been realized [18]. A careful system design serves as a model architecture for future Companion-systems, as detailed in Chap. 22.

Fig. 1.7
figure 7

Multi-sensorial setup of a data recording where a user interacts with a Companion-system