Introduction

The study of nonverbal communication has grown rapidly over the last several decades. Thousands of empirical studies have focused on the role of nonverbal communication in person perception, emotions, altruism, liking and love, aggression, prejudice and discrimination, influence, and other social processes (see handbooks by Hall and Knapp 2013; Matsumoto et al. 2016; Manusov and Patterson 2006). And there has been a long history of theories addressing different aspects of nonverbal communication. Some of these theories have focused on the give-and-take of nonverbal communication in face-to-face interactions. That is, how do we explain and predict the course of nonverbal interactions? In fact, it is now over 50 years since Argyle and Dean (1965) published their seminal theory of nonverbal exchange. Their affiliative-conflict theory (aka equilibrium theory) laid the foundation for subsequent formulations (Patterson 2006).

The present article builds on the extensive empirical and theoretical work in nonverbal communication to outline a systems model of dyadic nonverbal interaction. More specifically, this effort is designed to provide a broad framework for identifying and describing the processes and factors shaping the course of dyadic nonverbal interactions. Although some of these issues have received brief, earlier treatments (Patterson 2013, 2014), this article elaborates on the relevant processes and integrates them into a comprehensive systems model. Because this new model builds on the previous theoretical work, it is helpful to examine how the evolution of earlier theories provided the foundation for this systems model.

Changing Theoretical Landscape

Early Theories

Empirical research on nonverbal communication grew rapidly in the 1950s and 60s. Most of the published studies focused on single behaviors in isolation. That is, some studies measured or manipulated gaze, others interaction distance (personal space), and still others facial expression or touch. Often, this research focused on the effects of culture, gender, or personality on a specific behavior (Patterson 2014). Thus, this kind of work examined group and individual differences on specific nonverbal behaviors. Argyle and Dean (1965) were the first researchers, however, to take an interactive perspective on the role of multiple behaviors in their equilibrium theory of interpersonal intimacy.

Equilibrium theory proposed that individuals signaled their intimacy with a partner through several behaviors, including distance, gaze, smiling, and self-disclosure. The core prediction of equilibrium theory was that, if there were a deviation from the comfortable level of behavioral intimacy in a particular relationship, one person (or potentially both) would compensate by adjusting one or more of the component behaviors to restore the equilibrium. For example, a stranger who approached too closely might lead the partner to reduce both gaze and smiling. In this way, the behavioral intimacy between the partners moves toward equilibrium. Although there was considerable empirical support for equilibrium theory (Patterson 1973), the theory was unable to explain situations in which partners did not compensate for changing nonverbal intimacy, but reciprocated instead. For example, good friends or lovers often match or intensify the partner’s initial close approach, gaze, and touch. This issue prompted new theories that could explain both compensation and reciprocation.

By the 1970s, the cognitive revolution in psychology was beginning and its impact was evident in developing theories of nonverbal interaction. First, intimacy-arousal theory (Patterson 1976) proposed that a substantial change in one person’s behavioral intimacy precipitated an arousal-attribution process (e.g., Schachter and Singer 1962). Specifically, if a partner’s change in nonverbal intimacy were sufficient to precipitate arousal, then a cognitive labeling or attribution process identified the type of emotional experience. Negative affect, such as anxiety or fear, led to compensation, whereas positive affect, such as liking or love, led to reciprocation. Next, expectancy-violations theory proposed that, when a partner violated expectations with too much or too little behavioral intimacy, arousal was precipitated, leading to an evaluation of the violation (Burgoon 1978). Positive evaluations produced reciprocation of the partner’s behavior, whereas negative evaluations produced compensation for the partner’s behavior.

A different approach to expectations was advanced in discrepancy-arousal theory (Cappella and Greene 1982). According to Cappella and Greene, the kinds of cognitive processing described in both the intimacy-arousal and expectancy violations theories were too slow to account for the almost immediate behavioral adjustments people made in response to a partner’s behavioral change. Rather, arousal alone, determined by an immediate discrepancy evaluation between the expected and actual behavior of the partner, precipitated an automatic sequence determining compensation or reciprocation. Specifically, as the discrepancy between expected and observed behavioral intimacy increased, so did arousal. For example, moderate discrepancy precipitated moderate levels of arousal that, in turn, were experienced positively, leading to reciprocation. In contrast, high discrepancy precipitated high levels of arousal that were, in turn, experienced negatively, leading to compensation. In this way, Cappella and Greene’s rapid mediating processes anticipated the emphasis on automaticity of social behavior that became prominent in the 1990s.

All of these early theories were, however, distinctly limited in three important ways. The first, and most basic, limitation was that the theories failed to address the broader utility of each person’s nonverbal behavior in interaction. That is, they focused only on how partners managed comfortable levels of nonverbal involvement through patterns of compensation and reciprocation. In other words, how do partners negotiate appropriate nonverbal involvement in their interactions? Expressing and managing nonverbal involvement is one issue in interactions, but only part of the adaptive utility in the nonverbal give-and-take between partners.

A second limitation flows from the limited scope of the early theories. Specifically, all of the theories were affect-driven. That is, the valence of experienced affect determined the adjustment to a partner’s change in nonverbal involvement. Positive affect precipitated reciprocation of the partner’s behavior, whereas negative affect precipitated compensation. But occasionally, people behave in a fashion inconsistent with their underlying affect because it is useful to do so. For example, the demanding boss might approach closely, gaze intently, and put a hand on the employee’s shoulder as a “helpful reminder” that the report is due later in the day. The target of such a tactic might feel quite uncomfortable but still reciprocate by returning a gaze and smile, while reassuring the boss that the report will be finished on time. This kind of nonverbal adjustment would run counter to the compensation predicted by the early theories.

Finally, all of the theories were reactive in nature. That is, they attempted to explain and predict how one person might make a nonverbal adjustment in response to a partner’s behavior. Consequently, they started with a given behavior pattern of one person and proposed mediating processes that determined the partner’s nonverbal adjustment. Thus, the theories did not address the broader utility of initiating nonverbal interaction.

Functional Approach

To address the limitations of the early theories, the functional model of nonverbal behavior (Patterson 1982, 1983) proposed that people initiate patterns of nonverbal interaction to serve different social functions. That is, the focus of the functional model was on how people initiated patterns of nonverbal behavior and sought to modify their partners’ behavior to achieve those goals. For example, several basic functions of nonverbal behavior were proposed, including the following: (1) providing information, (2) regulating interaction, (3) expressing intimacy, (4) exercising influence, and (5) managing impressions. Thus, people are doing more than managing their levels of nonverbal involvement in interactions. They are behaving, consciously or unconsciously, to achieve specific goals in interaction.

In addition, the functional model proposed that biology, culture, gender, and personality are all determinants of our patterns of interaction. In emphasizing that individuals both initiate behavior and react to the behavior of their partners, this model moved beyond simply predicting reactive patterns of compensation or reciprocation (Patterson 1982, 1983). Instead, the outcome metric in the functional model was the stability of nonverbal exchange. In general, as interaction partners were more similar in biological hardwiring, cultural background, personality, and immediate goals, they were more likely to share common expectancies and behavioral predispositions that facilitated increased stability.

A different framework, enlisting a kind of functional emphasis, was proposed in interaction adaptation theory (Burgoon et al. 1995). In this theory, biological drives and needs, rather than explicit goals, were central determinants of nonverbal behavior. In addition, past experience, individual characteristics, and expectancies about the partner shaped the dominant behavioral predispositions. But over the course of an interaction, individuals also adapted their behavior to that of their partners. Subsequent behavioral adjustments, either reciprocating or compensating for the partner’s behavior, reduced the behavioral discrepancy between partners and facilitated stability in the interaction.

All of the theories discussed up to this point focused on explaining the behavioral side of nonverbal communication in interactions. One functionally-based theory addressed the receiving or social judgment side of nonverbal communication. The ecological theory of social perception (McArthur and Baron 1983) proposed that many rapid judgments of others focused on social affordances. Specifically, facial appearance provided important information about other people. These rapid judgments were selected over the course of evolution because they were relatively accurate and adaptive for survival of the species. For example, the baby face characteristics of infants—relatively large forehead, big eyes, and rounded chin—signal dependence and a need to be nurtured to adult caretakers (Zebrowitz 1997). The functional approach of these newer theories addressed some important limitations of the earlier theories. Nevertheless, dramatic advances in social cognition research and theory led to a new direction in explaining the complexities of nonverbal interactions.

Parallel Processes

In psychology, the dominance of social cognition research and theory in the 1980s and 1990s shifted attention away from social behavior to covert cognitive processes and social judgments. By the mid-2000s, the trend over time was evident in the types of articles published in major journals, such as the Journal of Personality and Social Psychology and Personality and Social Psychology Bulletin. Specifically, the percentage of studies measuring or manipulating behavior in these two premier journals dropped precipitously from the mid-1960s to the mid- 2000s (Baumeister et al. 2007; Patterson 2008). Ironically, however, this new emphasis on social cognition was clearly relevant for nonverbal communication research and theory. Although social cognition researchers typically did not use the term “nonverbal communication,” one major focus of their work—how impressions were formed from appearance and behavior—coincided with the receiving side of nonverbal communication. Furthermore, there was a growing recognition of the ubiquity, and often primacy, of automatic social judgments (e.g., Bargh 1989, 1990; Gilbert and Krull 1988) over the controlled, effortful ones that were the focus of earlier attribution theories (e.g., Jones and Davis 1965; Kelley 1967). It is also important to appreciate that these automatic judgments were not considered arbitrary, but rather, relatively accurate and adaptive (Fiske 1992; McArthur and Baron 1983). The utility of rapid judgments was reinforced in the growing research on “thin slices of behavior” (Ambady and Rosenthal 1992).

Whereas social cognition theories focused on the receiving side of nonverbal communication, the interaction theories, discussed in the last section, focused primarily on the sending or behavioral side of nonverbal communication. It became apparent to me that a comprehensive theory of nonverbal interaction required an integrated view of both sending and receiving, one that provided for automatic and controlled processes in both tracks. The parallel process model of nonverbal communication was an attempt at such an integration (Patterson 1995). The parallel process model combines the encoding and decoding processes of nonverbal communication into a single system, driven by a common set of determinants and mediators (Patterson 1995, 2001). The basic factors and processes of the model can be seen in Fig. 1. First, the determinants on the left side of Fig. 1, borrowed from the functional model (Patterson 1982, 1983), identify the most important factors shaping habitual patterns in the sending and receiving of nonverbal communication. In effect, the combination of biological predispositions, culture, gender, and personality influences not only our choice of settings and partners, but also our expectancies, affect, goals, and behavioral tendencies in interactions. Goals are particularly important in directing the parallel processes and are a reflection of the functional basis of nonverbal communication.

Fig. 1
figure 1

The parallel process model of nonverbal communication

The parallel processes are illustrated in the right side of Fig. 1, with the social judgment track (receiving) on the top and the behavioral track (sending) on the bottom operating simultaneously in interactions. In general, automaticity is the default setting for both behavioral and social judgment tracks. But depending on the availability of cognitive resources and the motivation to apply them, the social judgment and behavioral tracks can engage the more effortful, controlled processes in the middle of the model. Because the same pool of cognitive resources serves not only interaction processes, but also other everyday activities, resources applied to worrying about bills, children, or problems at work are not available for controlled sending and receiving processes. In turn, decreased availability of cognitive resources increases the probability of automatic sending and receiving processes. In addition, it is important to appreciate the interdependence of the two tracks because they are typically driven by common goals. Thus, the parallel process model represented the dynamic relationship between the goal-driven sending and receiving sides of nonverbal communication and the fluctuating role of automatic and controlled processes in the pursuit of goals. In spite of these improvements over earlier theories, the parallel process model was deficient in two important ways: (1) adequately representing a dyadic-level analysis and (2) a framework that specified the broader environmental context for nonverbal interactions. Both of these issues are addressed in a systems approach to nonverbal interactions.

Systems Approach

The earlier models represented the factors and processes involved in nonverbal communication from the perspective of one person in an interaction. For example, in illustrating the parallel process model in Fig. 1, the context and determinant factors affecting a person might be identified on the left side of the figure, covert mediating processes in the middle, and outcomes (nonverbal behavior or social judgments) on the right side. Feedback loops, indicated by the lines with arrows, might link the outcomes back to changes in the mediators, and even to the determinants. Of course, this represents only one side of an interaction. It is assumed that a partner’s contribution to an exchange could be represented in a comparable, but separate, flow of factors, processes, outcomes, and feedback loops. But how do we represent the contributions of both individuals that capture the separate, yet coordinated, elements of nonverbal communication from each person?

In addition, earlier theories failed to consider adequately the important role of the environment in the nonverbal interactions. That is, face-to-face interactions necessarily occur in a surrounding environmental context that helps to shape the course of interaction. This neglect of the environment’s influence is not surprising, given the cognitive emphasis in psychology over the last several decades. That is, the primary focus on research and theory was inward on the unseen cognitive processes, rather than outward on the reciprocal relationship between social behavior and the environment (Oishi and Graham 2010; Patterson and Quadflieg 2016). Thus, a second question provides a further focus in laying the foundation for this systems model: How do we integrate the central role of the environment in explaining nonverbal interactions? First, let us consider how the behaviors and covert processes of both parties in an interaction might be represented.

Dyadic Representation

The key to representing a dyadic framework lies in recognizing the complementary roles of both partners in interactions. That is, each person is simultaneously sending and receiving nonverbal signals. Thus, the appearance and nonverbal behavior in the sending track of Person 1 constitute the stimuli that register in the receiving track of Person 2. And in a similar fashion, the appearance and nonverbal behavior in the sending track of Person 2 constitute the stimuli that register in the receiving track of Person 1. It is important to recognize, however, that the interaction per se is behavioral. Although cognitive processes are important, partners do not interact with their cognitions and social judgments. With this foundation, the core dyadic processes of this systems model can be seen in Fig. 2. Specifically, the perceptual processes of each person refer to the reception of incoming information from the partner’s appearance and behavior, that is, the receiving track. In turn, the appearance and behavior of each person constitute the sending track. The simultaneous coordination of both parties’ behavior is represented in the interaction cell. The particular outcomes in the core processes vary widely depending on what the each person brings to the interaction and the important role of immediate environment. These latter influences that shape the core processes might be called peripheral factors and processes. That is, in spite of their critical impact on the core processes, it might be useful first to describe them separately and then integrate their links to the core processes.

Fig. 2
figure 2

Core dyadic processes

Before addressing the other components and processes in the model, let us take a closer look at the nature of an interaction. For most people, a face-to-face interaction means a conversational exchange. The sociologist Erving Goffman (1963) proposed, however, that conversations were only one form of interaction. He used the term focused interaction to identify these verbal exchanges because the conversation was the focus of the interaction. In contrast, Goffman emphasized that many interactions do not involve conversations. These unfocused interactions involve individuals who simply share a common setting and make mutual behavioral adjustments to the close presence of others. Unfocused interactions are common in a wide range of public and private social settings. For example, unfocused interactions are frequent in the aisles of a grocery store, in the checkout lane at a drugstore, choosing a seat on public transportation, or sharing an elevator ride with a stranger. One of the most common settings for unfocused interactions is sidewalks. Strangers approaching one another often make rapid changes in gaze direction and facial expression as they are passing one another (Patterson et al. 2002). In fact, for most people who live in relatively populated areas, unfocused interactions are probably more common than focused interactions. Thus, the interactions represented in the core processes involve both focused and unfocused interactions. They may be long-lasting or as fleeting as the two seconds of strangers approaching and passing one another on the sidewalk.

Power of the Environment

Prior to the ascendance of the cognitive paradigm, ecological psychology (e.g., Barker 1968; Wicker 1979) provided a perspective on the environment that is still relevant today for understanding social behavior (Stokols 2018). In general, ecological psychology emphasizes studying human behavior in its naturally-occurring physical and social environments. The basic assumption was that the immediate environment was central in shaping behavior. The construct of a behavior setting from ecological psychology is particularly relevant for the present systems model of nonverbal communication. A behavior setting is a bounded geographical area in which human and nonhuman (physical environment and social norms) components interact in a coordinated fashion to facilitate an ordered series of events and behaviors over a limited period of time (Wicker 1979, Chapter 1).

Most of our everyday activities can be viewed in terms of behavior settings. These might include a trip to the grocery store, a workout at your local gym, a church service on the weekend, or a parent-teacher meeting at your child’s school. In each of these settings, a specific physical environment combines with a set of social norms to constrain behavioral options. Thus, doing a “workout” at the church service would typically be censured. A given individual with particular personality characteristics and attitudes is an actor in a wide variety of settings, but her behavior would change across different settings (see Shanahan et al. 2014, for an example dealing with personality and context effects on health across the life-span). In addition, people select settings and settings select people. Differing interests, behavioral predispositions, and goals steer one person to attending a hockey game, another to a chess tournament, and a third to a church service. But settings can be selective of people with particular experiences, competencies, or even substantial resources. For example, in spite of my interest in attending Amazon’s corporate board meeting, there is a high probability that Amazon will not welcome me to the inner circle. Thus, the cumulative constraints of the physical environment, social norms, and complementary self- and setting pressures increase the likelihood that people within a given setting are more similar to one another than are those sampled from a wide range of settings.

An important complement to ecological psychology from the same era was the more focused analysis provided by Altman’s (1975) insightful book, The environment and social behavior: Privacy, personal space, territory, and crowding. Altman proposed that social behavior, much of it nonverbal in nature, was closely linked to a wide range of environmental features. The management of privacy was a central theme, with reciprocal processes linking verbal and nonverbal behavior on one side to architecture, design, and spatial configuration on the other. Furthermore, privacy was viewed not in black and white terms of strict avoidance and isolation, but rather as a continuum from low to high social contact. Over time, as individuals’ momentary preferences for privacy varied, they changed their verbal and nonverbal behavior and manipulated environmental features to achieve their momentary level of desired privacy. Thus, in this dynamic system, the environment and social behavior were interdependent in negotiating everyday life.

Specific physical features of behavior settings include more than the obvious constraints of available space and the design and arrangement of rooms. Many of these features are discussed at greater length in a recent chapter on the physical environment and nonverbal communication (Patterson and Quadflieg 2016), but a brief sample is appropriate here. First, ambient lighting in settings affects both social behavior and judgments of others. For example, dimmer lighting can lead to greater distance between strangers, but to closer, more intimate interactions between friends and lovers. Dimmer lighting can also decrease the visual detail in perceiving others and increase the probability of stereotyping, rather than seeing others as distinct individuals (Cloutier and Macrae 2007). Transparent glass barriers physically separate people, but have mixed visual effects. That is, people generally prefer the feeling of visual spaciousness that glass barriers provide but, at the same time, experience less privacy from surrounding others (Marquardt et al. 2015). Next, warmer (but not hot) temperatures increase perceived similarity with a stranger. Pleasant, citrus odors can facilitate trust and reciprocity between strangers (Liljenquist et al. 2010). Most people probably prefer acoustically-pleasant or quiet environments in interactions. But when people do have to interact in noisy environments, they are likely to move closer to compensate for the increased difficulty in understanding speech (Lloyd et al. 2009). And the physical traces and objects left by occupants of a setting provide information about their attitudes and interests (Gosling et al. 2008; Webb et al. 1966). In turn, this information affects visitors’ impressions and expectancies regarding the setting’s occupants. Thus, a wide range of ecological features affects the sending and receiving processes in nonverbal communication in subtle, and sometimes not so subtle, ways.

Finally, it is important to appreciate that the environment-behavior link is a two-way street—with feedback loops. The environment not only shapes nonverbal communication but, sometimes, people also act on their settings to effect changes in interaction processes. Later, we will see how various manipulations of our environments can affect subsequent interactions. Let us turn now to the peripheral components of the model, that is, the elements that provide the context for the core processes. Then the various components of the total system can be assembled and their dynamic links described.

Peripheral Components and Processes

For both partners, the primary determinants listed earlier—biology, culture, gender, personality—have some influence in individuals selecting particular settings and pursuing specific social goals. But settings can also vary in their accessibility to different types of individuals. That is, setting can select individuals. Some settings, such as an expensive country club, are very exclusive, whereas other settings, such as your local grocery store or library, are much less exclusive. This mutual selection is represented in the bidirectional link between the determinant factors of each person and the setting in Fig. 3. Note that, in Fig. 3, links between components and processes that are primarily one-way are represented with arrows, whereas bidirectional links lack an arrow.

Fig. 3
figure 3

Peripheral processes in the systems model

Next, goals are especially important, not only for the core dyadic processes but also in the selection of settings. It is rare that we are in a particular setting by pure chance. We routinely select settings in the pursuit of specific goals. The job interview requires going to the personnel manager’s office, whereas coffee-on-the run means the drive-thru window at Starbucks. Sometimes the goals directing these choices are deliberate, but often such goals are activated and implemented without conscious awareness (Bargh et al. 2001). Furthermore, once people are in a specific setting, various setting characteristics (e.g., physical design, seating arrangements, acoustics, and lighting) can precipitate conscious and unconscious goals (Patterson and Quadflieg 2016).

The use of facial expressions to achieve social goals is the core process in Fridlund’s (1994, 2017) Behavioral Ecology View (BECV) of expressions. BECV stands in contrast to the cardinal assumption of Basic Emotions Theory (BET, Ekman 1999) that facial expressions spontaneously signal interior emotional states. That is, according to BECV, signaling that forecasts one’s next interaction move and so influences a partner to act in a desired way is much more salient and adaptive than signaling emotional states. For example, an “angry” face in BET is a deterrent in BECV—stop what you are doing, or else. A “happy” smile in BET is, in BECV, a signal to influence the partner to expect affiliation and cooperation from the signaler and to act accordingly. Thus, no link between emotions and facial expressions is assumed in BECV. Rather, the sending and receiving of facial expressions are the means through which partners “negotiate” what is likely to happen in the pursuit of their social goals. BECV is consistent with the proposed dynamics of the present systems model. Nevertheless, the emphasis on patterns of behavior, and not just isolated behaviors, in the systems model would extend the signaling of social goals beyond just facial expressions. Thus, the threat face is likely to be supplemented by increased body tension and staring at the partner. In other words, although facial expressions may be primary, they are still just part of a broader pattern of nonverbal signals operating in the service of social goals.

In addition, people can have multiple goals, for example, wanting to be liked by a partner, but also trying to influence the partner to a particular course of action. Our patterns of nonverbal communication may be constrained also by the broader meta-goals of efficiency and appropriateness in communication (Berger 1997, Chapter. 2). The meta-goal of efficiency is similar to Fiske and Taylor’s (1995, Chapters 4–7) characterization of perceivers as “cognitive misers” relying on automatic judgments of others that minimize cognitive effort. Of course, efficiency also applies to the sending track, with automaticity dominating on the behavioral side too. The meta-goal of appropriateness is reflected in most people being constrained by social norms and customs in interacting with others, consistent with the power of settings. Finally, goals can change over the course of an interaction.

Next, cognitive-affective processes represent both what individuals bring to the interaction and, later, the processes activated in the interaction. The former processes might be described as residual or baseline, whereas the latter might be described as dynamic, that is, changing during the course of interaction. Each person enters the interaction setting with particular moods, attitudes, expectancies, and behavioral predispositions shaped by the determinants and past experience. In addition, the dynamic changes in cognitive-affective processes over the course of interactions are particularly important for determining adjustments in both behavior patterns and social judgments. These dynamic changes are elaborated in the next section. At this point, it is useful to appreciate that these dynamic processes can be automatic, controlled, or some combination of both. The automatic-controlled distinction is especially relevant for the last component—cognitive resources.

The availability of cognitive resources affects how each person’s contribution to the dyadic exchange evolves. A basic assumption here is that there is a finite pool of cognitive resources for managing all of our activities, whether they relate to communication or to other matters. In addition, the availability and activation of cognitive resources are influenced by goals and cognitive-affective processes, as seen in Fig. 3. The case of goals might be exemplified by a job interview, in which an explicit goal of impression management is likely to pull more cognitive resources than an implicit goal of having a casual lunch with a friend. Within this interview situation, negative thoughts and feelings are likely to draw more cognitive resources than are positive ones. Because these links are bidirectional, a lack of adequate cognitive resources can limit both a re-evaluation of goals and an elaboration of ongoing cognitive-affective processes.

Thus, these peripheral component processes and their relationships to one another, on the part of both individuals, set the stage for the dyadic interaction. In the next section, I discuss the links among these components and processes and how they combine to create the dyadic interaction. That is, the pieces of the systems model are assembled and what they imply for dyadic nonverbal communication is elaborated.

Dynamics of the Systems Model

In assembling the components of the systems model, a brief summary of the peripherallinks is useful. Each person’s biology, culture, gender, and personality have some influence on the choice of settings, goals, baseline cognitive-affective processes, and behavioral predispositions. In turn, settings exert variable selection pressures on individuals who might enter them. That is, some individuals will gain access to particular settings and others will not. In addition, setting features and social norms can restrict behavioral options, activate and/or modify goals, and shape cognitive-affective processes. Furthermore, because the link between settings and goals is bidirectional, specific goals can lead to manipulating the setting design and features to facilitate achieving a goal. For example, an office holder can structure seating options for visitors that are more friendly and open, or ones that signal the office holder’s dominance (Patterson 2011, Chapter 3). Next, the links between cognitive resources and both the active goals and the ongoing cognitive-affective processes are bidirectional. That is, goals and cognitive-affective processes influence the availability of cognitive resources and, in turn, cognitive resources affect the potential elaboration of cognitive-affective processes and the modification of goals. These links are represented in bidirectional line between the components in Fig. 4. Now, as individuals are “primed” by the coordinated peripheral processes, they share a common setting and the stage is set for the interaction. Let us return to the core dyadic processes and consider the unfolding interaction.

Fig. 4
figure 4

The systems model of dyadic nonverbal communication

Perception is the Key

For the sake of simplifying Fig. 4, perceptual processes and cognitive-affective processes are collapsed into a single cell. Each person’s contribution to an interaction first requires some perceptual input from the other person. Typically, this would involve some relatively coincident appearance cues and behavior. It is possible, however, that appearance and behavior might register separately. When the unseen stranger bumps into you from behind, you notice the bump before you pull away and turn to see the offender. Of course, one might argue that any appearance input also involves some behavioral information.

In the typical case, initial appearance and behavioral cues of one person typically occur at roughly the same moment in the other person’s perceptual processes. This can be seen for both persons in the middle section of Fig. 4. But both parties’ perceptual processes are influenced by the cognitive-affective processes they bring to the interaction. For example, the socially anxious person is more likely to perceive ambiguous behavior from a partner as rejection than a non-anxious person would (Patterson and Ritts 1997). Someone prejudiced toward an out-group partner may be more likely to attribute anger to an out-group partner’s poker-face than would someone who is not prejudiced (Hugenberg and Bodenhausen 2003). Thus, the link between perceptual processes and cognitive-affective processes is bidirectional.

The utility of thinking in managing our social worlds is generally accepted. For example, Fiske (1992) echoed William James’s (1983/1890, pp. 959–960) observation that “thinking is for doing.” Nevertheless, even before “thinking” starts, we might also say that perception is for doing (e.g., Brunswik 1955; McArthur and Baron 1983). Consequently, before sequences involving both automatic and controlled cognitive processes are addressed, a simpler and more immediate interaction sequence merits attention. That is, there is substantial evidence for the sufficiency of perception in precipitating behavioral reactions (e.g., Bargh 1997; Bargh et al. 1996).

One explanation for this automaticity of behavior is a two-stage process, starting with the immediate perceptual categorization and interpretation of social and environmental stimuli (Chartrand and Bargh 1999). That is, some features of the environment and/or the appearance and behavior of another person trigger an automatic perception. In the second stage, the automatic perception precipitates a behavioral response, again, without any conscious effort or control. This automatic sequence has been described as a “perception-behavior expressway” (Dijksterhuis and Bargh 2001). Specifically, the appearance and behavior of each person register in the automatic perceptual processes of the partner. Then each person’s perception can be sufficient to precipitate other-directed behavior, as shown in the direct link between each person’s perceptual processes and behavior. In turn, the coordinated behavior of both parties is represented in the interaction cell. Thus, this sequence might be represented for each person as Perceptual Processes > Behavior > Interaction. An interesting and important example of this kind of automatic interaction is behavioral mimicry (Lakin 2006, 2013). In many situations, people subtly mimic a wide range of partner behaviors without conscious awareness. One suggestion for the neural basis of behavioral mimicry is the simultaneous activation of mirror neurons in the behavior by one person and its perception by an interaction partner (Rizzolatti and Craighero 2004). Nevertheless, over an extended interaction, the initiation of some cognitive-affective processes is likely.

Activating Cognitive-Affective Processes

Although perception alone often can be sufficient to precipitate behavior, there are other sequences that do engage some kind of cognitive-affective processing. Three relatively distinct routes might be suggested. First, cognitive-affective processes, either automatic or controlled, are activated in parallel (non-causal) to the perception-behavior link. That is, while the Perception Processes > Behavior > Interaction expressway is running at the speed limit, parallel automatic social judgments are formed in a Perceptual Processes > Cognitive-Affective Processes sequence. If controlled processes are activated, then the sequence would be supplemented by activating Cognitive Resources. Of course, even if such automatic social judgments do not affect the co-occurring automatic behavior, they may well affect a subsequent cycle of behavior directly, or indirectly, by sensitizing a person to selective aspects of new perceptual input.

In the second route, automatic cognitive processes mediate the perception-behavior link. The rapid nature of social judgments from minimal perceptual input has been documented for over two decades (Ambady and Rosenthal 1992). Many judgments from “thin slices of behavior,” lasting only a few seconds or less, are made reliably across a variety of personal traits and characteristics (Ambady et al. 2000; Carney et al. 2007)). Facial appearance cues alone, often presented for only a fraction of a second, result in relatively accurate judgments of some personality traits, sexual orientation, and leadership (Re and Rule 2016). Such automatic judgments can be causal in directing a specific behavior pattern. This would be reflected in the following sequence: Perceptual Processes > Cognitive-Affective Processes > Behavior > Interaction. The third route, involving controlled processing, is modified by engaging Cognitive Resources to elaborate or to correct the initial cognitive-affective judgments. Thus, this sequence would be Perceptual Processes > Cognitive Resources + Cognitive-Affective Processes > Behavior > Interaction.

There are some circumstances that are likely related to a higher probability of automatic processes in interactions. For example, non-anxious individuals, people familiar with the setting, and those with more power are more likely to proceed on automatic pilot than are individuals with contrasting characteristics. To the extent that controlled processes are engaged, it is possible that the Cognitive Resources + Cognitive-Affective Processes > Behavior link might have a slightly longer latency (e.g., perhaps only a fraction of second) and be less smooth than an automatic link. In turn, this might be experienced as a kind of instability at the Interaction level.

The suggestion that stability-instability provides a kind of outcome reading of the ongoing interaction does not mean that people are consciously aware of stability. Rather, stability typically may be assumed and individuals only become aware of instability when the Interaction sequence falls somewhere outside an expected range (Patterson 1982). This might involve a partner’s unexpected or puzzling behavior or even a person’s recognition of his own awkward behavior. Such an assessment is consistent with Berger’s (1997, Chapter 2) meta-goal of appropriateness in interactions. Although automatic perception of instability may lead to an automatic behavioral adjustment, some controlled assessment of the instability is also possible. And the adjustment sequences described here may not always bring individuals closer to their goals and facilitate stability. This kind of situation may require a reassessment of specific goals.

Modifying Goals

Initial goals of partners are not immutable; they may change throughout the course of an interaction. In addition, features of the environment can affect people’s expectations about others and the goals they pursue in interactions (Loersch and Payne 2011). If the kinds of adjustment sequences described in the last section are insufficient to achieve the initial goals and stability, individuals can readily modify their goals. For example, an employee seeking a raise might approach the boss with added courtesy and expressiveness. If the boss were immediately nonresponsive, the employee may decide quickly to moderate her approach and switch the topic to next week’s product-marketing agenda. Here, the employee set out to ask for a raise, and when that failed, she recovered by modifying her goal. This kind of exchange might be described in the following sequence, starting in the interaction cell of Fig. 4: Interaction 1 > Perceptual Processes 2 > Cognitive-Affective Processes 2 > Goal 2 > Behavior 2 > Interaction 2. Of course, this is still just one side of the interaction and the boss would typically, but not always, have more power in determining the course of the interaction. Over time, the interaction should move toward greater stability, or else become more susceptible to early termination (Patterson 1982).

A second example highlights the system model’s emphasis on the environment. The earlier discussion of the power of the environment focused primarily on the effects of setting selection, social norms, physical design, and other setting features on interaction. Nevertheless, the environment-behavior link is a two-way relationship, as seen in Fig. 4. An earlier example described how an office holder might change seating options to facilitate friendly versus power-driven interactions. In such a case, these adjustments by the office holder would be anticipatory in nature, that is before the interaction begins. But manipulating the immediate environment also can occur mid-stream, in the course of an interaction. For example, after greeting a visitor and talking in the hallway of her home, the owner might suggest going into the kitchen for a cup of coffee. Nor would it be unusual for a dinner-party host to invite the guests to leave the dining room for the living room where they “can be more comfortable.” Further adjustments, such as reducing the volume on the TV, manipulating the lighting in the room, or changing the setting on the thermostat can affect in subtle ways the course of an interaction (Patterson and Quadflieg 2016).

Because the systems model and its dynamic relationships are complex and speculative in nature, its strengths and weaknesses merit further consideration. In addition, this model is relevant, not only for interactions among humans but also, for the rapidly emerging variations of “social” interactions that humans are starting to have with embodied computerized agents and other forms of AI. This issue is also addressed in the Discussion section.

Discussion

This systems model builds on both the earlier theories of nonverbal interaction and the extensive growth of empirical research over several decades. Complementary advances in theory and research have provided a foundation for elaborating the systems model. Among the critical factors and processes in this foundation are: (1) the role of biology, culture, gender, and personality in shaping patterns of nonverbal communication; (2) a focus on simultaneous, parallel behavioral and social judgment processes; (3) the dominance of automatic processes, complemented by conditional controlled processes; and (4) the importance of conscious and unconscious goals in nonverbal interactions (Patterson 1995, 2001). Despite these advances, there has been a failure to integrate these processes into a broader, ecologically-grounded framework—a systems model (Patterson 2014).

The two new, central features of the systems model are a focus on the dyadic level of interaction, rather than the individual level, and identifying the reciprocal relationships between the environment and interaction processes. These two features are consistent with a growing appreciation of the social ecology of behavior, typically neglected in the social cognition era in psychology (Oishi and Graham 2010; Patterson and Quadflieg 2016; Stokols 2018). Individuals are active in their selection of settings and their choices are influenced by their own characteristics and goals. Settings, however, also select individuals, and these complementary selection processes tend to increase the similarity between setting inhabitants. Thus, the intersection of two individuals is rarely by chance. Furthermore, the resulting interactions are understood best at a dyadic level that traces the separate, but complementary, processes of partners leading to face-to-face encounters. Although the interaction per se is behavioral, what directs the course of the exchange are perceptual and cognitive-affective processes, often automatic and outside of awareness. Setting design, other environmental features, and social norms constrain behavioral options and help to shape individual goals over time. In general, the course of interactions moves toward goal attainment and stability. But the failure to achieve goals and/or instability in the interaction increases the probability of both controlled cognitive and behavioral processes and the manipulation of setting features. Thus, the systems model attempts to describe and explain the dynamic interplay among individual, dyadic, and environmental processes in nonverbal interactions.

Specificity Versus Breadth

The earlier discussion of the evolution of interaction theories showed a clear pattern of increased breadth and complexity over time. That is, the early affect-driven theories were relatively focused in making fairly specific predictions for compensation versus reciprocity in response to a partner’s change in nonverbal involvement (Burgoon 1978; Cappella and Greene 1982; Patterson 1976). For a limited range of circumstances, the predictions worked relatively well. The early theories ignored, however, the broader utility of nonverbal communication, relied too much on affect in mediating nonverbal behavior, and focused only on reactive nonverbal adjustments to a partner’s behavior. Subsequent theories became more comprehensive by adding the influence of determinant factors, goals, parallel behavioral and social judgments tracks, and automatic versus controlled processes (e.g., Burgoon et al. 1995; Patterson 1982, 1995). This increasing breadth in the theories, however, made formulating simple, testable hypotheses more difficult.

As a more comprehensive explanatory theory, the systems model is even more vulnerable to the testability criticism. Indeed, the model is best understood as an explanatory framework under which more nuanced hypotheses can be formulated. That is, specific predictions can be generated by focusing on particular links in the model. For example, the greater the setting selection pressures, the higher the probability that setting users will share attitudes and expectations that facilitate stable interactions. Likewise, greater familiarity with the interaction setting and its related social norms should increase the probability of automatic behavioral and cognitive processes. In addition, the compatibility of partners’ goals should facilitate more stable interaction sequences. In turn, stable interactions should be characterized by more reciprocation (e.g., behavioral mimicry) and automaticity in behavior. In contrast, unstable interactions are likelier to be characterized by compensation and a higher probability of controlled social judgments and behavior. Of course, such controlled processes are dependent on the availability of adequate cognitive resources and the motivation to apply them (Gilbert and Krull 1988). These are all examples of testable hypotheses that might be derived from the systems model. Because this is a dyadic model, these hypothesized links might be examined through the application of new approaches to dyadic data analysis (Lederman and Kenny 2017).

Mobile Devices

Foundational theories of nonverbal interaction could not anticipate the rapidly-growing role of digital communication. The ubiquity of cell phone use in calls, texts, tweets, and various other social media greatly expands contacts with distant others. Although these digital communications are not the type of interactions discussed in the systems model, they constitute an environmental feature highly relevant for nonverbal interactions. In particular, cell phone use reduces the quality of our direct face-to-face contacts (Bilton 2013; Bugeja 2005). That is, cell phones can interrupt and divert users from ongoing face-to-face interactions and reduce opportunities for impromptu contacts with others in public settings (Patterson et al. 2014).

Other potential long-term effects of pervasive cell phone use are notable. An increasing number of people, especially younger individuals, seem addicted to their mobile devices (e.g., Leung and Liang 2015), and the proportion of time they spend in face-to-face interactions likely decreases. To the extent that these face-to-face encounters decrease, individuals have less experience with the sending and receiving processes described in the systems model. Will this result in greater difficulty in nonverbal interactions with others? And how might this affect the quality of personal relationships? A colleague of mine recounted a story told by the maître d’ of a well-known Hawaiian hotel. Specifically, star-crossed newlyweds were now commonly spending romantic dinners, not gazing into each other’s eyes, but glued to the screens of their smartphones. More generally, might empathy with surrounding others be reduced because some of the cues simply are missed or misperceived? And could difficulties in nonverbal interactions exacerbate current problems of anxiety and stress associated with cell phone addiction (De Sola Gutierrez et al. 2016)?

Different kinds of communication issues are developing in another important area of digital technology—social robots. Humans increasingly interact with embodied computerized agents (ECA’s) with voices, such as Amazon’s “Alexa” or Apple’s “Hi Siri” that understand and answer our requests for information. Soon, these ECA’s will be ambulatory and have the android equivalent of our eyes, ears, and tactile senses.

Human–Robot Interaction (HRI)

The rapidly evolving field of HRI involves the design, implementation, and evaluation of robotic systems that communicate with human operators or companions (Goodrich and Schultz 2007). In recent years, considerable research has focused on developing robots capable of social interaction (Weir 2018). The intense interest in social robots is sparked by the promise of potential applications in human–robot cooperative tasks (e.g., collaborative assembly in industrial settings), clinical interventions (e.g., helping children with autism, care for the elderly), and entertainment (e.g., robotic pets) (Corrales et al. 2012; Dautenhahn 2002). Successful social interaction requires that the robot track the human partner’s behaviors and respond according to its designed role.

Early research focused primarily on robots’ linguistic capability but, more recently, research has turned to nonverbal channels in the robot’s social repertoire (Mavridis 2014). On the sending side of interaction, some social robots are now able to initiate facial expressions, gestures, postural changes, and even touch, and on the receiving side of interaction, social robots are able to decode a similar range of behaviors from human partners (Park et al. 2015). The issuance and detection of facial displays of social robots and their human partners has been an important focus of HRI research. The meaning and significance of these displays in HRI has been explained primarily in terms of Basic Emotions Theory (BET) (Breazeal 2003). From this perspective, an effective social robot should mimic facial displays of emotion (e.g., happiness, surprise, fear, disgust, anger, sadness, and contempt) and, in turn, decode those same expressions into the appropriate emotions when expressed by their human partners. Social robots, however, do not experience emotion—at least, not yet. In addition, advocates of the Behavioral Ecology View (BECV) of facial displays argue that a functional approach to such displays is more productive than assuming a role for emotions (Fridlund 1994; Crivelli and Fridlund 2018). That is, what is relevant in such facial displays is signaling what the displayer is likely to do and what is wanted from the partner.

Thus, both robots and human partners have goals that are signaled through nonverbal displays. Facial expressions are an important component of such displays, but distance, gaze, touch, posture, and movement are also crucial to these goal-oriented exchanges. That is, these nonverbal interactions between a robot and human partner are analogous to the human dyadic interactions in the present systems model. And a similar theoretical structure might be proposed for human–robot dyads. Thus, in Fig. 4, the robot can replace Person 2. Instead of the determinants of biology, culture, gender, and personality, the robot would be constrained by its architecture and programmed predispositions related to both signaling and signal detection. And the programmed predispositions could be adjusted to fit particular setting features and social norms, e.g., a nursing home or a therapy setting for children. Furthermore, the robot could be programmed with social goals that might complement the goals of the human partner. In the place of cognitive resources, processing capacity and speed would determine the computing power for the behavioral and social judgment tracks of nonverbal interaction. The core processes illustrated in Fig. 2 should be similar for human–human and human–robot interactions. In addition, the feedback processes illustrated in Fig. 4 might also be similar for both types of interactions. Although all of this is highly speculative, the systems model might provide a useful approach to further research with social robots.

Beyond Dyads

As complex as the sequential processes of nonverbal communication may be in dyads, the potential permutations in five-, ten-, twenty-person or even larger groups seem daunting indeed. Nevertheless, the systems model has clear relevance for nonverbal communication in groups of any size. Let us consider a few of the elements. First, the environment in the form of specific behavior settings and their related social norms, along with a variety of physical features (e.g., available space, design, lighting, temperature, soundscape) affects nonverbal exchanges, whether they are only among a dyad or much larger groups (Patterson and Quadflieg 2016). Of course, this does not mean that everyone in a larger group is equally affected by a “shared environment,” because specific features within an environment vary by location. Some people may be more centrally positioned and more visible to surrounding individuals; others facing an expansive window rather than a beige wall; and some individuals may be closer to distracting elements in the environment than others in the group (Marquardt et al. 2015). In addition, individuals with specific disabilities may experience environments differently, e.g., visually-impaired individuals may be more affected ambient noise levels but less by visual distractions.

The course of nonverbal exchanges is probably more predictable in leader-driven, structured interactions (e.g., classrooms, workplace meetings, or religious services) than in nominally-leaderless interactions (e.g., holiday dinners, wedding receptions, or parties). In the former case, there is often a specific goal or agenda managed by the leader who commands most of the attention. In these kinds of gatherings, verbal communication is often asymmetric, with leaders initiating most of the comments and members responding to them. Even though audience members may say little or nothing in some group meetings, their nonverbal behavior can signal attention, enthusiasm, or the opposite—boredom, frustration, or confusion. In addition, a range of expressive reactions from audience members can indicate approval or disapproval of the speaker’s comments that, in turn, may affect the speaker’s subsequent verbal and nonverbal reactions.

Many multiple-person exchanges involve unfocused interactions. As people walk the aisle of a busy store, enter a line at a boarding gate, or choose a seat in a nearly-full waiting room, others nearby often make compensatory adjustments to the newcomer’s presence. In these situations, most people attempt, consciously or not, to position themselves according to their preferred level of privacy (Altman 1975). When additional people enter a gathering, the momentary levels of privacy established by each individual are disrupted, especially for those closest to the intrusion. Like birds who sequentially reposition themselves on a telephone line when a new bird arrives, individuals in unfocused interactions make similar adjustments to a newcomer in their midst. The compensatory adjustments are greatest at the point of intrusion and then rapidly diminish toward the periphery, like the ripples from a pebble tossed into a still pond. Knowles (1983) noted this dynamic in his social gravity model of perceived crowding, proposing that crowding increases as the square root of the audience size and decreases as the square root of its distance from the group’s center.

In the context of the present systems model (see Fig. 4), perceived crowding would be subsumed under the perceptual and cognitive-affective processes that, in turn, mediates behavior adjustments to the intrusion. The behavioral coordination with surrounding others in the interaction cell feeds back to goals. If the goal of desired privacy is attained, there is momentary stability; if not, additional cycling through these processes ensues. And, consistent with the “ripple effect” and Knowles’ (1983) social gravity model, perceived crowding and subsequent behavioral adjustments would diminish with distance from the intrusion. Of course, the particular patterns are affected by the determinants listed earlier—biology, culture, gender, and personality—and the constraints of the immediate environment.

Conclusion

The systems model provides a comprehensive, integrative approach to conceptualizing dyadic nonverbal interaction. The model’s emphasis on the social ecology of interactions directs particular attention to the reciprocal links between the physical environment and both individual and dyadic processes. In general, nonverbal interactions may be viewed as goal-oriented behavioral exchanges between partners shaped by interdependent perceptual, cognitive, and affective processes. Although automaticity is the default setting for both behavioral and social judgment processes, controlled processing is possible when adequate cognitive resources are available and there is sufficient motivation to engage those resources. Controlled processes are likelier as individuals’ social goals are not attained and when there is instability in the interactions. Although the systems model is broad in scope and descriptive in nature, specific links in the model should be amenable to hypothesis-testing. Furthermore, the systems model has special relevance for rapidly-developing communication technologies, including human–robot interactions. Future research will determine its ultimate utility and its place in the evolution of new theories.