Introduction

Interpersonal sensitivity (IS) is a broad construct that can include both perceiving others accurately and engaging in interpersonally appropriate behavior (Bernieri 2001). The present meta-analysis concerns the perception side of this definition. It is difficult to imagine social life without skill in processing the behavior and appearance of others. In the course of a day, a person notices countless details about others’ speech, facial and bodily movements, vocal tone, physiognomy, and dress, among other things, and then draws countless inferences based on this information, even though such information is often fleeting and incomplete. Psychologists have long believed that interpersonal sensitivity matters in daily life (e.g., Allport and Kramer 1946; Kanner 1931; Taft 1955; Vernon 1933), and it remains a timely topic of study (e.g., Ambady et al. 2000; Ames and Kammrath 2004; Elfenbein et al. 2007; Hall and Bernieri 2001; Hall et al. 2005; Nowicki and Duke 1994; Pickett et al. 2004).

Most often, interpersonal sensitivity tests measure accuracy in judging affective states or personality traits, though many other constructs are tested such as truthfulness, intelligence, status, or the intimacy of the relationship between two people. Most of the time, the stimuli are nonverbal cues conveyed by the face, body, and/or voice, but sometimes linguistic cues are included as well. Occasionally, IS has been defined as accuracy in noticing and recalling another’s nonverbal cues, speech content, or physical appearance. Accuracy that is based on making interpretational judgments has been called “inferential” and accuracy that is based on recall has been called “attentional” (Hall et al. 2001, 2006), corresponding to “utilization” and “detection” in Funder’s Realistic Accuracy Model of personality expression and judgment (Funder 1995). Whatever definition is used, IS is tested by having perceivers make assessments based on the behavior of one or more expressors (targets) and then scoring these assessments based on independent scoring criteria.

Authors have generally considered IS to be a valuable skill (e.g., Izard 1971; Nowicki and Duke 1994; Rosenthal et al. 1979). IS, in the form of judging others’ emotions from nonverbal cues, has been included as one of the defining elements of the emotional intelligence construct (Mayer et al. 2003). However, the IS field is underdeveloped theoretically (Zebrowitz 2001), one reason for which is the lack of a complete picture of the correlates of IS. According to the Realistic Accuracy Model (Funder 1995), individual differences in IS contribute to interpersonal accuracy, along with various message and target characteristics, but the model does not go deeply into the characteristics of the good judge.

The present meta-analysis contributes to the goal of theory development in two ways. The first way is to summarize the large domain of correlates that we term psychosocial, which we define broadly to include personality, social and emotional functioning, life experiences, values, attitudes, and self-concept. This broad definition encompasses essentially all person variables besides cognitive ability and cognitive style. The second way that the present review contributes to theory is to suggest how the obtained results might be compatible with different possible paths of causation.

Researchers who have focused on IS in children have been particularly concerned that deficits in IS put a child at risk for a range of intra- and interpersonal dysfunctions. Low IS is regularly found to be associated with worse social and personal adjustment in children (e.g., McClure and Nowicki 2001; Nowicki and Mitchell 1998; Rothman and Nowicki 2004; Russell et al. 1993), and both children and adults who are clinically diagnosed with a variety of psychopathologies typically have IS deficits compared to nonclinical comparison groups (e.g., Baron-Cohen et al. 2001a; Edwards et al. 2002; Lembke and Ketter 2002; Levine et al. 1997; Nowicki and Duke 1994; Rosenthal et al. 1979; Surguladze et al. 2004).

In the present review, we considered IS among adolescents and adults—mainly college age and older—who were not clinically diagnosed. Our interest therefore was in understanding better the role of IS in typically functioning individuals. Though the great majority of these people negotiate their personal, social, academic, and working lives with adequate interpersonal skills, they do not all score alike on tests of IS. The question, therefore, is what this variation means in terms of a wide range of variables that are likely to be relevant in daily life.

Causal paths between IS and psychosocial variables are difficult to establish because the vast majority of studies are based on simple cross-sectional correlations, as pointed out by Rothman and Nowicki (2004) and many others. Thus, in the case of psychological health, high IS could be a cause or a consequence of better functioning (Grinspan et al. 2003; McClure and Nowicki 2001). Alternatively, both paths could co-exist, or a relation between IS and psychological health could be due to third variables. For a given psychosocial variable, one causal path might be more plausible than another, but still one can only speculate about causation most of the time. Causal speculation is valuable, however, because it helps to pinpoint what kinds of additional variables would have to be investigated before a firmer understanding about causality can be achieved. In other words, a conceptual analysis can help to guide hypothesis development and the design of future research.

Fortunately, the IS domain provides fertile ground for hypothesis development and the specification of potential mediating variables. For example, a positive correlation between IS and better psychological functioning (e.g., Nowicki and Duke 1994) could occur because noticing and accurately assessing others’ cues is a precursor to being able to respond appropriately, which then enables a person to avoid social rejection and promotes positive changes in personal growth, self-concept, and so forth. Along the same lines, a positive correlation between IS and favorable instrumental outcomes related to leading, managing, selling, and negotiating (e.g., Byron et al. 2007; Elfenbein et al. 2007) could occur because higher IS renders a person able to predict others’ intentions, needs, and future behaviors, which then informs one’s own tactical and strategic decisions.

Hypotheses about reverse causal paths are also easy to develop. A positive correlation between IS and job rank (e.g., Hall and Halberstadt 1994; Rosenthal et al. 1979) could mean that higher rank causes increases in IS because the job provides relevant opportunities for skill building. Certainly the interpersonal situations faced on the job by managers are likely to be more complex and consequential than those faced by, say, cashiers or truck drivers. In daily life, many personal traits, attributes, and experiences could contribute to the development of IS. When discussing personality correlates of IS, researchers have often held the tacit assumption that IS is the consequence of having certain personality characteristics (e.g., Davis and Kraus 1997). Davis and Kraus’s meta-analysis on adults’ IS in relation to a number of psychosocial variables found that higher IS was associated with significantly less rigidity/dogmatism, more internal locus of control, more positive adjustment, higher emotional empathy, higher scores on scales of social intelligence, higher ratings of IS by acquaintances, greater interpersonal trust, and higher self-monitoring including its three component factors of extraversion, acting, and other-directedness. Some variables were not significantly associated with IS, notably participants’ self-assessments of IS.

There are many unanswered questions in the IS field (Zebrowitz 2001). The present review does not resolve all of the questions nor does it propose a theory. It has the more modest goals of summarizing the evidence relating to one broad class of potential correlates of IS and discussing the findings in terms of possible causal paths.

Overview

The present review, on nonclinical samples of adolescents and adults, draws on many different IS tests. In all cases, perceivers’ assessments or interpretations of live or recorded excerpts of strangers’ behavior and/or appearance were scored for accuracy according to criteria developed by the original investigators. For example, in the facial expressions test of the Diagnostic Analysis of Nonverbal Accuracy (DANVA; Nowicki and Duke 1994), perceivers view 24 2-s presentations of photographs of emotional facial expressions and guess which of four emotions is being conveyed by each. Responses are scored against a scoring key. In tests included in the present review, IS is always scored with reference to an independent criterion, such as the targets’ intentions or self-reports, objective facts about the targets, or expert judgment. The criterion is never the test-takers’ claims or beliefs about their own accuracy. Indeed, finding out whether such self-claims are correct is one of the goals of the present review.

The literature that was retrieved contained a very large number of individual psychosocial variables. To permit a manageable analysis, these were reduced to a smaller number of categories based on consensual judgment of the present authors. For transparency, all category assignments are documented in tables presented in the Appendix.

We limited the meta-analysis to published works, a decision that was guided by several considerations. First, Richard et al.’s (2003) summary of 322 meta-analyses in personality and social psychology (involving eight million research participants) found little evidence that the strength of effects varied with the proportion of unpublished works included in the meta-analyses. Those that included only published studies had an average effect size (r) of .22, while those that included more than 25% unpublished studies had an average effect size (r) of .20, a trivial difference suggesting little publication bias in the field as a whole. Second, a substantial proportion of the studies included in the present review came from one source, a monograph describing an IS test called the Profile of Nonverbal Sensitivity or PONS (Rosenthal et al. 1979). Because that monograph reported all of the authors’ findings regardless of magnitude or p value, there could, by definition, be no publication bias for them. Therefore, publication bias was not an issue for a large proportion of the present studies. The decision to limit the present review to published findings was further supported by an analysis reported later in this article showing only minor differences in the average magnitude of effects reported in the Rosenthal monograph compared to effects coming from other published sources, a conclusion also reached by Davis and Kraus (1997) in their earlier meta-analysis of IS and personality. Finally, in many of the published articles the results that we extracted were of secondary or tertiary importance in terms of the main thrust of the article, meaning that the publishability of the article would likely not depend on whether those results were large in magnitude or significant.

Method

Inclusion Criteria

Criteria for including studies were the following: (1) perceivers had an average age of 13 or older. (2) Perceivers were cognitively and psychosocially typical (i.e., not clinically diagnosed or in a group labeled as psychologically impaired). (3) The sample size of perceivers had to be at least 10 in order to minimize inclusion of effects with large sampling error (this exclusion needed to be implemented only a handful of times). (4) The study was published in an English-language article, book, or test manual. (5) The investigators scored perceivers’ accuracy of judging strangers using an independent criterion. (6) The IS test stimuli could consist of purely nonverbal content (e.g., photographs of facial expressions or content-masked vocal clips) or mixed nonverbal and verbal content (e.g., videotape clips played with the sound on). (7) The IS test stimuli could be recorded or live. (8) The IS test stimuli had to be human (i.e., not drawings or synthesized tones). (9) Studies in which IS was measured in persons who participated in a dyadic design in which, by definition, one person’s IS was confounded with the other person’s expressive clarity were excluded (e.g., Ickes et al. 1990). (10) IS was examined in relation to one or more individual-difference variables defined as personality, social and emotional functioning, life experiences, social skills, values, attitudes, and self-concept. Sources of these variables could be self-report, reports by other people such as friends or supervisors, or behavioral measurements.

Because several literatures that fell within the purview of the current article have already been the subject of meta-analytic reviews, the following steps were taken to avoid duplication. A meta-analysis of status-dominance in relation to IS (Hall et al. 1997) was not updated because insufficient new results were located. Two other topics were not included because these have been summarized in very recent meta-analyses. These were personality in relation to IS defined as accurate lie detection (Aamodt and Custer 2006) and out-group prejudice in relation to IS defined as accurate identification of Jewish versus non-Jewish faces (Andrzejewski et al. 2009). Elfenbein et al. (2007)’s meta-analysis of IS in relation to workplace effectiveness overlaps conceptually with our summary in the present article. However, because Elfenbein et al. included in the workplace effectiveness category several kinds of psychosocial variables that we assigned to different categories, we did not consider the two reviews to be redundant (though they did reach very similar conclusions). Therefore, we included workplace effectiveness as a category in the present meta-analysis.

Search Method

The following procedures were used to locate studies: (1) PsycINFO search from earliest possible year through 2006 using a list of terms that included interpersonal sensitivity, decoding accuracy, and nonverbal recognition. (2) PsycINFO search of names of specific IS tests (e.g., Diagnostic Analysis of Nonverbal Accuracy, Interpersonal Perception Task). (3) PsycINFO search of names of key authors known to conduct IS research. (4) Manual search of the first author’s issues of the Journal of Nonverbal Behavior. (5) Search of bibliographies of relevant sources. (6) Search of the authors’ reprint files.

Units of Analysis

In the present article, each publication from which results were extracted is called a source; there were 96 sources. A given source could contain one or more studies, defined as independent groups or subsamples (k = 215). In the great majority of cases, such groups were identified as separate studies by the original investigators or were separate male and female subsamples. When results for a given study were reported in more than one source, these were amalgamated under the same study identifier.

Study Characteristics

Table 1 gives a basic description of the 215 studies, most of which were conducted on college students in mixed-sex samples where the authors did not report results for the sexes separately. The vast majority of the studies were conducted in the U.S. (data not in table), presumably with predominantly Caucasian samples.

Table 1 Study characteristics

Table 2 shows the different IS tests represented in the meta-analysis, according to their cue channels and content. The majority of results were based on the full-length PONS test (Rosenthal et al. 1979), and a substantial minority came from shorter versions of that same test. Most of the PONS test results came from the monograph alluded to earlier (Rosenthal et al. 1979). The full-length PONS test presents 220 2-s clips of the face, body, and content-masked speech of a female encoder, in single cue channels and combinations (11 altogether, for example, face only, face plus voice). Perceivers make judgments about affective states and/or the situations appropriate to the expressions (20 altogether, for example talking to a lost child, expressing gratitude, and talking about the death of a friend). Other standard tests include the IPT, DANVA, and CARAT (see table for explanation of acronyms). Criteria applied by the test developers for scoring answers varied from test to test. Examples include intention of the encoder (DANVA, PONS), objectively determined contextual or background information (IPT, CARAT), expert judgment (POFA), and psychometric measurements (e.g., personality) of the encoders. The table shows that most of the tests measured accuracy in judging affect.

Table 2 Interpersonal sensitivity tests: cue channels, constructs judged, and frequency

Grouping of Psychosocial Variables

It was essential to reduce the many psychosocial variables to a smaller number of categories. This was done by consensus of the authors, blind to study results. The categories developed by this inductive method had good face validity. Appendix Tables 8, 9 and 10 list the specific variables that went into the categories that are summarized in Tables 3, 4, 5 and 6, along with the source information for the psychosocial variable if it was a developed instrument. As the Appendix tables indicate, variables were grouped into those with positive valence (positive personality traits and social competencies), negative valence, and ambiguous valence (i.e., variables for which it was not clear whether one pole was “better” than the other). A category had to have at least three studies in it. Variables that were measured in only one or two studies were treated separately, as explained later. In some instances, the polarity of a variable was switched (by switching the sign on the correlation) for consistency with other variables in its category.

Table 3 Correlations between interpersonal sensitivity and positive personality traits
Table 4 Correlations between interpersonal sensitivity and social competencies
Table 5 Correlations between interpersonal sensitivity and negative personality traits
Table 6 Correlations between interpersonal sensitivity and ambiguously valenced psychosocial categories

Statistical Methodology

The Pearson correlation, r, was the effect size indicator, and it was available directly in the great majority of studies. When r was not presented, standard formulas were used to calculate it (e.g., from t test) (Rosenthal 1991). On a rare occasion, partial rs and standardized regression coefficients were used when the ordinary Pearson r was not available. Analysis was aided by the Comprehensive Meta-Analysis Software program (Borenstein et al. 2005). Both fixed and random effects models were calculated, as well as a heterogeneity statistic. Also presented is the “file drawer N” or number of null results (averaging r = .00) required to bring significant combined p values into nonsignificance by a one-tail test (Rosenthal).Footnote 1

Effect sizes were almost always retrievable, meaning there were few instances where r was unknown (this happened when the author said “not significant” without giving sufficient information for calculation of effect size). Entering the unknown effect sizes into their respective psychosocial categories with an estimated Z of 0.00 made no difference in the combined probability conclusion regarding any category. Therefore, unknown effect sizes are not discussed further.

Maintenance of Independence

Independence was maintained by the following procedures. If a study included more than one variable from the same psychosocial category (see Appendix tables), the effect sizes for these variables were averaged before proceeding with further analysis. Similarly, if for the same psychosocial category a study reported results for more than one IS test, or subscores within one IS test, these too were averaged before further analysis. These procedures guaranteed that in the analysis of a given psychosocial category, all effect sizes were independent. However, because a given study often reported results for more than one psychosocial category, analyses of different categories are not necessarily independent because the same study could appear in more than one analysis.

Another rule for maintaining independence was that although the same study could appear in more than one analysis as just described, the same data could appear only once across all analyses. Specifically, Snyder’s (1974) Self-Monitoring Scale produces both a total score and subscale scores, one of which is extraversion. If we included the total score in the self-monitoring category and the subscale score in the extraversion category, this would mean that the extraversion data were being counted in two separate analyses. To maintain independence, the extraversion subscale data were not put into the main extraversion analysis but were analyzed separately, meaning that the main analyses of self-monitoring and extraversion had no shared data.

Check for Publication Bias

Because we searched only for published studies, it is important to address the possibility of publication bias. Several arguments were made in the introduction for why publication bias was not considered to be an important problem. We performed an empirical test as well, comparing the results from the PONS test monograph (Rosenthal et al. 1979), which reported all available results without regard to p value or magnitude, to results based on other sources. If there were publication bias, the results from the PONS monograph would be smaller than those published elsewhere. This analysis was done for all of the psychosocial categories for which there were results from both of these sources (n = 22 categories, each of which contained multiple studies). Averaging across nine categories of positive personality traits, the PONS monograph results were identical to those published elsewhere (M = .11 and .11, respectively). There was also no appreciable difference averaging across six categories of social competencies (M = .13 and .16 for monograph and findings published elsewhere, respectively), nor for three categories of negative personality traits (M = −.03 and −.05 for monograph and findings published elsewhere, respectively). For four “ambiguously valenced psychosocial” categories, there was a difference indicating that the monograph findings were weaker (M = −.07 and .16 for monograph and findings published elsewhere, respectively). On balance, however, these analyses give little evidence of publication bias.

Results

Positive Personality Traits

Table 3 shows that though the effects were not strong, IS was significantly positively associated with seven of the self-rated positive trait categories—empathy, affiliation, extraversion, conscientiousness, openness, tolerance, and internal locus of control. Three additional results (not in table) were available for extraversion, based on the extraversion subscale of Snyder’s (1974) Self-Monitoring Scale (see above). These three had a weighted mean r of .34 (fixed effects Z = 4.18, p < .001), a stronger relation with IS than found for other extraversion instruments summarized in Table 3. Also, as the table shows, IS was significantly correlated with the category called other-rated miscellaneous positive traits. This latter category contained two notable outliers (r = .74 and .78 in two studies that correlated teachers’ IS with their encouragement towards pupils in a classroom; Rosenthal et al. 1979). With these two results removed, the weighted mean r was still significant for the other-rated miscellaneous positive traits category (fixed effects Z = 4.02, p < .001). Finally, although the table shows that the category called warmth-prosociality showed no overall relation to IS, an article published after this review’s search period found that accuracy in judging facial expressions of fear was correlated with behavioral measures of prosociality in three separate studies (Marsh et al. 2007).

Social Competencies

The self- and other-rated social competencies shown in Table 4 range from very narrow to very broad. The variables shown in Table 4 are self-assessed unless stated otherwise. The first two lines of the table reveal that participants’ self-assessed nonverbal decoding skill positively predicted their measured IS; the first line is for studies of test-specific accuracy (in which participants’ self-assessments of their performance on an IS test they just took was correlated with their actual performance on that test), and the second line is for studies in which participants’ more general assessments of their IS (not made in the context of taking a particular test) were correlated with their performance on an IS test. For both kinds of self-assessment, there was significant evidence that self-assessments were correlated with actual IS performance.Footnote 2

Table 4 also shows that ratings of participants’ IS made by other respondents correlated positively with participants’ performance on IS tests, and both self-rated and other-rated social-emotional competence (a category that included more than just rated skill in IS; see Appendix Table 8) predicted IS performance as did self-reported relationship quality. Two categories reflecting competence in occupational settings (other-rated clinical-counseling effectiveness and other-rated workplace effectiveness) also showed positive correlations with IS. Finally, IS was positively related to participants’ cultural adjustment to living in a new country.

Indeed, the average correlations in Table 4 were significantly positive for all categories except for other-rated relationship quality, which showed a marginally significant effect. Two additional studies, which appeared too late for inclusion, fit with the findings for workplace variables. In a sample of Chinese business students, those who scored higher on an IS test succeeded in obtaining more favorable outcomes for themselves as sellers in a behavioral negotiation exercise, a finding that may have more general application in workplace settings (Elfenbein et al. 2007). And in a sample of managers, Byron (2007) gathered ratings both by their subordinates and by their supervisors and gave the managers a test of ability to decode emotions in several nonverbal channels. Subordinates’ ratings of the manager’s supportiveness and persuasiveness were significantly positively correlated with the managers’ IS. In that study, the correlations for subordinates’ satisfaction and superiors’ performance ratings were significant for female but not male managers.

Two studies, not included in Table 4 because they were based on behaviorally defined social competence rather than self- or other-ratings, also had positive associations with IS. These were accurately estimating a friend’s IS (Carney and Harrigan 2003) and accurately diagnosing patients with anxiety and depression (Robbins et al. 1994), which together had a mean weighted r of .24, fixed effects Z = 2.61, p < .01. Studies such as these that employ behaviorally defined outcomes are especially valuable in suggesting the utility of IS in everyday life.

Negative Personality Traits

As shown in Table 5, though the effects were small in magnitude, IS was significantly negatively correlated with neuroticism, shyness, depression, and self-rated miscellaneous negative traits.

Ambiguously Valenced Psychosocial Categories

Table 6 shows the psychosocial categories for which valence could not clearly be assigned. In other words, the present authors were reluctant to say that these traits or background variables were either undesirable or desirable. Having more artistic interests was significantly positively correlated with IS. This is consistent with some older studies that were not included in the meta-analysis (because effect sizes could not be calculated), which compared artistic groups defined as visual arts, theater, and dance against other groups such as unselected college students (Buck 1976; Rosenthal et al. 1979).

Psychological masculinity, and the conceptually related trait autonomy, were not related to IS, but psychological femininity was significantly positively related, consistent with the conceptually related categories of affiliation and empathy shown in Table 3.

Two scales of the California Psychological Inventory (Gough 1957), Communality (sees self as average and as fitting in) and Socialization (accepts norms, finds it easy to conform), had notably positive correlations with IS as shown in Table 6. For Communality there was an outlier value of r = .68 but even with that value removed, the fixed effects Z was significant (Z = 2.52, p < .01). These two scales both suggest a positive attunement to social values and expectations but they are in the “Ambiguous” category because they did not seem like unalloyed positive traits to the present authors. However, these two scales are considered favorable qualities by the CPI’s developer (Gough 1987). As for why they are correlated with IS, one can speculate that in order to know what the social values and expectations are, and in order to monitor one’s success in fitting in and living up to social norms, it is necessary to be both interested in others’ cues and a good judge of them.

Table 6 also shows a positive result for the Social Sensitivity Scale of the Social Skills Inventory (SSI; Riggio 1989). The Social Sensitivity Scale (SSI-SS) is one of six scales comprising the SSI, the other five of which were included in the category called self-rated social-emotional competence (Table 4). The SSI-SS scale was not included in that category because its items tap concern with others’ opinions and social norms more than social skill; indeed, the scale is positively related to neuroticism and is negatively related to the other SSI scales, suggesting a hypersensitivity to social norms (Riggio and Carney 2003). As such, it bears a resemblance to the Communality and Socialization scales just discussed. However, though there was a significant overall relation of the SSI-SS scale to IS, only one result (r = .35) was substantially positive and the remaining five were negligible in magnitude, raising some doubt about the stability of the association of the SSI-SS scale to IS.

The final line of Table 6 shows results for participants’ retrospective descriptions of emotional expressiveness in their family, all based on Halberstadt’s (1983, 1986) findings on this topic. Coming from a less emotionally expressive family is associated with higher IS, reflecting Halberstadt’s hypothesis that IS is more likely to develop in a family environment in which people’s feelings, desires, and intentions are expressed in subtle and hard to decode ways than in a family environment where cues are very overt and easy to read.

Individual Psychosocial Variables

There remained quite a few psychosocial variables that did not fit into the categories shown in the Appendix tables and Tables 3, 4, 5 and 6. These results are shown in Table 7, grouped into categories called formative experiences, behavioral outcomes, observer-coded nonverbal behavior, and “other.” We use the terms formative experiences and behavioral outcomes with caution, of course, because the direction of causality is not known. The results in the table sometimes came from the same studies and therefore independence should not be assumed.

Table 7 Correlations between interpersonal sensitivity and psychosocial variables not previously classified and measured in only one or two studies

Formative Experiences

This category contains psychosocial variables pertaining to events predating IS testing and that might have a causal—formative—impact on IS. Each result is shown with its corresponding r and Z, in descending order based on Z. Father and mother strictness and warmth, and childhood temperament, were all measured in one study and were truly longitudinal in that they were measured in childhood whereas IS was tested in adulthood (Hodgins and Koestner 1993). Most of the other variables in the table were reported retrospectively by participants at the time of IS testing.

The results suggest that coming from a secure, moderately strict, and nonconflictual family produces higher IS. Though the other “formative” predictors of IS in Table 7 are conceptually disparate, they can be linked by the fact that they all imply relevant learning experiences. Thus, the results for American Sign Language (ASL) proficiency, dance and athletic experience, musical training, foreign travel, and having a prelinguistic toddler all suggest a common underlying causal process whereby experiences in which expressive meaning must be encoded and decoded largely within a nonverbal medium have a beneficial effect on IS. This interpretation is consistent with the views of the original authors of those studies (ASL—Goldstein and Feldman 1996; dance—Pitterman and Nowicki 2004; athletics—Wyner 2000; music—Thompson et al. 2004; travel—Swenson and Casmir 1998; toddler—Rosenthal et al. 1979). All of these causal interpretations are, of course, speculative, not only because they are based on correlations but also because the studies were not actually longitudinal.

Behavioral Outcomes

For the variables in the behavioral outcomes category in Table 7, we reasoned that correlations with IS could reflect a causal influence of IS on that variable. For example, higher IS in physicians may cause them to be significantly more on the lookout for emotional cues and therefore more likely to “see” signs of affective disturbance (Robbins et al. 1994, Table 7). Interestingly, for the same physicians, the correlation of IS with actual accuracy of detecting anxiety and depression was only marginally significant, suggesting that what they “see” is not always there (on the other hand, it could also mean that they see valid cues that were missed in the criterion diagnostic assessment; Robbins et al. 1994).

By the same logic, other results categorized as “behavioral outcomes” in Table 7 can also be interpreted as indicating that IS is causally antecedent to the variable. Thus, it is plausible that higher IS leads people to choose people-oriented over thing-oriented occupations (Trimboli and Walker 1993) and that higher IS enables salespeople to earn higher salary raises due to their sales performance (Terranova 2002). Higher IS could also contribute to learning more in a face-to-face situation if the higher-IS person attends especially well to cues of reinforcement or correction (Bernieri 1991). Finally, in a study published after our search period, Byron et al. (2007) found that car salespersons who scored higher on an IS test sold significantly more cars per month than those who scored lower. Thus, a number of studies strongly suggest the hypothesis that higher IS is an asset in achieving practical goals of daily life.

Observer-Coded Nonverbal Behavior

There was a small number of studies that looked at participants’ actual behavior during interaction in relation to their IS. As Table 7 shows, there were no significant overall associations.

Other Psychosocial Variables

There still remained a list of miscellaneous, uncategorized variables. Instead of presenting all of the available results as we have previously, for these “other” variables we show only the results reaching p < .05, two-tail (bottom section of Table 7). Some of these variables are interpretable in terms of the categories discussed earlier. For example, “feeling” (Morand 2001) may be akin to empathy, which showed a positive relation with IS (Table 3); and being less rebellious and nonconforming (Funder and Harris 1986) may be akin (in reverse) to socialization, communality, and the SSI-SS scale, all of which showed a positive relation with IS (Table 6). Being more hurried was associated with lower IS. This finding could indicate a trait of inattention to others’ cues in general, which could impede development of IS skills, or it could indicate that the hurried person is inattentive to the IS test stimuli at the time of being tested.

Discussion

In this review, interpersonal sensitivity (IS), as measured by a wide array of objectively scored tests of accuracy in perceiving others’ states, traits, and other characteristics, had a remarkably consistent and coherent relation to many psychosocial variables. IS was positively related to empathy, affiliation, extraversion, conscientiousness, openness, tolerance, internal locus of control, and varied social competencies and other indicators of positive adjustment. IS was negatively related to neuroticism, shyness, depression, and miscellaneous other negative personality traits. The variables that were related to IS came from self reports, peer and supervisor reports, and behavioral measurements. Nearly all of the significant effects were significant even when tested with the more conservative random effects model, which permits wider generalization to new studies than the fixed effects model does (Lipsey and Wilson 2001). There can be no doubt that accuracy in interpersonal perception is connected to healthy psychological functioning that is manifested in both intrapersonal and interpersonal domains, including work settings. Confirming this conclusion, Carter and Hall (2008) found in a study done subsequent to this meta-analysis that higher IS, as measured with a test not previously used, was related to higher tolerance for ambiguity, more openness, more extraversion, and better psychological adjustment.

Possible Causal Paths

A number of psychosocial variables were identified that could plausibly be formative in the development of IS, such as early family climate and extended experiences that draw on nonverbal encoding and decoding. The idea that learning experiences can influence IS is consistent with studies showing positive effects of interventions specifically designed to improve IS (Beck and Feldman 1989; Costanzo 1992; Jenness 1932; Rosenthal et al. 1979). Interventions based on feedback rather than didactic instruction alone appear to work the best (Ambady et al. 2000; Gillis et al. 1995), consistent with the hypothesis that naturally occurring experiences and the feedback that would be associated with them can be formative with regard to IS.

Also supportive of a causal process were the variables that we classified as possible outcomes of IS, such as getting higher salary raises based on sales. The present groupings of variables as either determinants or outcomes of IS are only suggestive, however, because various causal paths can be imagined for all of these relations—not only those grouped as “formative” versus “outcome” in Table 7 but for all of the variables reviewed in Tables 3, 4, 5 and 6. It is premature to advance a comprehensive theoretical account of IS, because the research consists overwhelmingly of simple correlations. Future research could advance insight greatly by tackling causal issues through modeling techniques, measuring potentially mediating variables, controlling for covariates, and using longitudinal and experimental designs.

Another ambiguity about causation stems from the fact that the great majority of studies reported findings for men and women together, without exercising controls for a possible confounding by gender. This would be a problem if, for example, women scored higher on both IS (which, in fact, they generally do; Hall 1978, 1984) and the psychosocial variable in question. In such a case it could conceivably be the case that the positive association between IS and that variable is due to gender and that it would disappear if gender were controlled statistically or if only men or only women were used as participants. By this logic, a gender confound would manifest itself as larger correlations for mixed-sex samples than for single-sex samples. In fact, the grand mean correlation for samples that were comprised of both sexes combined or in which the gender composition was unspecified (for which it is reasonable to assume both were represented) was not larger (weighted mean r = .08) than the correlations for male-only (weighted mean r = .07) or female-only (weighted mean r = .14) samples. The effects for male and female samples were both significantly above chance (p < .001 for female samples and p < .01 for male samples), and the difference between the male-only and female-only results was marginally significant according to a fixed effects contrast, p < .10. This indicates that IS is somewhat more connected to psychosocial functioning in women than men.

Effect Magnitude

The average correlations were fairly weak in absolute magnitude, even when significant. However, several factors should be kept in mind when appraising effect magnitude. First, there was essentially no shared method variance between IS and the psychosocial variables, because IS was measured as performance on a test while the correlated variables were self-rated (including personality and attitude scales and estimates of one’s social competencies), rated by others (e.g., dyadic partners, group members, supervisors, patients, friends, or family), or measured with a behaviorally based method. Second, IS instruments sometimes suffer from weak internal consistency (Hall 2001), which would attenuate the correlations. Therefore, this may be a situation in which even small effects are impressive (Prentice and Miller 1992). As Ozer and Benet-Martínez (2006) and many others point out, small effects can accumulate and can have greater practical significance than suggested by the correlation per se (Rosenthal and Rubin 1982).

Effect magnitude should also be evaluated relative to effects that are typically found for individual-differences variables in social-personality psychology. In a quantitative review of 227 meta-analyses in which social behavior was related to individual difference variables (demographic, personality, or other dispositional variables), the average effect size was r = .19 (Richard et al. 2003), a figure that includes many studies using only self-report measures, for which much higher effect sizes can be expected. Seen in this light, the small effects found in the present review do not seem so small, and the fact that the effects were pervasive across so many different psychosocial variables adds credibility to the conclusion that a real phenomenon is being described. Similarly, though the number of unretrieved studies required to nullify the significant combined effects was often rather small (last column of Tables 3, 4, 5, 6), finding significant effects across so many categories and variables helps to reduce worry that the overall effect is fragile.

Comparison with Previous Reviews

The present review can be compared to the meta-analysis of Davis and Kraus (1997) that covered many of the same variables, though it is difficult to interpret differences because of methodological differences between the two reviews.Footnote 3 Despite these differences, the two reviews concur for positive psychological adjustment, reputational social sensitivity, internal locus of control, social desirability, the extraversion subscale of the Self-Monitoring Scale, and Machiavellianism. There is also conceptual concurrence for our categories of openness and tolerance, which overlap with Davis and Kraus’s rigidity/dogmatism. Finally, there was partial concurrence for empathy because Davis and Kraus found a positive effect for emotional empathy and not for cognitive empathy, whereas we found a positive effect for an inclusive definition of empathy. Davis and Kraus found significant effects for self-monitoring and trust, whereas the present review did not. The present review found significant effects for negative psychological adjustment, extraversion based on all scales, and femininity, whereas Davis and Kraus did not. For femininity, the present review was also discrepant with the meta-analysis by Hall and Halberstadt (1981), which did not overlap with the present review. Hall and Halberstadt did not find an overall effect for femininity whereas the present review did. For masculinity, the present review found no overall effect, whereas Hall and Halberstadt found that for male samples the relation was positive. The present findings could not be merged with the Hall and Halberstadt studies to form a comprehensive analysis of studies on masculinity and femininity, because Hall and Halberstadt’s studies were unpublished and can no longer be retrieved. At present, we do not have an explanation for discrepancies among the reviews with respect to the masculinity and femininity construct.

Finally, the present review was discrepant with Davis and Kraus (1997) with respect to whether people can accurately assess their own IS. Davis and Kraus concluded that the evidence warranted the conclusion that people cannot do this, as did many of the authors of the individual studies we included (e.g., Ames and Kammrath 2004; Patterson et al. 2001; Realo et al. 2003; see also review by Riggio and Riggio 2001). However, the present review actually found a significant positive effect both for ability to rate one’s performance right after taking a test and for ability to judge one’s nonverbal judgment skills more generally. For comparison, these effects were larger on average than comparable correlations summarized in a meta-analysis on people’s accuracy in assessing their own ability to detect deception (DePaulo et al. 1997), where the average effect of r = .04 indicated that people had essentially no self-insight into this kind of IS. However, though our effects were not zero, they were still very modest (Table 4) and far smaller than would be required for self-assessments to be a viable substitute for actual IS testing. Therefore, one can still conclude that people are not very accurate in estimating their own abilities.

Conclusions and Future Directions

Interpersonal sensitivity (IS) can be measured with tests, and it is related to many aspects of intrapersonal and interpersonal functioning. The literature gives broad support for the validity of IS tests as well as for the relevance of IS in everyday functioning. The present review covers many relations not previously summarized and addresses a specific gap identified by Ozer and Benet-Martínez (2006) in their assessment of consequential correlates of personality. Ozer and Benet-Martínez commented that “openness as yet has no well-documented effects in the interpersonal domain” (p. 416). Though IS is a tested skill rather than an interpersonal behavior per se, the positive association between openness and IS documented in the present review has clear implications for the interpersonal domain.

One important direction for future research is to ascertain whether the particular operational definition of IS—in terms of content, cue channels, and so forth—interacts with the type of psychosocial variable. For example, the research alluded to earlier by Marsh et al. (2007) suggests that accurate perception of fear, but not other emotions, is related to prosocial behavior. Is it possible that there exist many such optimal matchings between IS test characteristics and the variables to be predicted? One of many ways to approach this question would be in terms of content relevance; would, for example, a test of physicians’ ability to judge the cues of actual patients be a better predictor of patient satisfaction than a test of physicians’ ability to perform well on a generic IS test?

Related to this issue is the fact that the majority of studies used only one of the many IS tests in existence—namely the PONS test, in its full-length or short forms. Should we generalize about IS under these circumstances? The comparison reported earlier between results from the PONS monograph (containing many validity studies) and results not from that monograph, which were mostly not based on the PONS, suggested that results from the PONS monograph did not stand out as being either larger or smaller. As a different approach to the generalization question, we calculated the average effect size for all studies using any form of the PONS (regardless of whether the study was published in the PONS monograph or not) with all studies that used a different test. The average correlation for PONS results was .07 and the average correlation for other tests was .09—not much of a difference. Thus, the PONS produces similar results, on average, as other IS tests. However, as noted in the preceding paragraph, it may yet be shown that different tests (or different kinds of test content) have different degrees of predictive validity, depending on what variable is being predicted.

In the absence of such knowledge, investigators are often in a quandary when choosing an IS test for a research study. Which test is best for a given purpose? Researchers find themselves choosing a test based on nontheoretical factors such as convenience, cost, or precedent. Furthermore, they face the choice of using an existing instrument that has a good track record but that may not be optimal for their purposes, or creating a new one that might be more on target in terms of content but has unknown validity. Other things being equal, prediction should be maximized when the test and the variable to be predicted match in terms of content domain (see discussion in Rosip and Hall 2004). It would be good if researchers could develop sound criteria for choosing an IS instrument.

Another goal for future research is to understand better the causal relations between IS and psychosocial variables, as the present literature consists almost entirely of simple correlations. It will also be important to rule out potentially confounding variables and to search for mediating variables. For example, though research has shown that physicians with higher IS earn higher satisfaction ratings from their patients (DiMatteo et al. 1979), it is not known what the more sensitive physicians do that impacts favorably on their patients’ opinions of them, if it is indeed a causal relation. Elfenbein et al. (2007) called for an inquiry into this proverbial “black box”: what are the individual or dyadic mechanisms of behavior that might mediate between IS and social, personal, and occupational outcomes? A more sophisticated inquiry into causal issues would be of great benefit in this field.