Keywords

Ifyou have ever read a book or watched a television show depicting lie detection, or have had a casual conversation regarding the best way to figure out if someone is lying, you have likely heard many different theories, some of which conflict with each other. Even social psychologists who read journal articles on deception detection studies may find the literature confusing; a PsycInfo search on the term “deception detection” produces 1694 sources, including 1251 peer-reviewed journal articles. A systematic summary of results across the many studies on this topic may be the best way to make sense of the findings. Meta-analyses are one of the most effective methodological tools for summarizing and quantifying scientific effects across studies.

In a meta-analysis, the statistical findings from multiple studies are combined together in order to examine how robust an effect is across a variety of experimental paradigms. Researchers can examine not only the size of an effect, but also whether the effect is moderated by participant traits (e.g., gender, age), experimental variables, and study settings. The main advantage of a meta-analysis over a literature review is that meta-analyses are less subjective and allow for more precise quantitative conclusions (Rosenthal & Rosnow, 1991). Meta-analyses are also generally considered superior to single empirical studies because of their increased statistical power and the fact that they usually do not rely on a particular researcher or experimental paradigm. In meta-analyses, effect sizes are typically presented using either a Pearson’s r correlation coefficient (to assess the degree of relationship between two variables) or Cohen’s d (to assess the difference between conditions, calculated as the difference between means divided by the pooled standard deviation). Roughly, an effect size of d = .20 (or r = .10 to.30) is considered small, an effect size of d = .50 (or r = .30 to.50) is moderate, and an effect size of d  > .80 (or r  > .50) is large (Cohen, 1988). These effect size measures allow for easy comparison across different types of studies and enable readers to have a sense of how strong or weak a relationship or effect is. Thus, meta-analyses are uniquely useful in providing precise estimates of an entire literature—far more useful than simply relying on whether or not a finding is statistically significant within a given study or set of studies (Lipsey & Wilson, 2001).

Indeed, one reason we wanted to write this chapter is to have a single source to provide to our students that can summarize social science’s best answers to the “big” questions in deception detection research, such as: How accurate are people at detecting deception? Can experience, training, or circumstances make “perceivers” (people attempting to discriminate between truthful and deceptive communications) more accurate? What are the actual signs that a “sender” (the person who produces a truthful or deceptive communication) is lying, and what signs do perceivers think indicate that a sender is lying? Are polygraph machines, brain-imaging techniques, or other tools effective ways to enhance lie detection? Below, we summarize and interpret meta-analyses conducted to address these questions.

Deception Detection Accuracy and Moderators of Accuracy

For laypeople interested in deception detection, perhaps no question is more important than knowing how likely they are to detect deception. Can perceivers discriminate between truthful and deceptive communications at substantially above-chance levels? If so, under what circumstances are perceivers more or less accurate? The meta-analyses below address these questions.

Meta-Analyses of Deception Detection Accuracy

Three of the earliest meta-analytic analyses on deception detection accuracy (DePaulo, Zuckerman, & Rosenthal, 1980; Kraut, 1980; Zuckerman, DePaulo, & Rosenthal, 1981) found results that have been replicated over the years. First, these analyses found that deception detection accuracy is only slightly better than chance. Second, the analyses showed that, contrary to popular belief, voice and body cues were more useful than facial cues in detecting signs of deception (Zuckerman et al., 1981). In an unpublished doctoral dissertation, Kalbfleisch (1985) confirmed these findings and found early evidence of the tendency to see most communications as truthful, a finding subsequently labeled the truthbias (Levine, Park, & McCornack, 1999). Subsequent summaries of deception detection accuracy (e.g., Vrij, 2000) reported similar findings to these early analyses.

In 2006, Bond and DePaulo conducted a large-scale meta-analysis, including 206 studies and 24,483 perceivers of truthful vs. deceptive communications. This paper has been cited 1333 times as of April 2018, according to Google Scholar, and is generally considered the gold standard when it comes to measuring deception detection accuracy. Bond and DePaulo (2006) systematically gathered every known analysis (both published and unpublished) on perceivers’ accuracy at discriminating between truthful and deceptive communications of strangers; studies in which judges received experimental training, instructions about how to detect deception, or special aids (such as a polygraph reading or behavioral codings) were excluded. There were 177 independent samples of senders and 384 independent samples of perceivers. Twelve percent of the perceivers had occupational expertise in detecting deception (i.e., about 2842 experts).

Across all 292 samples used in the meta-analysis, the mean accuracy in discriminating truthful from deceptive communications was approximately 54% (when 50% is chance), with an effect size of d = .40 when deceptiveness was measured on a continuum. The highest mean percentage correctly attained in any sample was 73% and the lowest was 31%. As found in earlier meta-analyses, perceivers demonstrated a truth bias; they correctly classified 61.3% of truthful messages as truthful, but only 47.6% of deceptive messages as deceptive. Thus, accuracy rates in any given study may depend on the number of truthful versus deceptive statements made by senders, with higher-accuracy rates when senders tell relatively few lies. Bond and DePaulo (2006) also confirmed that deception detection accuracy was lower when judgments were made via video rather than via an audiovisual medium (d = −.44), audio-only medium (d = −.37), or from written transcripts (d = −.28); accuracy did not differ significantly between transcript, audiovisual, or audio presentations. Accuracy rates may be lower when visual cues are provided because senders make conscious attempts to control the way they appear when lying; on the other hand, accuracy rates may be higher when audio cues are provided because audio cues may be more difficult for senders to control. Ironically, senders who were motivated to get away with their lies were actually slightly more likely to be detected than senders who were not motivated (d = .17), possibly because motivated senders display more signs of fear or nervousness while lying. Perceivers were more accurate in judging unplanned rather than planned messages (d = −.14). Additionally, planned messages appeared more truthful than unplanned messages (d = .13). Contrary to popular opinion, people with occupational expertise (e.g., law enforcement personnel, psychiatrists, auditors) were not found to be superior to non-experts (e.g., college students) in discriminating lies from truths (d = −.02).

Are There Individual Differences in Judgments of Deception?

One of the earliest meta-analytic analyses of deception detection (Zuckerman et al., 1981) found no relationship between sex, self-monitoring, or Machiavellianism and perceivers’ ability to discriminate truthful from deceptive communications. A subsequent meta-analysis by Aamodt and Custer (2006), including 206 studies and 16,537 participants, examined several other individual difference measures and found similar null results. Deception detection accuracy was not substantially related to confidence (r = .05), age (r = −.03), education (r = .03), or sex (d = −.03). Aamodt and Custer were particularly interested in the role of occupational expertise in detecting deception, but once again the results came up short. Law enforcement officers (including police, detectives, secret service agents, parole officers, and judges) were not significantly more accurate (M = 55.5%) than college students (M = 54.2%). Even among law enforcement personnel, years of experience did not predict deception detection ability (r = −.08).

One might think that adults are at least more accurate in detecting the lies of children. But surprisingly, a recent meta-analysis by Gongola, Scurich, and Quas (2017), which included 45 experiments with 7893 adult perceivers and 1858 child senders, found that adults detect only 54% of children’s lies (i.e., no higher than the rate at which adults detect other adults’ lies).

Aamodt and Custer’s (2006) finding regarding confidence replicated a meta-analysis by DePaulo, Charlton, Cooper, Lindsay, and Muhlenbruck (1997), which extensively examined the relationship between deception detection judgments and confidence in those judgments. DePaulo et al. assessed 18 studies (including one unpublished manuscript) that reported correlations between continuous measures of confidence and accuracy. The finding across the 2972 perceivers, which included both college students and law enforcement personnel, was that the confidence–accuracy correlation did not significantly differ from zero (r = .04). In the six studies in which mean levels of confidence and accuracy could be compared, confidence was higher than accuracy. Thus, it appears people tend to be overconfident in their deception judgments, and their level of confidence says nothing about their accuracy. However, perceivers’ confidence was related to other aspects of their deception judgments. Perceivers who were more confident in their judgments were more likely to perceive sender communications as truthful (r = .17). Perceivers’ confidence in their judgments increased with the closeness of their relationship to the sender (r = .19), as predicted by theories that interpersonalperception in close relationships is based on an implicit sense of trust (McCornack & Parks, 1986). Men were significantly more confident about their deception judgments than women (r = .15) which is consistent with research showing that men are more confident—but not more accurate—than women in a variety of interpersonal judgments (Patterson, Foster, & Bellmer, 2001). Finally, 8 studies demonstrated that perceivers were significantly more confident in their judgments when viewing truthful rather than deceptive communications (r = .15). This result supports the notion that lies can be detected indirectly (DePaulo & Morris, 2004; but see criticism of theories on unconscious lie detection, Street & Vadillo, 2016).

Might certain individuals differ in their ability to detect deception, even if these abilities are not aligned with such obvious traits as sex, age, personality measures, education, occupational expertise, or confidence? The notion that a small proportion of people are lie detection “wizards” is a tantalizing idea, supported by the research of O’Sullivan and Ekman (2004) and popularized in prime-time television shows such as Lie to Me (Baum, 2009). Bond and DePaulo addressed this notion in their 2008 meta-analysis of individual differences in deception detection ability. They developed sophisticated statistical techniques to determine whether the variation in perceivers’ deception detection ability across studies was due to real differences in perceivers’ ability or whether the variation was a result of random measurement error. Their analysis included 247 studies drawn from 89 published and 53 unpublished manuscripts. In total, the participants included 2945 senders and 19,801 perceivers. This large analysis indicated that the range in ability to detect deception was no greater than would be expected by chance. While some perceivers were much better (or much worse) than the standard 54%, lie detection accuracy was not a reliable individual difference. Of course, it is possible that a tiny fraction of lie detection wizards do exist, or that particular situations can produce high levels of accuracy without special training among a select few—but evidence for such claims has not yet been demonstrated meta-analytically.

Bond and DePaulo (2008) did find individual differences regarding aspects of deception judgments other than accuracy. For example, perceivers differ from each other in terms of how likely they are to label senders’ communications as truthful (i.e., truth bias); the observed range for their judgments is 40% wider than would be expected by chance alone. In addition, senders differ in the ability to lie successfully. Most senders do not display obvious signs of deceptiveness, but there is a small proportion of senders who are extremely poor liars. The greatest individual difference among senders is their degree of credibility regardless of whether they are telling the truth or not; this range is 2.4 times as wide as what would be expected due to chance. Some people generally appear to be very honest or very deceptive, regardless of whether or not they are lying.

Can Deception Detection Accuracy Be Improved?

Readers who are seeking a “Pinocchio’s nose,” or surefire method to detect lies at near-100% rates, will be disappointed; while individual studies may claim to have found a technique to improve deception detection accuracy substantially without requiring perceivers to go through any special training, most of the discrepancy across studies can be explained as random variation (Bond & DePaulo, 2006). However, Hartwig and Bond (2014) note that Bond and DePaulo’s meta-analysis demonstrates the accuracy rate by human observers, rather than the “objective detectability” of lies. According to Hartwig and Bond, lies could theoretically be detected at a rate much higher than 54% if perceivers took multiple cues into account at once. Hartwig and Bond examined the degree to which lies could be detected if perceivers used all of the available behavioral cues. The researchers conducted a meta-analysis of 92 published and 33 unpublished studies (totaling 26,866 messages) that described a statistical prediction of deception from two or more visible, written, speech, vocal, or “global impression” cues. Lies could be objectively detected (using statistical algorithms and multiple cues) approximately 67% of the time on average, substantially higher than human perceivers’ actual accuracy of 54%. However, this high detection rate only works in situations in which a large number of communications take place under similar circumstances, and is dependent on the ability to observe multiple (sometimes numerous) cues at once.

Might the context in which senders tell lies enable perceivers to obtain accuracy rates similar to that of the statistical algorithms? According to Hartwig and Bond (2014), this is unlikely; they found that lies are equally detectable regardless of senders’ degree of motivation, whether senders are students or non-students, whether senders are communicating about feelings versus facts, or the setting in which the senders’ communication takes place. Hartwig and Bond interpret this finding as evidence that deception detection accuracy rates are not an artifact of laboratory settings, because the detectability of lies remains consistent across a variety of settings and situational variables. In other words, the low-accuracy rate of human lie detection without special training is stable and generalizable.

What happens when perceivers doreceive special training? This was the topic of a meta-analysis by Frank and Feeley (2003), who conducted an initial analysis of 20 studies (11 of which were published) on lie detection training. This meta-analysis compared 1072 participants who were trained in lie detection techniques during experiments to 1161 untrained perceivers. Frank and Feeley found that training did lead to a small gain in accuracy (r = .20). However, they note that there was considerable variance around this mean r value, with some studies showing much higher gains and some studies showing no gains in accuracy whatsoever due to training. Frank and Feeley suggested that future analyses should differentiate training that meets rigorous criteria from training that does not. Nine years later, this is exactly what Driskell (2012) did in his analysis of 16 published journal articles with 30 studies (total N = 2847). Using this updated dataset, Driskell found that training led to a moderate gain in accuracy (d = .50) but the accuracy was moderated by certain aspects of the training. First, training programs that included three components—instruction regarding signs of deception, practice in recognizing signs of deception, and feedback on perceivers’ guesses about senders’ truthfulness—lead to high gains in accuracy (d = .59). Second, Driskell investigated the effects of the training content being taught to perceivers, looking specifically at actual cues to deception documented by DePaulo et al. (2003)—a meta-analysis discussed in more detail in the next section. Driskell found positive effects on accuracy when perceivers were correctly taught that senders are more likely to be lying if they exhibit tension or fidgeting, if their statements seem illogical, and if they make speech errors. Third, Driskell compared the effectiveness of training on perceivers with no special expertise (mostly college students) versus perceivers with experience in deception detection (mostly law enforcement personnel). Training was actually more effective for the perceivers without any experience. This may seem surprising, but Driskell points out that law enforcement training frequently focuses on stereotypical signs of deception, such as gaze aversion (Vrij, 2000), rather than empirically supported cues—which may actually lead law enforcement personnel to focus on some incorrect cues. Finally, Driskell found that training was more effective in teaching perceivers to detect lies about feelings and opinions than lies about transgressions. Most recently, Hauch, Sporer, Michael, and Meissner (2016) conducted an updated meta-analysis on training, based on 55 studies; unlike Driskell’s analysis, Hauch et al. included unpublished findings and analyzed lie accuracy and truth accuracy separately. Hauch et al. found a small-to-moderate effect of training on detection accuracy of lies, but not truths. They also found that training was most effective when based on verbal content cues rather than nonverbal or paraverbal feedback. In sum, training that focuses on instruction regarding documented cues to deception, practice, and verbal content cues can be useful for detecting at least some lies. However, the degree to which training is effective in detecting lies in real-world interpersonal and forensic settings is less certain and a useful topic for future meta-analyses.

Actual Cues to Deception

Perhaps the most intriguing question about deception for researchers, law enforcement, and laypeople, beyond how to detect lies, is what the realcues to deception are. Are there reliable cues to deception, and if so, how strongly do they distinguish between truths and lies? The meta-analyses below address these issues, though the answers are less straightforward than the questions.

Nonverbal and Paraverbal Cues to Deception

The earliest meta-analyses on deception detection accuracy (e.g., Kraut, 1980; Zuckerman et al., 1981) documented some actual cues to deception, but we will focus on the updated and thorough meta-analysis of actual cues to deception conducted by DePaulo et al. (2003); this paper has been cited 2031 times as of April 2018, according to Google Scholar. This analysis included 120 independent samples (including 3 unpublished works), in which 1338 estimates of 158 verbal, paraverbal, and nonverbal cues to deception were assessed.

DePaulo et al. (2003) found dozens of cues that significantly differentiated between truthful and deceptive communications. In this review, we focus only on the most reliable findings—specifically, those that were statistically significant, based on at least 6 studies, and had an effect size d of at least.20. Compared to senders who told the truth, senders who lied exhibited more vocal tension (d = .26), spoke in a higher pitch (d = .21), and appeared more tense and nervous (d = .27). Vocal displays of tension and nervousness are among the most reliable “paraverbal” signs of possible deception. Paraverbal cues are vocal cues that accompany speech.

Sporer and Schwandt (2006) conducted a meta-analysis to analyze a small number of senders’ paraverbal behaviors in great depth. Specifically, they examined message duration, number of words, speech rate, response latency, pauses, speech errors, speech repetitions, and vocal pitch across 41 manuscripts. Only two of these paraverbal cues were significantly related to deception overall: liars spoke in a higher pitch (r = .10) and took longer to begin responding to questions (i.e., greater response latency) (r = .11). Sporer and Schwandt also noted that the relationship of many cues to deception was heterogeneous—that is, they differed substantially depending on several moderators. For example, the aforementioned relationships of vocal pitch and response latency to deception were greater when senders talked at least in part about their feelings, rather than only facts. The relationship of paraverbal cues to deception also varied with the amount of senders’ preparation and degree of motivation, as well as the type of experimental design.

In a similar meta-analysis, once again using data from 41 articles (54 studies), Sporer and Schwandt (2007) conducted an in-depth examination of the relationship between 11 nonverbal behaviors (blinking, eye contact, gaze aversion, head movements, nodding, smiling, self-touching, hand movements, illustrators, foot/leg movements, and postural shifts) and deception. Three of these behaviors occurred less often when people were lying than when they were telling the truth: nodding (r = −.09), hand movements (r = −.19), and foot/leg movements (r = −.07). Contrary to popular belief (Global Deception Research Team, 2006), averting one’s gaze was unrelated to deception. As is the case with paraverbal cues, the effect sizes of the relationships between nonverbal cues and deception tend to be small and heterogeneous. Sporer and Schwandt found that the relationship between nonverbal cues and deception varied substantially with the content of the lie, whether or not senders were motivated, the degree to which senders prepared their statements, the type of experimental design, and the operationalization of the behaviors. For both paraverbal and nonverbal cues, context matters in their relationship to deception, and the correlations between these cues and deception are generally far smaller than most people expect.

Verbal and Content-Related Cues to Deception

Many of the cues that DePaulo et al. (2003) found differentiated truths from lies were not facial expressions, body movements, or tone of voice, but rather characteristics of the actual wording senders used, as well as general impressions perceivers had of senders. Compared to truthful senders, deceptive senders were perceived as displaying less verbal and vocal “immediacy,” or signs of being clear and direct (d = −.55). Liars also seemed more uncertain (d = .30) and less emotionally involved in their statements (d = −.21). Liars made statements that seemed less plausible (d = −.23), less logical (d = −.25), and more internally discrepant or ambivalent (d = .34). DePaulo et al. interpreted these six findings as indicative that liars’ communications are less compelling than those of truth-tellers. Liars also provided fewer details in their statements (d = −.30) and were perceived as making more negative statements and complaints (d = .21), leading perceivers to have a slightly more negative impression of liars than truth-tellers. Ironically, cues to deception are more obvious when senders are more motivated to succeed.

Computer-Identified Linguistic Cues to Deception

Can computers detect lies? This was the question asked by Hauch, Blandón-Gitlin, Masip, and Sporer (2015) in their meta-analysis of the linguistic cues to deception that can be detected by computer programs. Hauch et al. identified 79 cues from 44 studies (17 unpublished; total N = 3780 senders) in which computer software programs (e.g., the Linguistic Inquiry and Word Count; Pennebaker, Booth, Boyd, & Francis, 2015) had been used to identify words indicative of deception. Hauch et al. found that liars used fewer words, as well as less-varied and complex words, supporting the notion that liars experience greater cognitive load (see Vrij, Fisher, & Blank, 2015). Liars also used more negative words (as well as more emotion words), which fits with DePaulo et al.’s (2003) finding that liars make more negative statements. Liars used fewer first-person pronouns and more second- and third-person pronouns, possibly indicating that liars are more likely to distance themselves from the events they discuss. Liars used fewer sensory and perceptual words, as well as fewer words related to their cognitive processes. As in other meta-analyses of actual cues to deception (DePaulo et al., 2003; Sporer & Schwandt, 2006, 2007), effect sizes were generally small and heterogeneous. Effects were moderated by the type of event senders discussed, the degree of personal involvement, whether events discussed were positive or negative, the degree of interaction senders had with perceivers, and senders’ level of motivation. As with nonverbal and paraverbal cues, context figures prominently in the relationship between linguistic cues and deception.

Do People Use Valid Cues to Detect Deception?

People generally perform only slightly better than chance at detecting lies, as demonstrated by the aforementioned Bond and DePaulo (2006) meta-analysis. There are two possible explanations for this finding. First, it is possible that people are unable to detect many lies because they rely on invalid cues to deception (perceived cues). Second, it is possible that people rely on valid cues to deception, but the dearth of valid behavioral cues and the small effect sizes associated with those cues lead to poor accuracy. Hartwig and Bond (2011) conducted a series of meta-analyses to evaluate which of these explanations receive greater empirical support.

Hartwig and Bond (2011) assessed the relationship between perceived cues to deception and actual cues to deception, examining 66 cues across 153 samples. The overall correlation between perceived and actual cues was r = .59, a moderate to strong relationship. When Hartwig and Bond limited their meta-analysis to “within-study evidence”—i.e., studies in which perceived cues and actual cues were measured within the same sets of perceivers and senders—the correlation between perceived and actual cues rose to r = .72, a very strong relationship. Hartwig and Bond found that deception detection accuracy was much more constrained by the lack of valid cues than by perceivers’ tendency to use incorrect cues. In other words, perceivers mostly use the right cues to detect deception; limited lie detection accuracy can be attributed mostly to the fact that valid cues to deception are not very reliable.

Interestingly, Hartwig and Bond (2011) also found that the cues perceivers rely on when making deception judgments differ from the cues perceivers claim to rely on when making deception judgments. For example, perceivers frequently say that they use lack of eye contact to determine that a sender is lying; however, in actuality, lack of eye contact is only weakly related (r = −.15) to perceivers’ judgments of deceptiveness. Consistent with classic findings that people are often misguided when reporting on their internal (often unconscious) cognitive processes (Nisbett & Wilson, 1977), people don’t seem to know what cues they use when making deception judgments.

Interrogation Techniques Used by Law Enforcement

Law enforcement would be a much easier job if there were a highly accurate method to discern whether a suspect is lying or telling the truth. In the following section, we will describe the deception detection techniques used by law enforcement and provide meta-analytic data regarding the accuracy of each technique.

Polygraph—Control Question Test

The polygraph is often referred to as a “lie detector,” implying that it can distinguish between truths and lies with a high degree of accuracy. More specifically, the polygraph measures certain physiological responses such as respiration, pulse, blood pressure, and the skin’s electrodermal response. The polygraph can be used with different types of questioning techniques. One such technique is the control question test (CQT) which is commonly used by law enforcement and government agencies in the US. In the CQT, a suspect’s physiological responses during questions relevant to the crime are compared to their physiological responses during control questions that are unrelated to the crime. If the two patterns of physiological responses are significantly different from each other, the polygraph examiner is likely to conclude that the suspect is lying.

Kircher, Horowitz, and Raskin (1988) conducted a meta-analysis of the accuracy of the CQT which included 14 mock crime studies (N = 765), including 2 unpublished studies. They focused their meta-analysis on mock crime studies rather than field studies because mock crime studies allow the researchers to know with certainty which participants are lying and which are telling the truth. Participants in mock crime studies are randomly assigned to commit a mock crime or are given information about a mock crime which they did not commit. In these studies, there is often an incentive, in the form of money or the avoidance of punishment, motivating both the guilty and innocent people to appear convincingly truthful in their claims of innocence. The meta-analysis of mock crime studies found that the overall detection accuracy of the CQT was 66%.

Because accuracy rates for the CQT in mock crime studies vary widely (21–87%), the purpose of the meta-analysis by Kircher et al. (1988) was to test whether the variability in accuracy is due to how ecologically valid the mock crime studies are—that is, how similar the conditions of mock crime studies are to conditions in the field. The meta-analysis found that the CQT is more accurate when the participants have some incentives or motivation to appear truthful (r = .73). The CQT was also more accurate when the samples in the mock crime studies were not predominantly college students but instead included members of the community, ex-offenders, and prison inmates (r = .61). Kircher et al. argue that the CQT may be less accurate with college students because college students may care less about the monetary incentives than would a prisoner or ex-offender. Finally, the CQT is more accurate when the polygraph examiners make decisions based on standard criteria used in the field such as using numerical coding and at least 3 charts of physiological data (r = .67).

Polygraph—Guilty Knowledge Test

Another questioning technique used with the polygraph is the guilty knowledge test (GKT), most frequently used in Japan and Israel (Ben-Shakhar & Elaad, 2003). The GKT is a series of multiple choice questions such as “what type of gun was used in the crime?” If the suspect’s pattern of physiological responses is different when the correct answer is mentioned than when the incorrect answers are mentioned, this pattern would indicate that the suspect has personal knowledge of the details of the crime. The GKT can only be used if the investigator knows what the answer to the question is (e.g., what type of gun was used) and if there is no conceivable way that an innocent person would have that information.

In a meta-analysis of 22 published studies (N = 1247; unpublished studies were excluded from the analysis) conducted by MacLaren (2001), the overall accuracy rate of the GKT was 76%. When the meta-analysis was limited only to studies which included mock crimes, the accuracy rate increased to 82%. Similarly, in a meta-analysis of 80 studies conducted by Ben-Shakhar and Elaad (2003), the effectiveness of the GKT was higher in mock crime studies (d = 2.09) than it was in the overall meta-analysis of all studies (d = 1.55). The GKT was significantly more accurate in studies in which the GKT was implemented under conditions the researchers considered optimal. Those conditions were that the participants had a higher motivation to succeed, the participants had to verbalize a “no” response to the unselected options, and there were at least 5 guilty knowledge questions asked (d = 3.12).

The Strategic Use of Evidence Technique

Law enforcement officials can also use deception detection techniques which do not rely on using a polygraph. When interrogators are in possession of highly incriminating evidence, they can use this information to their advantage when trying to detect deception. The Strategic Use of Evidence (SUE) technique involves not informing suspects that the interrogators are aware of the incriminating evidence until after the suspect has already provided his or her own version of events (Hartwig, Granhag, & Luke, 2014). The premise behind the SUE technique is that guilty people will avoid mentioning any information that could possibly be incriminating, whereas an innocent person will willingly share all information of which they are aware. For example, if there were a robbery at a store in a mall and suspects are asked to describe their recollection of that day, an innocent suspect would be more likely than a guilty person to mention shopping at the mall on the day of the crime. To a guilty person, this incriminating piece of information would be considered too aversive to mention and should be concealed.

When interrogators implement the SUE technique, they begin by asking open-ended questions such as, “Where were you on February 11th?” If the interrogator has evidence from a security camera that the suspect stopped at the mall, a failure to mention that detail would be considered a possible indication of deception. An interrogator using the SUE technique would look for two signs of deception: failure to mention incriminating information during the first telling of the story and inconsistencies between the suspect’s statement and the known evidence. The SUE technique is most effective when the incriminating information is revealed to the suspect later in the interrogation process (Hartwig et al., 2014). If suspects already know that they were captured on a security camera, they can incorporate that detail into their story in a way that doesn’t make them look guilty. For example, they could say that they went shopping at the mall that day but fail to mention the particular store they went to. Withholding the incriminating evidence until late in the interrogation is likely to lead to more avoidance of the incriminating information and more inconsistencies between their statement and the evidence. Hartwig et al. (2014) conducted a meta-analysis (N = 599) comparing the effectiveness of the SUE technique to a non-SUE technique using 8 mock crime studies (including 1 unpublished study). For both the SUE and non-SUE techniques, the statements of guilty people were more inconsistent with the evidence than the statements of innocent people, but the effect size was much larger when the SUE technique was used (non-SUE technique d = 1.06; SUE technique d = 1.89).

Although the SUE technique is very effective, its use is limited because it can only be used when the interrogator has incriminating evidence and the suspect is unaware that the evidence is known to the interrogator. In such situations where both of these requirements are met, SUE may be the most effective deception detection technique currently available which does not require a polygraph.

Increasing Cognitive Load

Lying requires more cognitive resources than telling the truth because a liar must actively create untruths, whereas a truth-teller simply has to describe existing memories (Vrij et al., 2015). Cognitive approaches to lie detection are based on the assumption that nonverbal cues may distinguish liars from truth-tellers precisely because those cues appear when a liar is using a lot of cognitive resources, a state known as cognitive load. When implementing a cognitive approach to deception detection, the interrogator or experimenter intentionally does certain things to make the task even more demanding for liars. For example, an interrogator might increase cognitive load by asking the suspect to tell the story backward, to make unwavering eye contact with the interrogator while telling the story, or to tell the story while doing another task simultaneously. An interrogator can also increase the difficulty of the task by asking suspects to add more details to their stories; a truth-teller can do this easily because there are many possible details to share, but a liar would need to create those details on the spot. Another way to increase cognitive load is to ask the suspect unanticipated questions. Liars often plan their answers to anticipated questions in advance which reduces their cognitive load when giving those answers during the interrogation. Suspects will be under higher cognitive load if they are asked questions they did not anticipate having to answer.

Vrij et al. (2015) conducted a meta-analysis of 14 studies to compare the effectiveness of the cognitive approach to a standard lie detection approach in which cognitive load is not intentionally increased. Their meta-analysis indicated the cognitive approach was more accurate than the standard approach in accurately detecting lies (67% vs. 47%, d = .53), accurately detecting truths (67% vs. 57%, d = .24), and overall accuracy (71% vs. 56%, d = .42). Because cognitive approaches increase the difficulty of the task, suspects “leak” twice as many verbal and nonverbal cues to deception than when a standard approach is used (Vrij, Fisher, Blank, Leal, & Mann, 2016).

Content-Based Techniques

Content-based techniques were designed to differentiate between truthful and deceptive statements by examining the specific content shared by suspects. Content-based techniques are based on the assumption that statements about personal experience will include more detail than statements not based on actual experience. For example, a truthful story may be more likely to contain details about the context of the event, conversations that occurred, and recollections of one’s mental state. Furthermore, because speakers of fabricated stories are especially concerned with appearing truthful, they may be less likely to correct their stories spontaneously or admit to forgetting some aspect of the event than would a truthful speaker. Fabricating a story and trying to appear truthful through self-presentation both require cognitive resources which leave fewer resources available to add extensive details to the story. Two types of content-based techniques are criteria-based content analysis (CBCA) and reality monitoring (RM). CBCA was developed to distinguish between true and fabricated statements, and it is considered admissible evidence in a court of law in the US and Western Europe. Although RM was originally developed as a way to distinguish between true and false memories, the technique has also been used in lie detection. With both of these techniques, there is a list of criteria that are used to judge the statements. A recent meta-analysis of 56 studies tested the effectiveness of CBCA and RM in detecting deceptive statements (Oberlader et al., 2016). Overall, the accuracy rate for these techniques was 70% (d = 1.03) and there was no significant difference between the effectiveness of the two techniques.

Limitations of Studying the Effectiveness of Lie Detection Techniques in Lab-Based Settings

While the above meta-analyses are very useful in giving us an idea of how accurate each method is, we must be very cautious about assuming these accuracy rates will be the same outside of the laboratory and in the field settingswhere professionals use the techniques to solve crimes. Experiments testing the accuracy of lie detection techniques typically take place in highly controlled laboratory environments so that the experimenter can randomly assign participants to be guilty and innocent, which enables experimenters to know with certainty whether a particular deception detection technique was accurate in each case. This level of certainty in testing the accuracy of a lie detection technique is not possible in the field because professional law enforcement officials usually cannot know definitively which suspects are innocent and which are guilty. In an effort to increase the external validity of laboratory studies, researchers have used mock crime experiments which attempt to mirror real-world conditions by including incentives to appear innocent or punishments if one is found guilty (Ben-Shakhar & Elaad, 2003; Hartwig et al., 2014). However, given that the incentives and punishments associated with a real criminal investigation are far greater, the accuracy rates of the lie detection techniques above should be considered estimates rather than definitively conclusive.

Neuroscientific Techniques for Lie Detection

Technological advances over the past 20 years have provided new tools for studying brain activity during deception (Christ, Van Essen, Watson, Brubaker, & McDermott, 2009). Unlike older techniques for studying brain activity (via scalp-recorded event-related potentials), positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) allow researchers to study the specific brain regions being activated while participants engage in various forms of deceptive communication (Kozel et al., 2005). In their meta-analysis on the use of fMRI and PET techniques in 12 studies, including a total of 173 activation foci, Christ et al. sought to determine: (a) which regions of the brain were activated during deception, and (b) which aspects of executive control (working memory, inhibitory control, or task switching) were most important during the act of deception. Christ et al. found that 13 brain regions were more active during deceptive than truthful communication, and 8 of these 13 brain regions are located in or near the prefrontal cortex (PFC). This finding supports the theory that executive control processes play an important role in deception, because the PFC has a strong role in executive functioning. Some of these regions (specifically, the right and left inferior frontal gyrus [IFG] and insula, as well as the anterior cingulate cortex [ACC]) contribute to executive control generally, and thus, it is difficult to know if these regions have a specific role in deception per se, rather than just a role in all executive control functions. However, the left dorsolateral PFC, the right anterior PFC, and right posterior parietal cortex were associated with both deception and working memory, but not other executive functions. This indicates that working memory, more than inhibitory control or task switching, may play a particularly important role in deception. Furthermore, the insula and nearby (parainsular) regions of the brain, which are known to play a role in visceral responses such as blood pressure and heart rate, were activated during deception. As discussed in the section on polygraph techniques, these functions frequently accompany deception; thus, it is unsurprising that these regions of the brain are active during deception. Finally, two regions of the brain unrelated to executive control were activated during deception—specifically, the left and right inferior parietal lobules. These regions of the brain have previously been implicated in selective attention and detection of important low-frequency events. These areas of the brain may play a role in maintaining attention in order to detect contexts in which deception is required.

There are various kinds of lies, involving various types of cognitive and emotional processes (DePaulo, Kashy, Kirkendol, Wyer, & Epstein, 1996); thus, it is likely that different regions of the brain are utilized for these different types of lies. Lisofsky, Kazzer, Heekeren, and Prehn (2014) conducted a meta-analysis of neuroimagingstudies using PETand fMRI in an attempt to differentiate the regions of the brain utilized during socially interactive vs. non-interactive lies. Socially interactive lies, which the authors consider more ecologically valid, include tasks such as deceiving an interrogator about autobiographical information, making false promises to behave cooperatively in a trust/prisoner’s dilemma game, and concealing knowledge about memories or knowledge when asked. Non-interactive lies include tasks such as lying about whether an object or word is recognized, or whether an everyday act has been performed correctly. Twenty-four studies, including 26 contrasts between truthful and deceptive statements (N = 416), were included in the meta-analytic comparison between socially interactive vs. non-interactive studies. Studies were classified as having “social interactive” (as opposed to “non-interactive”) deception paradigms: if (a) an interaction partner who gets deceived was present or imagined by the participant, or (b) there was a cover story designed to simulate a real-world interpersonal deception. The analysis compared the neural activity for the 15 social interactive study paradigms vs. the 11 non-interactive study paradigms. Consistent with the authors’ predictions, the regions of the braininvolved in social interactive (vs. non-interactive) deception involved working memory and inhibitory control. One of the regions of the brain that was more active during social lies was the ACC, which is known to play a role in detecting or monitoring conflict (a situation that may occur when people lie in social situations). The posterior superior temporal and angular gyrus was also more active during social lies; this region of the brain has been associated with social cognition and moral decision-making, including theory-of-mind processes (i.e., inferring the mental states of others). It is likely that people engage in social deception designed to fool an interaction partner by making inferences about the partner’s mental state. The role of a part of the brain associated with moral decision-making may indicate that telling a lie in a social setting is considered a moral transgression, whereas non-interactive lies may not elicit that same sense. In addition to replicating many of the patterns found in the meta-analysis by Christ et al. (2009) regarding brain activity indicative of deception, Lisofsky et al. found that regions of the brain responsible for theory of mind and moral reasoning are utilized more in social interactive than non-interactive deception.

Findings from the aforementioned neuroscience meta-analyses show a good deal of consistency—perhaps unsurprisingly, because they included many of the same studies in their analyses. Brain-imaging techniques for deception detection have been heavily marketed as a potential cutting-edge tool to be used in business negotiations, protection against terrorists, and criminal trials. Can fMRIor PET technology serve as a useful lie detector in forensic or negotiation settings? Farah, Hutchinson, Phelps, and Wagner (2014) conducted a meta-analysis of lie detection studies using fMRI, with a focus on the practical and ethical implications of using brain-imaging tools in these applied settings. Their sample, which included 23 studies comparing responses to deceptive and truthful statements, indexed 321 foci in the brain. Farah et al. delineate several reasons why this technology may not be reliable in applied settings. First, although the meta-analytic findings are consistent, there is considerable variability in findings from study to study; no single brain region was active during deception in all the studies, or even almost all of the studies. Consistent with this finding, Gamer (2011) conducted a meta-analysis of fMRI studies using two different deception detection paradigms (22 studies, N = 408) and found that the brain regions most active during deception depended heavily on the type of paradigm used to elicit deception. Second, for all deception detection paradigms, it is extremely difficult to determine whether differences in brainactivity between the “lie” and “truth” conditions are due to the degree of truthfulness or to some other subtle difference between conditions. For example, Farah et al. point out that the frequency of motor response tended to be greater during deceptive than truthful statements; therefore, differences in neural activity may actually reflect brain activity associated with differences in motor actions. Similarly, differences in neural activity may be due to the greater cognitive load imposed by deceptive versus truthful statements. This could lead to false positives when participants are under cognitive load for reasons other than telling a lie. Another problem for practical applications is that fMRI studies may be particularly vulnerable to countermeasures—possibly much more so than other physiological lie detection measures such as the polygraph. For example, in one study Farah et al. reviewed, if participants made imperceptible finger and toe movements during their truthful and deceptive statements, accuracy fell to chance.

Even when lie detection via fMRI is reasonably accurate, it may still have low specificity, meaning that it is not useful in spotting low-probability events because too many false positives will occur (Farah et al., 2014).

Yet another issue with applying fMRI research to criminal investigations is that almost all participants in laboratory studies were college students with no diagnosed psychopathologies. Some violent criminal offenders, on the other hand, can be diagnosed with traits related to psychopathy and anti-social personality disorder. These diagnoses have been linked to structural and functional differences in brain activity, calling into question whether fMRI findings would differ for these populations (Farah et al., 2014).

The idea of sophisticated brain imaging as a tool to detect deception is appealing to the public at large. Indeed, at least two companies (No Lie MRI and Cephos) have recently started to offer fMRI lie detection services in business, personal, criminal, and national security settings (Farah et al., 2014). However, use of fMRI for lie detection in these real-world settings is almost certainly premature, and, thus far, fMRI evidence of deception is not generally accepted as evidence in criminal or civil court cases.

Statistical Limitations to Meta-Analyses on Deception

As we discussed earlier, meta-analyses are an exceptionally useful tool for quantitatively summarizing an entire field of research (Rosenthal & Rosnow, 1991) and this is particularly helpful in the study of deception detection, given the breadth of literature and the wide variety of experimental paradigms used. Nevertheless, meta-analytic techniques are not without flaws—flaws which are largely the result of problems with the entire paradigm of significance-testing across multiple areas of science (Simonsohn, Nelson, & Simmons, 2014).

One of the earliest documented flaws is the file drawer problem, which is the phenomenon that statistically significant results are much more likely to be published than nonsignificant results (Rosenthal, 1979). If, for example, one study demonstrates that a given manipulation improves perceivers’ ability to detect deception, while ten other studies find no such effect, it is possible that the study showing the significant effect will be published, while the ten studies with null findings will go unpublished (i.e., sit in a file drawer), leading readers to infer that the manipulation does indeed improve deception detection. Meta-analyses are potentially quite useful in exposing these false positives, but only if all studies—including unpublished studies—are included in the analysis. A meta-analysis that systematically eliminates null results gives a skewed picture of the literature. Where possible, we noted when the meta-analyses we discussed made an effort to attenuate the file-drawer problem by including unpublished findings.

In addition to the file-drawer problem, other common researcher practices have been found to add to the rate of false positives. Many of these questionable practices were illustrated dramatically in a paper by Simmons, Nelson, and Simonsohn (2011) who demonstrated that flexibility in the collection, analysis, and reporting of data leads to a skewed publication record. When individual studies are biased in favor of a certain outcome, the meta-analysis of those studies will be biased as well (McShane, Böckenholt, & Hansen, 2016). Researchers have developed statistical techniques specifically to correct for these biases when conducting meta-analyses (e.g., Hedges, 1992; the “p-curve analysis” by Simonsohn et al., 2014). However, other researchers have found that these efforts to correct for publication biases in meta-analyses are not always adequate (Inzlicht, Gervais, & Berkman, 2015). In addition, even when meta-analyzing the exact same literature, different researchers may use slightly different methods for choosing precisely how to combine effect sizes from multiple studies, leading to slightly different outcomes (e.g., see DePaulo et al., 2003; Sporer & Schwandt, 2006). Nevertheless, despite these flaws, most researchers agree that meta-analyses are still far more useful than single studies in ascertaining average effect sizes, the degree of heterogeneity within findings, and the role of moderators in study results (McShane et al., 2016).

Potential Future Meta-Analyses on Deception

There are at least three research topics in the deception literature that have not been meta-analyzed yet, but we hope they will be in the near future. First, to the best of our knowledge, there are no meta-analyses on the frequency with which people lie, despite some intriguing studies in this area (e.g., DePaulo et al., 1996). Second, we could not find meta-analyses on the role of relationship closeness in deception detection accuracy, although there are a number of published manuscripts in this area (e.g., Anderson, DePaulo, & Ansfield, 2002; Boon & McLeod, 2001; Levine & McCornack, 1992; McCornack & Levine, 1990; Morris et al., 2016; Sternglanz & DePaulo, 2004; also see Sternglanz & Morris, 2014, for a brief review of deception in friendships). Third, the efficacy of implicit or indirect deception detection is a hotly debated topic (DePaulo & Morris, 2004; Levine & Bond, 2014), and a meta-analysis may provide clarity on this issue. Finally, there are numerous topics related to deception in psychology (e.g., embellished resumes, infidelity, academic dishonesty, children’s understanding of false belief tasks) and behavioral economics (e.g., game theory paradigms such as prisoner’s dilemma) with broad literatures, some of which have been meta-analyzed. It would elucidate our understanding of deception to integrate findings from these analyses with meta-analytic findings on deception detection.

Conclusions

Meta-analyses have been conducted on a wide variety of topics related to deception detection; see Table 16.1 for a brief quantitative summary. Findings from these analyses indicate that people are generally only slightly better than chance at detecting deception, regardless of their personality traits, career experience, or confidence in their judgments. There are cues that probabilistically indicate when people may be lying, but only a small minority of liars display obvious cues. Limited deception detection accuracy can be attributed mostly to the fact that valid cues to deception are not highly reliable. Nevertheless, training programs that focus on documented cues to deception, verbal content cues, and practice can improve perceivers’ ability to detect at least some lies. Additionally, computer programs and statistical algorithms can detect lies better than human perceivers under specified conditions. Law enforcement tools such as polygraphs, strategic use of evidence, and increasing senders’ cognitive load also improve lie detection, at least in controlled experimental settings. Neuroscientific techniques such as fMRI have demonstrated that areas of the brain associated with working memory are more active during deception; however, despite high levels of accuracy under specific highly controlled conditions, brain imaging is, at present, an unreliable tool for detecting real-world lies. In spite of some limitations, meta-analyses have been highly useful in summarizing the effect sizes, degree of heterogeneity, and moderators for the scientific study of deception detection.

Table 16.1 Deception detection accuracy for meta-analyzed techniques