Research has consistently found that people are mediocre lie detectors (Bond and DePaulo 2006). Even presumed lie experts, such as police officers, achieve accuracy rates only slightly above the level of chance. The literature on behavioral differences between liars and truth tellers provides an explanation: there are few objective cues to deception, giving lie-catchers little diagnostic information to rely on (DePaulo et al. 2003; Hartwig and Bond 2011).

In police investigations, assessments of veracity are often made during interviews with suspects. However, research has until recently neglected to provide empirically validated information about how interviews should be carried out to facilitate for deception detection. Police interrogation manuals offer plenty of suggestions (Inbau et al. 2001), but empirical evidence provides little support for these suggestions (e.g., Vrij et al. 2007).

Contrary to the emotional perspective on deception (Ekman 2001) and widespread beliefs (Strömwall et al. 2004), liars do not reveal themselves through leakages of cues to nervousness or discomfort (DePaulo et al. 2003). In contrast, an emerging wave of research supports the notion that there may be cognitive differences between liars and truth tellers, and that lie detection can be improved by exploiting such differences (Colwell et al. 2007; Levine et al. 2010; Vrij et al. 2006; Vrij et al. 2009). For example, a recent line of research suggests that increasing cognitive load during the interview may amplify verbal as well as non-verbal differences between liars and truth-tellers (Vrij et al. 2008). The present study explores another possible way of amplifying the differences between liars and truth-tellers by varying the time at which the evidence against them is disclosed.

Deception Detection via Strategic Use of Evidence (SUE)

The Strategic Use of Evidence (SUE) technique is a method of interviewing based on the assumption that liars and truth tellers have different counter-interrogation strategies in response to interrogation (Colwell et al. 2006; Granhag and Hartwig 2008; Hartwig et al. 2010; Hartwig et al. 2011; Hines et al. 2010). Specifically, theoretical notions about the psychology of guilt and innocence predict that guilty suspects will use avoidant strategies with respect to possibly self-incriminating information (Strömwall et al. 2006), whereas innocent suspects will use much more forthcoming strategies (Kassin 2005). For example, innocent people may volunteer being at the crime scene, while guilty people tend to omit or deny such information. Importantly, recent work shows that these basic assumptions are empirically supported. In brief, guilty mock-suspects have been found to avoid mentioning possibly self-incriminating information (Strömwall et al. 2006), and deny holding possibly self-incriminating information more often than innocent suspects (Hartwig et al. 2006). Thus, when the SUE technique is used, guilty and innocent suspects are found to differ markedly in the extent to which their statements are consistent with the evidence held by the interviewer.

A full-scale use of the SUE-technique demands considerable pre-interview planning that includes an assessment of the evidence at hand and tactical considerations with respect to both the questions asked and the potential disclosure of the evidence. However, at the core of the technique is to encourage suspects to tell their story (in order to open up for ‘avoidance’), and – in the next phase – closing in on the critical information by asking open and specific questions (in order to open up for ‘denials’). There is now a series of empirical studies, with various samples, including children, showing that if the basics steps of the SUE-technique are used, it is possible to elicit diagnostic cues to deception in the form of inconsistencies between the suspects’ statement and the evidence (e.g., Clemens et al. 2010; Clemens et al. 2011; Granhag 2010; Granhag et al. in press; Hartwig et al. 2005; Hartwig et al. 2011). That is, suspects who are innocent generally adapt a “tell it like it happened” strategy, thus remaining consistent with the evidence (e.g., confirming that they touched the briefcase from which the wallet was stolen), whereas guilty suspects, in the absence of information about existing evidence, tend to deny such information.

In previous SUE research, the evidence has typically been withheld fully until the end of the interview. A question that remains largely unanswered is whether this disclosure strategy is the optimal one for increasing verbal differences between liars and truth tellers. Hartwig et al. (2005) stated that “it might be worthwhile to investigate how more sophisticated ways of disclosing evidence such as different drip-feeding procedures in which parts of the evidence are disclosed throughout the interrogation, may moderate deception detection performance.” (p. 483). Other researchers have also suggested a variation of the SUE technique (Dando and Bull 2011) where evidence is disclosed gradually to the suspect throughout the interview. The present study is an investigation of how varying the timing of evidence disclosure will affect the elicitation of verbal cues to deception (i.e., statement-evidence inconsistencies). Mapping the effectiveness of different evidence disclosure tactics is an important enterprise, as field research shows that evidence disclosure is common in interviews (e.g., Leo 1996; Soukara et al. 2009).

As described above, the theoretical basis for the SUE technique is that innocent and guilty suspects employ different counter-interrogation strategies (Granhag and Hartwig 2008). However, a question that is still unexplored is whether suspects’ counter-interrogation strategies may change during the course of an interview. In the present study, we tested the notion that these strategies may in fact be dynamic and change during the interview. Hartwig et al. (2007) found that guilty suspects claim not to have a strategy coming into the interview because they did not know what will happen. This suggests that suspects may gradually revise their strategy during an interrogation, as a function of the interview tactics employed by the interviewer. In order to investigate this question, we manipulated evidence disclosure tactics and measured statement-evidence consistency at different stages during an interview, in contrast to previous research that has measured this cue on the basis of the whole interview.

In line with the reasoning above, we hypothesized that gradual disclosure of evidence may not be as effective as late presentation of evidence used in the SUE, as suspects may come to realize that evidence against them does exist, and, therefore, try to stay as close as possible to the actual event. In other words, it is possible that disclosing the evidence gradually would allow guilty suspects to tailor a counter-strategy to avoid inconsistencies with the evidence. This may decrease the amount of verbal cues to deception in their statement compared to the late disclosure of evidence strategy.

In order to test our hypotheses, we developed a complex mock-terrorism paradigm that included multiple actions and generated multiple pieces of evidence, thus creating the opportunity for a lengthier and more realistic interview.

Hypotheses

Hypothesis 1

Differences in omissions as a function of veracity. In line with theory and previous research (e.g., Granhag and Hartwig 2008; Hartwig et al. 2006), we predicted that innocent suspects would volunteer more information during a free recall compared to guilty suspects. More specifically, we expected that liars would omit crime-relevant information (e.g., being close to where the crime occurred).

Hypothesis 2

Differences in statement-evidence inconsistencies as a function of veracity. In line with previous research on the SUE technique (e.g., Clemens et al. 2010), we predicted that guilty (vs. innocent) suspects would be more prone to contradict the evidence.

Hypothesis 3

Differences in statement-evidence inconsistencies as a function of interview style. We predicted that differences in statement-evidence inconsistencies between liars and truth-tellers would be greater when the evidence was disclosed late compared to when it was presented early or gradually. More specifically, we hypothesized that in the gradual evidence presentation condition, liars will become increasingly consistent with the evidence once they realize that evidence against them exists (i.e., once the first piece of evidence is disclosed), and thus the differences between liars and truth-tellers in this condition will become progressively smaller whereas in the late evidence presentation condition the differences will remain constant throughout the interview.

Method

Overview

The study was conducted in two phases. In the first phase, participants were randomly assigned to commit a mock terrorist act or a non-criminal act. In the second phase, participants were randomly assigned to one of three evidence disclosure conditions.

Phase 1

Participants

Eighty-six undergraduate students (58 females and 28 males, mean age 20.6, SD = 3.3) enrolled in a psychology course participated in this study for course credit. The sample included 48.8 % Hispanic, 17.4 % White, 12.8 % African-American, 9.3 % Asian, and 11.6 % other participants.

Design & Procedure

Upon arrival to the laboratory, participants were informed that they will partake in a study on interviewing techniques. After signing the informed consent forms, participants were assigned to one of six conditions (resulting from the 2 [Suspect: Guilty vs. Innocent] x 3 [Interview: Early vs. Late vs. Gradual evidence disclosure] design).

In the Guilty condition, participants were informed that they were going to pick up a package containing materials for assembling a bomb and drop the package off at another location.Footnote 1 They were instructed to go to a room located in one of the college buildings where they were to pick up the package (Station 1), after which they were to go to the library’s reference desk and wait for an “agent” to approach them. They were told that the agent will ask “What time does the library close?” and they should answer “It is always open”. After this exchange, the agent provided them with further instructions (Station 2). The agent told the participants to go to another room (in the same building as Station 1 to drop off the package (Station 3).

In the Innocent condition, participants were told that they were going to pick up a package with a book inside and drop it off at another location. They were instructed to go to a room (same room as the guilty participants) where they were to pick up the book (Station 1), after which they were to go to the library’s reference desk and ask a librarian whether they have another copy of that book (at this point, they were also approached by the confederate “agent” who asked them “What time does the library close?” – the participants in the Innocent condition, however, were not advised of this event and were not told what to answer) (Station 2). They then needed to go to another room (the room number was given to them with the initial instructions) and drop off the package (Station 3).

Evidence

In line with previously employed paradigms (e.g., Hartwig et al. 2005), the events were designed to generate evidence against both innocent and guilty suspects. This evidence was ambiguous in the sense that it suggested that the person may have been involved in the crime, but not conclusively so. For both innocent and guilty suspects, the task generated a total of 9 pieces of evidence (3 from each station). The evidence for each station included: security records indicating the person has entered the particular building at the time the suspected mock terrorism act was committed; surveillance camera footage showing the person outside the room where the package was; witnesses reporting seeing the suspect enter the library and talk to the agent; and/or fingerprints matching the suspect found on the boxes from which the package was retrieved and where it dropped off. In line with earlier studies, no actual fingerprints were taken but it was established that all suspects in fact did touch these boxes (e.g. in order to retrieve the package or book they had to touch the box; in order to put the package or the book in the designated mailbox, they also had to open it, and so their fingerprints would have definitely been there). The plausibility of this evidence was rarely questioned by participants. In the few cases where this did happen, the interviewer was instructed to tell the suspect that the fingerprints lifted from the box were compared to those lifted from an instruction sheet that the suspects handled before the event took place. The few suspects who questioned the plausibility of the evidence were satisfied with this explanation.

Phase 2

Interview

Upon returning to the lab, all participants were informed that they were suspected of a terrorist activity, and that they would be interviewed regarding their recent whereabouts. They were instructed to convince the interviewer of their innocence, and they would have a chance to win $200 in a lottery if they succeeded. Participants were allowed to take their time to prepare for the interview and knock on the door when they thought they were ready. All interviews consisted of a) free recall (participants were asked: Could you please tell me, in as much detail as possible, what you did today at xx (time when the event occurred)?), b) specific questions about the evidence (e.g. Did you visit room Y?; What did you do in room Y?; Did you see a box in room Y? If yes, where was it? Did you look in the box? If yes, what was in it?; Did you touch the box?; Did you take anything from the box?; etc.) and c) evidence disclosure. In the Early disclosure condition, the evidence was disclosed to the mock suspect at the beginning of the interview (followed by free recall and specific questions). In the Late disclosure of evidence condition, the interview started by asking the suspect to provide a free recall, followed by specific questions, and finally, all evidence was disclosed at the end of the interview. In the Gradual evidence disclosure condition, the interview started with free recall, and the evidence was disclosed after each specific questioning part pertaining to each Station (i.e. after all specific questions about station one – the pickup location – participants were informed of the evidence that exists against them pertaining to being at this place. Then, the next set of questions was asked and next set of evidence disclosed, etc.). See Table 1 for an outline of the interviews in each condition.

Table 1 Outline of interviews for each condition

Post-Interview Questionnaire

After the interview, participants completed questionnaires in which they reported their age, sex as well as ratings (on a Likert scale ranging from 1 to 10) of how deceptive they were, how motivated they were to give a credible impression and how difficult they thought the interview was.

Analyses of the Interviews

The interviews were transcribed verbatim from the videotapes and coded in two steps in line with previous research (Hartwig et al. 2005), resulting in two dependent measures: a) omissions of incriminating information during free-recall and b) statement-evidence inconsistency throughout the interview. In order to establish inter-rater reliability, a random 20 % of interviews were independently coded by two coders who were blind to the veracity condition. Agreement rates ranged from 95 % (statement-evidence inconsistency) to 98 % (omissions). The remaining interviews were split in half and coded by these same coders.

Omissions of Relevant Information in Free Recall

We coded the amount of information the suspect gave during the free recall phase that pertained to the nine (3 for each station) pieces of evidence. If a suspect mentioned going to the building where the package was picked up, he or she provided information pertaining to one piece of evidence (i.e., security records showing that they have entered that building). Accounting for one piece of evidence was scored as 1 point for a possible total score of 9, if all 9 pieces of evidence were accounted for.

Statement-Evidence Inconsistency During Interview

The second dependent measure we coded was the inconsistency of statements throughout the interview (i.e., during free recall as well as specific questions phase) with the evidence against the suspect. For each piece of evidence, we checked if the suspect gave information that was completely inconsistent with the evidence (score of 3), possibly consistent (score of 2), or completely consistent (score of 1). For example, for the fingerprints evidence on the box from which the package was taken, if a suspect mentioned opening the box, it would be completely consistent with the evidence; if he/she mentioned looking around where the box was, it would be scored as possibly consistent, and if the suspect claimed that they have never even gone near there, it would be scored as completely inconsistent. This coding was done for each piece of evidence, resulting in a range of possible scores from 9 (if the suspect was completely consistent for each piece of evidence) to 27 (if they were completely inconsistent for each piece of evidence).

Results

Preliminary Analyses

On average, self-reported motivation to give a credible statement was high (M = 7.23; SD = 2.3), and there was no difference in motivation across conditions. A one-way ANOVA revealed a significant difference in self-reported nervousness during interview between innocent (M = 4.36) and guilty (M = 5.86) participants (F (1, 85) = 5.74; p = 0.019; d = 0.51), and a trend toward significant difference in the perceived difficulty to be interviewed between innocent (M = 3.34) and guilty (M = 4.6) participants (F (1, 85) = 3.7; p = 0.058; d = 0.4). There were no differences in these measures as a function of interview strategy.

Hypothesis 1

Differences in omissions as a function of veracity. We predicted that innocent suspects would volunteer more information during the free recall compared to guilty suspects. More specifically, we expected that liars would omit crime-relevant information (e.g., being close to where the mission occurred). Table 2 presents the mean and standard deviation scores of omissions (higher scores mean more omissions) in guilty and innocent suspects. An independent sample t-test confirmed our hypothesis (t (84) = -5.27; p < 0.0001; d = 1.14)

Table 2 Mean scores of omissions and statement-evidence inconsistencies for guilty and innocent suspects

Hypothesis 2 & 3

Differences in statement-evidence inconsistencies as a function of veracity. Differences in statement-evidence inconsistencies as a function of interview style. We predicted that guilty (vs. innocent) suspects would be more prone to contradict the evidence, and that these inconsistencies will vary as a function of interview style. Table 2 presents the mean and standard deviation scores for statement-evidence inconsistencies (higher scores = more inconsistencies) in guilty and innocent suspects. A 2 (liar vs. truth-teller) x 3 (early, late, gradual evidence presentation) ANOVA revealed a main effect of veracity (F (1, 86) = 23.41; p < 0.0001; d = 1.03) where innocent participants had a significantly lower rate of statement evidence inconsistencies (M = 13.88) than guilty participants (M = 19.08), thus, confirming our second hypothesis. We also predicted that differences in statement-evidence inconsistencies between liars and truth-tellers would be greater when the evidence was disclosed late than when it was disclosed early or gradually. Main effect for strategy (F (2, 86) = 1.43; ns; d = 0.35) as well as the interaction (F (3, 86) = 1.37; ns; d = 0.33) did not produce a significant result. However, in line with previous studies (Hartwig et al. 2005; Hartwig et al. 2006) we conducted a series of planned comparisons (comparing liars to truth-tellers within each interview condition), using independent sample t-tests with a Bonferroni correction, meaning that a significance level of 0.0017 was required. Figure 1 presents the mean scores of liars and truth-tellers broken down by interview condition with error bars representing the 95 % confidence intervals. Liars in the late evidence presentation condition had significantly more statement-evidence inconsistencies than truth-tellers (t (28) = -4.39; p < 0.001; d = 1.61). In the gradual evidence presentation condition, the difference was not significant under the Bonferroni correction (t (27) = -2.5; p = 0.02; d = 0.93), and no significant difference was found in the early evidence presentation condition (t (25) = -1.69; p = 0.10; d = 0.65). In examining the plot and the means, contrary to our prediction, the difference between liars and truth-tellers in the gradual evidence presentation condition seems to have been reduced due to truth-tellers becoming less consistent rather than liars becoming more consistent.

Fig. 1
figure 1

Statement-evidence inconsistencies as a function of interview style in guilty and innocent suspects. Error bars represent 95 % confidence intervals. Only the differences in the late evidence presentation condition are statistically significant

Discussion

The purpose of the present study was to identify the optimal evidence disclosure strategy for eliciting cues to deception. More specifically, our study was designed to test differences in verbal cues to deception as a function of the timing of evidence disclosure (early vs. gradual vs. late) using a novel mock-terrorist act paradigm. The results provide further support for the utility of late disclosure of evidence (Hartwig et al. 2005; 2006) in creating diagnostic verbal cues to deception, by showing that withholding the evidence strategically until the end of the interview produces the largest verbal differences between liars and truth-tellers. While gradual evidence disclosure seems to be better than disclosing all the evidence early on in the interview (as evidenced by the effect size differences found in those conditions), our results indicate that it does not appear to be as effective as late evidence disclosure. Moreover, comparing the results from the late and gradual evidence disclosure conditions, it appears that the decrease in the magnitude of statement-evidence inconsistency cue in the gradual disclosure condition (compared to the late disclosure condition) was largely due to an increase in inconsistency in the statements of innocent participants, rather than a decrease of inconsistencies in the guilty suspects as we predicted. This suggests that the gradual disclosure approach could possibly affect innocent suspects in a way that puts them at a greater risk of being mistaken for guilty. Speculatively, innocent suspects in the gradual disclosure condition may become wary when they realize there is evidence against them, and thus adopt a more aversive strategy. It is possible that when evidence is disclosed, albeit ambiguous, innocent suspects feel compelled to distance themselves further from the crime scene. That is, even though there is a perfectly legitimate reason for their fingerprints to appear on the mailbox where the package was deposited, they may believe that their “story” is not good enough and be compelled to deny being in the mailroom altogether. This finding warrants further empirical exploration.

Dando and Bull (2011) also compared early and late disclosure of evidence to a gradual disclosure approach. While they did not analyze cues to deception, their analysis of deception detection accuracy indicated that the gradual disclosure tactic produced higher lie detection accuracy rates compared to the early and late disclosure of evidence. We found that the late disclosure tactic was more effective in terms of eliciting cues to deception, thus, assuming the deception judges would attend to those cues, they should obtain higher accuracy in the late disclose condition. However, it is difficult to explain the discrepancy between our findings and theirs, as Dando and Bull did not offer any analysis of verbal cues to deception. We suggest that future research on the effects of gradual disclosure of evidence includes codings of verbal cues to deception in order to clarify these contradicting patterns.

It is interesting to note that while innocent suspects had significantly fewer statement-evidence inconsistencies than guilty suspects (irrespective of evidence disclosure tactic), inconsistencies were still present in their statements. It is beyond the scope of the present study to determine what sort of inconsistencies these were and how they differed from inconsistencies in the liars’ statements. However, a qualitative analysis of these statements may present an interesting future direction to this line of research.

It is worth noting that in the present study, we employed a novel and relatively complex mock-terrorism paradigm. By providing the interviewer with as many as nine pieces of evidence, this paradigm allowed for a longer, more substantial interview with the suspect, giving him or her more opportunities to lie (i.e. come up with an alternative account of the events) or tell the truth, and brought the interview to a higher level of ecological validity. It proved to be successful in producing rich data for analysis, and showed its utility in yielding highly significant differences in verbal cues elicited from liars and truth-tellers.

We believe it is worth noting that evidence disclosure is a complicated issue that demands considerations beyond the timing of disclosure. For example, the way in which the evidence is framed when it is disclosed (i.e., whether it is presented in a general, vague manner or a more specific one) may very well have an impact on cues to deception (see Granhag et al. in press). Also, it is conceivable that the effectiveness of various disclosure tactics will be moderated by the strength of a particular piece of evidence. Due to the importance of evidence disclosure tactics in interviews, and potential of these tactics to affect cues to deception, we encourage future research to further explore these issues.

As any experimental study, this one is not without limitations. Specifically, the sample consists of undergraduate students and the “criminal event” is artificial in that no real crime is committed. While the “mock crime” paradigm has been successfully used in numerous studies and is believed to be an effective way of studying real world phenomena in controlled settings, critics may question its generalizability to real life interrogation arguing that the event is not nearly as stressful and the motivation to be believed is not nearly as great. However, Vrij et al. (2010), in their detailed review of the current research in this area, concluded that “although high-stakes lies may be harder for liars to tell, their behavioral signs are neither obvious [and] may simply not be more extreme than those of lower-stakes lies” (p. 110). While we see no compelling reason why the stakes would alter the patterns obtained here, future studies may address this issue by testing the effects under higher stakes conditions. For a further discussion and a summary of research on the issue of differences between liars and truth-tellers under various high stakes and low stakes conditions, as well as people’s ability to detect deception using naturally occurring as well as elicited cues, see Bond and DePaulo (2006), DePaulo et al. (2003), Vrij et al. (2010) and Vrij and Granhag (2012).

In sum, the results of the present study suggest that late evidence disclosure tactic works better in eliciting verbal cues to deception than either early or gradual evidence disclosure. We encourage future research to replicate these findings and explore the possible explanations for why this may be the case. In particular, future research ought to consider explanations for the pattern displayed by innocent suspects in the gradual disclosure condition.