Introduction

The concealed information test (CIT), or guilty knowledge test (GKT), examines whether a participant has knowledge of crime-related information, usually by means of physiological measures (Ben-Shakhar and Elaad 2003; Iacono 2007). Until recently, autonomic measures such as electrodermal activity and respiration have been used in the CIT, but numerous studies have also been done with event-related brain potentials (ERPs). In particular, it is suggested that a late positive wave called P3 or P300 can be an index of a participant’s recognition of a specific item (e.g., Allen and Iacono 1997; Allen et al. 1992; Farwell and Donchin 1991; Rosenfeld 2005; Rosenfeld et al. 1988), although it may be vulnerable to false memory (Allen and Mertens 2008).

In the P300-based CIT paradigm, participants are usually presented with three types of items (relevant, irrelevant, and target) in random order. The relevant and target items are presented infrequently (e.g., 10–15% of the array), whereas the irrelevant items are presented frequently (e.g., 70–80%). A relevant item is crime-related information that only a person involved in the crime (e.g., perpetrator, eyewitness) would know. Irrelevant items are similar to the relevant item but not related to the specific crime. Usually, four or more different irrelevant items are presented, each with the same probability as the relevant item. Only the guilty person involved can differentiate between them. The target item is embedded in the sequence of relevant and irrelevant items, and participants are asked to respond differently to it than to the other items. This procedure ensures that participants attend to the stimulus sequence. It has been assumed that P300 amplitude is determined by subjective probability and the meaning of the eliciting stimulus when the stimulus is unequivocal and the level of attention constant (Johnson 1986, 1988). For a participant who has knowledge of a specific crime, the relevant item is more meaningful than irrelevant items, and is presented at a lower probability than multiple irrelevant items. Because the target stimulus also occurs infrequently and has a task-relevant meaning, it elicits a large P300 that can be compared statistically with the response to the relevant item (Farwell and Donchin 1991).

Generally, the CIT is considered a test for assessing recognition of relevant information (Allen et al. 1992; Lykken 1959, 1998); however, it is controversial whether mere recognition is sufficient for the larger P300 amplitude for relevant items in the CIT. For example, Rosenfeld (2006) showed that a larger P300 amplitude and a higher detection accuracy resulted when autobiographical information (e.g., a participant’s name) was used as the relevant item than when incidentally acquired information (e.g., the experimenter’s name) was used, although both items were remembered perfectly. Meijer et al. (2007) conducted CIT experiments using face pictures and found that mere recognition of the relevant face was not sufficient for a larger P300. They suggested that a factor like high familiarity might be the key to successful detection of concealed information. Moreover, several studies have reported that motivation to conceal led to a larger P300 for the relevant item. Allen and Iacono (1997) suggested that incentive to deceive increased the accuracy of detection in the P300-based CIT, at least when using a bootstrapping procedure. Recently, Verschuere et al. (2009) reported that when participants responded deceptively (i.e., by pressing “I don’t recognize this name” in response to his or her own name), a larger difference in P300 between the relevant and irrelevant items and a higher detection accuracy were obtained than when participants gave a truthful response (i.e., by pressing “I recognize this name” when shown their own name), where a smaller but detectable P300 difference occurred. These findings agree with findings of the CIT based on autonomic responses (Elaad and Ben-Shakhar 1989; Gustafson and Orne 1965), which showed that motivation to avoid detection was ineffective and often increased detection. However, it has been thought that accurately assessing the contribution of motivation to avoid detection is difficult (Ben-Shakhar and Elaad 2003).

The effect of intention to conceal can be interpreted in two ways. One possibility is that the intention to conceal elicits a deception-specific process during the CIT, and ERPs reflect it. The existence of such a deception-specific process has been demonstrated in ERP studies (e.g., Johnson et al. 2003, 2004, 2005) and in brain imaging studies (e.g., Abe et al. 2006; Ganis et al. 2003; Langleben et al. 2002; Phan et al. 2005). The other possibility is that a general process not specific to deception, such as need for additional processing or increased significance/salience of an item, is responsible for the P300 increase. If the second possibility is the case, not only deception but also a different mental operation such as informing the experimenter of the item by physiological signals causes a larger P300. This kind of instruction, the opposite of concealing, has been used in the field of brain-computer interface (BCI) to enable paralyzed or control participants to communicate without moving muscles but through brain activities such as the P300 elicited by an item, with voluntary increase or decrease of a slow cortical potential, or modulation of spontaneous EEG activity like mu rhythm (Birbaumer 2006; Wolpaw et al. 2002). In effect, the CIT and BCI are two sides of the same coin. Both techniques probe information hidden either intentionally or unintentionally from the outside world.

In the present study, we examined whether an increase in P300 amplitude caused by instruction to conceal is specific to deception. To this end, we did a within-participants experiment under four conditions using a card test in which a participant chose one card from five and hid it before the CIT. The card test is simple, but it shares central characteristics of the CIT and has been used to examine the effect of motivation to conceal information in the CIT (Furedy and Ben-Shakhar 1991). More realistic mock-crime procedures could be used, but laboratory settings can never replicate real crime scenes (Elaad 2009); therefore, we selected the card test procedure. First, participants performed a typical CIT without additional instruction (control condition). Then, three conditions were introduced in counterbalanced order. In the conceal condition, participants were instructed to make an effort to leave the card they chose undetected by suppressing brain response to it. In the transmit condition, they were instructed to make an effort to inform the experimenter of the chosen card by enhancing brain response to it. In the no secret condition, participants showed the chosen card to the experimenter beforehand and lost their motivation to conceal it during the CIT. Based on the previous studies examining the effect of motivation to conceal information on ERPs (Allen and Iacono 1997; Verschuere et al. 2009), we predicted two outcomes. First, instruction to conceal the chosen card would enhance the difference in P300 amplitude between the chosen card and the unchosen cards. Second, if the P300 is enhanced by a deception-specific process, it would increase only in the conceal condition. Conversely, if the P300 is affected by a more general process such as additional processing of the item, it would increase not only in the conceal condition but also in the transmit condition.

Method

Participants

Eighteen students at Hiroshima University participated in the experiment (nine women and nine men; mean age 21.6 years old). All participants were right-handed, assessed by the Edinburgh Handedness Inventory (Oldfield 1971), and had normal or corrected-to-normal sight according to self-report. They gave written informed consent. The protocol was approved by the Research Ethics Committee of the Graduate School of Integrated Arts and Sciences, Hiroshima University.

Stimuli and Tasks

Each participant performed a similar card-test CIT four times with different instructions. At the beginning of each condition, participants were shown five playing cards (2, 3, 4, 5, and 6 of the same suit; different suits were used in different conditions) and told to choose one card. Then they were asked to remember it, and to put it in an envelope when the experimenter was absent. The chosen card was regarded as the relevant item, and the unchosen cards were regarded as the irrelevant items. After this memorization phase, the pictures of the five cards and a joker (target) were presented one by one on a cathode ray tube screen placed 170 cm in front of the participant. Each card extended 1.6° in width and 2.6° in height. The task was to respond selectively to the five cards and to the joker by pressing either the left or the right button as quickly and accurately as possible with the left or right index finger. The exact instructions are described in the next section. The stimulus duration was 300 ms, and the interstimulus interval varied between 1,500 and 1,900 ms (mean 1,700 ms). Participants performed 180 trials per condition with a short break after 90 trials. The chosen card and the target card were presented in 30 trials each (p = .17 each). The unchosen cards were presented in 120 trials (30 trials each of the 4 unchosen cards, p = .66).

Procedure

Each participant performed the card test in four conditions. First, all participants performed a typical CIT without additional instruction (control condition). They were instructed as follows: “Now we try to detect the card you chose. Please press the right (or left) button to the joker and press the left (or right) button to both the chosen and unchosen cards as quickly and accurately as possible with respective index fingers.” After this condition, participants were told explicitly that the chosen card would usually elicit a larger brain activity than the unchosen cards and the experimenter could detect the card on the basis of this information. They performed the remaining three conditions in counterbalanced order. In the conceal condition, participants were instructed as follows: “Please make an effort to leave the chosen card undetected by suppressing brain response to it. Do not fake an unchosen card as the chosen card. You should only make a mental effort to suppress.” In the transmit condition, they were instructed as follows: “Please make an effort to inform the experimenter of the chosen card by enhancing brain response to it. You should only make a mental effort without extra physical movements.” In the no secret condition, participants were asked to show the chosen card to the experimenter before the CIT, which extinguished participants’ motivation to conceal the card. Before starting each condition, participants were encouraged to ask a question if they had uncertainty about the task. The task was initiated only after they declared that they understood the instruction clearly. After each condition, participants were asked to rate the levels of difficulty, concentration, arousal, and efficacy they felt during the block on 9-point scales (1 = minimum, 9 = maximum). Then they were asked to verbally report the chosen card. All participants remembered it correctly in all conditions.

Physiological Recording

An electroencephalogram (EEG) was recorded from 38 scalp sites (8 midline: AFz, Fz, FCz, Cz, CPz, Pz, POz, Iz; 30 lateral: Fp1/2, F3/4, F7/8, FC1/2, FC5/6, FT9/10, C3/4, T7/8, CP1/2, CP5/6, TP9/10, P3/4, P7/8, PO9/10, O1/2, according to the 10% system) using an elastic cap (EASYCAP, Munich, Germany) with Ag/AgCl electrodes. A high-pass filter of 0.016 Hz (i.e., a time constant of 10 s) and a low-pass filter of 60 Hz were used at recording. Horizontal and vertical electrooculograms (EOGs) were recorded from the outer canthi and from above and below the left eye. The ground electrode was fixed at the forehead (Fpz). The sampling rate was 500 Hz. Electrode impedance did not exceed 5 kΩ.

Data Reduction

The EEG data were re-referenced to the nose tip. A digital bandpass filter of 0.05–30 Hz was applied, and ocular artifacts were corrected using the method of Gratton et al. (1983) implemented in Brain Vision Analyzer 1.05 (Brain Products, Germany). ERP waveforms were calculated separately for each participant, stimulus type (chosen, unchosen, and target), and condition (control, conceal, transmit, and no secret). The period between 200 ms before and 1,000 ms after the stimulus onset was averaged. Each ERP waveform was aligned to the 200-ms pre-stimulus baseline by subtracting the mean amplitude of this period from each point of the waveform. Because the peaks of the P300s elicited by the chosen and unchosen cards were obscure for most participants, P300 amplitude was scored as the mean amplitude between 450 and 650 ms after stimulus onset, based on our visual inspection of the grand mean waveforms. On the other hand, the peak of the P300 elicited by the target card could be clearly determined, so that the peak amplitude and latency of the most positive deflection in the latency range of 300–600 ms were measured at the most dominant site, Pz. Because P300 amplitude was more than two times larger for the target than for the chosen and unchosen cards, they were analyzed separately. The trials in which reaction times were shorter than 100 ms or longer than 1,000 ms and trials with incorrect button responses were regarded as error trials, and eliminated from the reaction time (RT) and ERP analyses. The mean numbers of averaged trials in each condition were 25, 27, and 107 for the target, chosen, and unchosen cards, respectively. At least 20 trials were averaged for each individual waveform.

Statistical Analysis

Repeated measures analyses of variance (ANOVAs) were applied to subjective, behavioral, and ERP data using SPSS 14.0. When degrees of freedom in the numerator were greater than 1, Greenhouse-Geisser ε correction was applied to control Type I error. The effect sizes in ANOVAs were shown as partial eta squared (η 2). Post hoc comparisons were made by paired t tests with Bonferroni correction. Significance level was set at .05 for all analyses.

Individual Bootstrapping

To determine whether the P300 elicited by the chosen card was larger than the P300 elicited by unchosen cards on an individual basis, a bootstrapping method similar to that of Rosenfeld et al. (2004) was applied to the P300 amplitude data (mean of the 450–650 ms period) at Pz. Bootstrapping generates multiple averages from a fixed set of single-trial data (Wasserman and Bockenholt 1989). Suppose that there are n single-trial EEG traces for the chosen cards and i single-trial EEG traces for the unchosen cards after eliminating trials with artifacts and/or incorrect responses. Ordinary ERP waveforms are calculated by averaging the single-trial EEG traces without repetition. In the bootstrapping method, a large number of ERP waveforms are calculated by averaging n single-trial EEG traces sampled randomly from the n chosen card trials with replacement, and similarly by averaging i single-trial EEG traces from the i unchosen card trials with replacement. Then, P300 amplitude is measured in each ERP waveform and the amplitude difference between the chosen and unchosen cards is calculated. In the present study, this procedure was repeated 1,000 times, and the mean and standard deviation (SD) of the difference amplitude value were calculated. When the mean value was larger than 1.28 SD, the chosen card of that participant was judged to be detected. This criterion means that P300 amplitude is reliably larger for a chosen card than for unchosen cards with 90% confidence (one-tailed, Rosenfeld et al. 2004).

Results

Behavioral and Subjective Data

The mean rates of error responses were 9.7, 0.8, and 0.4% for the target, chosen, and unchosen cards, respectively. These trials were removed from further analyses. Figure 1 shows the mean RTs in the four conditions. A two-way ANOVA with factors of condition (control, conceal, transmit, and no secret) and stimulus type (target, chosen, and unchosen cards) showed significant main effects of condition and stimulus type, F(3, 51) = 19.38, p < .001, ε = .72, partial η² = .53; F(2, 34) = 161.10, p < .001, ε = .70, partial η² = .90, respectively. Post hoc comparisons showed that the mean RT was longer in the transmit condition than in the other three conditions, and that the mean RT was longer for the target card than for the chosen and unchosen cards, p < .05. The interaction was also significant, F(6, 102) = 7.70, p < .001, ε = .88, partial η² = .31. In all conditions, the mean RT was longer for the target card than for the chosen and unchosen cards, p < .005. Moreover, the mean RT was longer for the chosen card than for the unchosen cards in the transmit and control condition, p < .01, but not in the other conditions, p > .60. For the target and chosen cards, the mean RT was longer in the transmit condition than in the other conditions, p < .01. For the unchosen cards, the mean RT was longer in the transmit condition than in the conceal and no secret conditions, p < .01.

Fig. 1
figure 1

Mean reaction times for the target, chosen, and unchosen cards in the four conditions (N = 18). Error bars indicate standard errors of the means across participants; asterisks indicate a significant difference (p < .05)

Table 1 summarizes the results of subjective measures. Each measure was submitted to a one-way ANOVA with a factor of condition. A significant effect was obtained only for task difficulty, F(3, 51) = 16.41, p < .001, ε = .79, partial η 2 = .49. Post hoc comparisons showed that participants rated the conceal and transmit conditions more difficult than the control and no secret conditions, p < .01.

Table 1 The peak amplitude and latency of the P300 elicited by the target card and summary of subjective ratings (N = 18)

P300s for the Chosen and Unchosen Cards

First, we analyzed the amplitude data of all 38 electrodes. For the eight midline electrodes, a three-way ANOVA with factors of condition, stimulus type, and site (AFz, Fz, FCz, Cz, CPz, Pz, POz, and Iz) was conducted. For the 30 lateral electrodes, a four-way ANOVA with factors of condition, stimulus type, hemisphere (left and right), and site (Fp1/2, F3/4, F7/8, FC1/2, FC5/6, FT9/10, C3/4, T7/8, CP1/2, CP5/6, TP9/10, P3/4, P7/8, PO9/10, and O1/2) was conducted. Because the main effect of hemisphere and the interaction effects including the factor of hemisphere were not significant, we will not report the results of the lateral ANOVA. In the midline ANOVA, all the main effects and interactions were significant. We focused on the amplitude data at the dominant site, Pz, which most previous P300-based CIT studies have dealt with. Analysis of the amplitude data at Cz or CPz yielded virtually the same results, although we do not report them.

Figure 2 shows grand mean ERP waveforms at Pz in all conditions. Both the chosen and unchosen cards elicited a P300, and the difference between these P300 s is prominent in the period of 450–650 ms after stimulus onset. The P300 amplitude appeared to be larger for the chosen card than for the unchosen cards in the conceal and transmit conditions, but not in the control and no secret conditions. Figure 3 shows the P300 amplitudes for the chosen and unchosen cards. A two-way ANOVA with factors of condition and stimulus type (chosen and unchosen) showed significant main effects of condition and stimulus type, F(3, 51) = 13.20, p < .001, ε = .94, partial η² = .44; F(1, 17) = 22.17, p < .001, partial η² = .57. The interaction was also significant, F(3, 51) = 6.07, p < .001, ε = .88, partial η² = .26. When each condition was analyzed separately by paired t test, P300 amplitude was larger for the chosen card than for the unchosen cards in the conceal and transmit conditions, p = .008 and p < .001, respectively. The effect was marginally significant in the control condition, p = .068, and not in the no secret condition, p = .187. When the chosen card and the unchosen cards were analyzed separately by one-way ANOVA with a factor of condition, a significant effect was found for the chosen card, F(3, 51) = 11.99, p < .001, ε = .95, partial η 2 = .41, but not for the unchosen cards, F(3, 51) = 1.28, ε = .81, p = .293, partial η 2 = .07. Post hoc comparisons showed that P300 amplitude for the chosen card was larger in the transmit condition than in the other three conditions, p < .05.

Fig. 2
figure 2

Grand mean ERP waveforms elicited by the chosen and unchosen cards (Pz recording, N = 18)

Fig. 3
figure 3

Mean amplitude of 450–650 ms at Pz for chosen and unchosen cards in the four conditions (N = 18). Error bars indicate standard errors of the means across participants; asterisks indicate a significant difference (p < .05)

Bootstrapping analysis showed that detection rates (i.e., the percentage of participants who showed a reliably larger P300 amplitude for the chosen card than for the unchosen cards) were 33% (6/18), 33% (6/18), 61% (11/18), and 17% (3/18) for the control, conceal, transmit, and no secret conditions, respectively. The detection rate was generally low, but it was higher when participants had instruction to conceal or transmit than when they had no motivation to hide (i.e., no secret condition).

P300 for the Target Card

Figure 4 shows the grand mean ERP waveforms for the target card. Similar waveforms were elicited in all conditions, but slight differences appear. Table 1 shows peak amplitude and latency of the P300 elicited by the target card. A one-way ANOVA with a factor of condition on P300 amplitude did not show a significant effect, F(3,51) = 1.05, p = .376, ε = .91, partial η² = .058, although P300 amplitude appears to be smaller in the conceal and transmit conditions. A similar ANOVA on P300 latency showed a significant effect, F(3,51) = 4.80, p = .007, ε = .89, partial η² = .22. Post hoc comparisons showed that peak latency of the P300 elicited by the target card was longer in the conceal and transmit conditions than in the control and no secret conditions, p < .05.

Fig. 4
figure 4

Grand mean ERP waveforms elicited by the target card in the four conditions (N = 18)

Discussion

This study investigated the effects of instructed intention to conceal on P300 and detection rate in the ERP-based CIT. We found that the chosen card elicited a larger P300 than the unchosen cards when participants were instructed to conceal it or transmit it. If participants had no instruction to conceal the chosen card, P300 amplitude did not differ between chosen and unchosen cards. Instruction type also affected P300 latency for the target card. When participants were instructed to conceal or transmit the chosen card, the peak latency of the P300 elicited by the target card increased.

The results of the present study are consistent with previous studies in several respects. First, consistent with Meijer et al. (2007), this study showed that mere recognition of the relevant item was not sufficient to elicit a reliable P300 amplitude difference between the relevant and irrelevant items. Even when participants remembered the chosen card perfectly, P300 amplitude did not differ between chosen and unchosen cards when participants had declared the chosen card before the CIT and lost their motivation to conceal it.

Second, consistent with Allen and Iacono (1997) and Verschuere et al. (2009), instruction to conceal the chosen card enhanced the difference in P300 amplitude between the relevant and irrelevant items. This result is also consistent with previous CIT research using autonomic measures (Elaad and Ben-Shakhar 1989; Gustafson and Orne 1965). Although previous studies have reported better detection accuracy as well as larger P300 for the relevant item, the detection rate in the conceal condition did not increase in the present study. This might have been caused by use of the joker, a high salient stimulus, as the target card. It is known that P300 amplitudes for multiple stimuli are not determined by categories defined by the experimenter but by the categories perceived by a participant (Johnson and Donchin 1980; Rosenfeld et al. 2005). When the target stimulus is highly salient, the other two stimuli are prone to be categorized as the same category, and the P300 amplitude difference between the two stimuli may become smaller (Katayama and Polich 1998). As far as we know, no study has examined the effect of target salience in the P300-based CIT. The present result suggests that including a distinct target item in a stimulus sequence may overshadow the differential processing of the relevant and irrelevant items and deteriorate detection accuracy. Therefore, it may be better not to use any target item if the examiner can ensure examinees’ attention to the stimulus sequence. Actually, a two-item paradigm without target stimulus has been popular in the autonomic-based CIT (Ben-Shakhar and Elaad 2003; Iacono 2007). Rosenfeld et al. (2006) showed that the P300 difference between the relevant and irrelevant items could be detected even without target stimulus and behavioral classification. Moreover, Rosenfeld et al. (2008) have proposed a new experimental paradigm, called the Complex Trial Protocol (CTP), in which a relevant or irrelevant item is followed by a target item. This two-stage protocol has been shown to be effective and tolerant of various countermeasures. The CTP can reduce the possibility of the target overshadowing effect and improve detection accuracy.

Importantly, the present study demonstrated that the P300 amplitude increase in the conceal condition was not caused by a deception-specific process. Even when participants attempted to transmit the chosen card to the experimenter instead of concealing it, a similar increase was observed. Therefore, it can be concluded that the P300 amplitude increase was due to an additional processing irrespective of concealing or transmitting. Intention to conceal the chosen card paradoxically made the card more significant, and this process was reflected in a larger P300 (Allen and Iacono 1997). Evidence of such additional processing can be found in subjective, behavioral, and ERP results. First, the conceal and transmit conditions were rated more difficult than the control and no secret conditions. Second, the mean RT was longer for the chosen card than for the unchosen cards in the transmit condition. This result suggests that the chosen card was processed more extensively than the unchosen cards when participants attempted to convey it to the experimenter. This difference was not found in the conceal condition, where participants probably adopted a strategy of ignoring any distinction between the chosen and unchosen cards. Third, the peak latency of the P300 elicited by the target card was longer in the conceal and transmit conditions than in the control and no secret conditions. Moreover, the P300 amplitude for the target card tended to be smaller in these additional instruction conditions, although it was not statistically significant. Previous studies showed that the instruction to react deceptively reduced the amplitude of late positive component (LPC), probably because cognitive resources were depleted by the attempt to deceive (Johnson et al. 2003, 2004, 2005). Although we did not instruct participants to react deceptively, the instruction to suppress or enhance brain activation to the chosen card may have served as a secondary task, and the cognitive resources allocated to the target card may have been reduced in the conceal and transmit conditions. Given that P300 latency is an index of the time required for stimulus evaluation (Magliero et al. 1984), this additional processing delayed the evaluation process of the target card in the conceal and transmit conditions.

The attempts to beat a deception detection test can be divided into two categories: inhibitory and active countermeasures (Honts and Amato 2002). It should be noted that the instruction to conceal the chosen card in the present study corresponds to an inhibitory mental countermeasure, because participants were forbidden to use an active countermeasure like faking an unchosen card as the chosen card. Previous studies have suggested that the P300-based CIT is tolerant of inhibitory countermeasures (e.g., count backward by sevens; Sasaki et al. 2001), but vulnerable to active countermeasures (Rosenfeld et al. 2004). If participants are allowed to use an active countermeasure, the intention to deceive may have different effects on P300 and decrease detection accuracy.

In conclusion, the present study confirmed that the P300 in the CIT reflects neither a deception-specific process nor mere recognition. Although a deception- specific process may occur in the CIT, a larger amplitude of the P300 elicited by the relevant item is not due to this process but rather to a more general, additional processing of the relevant item. Although the instructions in the conceal and transmit conditions were apparently different, chosen items elicited larger P300s in both conditions. This result suggests that any manipulation that enhances attention to the relevant items would increase P300 amplitude. Moreover, such a manipulation can overcome the target overshadowing effect. Although the CIT assesses a participant’s knowledge of a certain item, the instruction to do elaborative processing on test items may enhance the response to the relevant item, and thereby improve detection accuracy. This implies that detection results of the CIT will be improved if future research can specify an optimal task instruction.