Introduction

A wealth of research demonstrates that misinformation exposure can distort eyewitness memory (Loftus, 2005). The source monitoring framework (SMF; Johnson, Hashtroudi, & Lindsay, 1993) explains that when people are asked about misinformation, they likely imagine that misinformation while retrieving the original event. As the misinformation images become more detailed and sensory, they become more familiar and similar to memories of the original event. The SMF posits that because we employ heuristics (e.g., familiarity) to determine whether we experienced an event, we inadvertently incorporate misinformation into our original memory. However, additional research is needed into whether memories for negative and potentially traumatic stimuli behaves in the same way as other memories given that past findings have been mixed. While the literature, overall, agrees that memories for negative events are malleable, some studies suggest that they may be more malleable than positive or neutral memories (Monds, Paterson, & Kemp, 2016; Porter, Bellhouse, McDougall, ten Brinke, & Wilson, 2010) while other studies have found the opposite (Peace & Porter, 2004; Monds, Paterson, Kemp, & Bryant, 2013). Emerging research on a phenomenon called the “amplification effect” has consistently found that people tend to remember more trauma over time than they actually experienced, that is, they report having experienced trauma events at follow-up that they did not report at initial assessment (see van Giezen, Arensman, Spinhoven, & Wolters, 2005). Therefore, our first aim was to broadly examine whether negative, potentially traumatic memories distort, especially after misinformation exposure. To address this aim, we adapted Strange and Takarangi’s (2012) trauma analogue film paradigm. We chose this paradigm for three reasons: (a) it allows us to investigate whether misinformation can lead people to falsely remember entire parts of an event over time, rather than small details, similar to findings from field research (van Giezen et al., 2005); (b) film content depicting actual or perceived threat and serious injury—events listed as traumatic in the DSM-5 (American Psychiatric Association, 2013)—has been found to elicit responses analogous to symptoms experienced after actual trauma (e.g., intrusions, physiological arousal, negative cognitions and mood; see James, Lau-Zhu, Clark, Visser, Hagenaars & Holmes, 2016), and (c) the trauma analogue film paradigm is a common and accepted method of investigating trauma: as of 2016, 74 peer-reviewed articles (with a total of 87 experiments) have used traumatic or negative film stimuli within an experimental or prospective study design (James et al., 2016). However, we acknowledge that a trauma analogue is unlikely to replicate the extreme stress experienced during a real trauma and can only approximate the conditions under which an eyewitness to a traumatic event would be exposed to post-event misinformation.

Importantly, many studies have found that for emotional events, peripheral or contextual details may be forgotten or confused, while emotionally arousing details are enhanced (e.g., Burke, Heuer, & Reisberg, 1992; Christianson & Loftus, 1991; Mather et al., 2006; Rubin, Berntsen, & Bohni, 2008). There are a few explanations for this pattern. We know from past research, for example, that, compared to neutral stimuli, emotional stimuli capture attention faster and are more likely to reach conscious awareness (see Levine & Edelstein, 2009). Further, although trauma events may comprise both emotional and unemotional elements, when people rehearse trauma events—which we know they typically do extensively, both unintentionally via intrusions and intentionally (Schacter, 2001; Ehlers & Clark, 2000)—that rehearsal tends to focus on the emotional elements. For example, a car accident victim may frequently experience intrusive images of headlights approaching and voluntarily discuss those intrusions with his or her therapist (see Ehlers & Clark, 2000). Little is known, however, about the effects of emotional or even traumatic misinformation, given that trauma analogue studies have usually used unemotional misinformation test items (e.g., that the event occurred on Thursday when it actually occurred on Tuesday; see Monds et al., 2013, 2016; Porter et al., 2010). Because attention narrows to emotional stimuli, it is plausible that people would be more likely to notice, encode, and rehearse emotional misinformation, leading to increased false memories for emotional moments of an event compared to the unemotional moments. As such, our second aim was to examine whether memory distortion was increased for emotional (vs. unemotional) misinformation.

In reality, people are likely exposed to the same misinformation repeatedly. For example, after witnessing a car accident involving a red car, Witness A may wrongly inform Witness B that the car was pink. The police, working from this inaccurate information, may also suggest to Witness B that the car was pink. Witness B may also read media reports reporting the car as pink. Only a few studies have systematically examined the effect of such repeated misinformation exposure on memory (Foster, Huthwaite, Yesberg, Garry, & Loftus, 2012; Mitchell & Zaragoza, 1996; Zaragoza & Mitchell, 1996). In Zaragoza and Mitchell (1996), participants watched a home burglary video and answered questions, some embedded with misleading suggestions (e.g., the thief wore gloves). These suggestions were given zero, one, or three times in different questions. Repeated suggestion increased false memories for seeing suggested details in the video. The researchers argued that with repetition, images of suggested details become more detailed and similar to the original event memory. Therefore, repetition decreases people’s ability to monitor the misinformation source and, thus, increases the likelihood that people will incorporate the misinformation into their original event memory. Importantly, we found no empirical studies investigating the effect of repeated misinformation for highly negative, potentially traumatic events. Addressing this gap is important because there are additional opportunities for repeated exposure to the same piece of misinformation among trauma victims—for example during medical treatment and therapy—as well as exposure to more ‘typical’ sources via co-witness discussion, police interviews, when reading media reports, and so on. If Zaragoza and Mitchell’s (1996) findings extend to highly negative and potentially traumatic memories, repeated misinformation exposure could have deleterious consequences on the accuracy of trauma victims’ testimony and perhaps mental health. Therefore, our third aim was to investigate the effect of repeated misinformation exposure on memory for highly negative events.

In two experiments,Footnote 1 participants watched a trauma analogue (car accident) film with some scenes removed. After 24 h, participants read three “eyewitness” reports describing the film’s events with some containing descriptions of removed scenes (see Foster et al., 2012). Participants then received a memory test containing Old (i.e., previously seen), Missing (removed), and New (control) clips. To examine whether people more frequently rehearse and incorporate emotional (vs. unemotional) misinformation into their original event memory, removed Old and Missing clips were divided (based on Strange and Takarangi, 2012) into crux clips (scenes crucial to the film’s meaning e.g., cars colliding) and non-crux clips (less crucial, more peripheral scenes e.g., a rescue helicopter arriving). Importantly, pilot testing in Strange and Takarangi (2012) found that crux clips were rated as more traumatic than non-crux clips and participants’ ratings of cruciality were highly correlated with how traumatic they found the clip (r = 0.95, p < 0.01).

After running Experiment 1, we wondered whether participants in the Single and Repeated Misinformation conditions may report Missing clips as “Old” for reasons other than because they genuinely falsely remembered the clips. Indeed, emerging research suggests that multiple mechanisms can lead to false memory reports, only some of which reflect “real” false memories (see Betz, Skowronski, & Ostrom, 1996; Wagner & Skowronski, 2017; Zhu et al., 2012). Although these studies confirm that people genuinely make memory errors, they also suggest some false memory responses are not authentic. For example, rather than truly recollecting false misinformation as having appeared in the original stimulus, participants can believe post-event information reports are completely accurate or use the reports to fill in forgotten details, thus leading them to report false information. Therefore, in Experiment 2, we probed “Old” responses further with a source-monitoring test where participants clarified whether or not they said “Old” to a clip purely because they saw it in the film.

Experiment 2 was pre-registered and data for both experiments can be accessed at https://osf.io/6y9mt/.Footnote 2 Because method and results were similar across both experiments, we discuss the experiments together.

Method

Participants

Participants were recruited from Amazon Mechanical Turk. Strange and Takarangi (2015), using the same film stimuli, found that participants exposed to Missing clip descriptions—similar to our Single Misinformation condition—during encoding, exhibited increased memory distortion compared to those not exposed to the descriptions (d = 0.45). Given that our misinformation would be repeated and delivered in a separate phase (allowing memory to fade), we expected to find a larger difference in memory distortion between the Repeated Misinformation and No Misinformation condition than in this prior study. Mitchell and Zaragoza (1996) and Zaragoza and Mitchell (1996) did not provide effect sizes but Foster et al. (2012) found a large difference between repeated and single misinformation conditions in memory accuracy for misleading claims (d = 0.64). Therefore, we used Foster et al.’s (2012) effect size to power Experiment 1. Our a priori power analysis for a two-tailed, two-group t test (using G*Power) with an alpha of 0.05, power of 0.80, and effect size of 0.64 revealed that the desired sample size for our study was at least 40 per condition.Footnote 3 We ran extra participants at Time 1 in case of attrition. In Experiment 1, we found a basic misinformation effect, that is, a difference between the No Misinformation and Single Misinformation conditions on errors for Missing clips, d = 0.70. To ensure we had enough power to find this misinformation effect again in Experiment 2 (given that we did not find a misinformation effect in our intervening supplementary experiment; see Supplementary Materials for all data from that experiment), we ran an a priori power analysis for a two-tailed, two-group t test with α = 0.05, power = 0.95, and d = 0.70. The analysis revealed that the desired total sample size for our study was at least 165 (55 per condition).

Participant selection procedures included the following in both experiments. First, workers were blocked if they had already participated in one of the experiments described in this paper or in any of our past experiments using this film stimulus. Second, to ensure participants would later understand our post-event information reports, they had to pass an English test at the start of Session 1 by answering at least four out of five fill-in-the-blank multiple-choice questions correctly (see our Supplementary Materials for these questions). If they failed this test, they were not allowed to continue with the experiment. Third, participants’ data was excluded if they: had seen the film before, restarted either session or admitted to watching the clips more than once, left the session to do something else, did not pass all attention checks (see Supplementary Materials to see attention checks; see Berinsky, Margolis, & Sances, 2013; Oppenheimer, Meyvis, & Davidenko, 2009; Paas, Dolnicar, & Karlsson, 2018; Thomas & Clifford, 2017 for reviews on the importance of using screeners such as attention checks to boost statistical power), or completed Session 2 after the 24-h deadline. In Experiment 1, participants were allowed to spend as much time as they needed to read the reports so we could collect timing data and exclude participants who skimmed. Participants in Experiment 1 were excluded if they skimmed the post-event information reports by reading them faster than 318 words per minute (according to their response time clicking “Next” for each report). In a pilot study, reading time data revealed that many participants read the reports unrealistically fast, indicating lack of exposure to the manipulation. We found no misinformation effects in our pilot until we excluded those who only skimmed the misinformation, at which point we found that participants who were exposed to misinformation repeatedly falsely remembered more stimuli compared to participants who were not exposed to misinformation at all. According to Trauzettel-Klosinski and Dietz (2012), average reading time for online English text is 228 words per minute, plus or minus 30 words. We, therefore, chose 318 words per minute (i.e., 228 words plus three standard deviations) as our cut-off because it would account for 99.7% of the reading speed data in a normal distribution. For context, participants had to spend at least 100 s reading a 530-word report. It is important to note that Amazon Mechanical Turk workers are more attentive to attention instructions than undergraduate subjects, suggesting that the failure rate could have been even higher if the experiments had been conducted in the laboratory rather than online (Hauser & Schwarz, 2016). In Experiment 2, in an attempt to prevent skimming, we gave participants a minimum period of time to read each report. Specifically, the ‘Next’ button at the end of each report did not appear until participants were given enough time to read the reports at no faster than 318 words per minute.

Four hundred and twelve participants completed Experiment 1. We excluded 263 participants: 15 said they had seen the trauma film before, 13 repeated Time 2 and, therefore, potentially read reports more than once, nine failed to accurately complete instructional manipulation checks, 34 admitted to watching the test clips more than once, seven admitted to leaving the study session, seven completed Session 2 beyond the 36 h deadline (we kept one person who completed it 50 min early), and 178 read at least one report faster than 318 words per minute. Of the 149 participants who met our inclusion criteria, 82 were female and aged 20 to 72, M = 40.52, SD= 12.59. A one-way ANOVA found no difference in age and a Chi square analysis found no difference in gender between conditions, ps = 0.904–0.930. Participants completed Time 2 M = 29.38 h (SD= 6.70) after Part 1.

Two hundred and two participants completed Experiment 2. We excluded and replaced 37 participants: 10 said that they had seen the trauma film before, seven repeated Time 2 and, therefore, potentially read reports more than once, 11 failed to accurately complete instructional manipulation checks, five admitted to leaving the study session, and four completed Session 2 beyond the 36-hour deadline. Of the 165 participants (55 per condition) who met our inclusion criteria, 94 were female and aged 21 to 76, M = 38.09, SD = 10.72. There were no significant differences in age and gender between conditions, ps = 0.298-0.679. Participants completed Time 2 M = 32.08 h (SD = 9.49) after Part 1.

Materials

Positive and negative affect schedule (PANAS)

Subjects rated how they felt before and after the trauma analogue stimuli (1 = very slightly or not at all; 5 = extremely) on 10 positive (e.g., excited, enthusiastic) and 10 negative (e.g., distressed, upset) mood adjectives. The scales have excellent convergent correlations with other mood measures (0.76–0.92) and correlates with other measures of distress and psychopathology, including the Hopkins Symptom Checklist (Negative Affect subscale: r = 0.65–0.74; Positive Affect subscale: r =− 0.29 to − 0.19) and Beck Depression Inventory (Negative Affect subscale: r = 0.56–0.58; Positive Affect subscale: r = − 0.36 to − 0.35; see Watson, Clark, & Tellegen, 1988).

Trauma analogue

The trauma analogue stimulus was the same film used in Strange and Takarangi’s (2012, 2015; see also Segovia, Strange & Takarangi, 2016) trauma analogue paradigm. The film was a United Kingdom public service announcement warning against texting while driving. Briefly, a teenage driver, while looking at her phone, collides with another vehicle head-on. Another car then crashes into them. Emergency services deal with the situation while the driver screams in distress at her injuries and upon noticing the dead passengers. The injuries and fatalities are graphically depicted (e.g., passenger’s neck snaps, dead baby). Participants in Strange and Takarangi (2012) rated crux clips as not pleasant and moderately traumatic, indicating it is an appropriate negative, traumatic analogue. The film was cut into clips (separated by 2 s of blank screen) with 6 clips removed before encoding.

Post-event information

Three research assistants acting as mock witnesses wrote a report describing the events depicted in the film. The research assistants watched the film cut into 28 clips (see Strange & Takarangi, 2012), identifying chunks that depicted a discrete event within the larger event. They were informed that the report should be written in past tense and sound continuous to participants. Therefore, each mock witness described the same details in the same order (as in the film) in their own words. Sentences describing removed scenes were removed for accurate reports. We edited these reports to remove repetition (e.g., “…the driver looked at her friend in the passenger seat. The two girls in the front were still conscious, they looked at each other” describe the same moment in the film, so the second sentence was removed) and shortened where possible without removing key information (e.g., “An ambulance approached the scene of the crash” was changed to “An ambulance arrived”; please see our Supplementary Materials for these reports). The accurate versions (word count range 420–435) only described what participants had seen in the film. The misinformation versions (word count range 530–544) described what participants had seen in the film but also the removed scenes they had not seen.

We counterbalanced and randomized reports so that participants read one report from each mock witness. In Experiment 1, each report was ‘chunked’, that is, presented 2–5 sentences at a time. Participants clicked ‘Next’ to move between sections. We chose this method to make reading easier and skimming the text (and thus not encoding the reports) harder. In Experiment 2, instead of 'chunking' the reports, we attempted to prevent skimming by ensuring that the ‘Next’ button did not appear until participants were given enough time to read each report at no faster than 318 words per minute.

Recognition test

The memory test consisted of six Old (previously seen), six New (never seen, control), and six Missing (removed) clips. Three of the Missing clips and three of the Old clips were cruxes and three of the Missing and three of the Old clips were non-cruxes. Crux clips depicted scenes crucial to the film’s overall story (e.g., cars colliding) while non-crux clips depicted less crucial scenes (e.g., a rescue helicopter arriving). A pilot study found that crux clips were more traumatic than non-crux clips (see Strange & Takarangi, 2012). The New clips were from online sources and depicted different car accidents and their aftermath and, therefore, were not split into crux/non-crux categories. We added the New clips simply to ensure participants were paying attention during the test. As such, these clips were analyzed separately from Old and Missing clips. All clips were approximately equal in length (Old: M = 8.65 s, SD= 2.15; Missing: M = 7.27 s, SD= 2.74 s; New: M = 8.15 s, SD= 2.16 s). In both Experiments 1 and 2, participants were asked if each clip was Old (“if it appeared in the film you watched during Session 1 yesterday”) or New and rated their confidence in their decision (1 = not at all, 5 = extremely confident; see Supplementary Materials for confidence ratings data).

Note that clips were not counterbalanced (as Old or Missing). When constructing the test, Strange and Takarangi (2012) ensured that Old and Missing clips were equally memorable (pilot data available on request), not consecutive or the first or last clips in the film (to avoid primacy and recency effects). They clips also could not have received a memorability rating in pilot testing at the anchor points of the scale. The short length of the film prevented the researchers from creating two sets of Old and Missing clips that satisfied these rules. We randomized test items to control for order effects.

Analogue trauma symptoms

The 15-item Impact of Event Scale (IES; Horowitz, Wilner, & Alvarez, 1979) has two subscales: Intrusions and Avoidance. Participants rated items (e.g., “I thought about it when I didn’t mean to”) on 4-point scales (0 = not at all, 5 = often) in relation to the film. The scale is internally consistent (Intrusions: M αs = 0.72–0.92; Avoidance: M αs = 0.65–0.90) and has strong validity (see Sundin & Horowitz, 2002). In Experiment 1, participants completed the IES at the end of Session 2. However, we later realized that the impact of seeing the film over 24-h before may have faded by that point, making it harder to determine whether participants found the film distressing or traumatic. Therefore, in Experiment 2, we also asked participants to complete the IES at the end of Session 1.Footnote 4

Authenticity questions

In Experiment 2, we measured the authenticity of participants’ memory errors in two ways. First, after completing the recognition test, participants completed a source-monitoring test where they selected the reason they said Old for each applicable clip (adapted from Wagner & Skowronski, 2017; Zhu et al., 2012): (1) It appeared in the film I watched yesterday, (2) I read it in the eyewitness report(s) and that was only memory I had, (3) I read it in the eyewitness report(s) and I trust the report(s), (4) I read it in the eyewitness report(s) and I didn’t want to contradict the report(s), (5) It appeared in the film I watched yesterday and in the eyewitness report(s), and (6) I guessed. Second, at the end of the survey, we asked participants whether they believed the eyewitness accounts were accurate.

Design and procedure

See Fig. 1 for an outline of the study design for these experiments. Both experiments were a 3 (Misinformation Condition: No Misinformation, Single Misinformation, Repeated Misinformation; between subjects) × 3 (Clip Type: New, Old, Missing; within subjects) design. Old and Missing clips were analyzed split into crux and non-crux clips (Crux Old, Non-Crux Old, Crux Missing, Non-Crux Missing). Therefore, our main analyses were 3 (Misinformation Condition: No Misinformation, Single Misinformation, Repeated Misinformation; between subjects) × 2 (Clip Type: Old, Missing; within subjects) × 2 (Crux Type: crux, non-crux; within subjects) with New clips analyzed separately. Our measure of memory distortion was the proportion of Missing Clips falsely identified as “Old”. In Experiment 2, participants also completed a source monitoring test to allow us to determine if their “Old” responses were authentic (see Supplementary Materials for responses on this test). We analyzed responses for Old and Missing clips again with inauthentic responses excluded (similar to Betz et al., 1996; Zhu et al., 2012). Given the limited number of clips, we did not run signal detection analyses.

Fig. 1
figure 1

Outline of the study design

Participants first provided informed consent—we warned them that participation would involve viewing a potentially distressing film depicting a road traffic accident, and that they could withdraw their participation any time without penalty. Our cover story was that we were working with a government agency to determine whether graphic material should be used as part of a new Drivers Education campaign. Participants were told that they would be asked some questions about the film and their responses to it; they were not told about the memory test. There were three phases across two separate sessions. Session 1 included the encoding phase during which participants completed the PANAS, watched the film, answered a question about whether they had seen the film before, and then completed the PANAS again (as well as the IES in Experiment 2). Session 2, emailed out 24-h later with a 24-h completion deadline, included the post-event information phase, and the memory test phase. During the post-event information phase, participants were told they were going to read three eyewitness reports about the film they saw the previous day They were told that reading the reports was expected take around 7–10 min and it was critical that they read each of these reports very carefully. To manipulate misinformation repetition, the No Misinformation condition read three accurate reports, the Single Misinformation condition read two accurate reports and one misinformation report, and the Repeated Misinformation condition read three misinformation reports, all containing descriptions of the removed scenes. We also tweaked our programming to make it impossible for participants to watch the film or the clips more than once. All participants then completed a 5-min filler task (mazes), the PANAS (to ensure there were no mood differences between conditions that may affect performance on the memory test), the memory test and the IES, respectively. In Experiment 1, participants were asked if they watched the clips more than once, and in both experiments they were asked if they left the task at any point. Participants were fully debriefed and paid for their participation.

Results

Emotional impact of trauma analogue film

We first ran a 2 (Mood: positive mood, negative mood) × 2 (Time: before film, after film) × 3 (Condition: No Misinformation, Single Misinformation, Repeated Misinformation) repeated measures analysis of variance to ensure our film acted as a highly negative stimulus (Table 1). There was a Mood × Time interaction in both experiments (Experiment 1: F(1, 146) = 217.27, p < 0.001, η 2p  = 0.598; Experiment 2: F(1, 162) = 260.01, p < 0.001, η 2p  = 0.616): participants reported a decrease in positive mood and an increase in negative mood from before to after watching the trauma analogue film, pairwise comparisons ps < 0.001.Footnote 5 Pairwise comparisons showed that ratings were higher for positive than negative mood at both times (ps < 0.001), supported by a main effect of Mood, Experiment 1: F(1, 146) = 417.93, p < 0.001, η 2p  = 0.741; Experiment 2: F(1, 162) = 283.26, p < 0.001, η 2p  = 0.636.Footnote 6 We also found a main effect of Time, with ratings higher after the film than before, likely because of the increase in negative mood, Experiment 1: F(1, 146) = 12.66, p = 0.001, η 2p  = 0.080; Experiment 2: F(1, 162) = 21.11, p < 0.001, η 2p  = 0.115. Therefore, our film stimulus was a successful analogue for a highly negative event, leading to increased negative mood and decreased positive mood.

Table 1 Means (standard deviations in brackets) for PANAS and IES scores

Next, we examined whether the film stimulus led participants to experience trauma symptoms. In Session 2, participants reported having experienced some analogue avoidance and intrusion symptoms in relation to the film within the previous 24-h (see Table 1). Horowitz (1982; see Joseph, 2000) suggested thresholds for symptom levels corresponding to levels of clinical concern using the IES total score, with scores between 8.6 and 19 (as we found) indicating medium clinical concern. Further, our participants reported equal or more distress than populations who had experienced real-life highly negative stressors or trauma (e.g., survivors of an avalanche, some survivors of childhood sexual abuse, firefighters who had experienced stresses such as corpse handling, and freshman medical students confronting cadaver dissection for the first time; Bryant & Harvey, 1996; Elliott & Briere, 1995; Horowitz, Wilner, & Alvarez, 1979; Johnsen, Eid, Lovstad, & Michelsen, 1997). Therefore, our film likely acted as an appropriately negative, potentially traumatic stimulus, similar to a real-life negative event.

In Experiment 1, there was no difference for total or subscale IES scores in Session 2 between conditions (ps = 0.167–0.721), suggesting that reading misinformation did not affect reported symptoms. This finding is perhaps unsurprising because all participants had to read highly negative reports, with participants in the Single and Repeated Misinformation conditions reading only a few more negative (crux-related) sentences. In Experiment 2, there was no difference in IES scores at Session 1 (ps = 0.063–0.577). However, there was a significant difference for total IES scores at Session 2 (F(2, 164) = 3.11, p = 0.047, η 2p  = 0.037, but not subscale scores: ps = 0.061–0.153), with participants in the Repeated Misinformation condition reporting more overall distress than those in the No Misinformation condition. This finding needs to be interpreted with caution given that there were no significant differences in the underlying subscales. After reading all three reports and before taking the memory test at Session 2, there were no significant differences in positive or negative mood between conditions, Experiment 1: ps = 0.203–0.304; Experiment 2: ps = 0.182–0.187. Therefore, overall, there was little to no differences in trauma symptoms between conditions.

Memory distortion

We first analyzed New clips separately from Old and Missing clips. For both experiments, we found, using one-way ANOVAs, that participants were highly successful at recognizing that New clips were not part of the trauma film, suggesting that they were paying attention during test and remembered the general characteristics of the trauma analogue film (Table 2). There were no differences between conditions, Experiment 1: F(2, 146) = 0.29, p = 0.749, η2 = 0.005; Experiment 2: F(2, 162) = 1.86, p = 0.159, η2 = 0.022.

Table 2 Proportion of “Old” Responses for each clip type

We next examined memory distortion for Old and Missing clips divided into crux vs. non-crux Old and Missing clips with 2 (Clip Type: Old, Missing) × 2 (Crux Type: crux clip, non-crux clip) × 3 (Condition: No Misinformation, Single Misinformation, Repeated Misinformation) repeated measures ANOVA for each experiment. First, we note that both experiments found Clip Type main effects, with participants responding “Old” to more Old clips than Missing clips, Experiment 1: F(1, 146) = 337.28, p < 0.001, η 2p  = 0.698; Experiment 2: F(1, 162) = 353.80, p < 0.001, η 2p  = 0.686. Even with a 24-h delay period, participants almost always remembered Old clips, further suggesting that participants remembered what they saw in the trauma analogue film well.

Moving on to our main research questions, our first aim was to examine whether negative, potentially traumatic memories distort, especially after misinformation exposure. In both experiments, we found that, even without misinformation, participants falsely remembered approximately 35% of Missing clips across both crux and non-crux clips. Therefore, our findings fit with previous work (e.g., Strange & Takarangi, 2012) showing that memories for our trauma analogue are malleable, even in the absence of external suggestion. In Experiment 1, our repeated measures ANOVA found a Clip Type × Condition interaction (F(2, 146) = 4.54, p = 0.012, η 2p  = 0.059): although there were no significant differences between conditions for Old clips (pairwise comparisons p = 0.224–1.000), participants in the Single (p = 0.004) and Repeated Misinformation (p < 0.001) conditions made significantly more errors for Missing Clips compared to participants in the No Misinformation condition. This interaction was supported by a main effect of Condition, F(2, 146) = 8.42, p < 0.001, η 2p  = 0.103. In Experiment 2, there was no Clip Type × Condition interaction (F(2, 162) = 1.35, p = 0.262, η 2p  = 0.016), but there was a main effect of Condition (F(2, 162) = 6.78, p = 0.001, η 2p = 0.077). Specifically, participants in the Repeated Misinformation condition responded “Old” more than participants in the No Misinformation condition (p = 0.001); there were no other pairwise differences, ps = 0.085–0.451. Therefore, in both experiments, we found evidence that misinformation increases memory distortion for our trauma analogue.

Our second aim was to examine whether participants were more likely to encode emotional misinformation (i.e., descriptions of the crux clips in the reports) compared to less emotional misinformation (describing non-crux clips), thus leading to increased false memories for emotional scenes from the trauma film. Memory distortion was increased for emotional (vs. unemotional) aspects of the film. Our repeated measures ANOVAs found main effects of Crux Type in both experiments, with participants responding “Old” to more crux (vs. non-crux) clips, Experiment 1: F(1, 146) = 28.71, p < 0.001, η 2p  = 0.164; Experiment 2: F(1, 162) = 29.93, p < 0.001, η 2p  = 0.156. Therefore, participants were more likely to recollect emotional aspects of the film. However, the single and repeated misinformation conditions did not make more errors on crux clips than the No Misinformation condition (i.e., there was no Crux Type × Condition interaction; Experiment 1: F(2, 146) = 0.05, p = 0.951, η 2p  = 0.001; Experiment 2: F(2, 162) = 0.36, p = 0.701, η 2p  = 0.004), suggesting they did not specifically incorporate more emotional misinformation into their original event memory compared to misinformation about non-emotional aspects.

Our third aim was to investigate the effect of repeated misinformation exposure on memory for highly negative events. We found no significant differences for Missing clips between Single and Repeated Misinformation conditions (Experiment 1 pairwise comparison p = 1.000; Experiment 2 pairwise comparison not examined due to the lack of interaction). In other words, exposure to repeated misinformation within our paradigm did not lead to increased memory distortion compared to single exposure. Despite it being non-significant, however, our raw means indicate that participants in the Repeated Misinformation conditions made more errors than participants in the Single Misinformation conditions, suggesting that we need to interpret the null finding with some caution.

Authentic memory distortion

In Experiment 2, we measured the authenticity of memory errors with a source-monitoring test where participants selected the reason they said “Old” for each applicable clip. We analyzed our memory distortion data using only those Old responses classified as “authentic”, that is, when participants clicked “it appeared in the film I watched yesterday” when asked why they responded “Old” to clips.Footnote 7 A 2 (Clip Type: Old, Missing) × 2 (Crux Type: crux, non-crux) × 3 (Condition: No Misinformation, Single Misinformation, Repeated Misinformation) repeated measures ANOVA found a main effect of Clip Type (F(1, 162) = 173.29, p < 0.001, η 2p = 0.517) and Crux Type, F(1, 162) = 7.56, p = 0.007, η 2p = 0.045, but no Clip Type × Condition interaction or main effect of Condition, ps = 0.261–0.474. Therefore, once inauthentic false memory responses were removed, we no longer found any misinformation effects. However, this finding needs to be interpreted with caution because removing responses from an already limited number of clips may inherently make it more difficult to find an effect. We also asked participants whether they believed the eyewitness accounts were accurate. The majority (88.5%) of participants believed that the reports accurately described the trauma analogue film. A Chi square analysis showed no difference between conditions, p = 0.386.

Discussion

Overall, our results add to the growing body of literature that has found that memories for potentially traumatic experiences are malleable and prone to distortion like other, more mundane memories. These findings provide support for Rubin et al.’s (2008) memory-based model of PTSD, which suggests that memory of the trauma event, rather than objective trauma exposure, predicts the development of PTSD symptoms. Because memory is reconstructive and influenced by factors such as current goals, attitudes, concerns, and emotions, trauma memories will change and distort. Further, consistent with previous trauma analogue research (e.g., Monds et al., 2013), participants in both experiments falsely remembered misinformation as being included in the trauma analogue. Source monitoring errors may be a key mechanism underlying this memory distortion (Lindsay, 2008). We expect that participants likely imagined the misinformation provided in reports and as these images became more detailed and sensory, those details became more familiar and similar to memories of the original trauma film. At test, participants may have used simple mental shortcuts such as familiarity of the clip’s content to determine whether the clips were shown at encoding, leading to their inaccurate Old responses. Put differently, participants failed to monitor the source of the clip and their memory expanded to include the misinformation.

We also consistently found that participants reported more crux clips as “Old” than non-crux clips. Therefore, our results support the proposition that emotional details (in this case, crux clips) can be enhanced in memory compared to less emotional details (non-crux clips; see Rubin et al., 2008). Importantly, other features of crux clips (e.g., how much they stood out in memory; see Strange & Takarangi, 2012) may also have contributed to our finding. Given that the film depicts a highly negative, potentially traumatic event, it is unsurprising and likely unavoidable that crux clips consisted of these other features along with emotionality. We did not find any differences in memory for crux clips between conditions, however. In other words, our finding that emotional aspects of the film were enhanced in memory may not have occurred because participants in the misinformation conditions remembered more emotional misinformation items. Indeed, all participants may have rehearsed the film during the delay period regardless of misinformation exposure. Participants may have recognized that there were gaps in the films and mentally generated content to fill in those gaps during the rehearsal. Because people tend to rehearse emotional elements of an event, it is plausible that they would also generate emotional material, similar to the crux clips, during the rehearsal process. At test, the crux clips would have likely felt familiar to participants, leading them to falsely remember more of those clips as coming from the original film compared to non-crux clips (Johnson et al., 1993; Lindsay, 2008).

Importantly, we found that repeated misinformation exposure did not lead to more false memories for Missing clips compared to single misinformation exposure in our paradigm. How, then, do we reconcile our data with previous findings showing repeated misinformation exposure enhances memory distortion? Foster et al. (2012), Zaragoza and Mitchell (1996) and both our experiments exposed participants to verbal written misinformation items. However, Foster et al.’s (2012) and Zaragoza and Mitchell’s (1996) tests were comprised of verbal written items and verbal audio items respectively, while our test items were film clips. Past research has shown that people may falsely recognize more new items when those items are conceptually or perceptually similar to studied items (e.g., Koutstaal & Schacter, 1997; Schacter, Verfaellie, & Anes, 1997). Therefore, in our experiments, there may have been little perceptual overlap between our written misinformation and film test items, limiting the number of false memories that could be created even with repeated misinformation.

However, while there was little perceptual similarity between our misinformation and test items, the conceptual features of the film and subsequent reports overlapped considerably. Indeed, the film and reports are nearly identical because both sources of information are about the same event. It is possible, therefore, that reading and imagining one misinformation report increased distortion to the point where it could not increase further even with exposure to more reports containing the same misinformation. In other words, memory distortion may have reached ceiling after a single report. The fact that there was no difference in memory distortion between the No Misinformation and Single Misinformation conditions in Experiment 2, places doubt on this explanation. Alternatively, there may be an even simpler explanation for our findings based on raw means: the number of errors for Missing clips made by the Single Misinformation condition was in between the number made by the No Misinformation and the Repeated Misinformation conditions. That is, the Single Misinformation condition made more errors than the No Misinformation condition but fewer than the Repeated Misinformation condition. Therefore, the repeated vs. single misinformation effect may exist but be very small as suggested by our effect sizes.

It is important to note, however, that our results from Experiment 2 (when only looking at authentic errors) suggest that misinformation may have had no effect on memory for Missing clips in our paradigm. Across conditions, participants had an authentic memory error rate of 19–30% for Missing clips. This rate is similar to the one found by Strange and Takarangi (2012; 26%), whose paradigm we adapted, suggesting that our data may reflect a bias to respond “Old” for emotional stimuli (Dougal & Rotello, 2007). For example, all participants may have been biased to say that any clip that fit the gist or emotional tone of the film was “Old”. Indeed, this possibility may explain why we consistently found that participants, across conditions, were more like to falsely remember crux (emotional, more traumatic) clips compared non-crux clips. We also acknowledge that if new clips seemed very different from Missing and Old clips at test, participants may have been biased to rate Missing clips as “Old” because the content would be similar to Old clips by comparison. Again, examining different test formats could be a way to investigate this possibility in future. Regardless of the explanation, our results suggest researchers need to be aware that some false memory responses may not reflect authentic false memories. Probing the authenticity of false memory reports should be considered when designing future studies, to avoid potentially exaggerated effect sizes for false memories.

Of course, our study has limitations. The film likely did not replicate the stress and emotionality experienced during real-life trauma, thus our results may not generalize to real-world scenarios. However, analogue trauma film paradigms do elicit analogue PTSD symptoms (James et al., 2016). Furthermore, repeated or extreme exposure to aversive details of a traumatic event through electronic media, television, movies or pictures can be a Criterion A stressor if it is work-related (see DSM-5; American Psychiatric Association, 2013). Therefore, it is important to investigate the effect of film content itself on trauma memory and symptomology. We also found it difficult to prevent participants from skimming the post-event information reports. Future studies could investigate other methods of presenting the reports (e.g., via audio clips or physical copies) to see if alternative formats lead to increased encoding of the misinformation items. However, we acknowledge that other methods of report presentation may not necessarily increase attention to the reports (e.g., participants could simply remove their headphones or “zone out”). Indeed, in a laboratory pilot study where participants read physical copies of the reports, we found no difference between conditions, with errors for Missing clips remaining around M = 0.36–0.40, suggesting that participants may have skimmed and not been exposed to the misinformation items. Other explanations are also very plausible, however (e.g., participants read the physical reports more carefully and recognized that some items were false) and should be investigated in future. We also do not believe that presenting the reports at a fixed encoding rate would prevent skimming, given that online participants could just as easily “zone out” or surf the internet in another tab and return a few minutes later once all chunks had automatically skipped through.

In summary, our findings have important theoretical implications. Our data suggest that, statistically, repeated misinformation exposure does not result in significantly more memory errors compared to single misinformation exposure in our analogue paradigm. Across conditions, the emotional and more traumatic elements of the stimulus produced more memory distortion compared to the unemotional elements. But misinformation did not lead to more PTSD symptomology. Although these results seem encouraging, it is critical to note that any degree of misinformation exposure led to a 36–59% memory error rate. Thus, our findings have implications for victim and eyewitness accuracy. If people are exposed to misinformation, it appears likely that their memory will be distorted.