Exploring the Single-Trial-Type-Dominance-Effect in the IRAP: Developing a Differential Arbitrarily Applicable Relational Responding Effects (DAARRE) Model

Finn, Martin; Barnes-Holmes, Dermot; McEnteggart, Ciara

doi:10.1007/s40732-017-0262-z

Exploring the Single-Trial-Type-Dominance-Effect in the IRAP: Developing a Differential Arbitrarily Applicable Relational Responding Effects (DAARRE) Model

Original Article
Published: 21 December 2017

Volume 68, pages 11–25, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Psychological Record Aims and scope Submit manuscript

Exploring the Single-Trial-Type-Dominance-Effect in the IRAP: Developing a Differential Arbitrarily Applicable Relational Responding Effects (DAARRE) Model

Download PDF

1219 Accesses
54 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

The Implicit Relational Assessment Procedure (IRAP) has been used as a measure of implicit cognition and has been used to analyze the dynamics of arbitrarily applicable relational responding. The current study uses the IRAP for the latter purpose. Specifically, the current research focuses on a pattern of responding observed in a previously published IRAP study that was difficult to explain using existing conceptual analyses. The pattern is referred to as the single-trial-type dominance effect because one of the IRAP trial types produces an effect that is significantly larger than that of the other three. Based on a post hoc explanation provided in a previously published article, the first experiment in the current series explored the impact of prior experimental experience on the single-trial-type dominance effect. The results indicated that the effect was larger for participants who reported high levels of experimental experience (M = 32.3 previous experiments) versus those who did not (M = 2.5 previous experiments). In the second experiment, participants were required to read out loud the stimuli presented on each trial and the response option they chose. The effect of experimental experience was absent, but the single-trial-type dominance effect remained. In the third experiment, a different set of stimuli than those used in the first two experiments was used in the IRAP, and a significant single-trial-type dominance effect was no longer observed. The results obtained from the three experiments led inductively to the development of a new model of the variables involved in producing IRAP effects—the differential arbitrarily applicable relational responding effects (DAARRE) model—which is presented in the General Discussion.

Predicting and Influencing the Single-Trial-Type-Dominance-Effect: the First Study

Article 09 July 2019

Differential Trial-Type Effects in an Implicit Relational Assessment Procedure: Extending the DAARRE Model

Article Open access 12 August 2024

Exploring the Behavioral Dynamics of the Implicit Relational Assessment Procedure: The Role of Relational Contextual Cues Versus Relational Coherence Indicators as Response Options

Article 04 May 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The study of derived stimulus relations has been used by many behavior analysts as a conceptual basis for analyzing behaviors that appear to be closely related to human language and cognition. Perhaps the clearest and most self-conscious example of this approach is provided by relational frame theory (RFT; Hayes, Barnes-Holmes, & Roche, 2001). Drawing on the seminal work of Sidman (1971; see Sidman, 1994, for a book-length treatment) on equivalence relations, RFT argues that the functional units of human language and cognition involve a wide range of generalized relational operants, known as relational frames, each possessing three core properties. The first property is mutual entailment and involves a bidirectional relation between two stimuli, such that if A is related to B, then B is related to A. The second property is combinatorial entailment and involves three or more stimuli, such that if A is related to B and B is related to C, then A is related to C and C is related to A. The third property is the transformation of functions, which recognizes that any mutual or combinatorial entailment will involve specific behavioral functions. Thus, if A is related to B and B acquires a mildly appetitive function, the function of A may be transformed based on the type of relation between A and B. For example, if you are told that “A is better than B,” then the appetitive function acquired by A may be larger than the appetitive function that was initially acquired by B.

According to RFT, relational frames are always under two types of contextual control. One type of contextual control is denoted as Crel, which refers to any contextual cues that determine the specific mutual and combinatorial entailed relations. In natural language, these may be words or phrases such as bigger than, smaller than, same, opposite, different, and so on. The other type of contextual control is Cfunc, which refers to any contextual cues that select the behavioral functions that are transformed in accordance with an entailed relation. For example, if two stimuli (A and B) are entailed in a “same” relation, then a Cfunc determines the specific behavioral function that “expresses” the entailed relation. If A is an actual chocolate brownie and B is the phrase chocolate brownie, then different Cfuncs will evoke different responses, although the entailed relation remains the same. For example, the Cfunc “tastes like” and the Cfunc “looks like” will evoke the gustatory and visual properties of chocolate brownies, respectively.

According to RFT, many of the functions of stimuli that we encounter in the natural environment may appear to be relatively basic or simple but have acquired those properties due, at least in part, to a history of relational framing. Even a simple tendency to orient more strongly toward one stimulus rather than another in your visual field may be based on relational framing. Identifying the name of your hometown or city from a random list of place names may occur more quickly or strongly because it coordinates with other stimuli that control strong orienting functions (e.g., the many highly familiar stimuli that constitute your hometown). Such functions may be defined as Cfunc properties because they are examples of specific stimulus functions (i.e., orienting) that are acquired based on—but are separate from—the entailed relations among the relevant stimuli. The reader should note that the use of the term orienting (as an example of a Cfunc property) is one that we will use frequently later in this article.

Since the publication of the Hayes et al. (2001) volume on RFT, a large number of studies have explored and tested the basic account (see Dymond & Roche, 2013; Hughes & Barnes-Holmes, 2016a, b, for recent reviews), and the evidence thus far has been largely supportive. In more recent years, some researchers have shifted their attention from testing the basic RFT model to analyzing the relative strength of relational responding as defined by RFT (see Barnes-Holmes, Finn, McEnteggart, & Barnes-Holmes, 2017). The development of the Implicit Relational Assessment Procedure (IRAP), which grew directly out of the theory, was instrumental in pursuing this line of research (see Hughes & Barnes-Holmes, 2011). The IRAP is a computer-based task that presents label stimuli at the top of the screen, such as pictures of spiders and puppies, and target stimuli that appear in the middle of the screen, such as Scares me, Frightens me, I like it, and Makes me calm. Which labels and targets appear on screen at any point in the task is quasirandom, with the constraint that the two types of labels will appear with the two types of targets an equal number of times within a predetermined block of trials. The labels and targets thus form a 2 × 2 crossover design that yields four different trial types on the IRAP (in the current example, spider–negative, spider–positive, puppy–negative, and puppy–positive). On each trial, two response options are provided that require participants to confirm or deny specific relationships between the label and target stimuli. For example, a spider picture and Scares me might appear on a given trial with the response options yes and no; in this case, participants would be required to confirm (i.e., pick yes) or deny (i.e., pick no) that spiders scare them.

The IRAP operates by requiring opposing patterns of responding across successive blocks. For example, a spider picture and Scares me would require the response Yes on one block and No on the next block. The IRAP operates on the assumption that, all things being equal, history-consistent response patterns will be emitted more readily than history-inconsistent patterns; this basic assumption was formalized in the context of the relational elaboration and coherence (REC) model, the details of which have been presented in numerous other articles (see Barnes-Holmes, Barnes-Holmes, Stewart, & Boles, 2010a, for a detailed treatment of the REC model). According to the REC model, an individual with a history of relatively high fear responses toward spiders is more likely to emit patterns of responding on an IRAP indicative of this history. Broadly speaking, the IRAP is scored by subtracting the mean response latency for one pattern of responding from the mean response latency of the opposite pattern. Any resultant difference is deemed to be reflective of the differential behavioral history involved in the two patterns of responding. In this case, the difference score should indicate that the spider-fearful individual responded more quickly when asked to confirm (respond Yes) rather than disconfirm (respond No) that the pictures of spiders are scary.

In concluding that an IRAP performance is indicative of specific histories of relational responding, it might be assumed that each of the four trial types of the IRAP would be equally sensitive to these histories. However, growing evidence suggests that this is not the case. Perhaps one of the clearest examples of the way in which the four trial types of the IRAP may not be equally sensitive to pre-experimental history was provided in a series of experiments reported by Finn, Barnes-Holmes, Hussey, and Graddy (2016). Across three experiments, they used what we will call here a shapes and colors IRAP. Across trials, the two label stimuli were color and shape and the target words were red, green, blue, square, circle, and triangle. As such, the IRAP involved presenting four different trial types that could be designated as (a) color–color, (b) color–shape, (c) shape–color, and (d) shape–shape. The experiments they reported focused on the impact of specific types of instructions on the individual trial-type effects, the details of which are not critical here. The important result was a persistent finding, which we will refer to as the single-trial-type dominance effect for the color–color trial type. That is, even when all four trial-type effects were history-consistent, the size of the color–color trial-type effect was significantly larger than for the shape–shape trial type. The puzzling issue about this difference is that these two trial types share the same response option within blocks of trials (i.e., true during history-consistent blocks and false during history-inconsistent blocks), and thus the difference between the trial types cannot be explained by a simple tendency to respond true more readily than false, which is common in natural language (Dodds et al., 2015).

In speculating about the single-trial-type dominance effect, Finn et al. (2016) suggested that perhaps prior experimental experience with the IRAP might play a role in generating the effect. Since the Finn et al. data were collected, our research team has moved to a different university that runs a large research participant pool populated with volunteers who vary considerably in the number of experiments they have completed previously using a range of latency-based measures. Access to this participant pool afforded us the opportunity to explore the potential impact of prior experimental experience on the single-trial-type dominance effect. On the grounds of intellectual honesty, we have presented the studies herein as they were conducted and contextualized them with a narrative that reflects our thinking as we moved from one experiment to the next. Although the research may initially appear to be somewhat trivial, it served to generate a new conceptual model of the key variables that appear to be involved in generating IRAP performances. Again, on the grounds of intellectual honesty, this new model is presented toward the end of the article rather than at the beginning (i.e., because it emerged inductively from the experimental work).

Before proceeding, it is important to emphasize that research using the IRAP may be separated into two categories. One category contains studies in which the IRAP has been used largely as a type of psychometric instrument (e.g., to predict some criterion variable such as treatment outcome; e.g., Carpenter, Martinez, Vadhan, Barnes-Holmes, & Nunes, 2012). The other category contains studies that have used the IRAP as an experimental context for exploring the dynamics of relational framing (e.g., Finn et al., 2016; Maloney & Barnes-Holmes, 2016). The current study is very much in the latter category. Although it is important for the reader to bear this distinction in mind, developing an improved understanding of the dynamics of behavior that are typically produced by the IRAP may be beneficial in using it as a psychometric instrument in future research.

Experiment 1

The first study aimed to test the extent to which different levels of experimental experience affected the relative sizes of the four trial types in an IRAP. Generally speaking, the methodologies used by the various research groups in the Department of Experimental, Clinical, and Health Psychology are latency-based procedures ostensibly similar to the IRAP. Examples of these procedures include the Implicit Association Test (IAT), the Affect Misattribution Procedure (AMP), variations of the Stroop task, and reaction time–based procedures delivered as part of EEG studies. Although not all experiments involve procedures of this kind, they are the predominant mode of investigation. In this context, amassing a sample of completely experimentally naive individuals would have been extremely difficult. We therefore decided to compare differences in IRAP performances after minimal exposure versus repeated exposure to these tasks. In this context, participants sometimes complete more than one experiment within a single visit to the research laboratory, and thus it would also have been difficult to limit a “low experience” group to just one or two experiments. Therefore, we operationalized experimental experience as follows: Participants who had taken part in five or fewer experiments were categorized as low experience (LE), whereas participants who had taken part in six or more experiments were categorized as being high experience (HE).

In Experiment 1, all participants completed a shapes and colors IRAP. The performance of each participant on the IRAP was established initially using a verbal feedback procedure rather than through the provision of block-specific rules (described in more detail in the following sections). The purpose of the verbal feedback procedure was to avoid providing any rules or instructions that might specify or emphasize any of the relations the IRAP was assessing because, as noted previously, Finn et al. (2016) showed that such instructions may affect IRAP performances.

Method

Participants

Sixty-five individuals between the ages of 18 and 48 (M = 22.6 years) participated in the experiment in return for payment of €5. Informed consent was provided by all participants. The experiment was approved by the ethics committee of the Faculty of Psychology and Educational Sciences at Ghent University.

Materials

The materials comprised an IRAP and a questionnaire.

IRAP

The IRAP was presented on a standard laptop computer. The IRAP software was used to present the instructions and the stimuli and to record responses.^{Footnote 1} Each trial presented one of two labels: color or shape. The label stimulus was presented along with 1 of 12 target stimuli. The target stimuli were all words, six denoting colors—red, green, blue, pink, yellow, and orange—and the other six denoting shapes—triangle, circle, square, rectangle, octagon, and oval. The following instruction appeared on screen between each block of trials: “Important: During the next phase, the previously correct and wrong answers are reversed. This is part of the experiment. Please try to make as few errors as possible—in other words, avoid the red X.”

Questionnaire

A questionnaire composed of demographic information (age) was presented to participants after they had completed the IRAP. In addition, the questionnaire asked participants to estimate approximately the number of previous psychology experiments in which they had participated.

Procedure

The experiment was conducted on a one-on-one basis in cubicles frequently used by the Department of Experimental, Clinical, and Health Psychology at Ghent University. Upon entering the cubicle, participants were seated in front of a laptop that had the IRAP program initiated. The first stage of the experiment involved shaping participants’ performance on the IRAP in a manner similar to that used in research that was conducted prior to the introduction of specific preblock rules that were made possible with the 2012 version of the IRAP program. Before interacting with the IRAP, the experimenter described to participants what they would see during the task. Specifically, they were told that on each trial they would see a word appear at the top of the screen, that they would see a second word appear at the middle of the screen, and that they would be required to respond to the combination of words by choosing one of two response options—Yes or No—that appeared at the bottom of the screen, one to the left and one to the right. Participants were also informed that pressing the d key would select the response at the bottom left of the screen and that pressing the k key would select the response option at the bottom right of the screen.

After the basic details of the task had been outlined, participants were told that their goal or objective in the task was to discover the pattern of responses required by the computer program. They were told that this meant that for each trial, one response would be accurate and the other inaccurate. Selecting the accurate response would allow them to progress to the next trial, whereas selecting an inaccurate response would produce a red X below the target stimulus. At this point, participants were encouraged to press the space bar and interact with the task, being told that all they could do on the first trial was to guess which response option was the accurate one. Following their initial response, the experimenter noted the feedback for that trial. If participants emitted a response deemed inaccurate according to the program (i.e., the response produced a red X), the experimenter told them that that particular combination of labels and targets did not go together on this occasion. If participants emitted an accurate response, the experimenter provided verbal reinforcement (e.g., by saying “Well done”) and then invited them to continue with the next trial that appeared on screen. The aim of the verbal feedback procedure was to ensure that participants attained the latency and accuracy criteria required on the IRAP without specifying a formal rule or instruction (e.g., “Please respond as if shapes are shapes and colors are colors”).

On each trial of the IRAP, four words appeared on screen: (a) a label at the top center of the screen (color or shape); (b) a target at the center of the screen (e.g., red, green, blue, circle, square, or triangle); and (c) the two response options yes and no at the bottom left and right of the screen, respectively. Participants responded on each trial using either the d key for the response option on the left or the k key for the response option on the right. The locations of the response options (the words yes and no) alternated from trial to trial in a quasirandom order, such that they did not remain in the same left or right locations for more than three successive trials. Examples of each type of trial to which participants were exposed are shown in Fig. 1.

When participants selected the response option that was deemed correct within that block of trials, the label, target, and response option stimuli were removed immediately from the screen for an intertrial interval of 400 ms, after which the next trial was presented (i.e., a label, target, and two response options appeared simultaneously). When participants selected the response option that was deemed incorrect for that block of trials, the stimuli remained on screen and a red X appeared beneath the target stimulus. The participants were required to select the correct response option, and only then did the program proceed directly to the 400-ms intertrial interval (followed immediately by the next trial). This pattern of trial presentations, with corrective feedback, continued until the entire block of 24 trials was presented. The IRAP program presented the trials in a quasirandom order within each block; each label was presented twice with each target stimulus across the 24 trials. Consistent with the majority of previously published IRAP studies, the trials presented within each block may be described as consisting of four different trial types. In the current study, the four different combinations of label and target stimuli may be denoted as (a) color–color, (b) color–shape, (c) shape–color, and (d) shape–shape (see Fig. 1).

When participants completed a block of trials, the IRAP program provided them with feedback on their performance during that block. The feedback consisted of a message informing them how accurately they had responded in terms of the percentage correct and how quickly they had responded in terms of the median response latency. The latter was calculated from stimulus onset to the first correct response across all 24 trials within the block. Participants were required to achieve a minimum accuracy of 79% correct and a maximum median latency of no more than 2,000 ms on each block within a pair. The IRAP program was set to allow participants up to a maximum of four pairs of practice blocks to achieve these criteria. Only when participants achieved these criteria across both Blocks 1 and 2 (or Blocks 3 and 4, Blocks 5 and 6, or Blocks 7 and 8) were they permitted by the IRAP program to continue to the critical test blocks. The test phase of the IRAP involved six test blocks with no accuracy or latency criteria in order for participants to progress from one block to the next. However, percentage correct and median latency were presented at the end of each block to encourage participants to maintain the accuracy and latency levels they had achieved during the practice blocks.

Two types of feedback contingencies were applied across the practice and test blocks of the IRAP, denoted here as consistent versus inconsistent with the natural contingencies operating in the wider verbal community. The contingencies deemed consistent required participants to choose yes on color–color and shape–shape trial types and to choose no on color–shape and shape–color trial types. The contingencies deemed inconsistent required participants to respond in an opposite pattern, choosing no on color–color and shape–shape trial types and choosing yes on color–shape and shape–color trial types. The IRAP program typically applies the feedback contingencies in one of two patterns. For one pattern, the first block and all subsequent odd-numbered blocks used the consistent feedback; the second and all subsequent even-numbered blocks used the inconsistent feedback. For the second pattern, the first block and all odd-numbered blocks used the inconsistent feedback, and the second and all even-numbered blocks used the consistent feedback. The use of these two patterns of feedback contingencies was counterbalanced (approximately) across the participants in the current study. In other words, half of the participants were presented with an IRAP that commenced with consistent feedback and then alternated from inconsistent to consistent across all subsequent blocks; the other half were presented with an IRAP that commenced with inconsistent feedback and then alternated from consistent to inconsistent across blocks thereafter. After participants completed the IRAP, they completed the questionnaire with the experimenter.

Results and Discussion

Questionnaire Data

As noted previously, completing five or fewer psychology experiments was used as the criterion for defining the LE group, whereas completing six or more psychology experiments defined the HE group. These criteria were chosen because it was deemed impractical to attempt to gather data from a completely experimentally naive participant group within Ghent University’s participant recruitment system. In the LE group, 28 of 29 participants met the IRAP performance criteria and the estimated average number of experiments participated in was 2.3, whereas in the HE group, 29 of 35 participants met the IRAP performance criteria and the estimated average number of experiments participated in was 32.5. The vast majority of these experiments involved exposure to some form of latency-based measure.

IRAP Data Processing

The primary datum of the IRAP is the response latency, defined as the length of time in milliseconds from stimulus presentation to a correct response on a particular trial. If participants maintained the accuracy and latency performance criteria across all six test blocks, the data from all blocks were included in the analyses. If, however, a participant failed to maintain the criteria on one or both blocks within a given test-block pair (Blocks 1 and 2, Blocks 3 and 4, Blocks 5 and 6), the data from that pair of test blocks were removed from the analyses. If a participant failed to maintain the criteria on two or more blocks from different test-block pairs, all of the data from that participant were removed. This practice has been used previously to avoid higher attrition rates (Leech, Barnes-Holmes, & Madden, 2016; Nicholson & Barnes-Holmes, 2012). Application of these criteria resulted in the exclusion of 7 of the 65 participants from the analyses.

Each participant who completed the current IRAP produced 24 response latencies for each test block. For participants who maintained the accuracy and latency criteria across all three pairs of test blocks, the D-IRAP scores were calculated as follows:

1.
If 10% of a participant’s response latencies were less than 300 ms, all of the data were removed (no participant had his or her data removed on this basis).
2.
All latencies over 10,000 ms were removed.
3.
Twelve standard deviations for the response latencies, calculated for each trial type, were obtained across the three pairs of test blocks (i.e., Blocks 1 and 2, Blocks 3 and 4, and Blocks 5 and 6).
4.
Twenty-four mean latencies were calculated, one for each trial type in each block.
5.
A difference score was calculated for each trial type, in each test-block pair, by subtracting the mean latency in the consistent block from the mean latency in the inconsistent block, thus producing 12 difference scores.
6.
The difference score for each trial type in each test-block pair was divided by the standard deviation for that trial type from the corresponding test blocks, resulting in 12 D-IRAP scores—one for each trial type in each pair of test blocks.
7.
Four D-IRAP scores were calculated—one for each trial type—by averaging scores across the three pairs of test blocks.

The same general method for calculating D-IRAP scores was also applied to the data from participants who had data from a pair of test blocks removed, except the algorithm was adjusted accordingly (e.g., eight standard deviations were calculated in Step 3 and 16 mean latencies were calculated in Step 4).

IRAP Data Analyses

The aforementioned calculations yielded four mean D-IRAP scores for each participant, one for each trial type (no main effect for block sequence was identified). The overall mean D-IRAP scores, divided according to level of experience, are presented in Fig. 2. All eight scores were positive, which indicates that both groups responded more quickly during history-consistent than history-inconsistent blocks for each of the trial types. In effect, participants tended to respond yes more quickly than no when presented with the label color and the name of a color and when presented with the label shape and the name of a shape; conversely, participants tended to respond no more quickly than yes when presented with the label color and the name of a shape and when presented with the label shape and the name of a color. In comparing the two levels of experience, the HE group produced an effect for the color–color trial type that differed dramatically from the remaining three trial-type scores; the LE group produced a broadly similar pattern, but the differences between the color–color trial type and the other three trial types were far less pronounced. A 2 × 4 mixed repeated-measures analysis of variance (ANOVA) failed to produce a main effect for experience (p > .6) but did yield a significant interaction between trial type and experience, F(3, 55) = 3.39, p = .02, η_p ² = .06. Two follow-up one-way repeated-measures ANOVAs yielded a large effect size for the HE group, F(3, 28) = 18.34, p < .001, η_p ² = 0.4, with a much smaller effect size for the LE group F(3, 27) = 5.82, p = .001, η_p ² = 0.18. Four follow-up independent t tests, one for each trial type, were nonsignificant (ps > .1). Eight one-sample t tests indicated that each of the D-IRAP effects differed significantly from zero (ps < .03). The inferential statistics therefore confirmed the descriptive analyses presented in Fig. 2, in that the effect size for trial-type differences for the HE group was over twice that of the LE group.

How might we explain the interaction between experimental experience and trial types on the IRAP? As noted previously, Finn et al. (2016) examined the impact of providing rules before each block of trials on the IRAP. The study revealed that rules that specified particular relations (e.g., “respond as if colors are colors and shapes are shapes”) generated more pronounced differences among the trial types than general rules (e.g., “respond correctly to the stimuli”). Although speculative, perhaps large amounts of experience in broadly similar latency-based experimental studies facilitated the production of self-generated rules for completing the current IRAP that were more specific than those generated by participants with less experience. Or more informally, perhaps generating and following relatively specific rules for completing reaction-time tasks had been established across the many experiments completed by the HE group. Indeed, in many of these previous studies, it is likely that participants had been presented with quite specific rules for completing latency-based measures, and thus they simply continued to do in the current study what they had been trained to do in very similar contexts (i.e., in the same laboratory). If this was the case, then the relatively large differences between the color–color trial type and the other three trial types for the HE groups would be expected (because experience functioned as a proxy for the provision of relatively specific rules).

In the research reported by Finn et al. (2016), the relatively large differences between the color–color trial type and the other three trial types were attenuated considerably when the specific rules, which focused on specific trial types, were replaced with more general rules. The rationale for using more general rules was to avoid emphasizing specific trial types before participants started responding on each block of trials. A reasonable question at this point, therefore, was how to attenuate the impact of experience on differential trial-type effects on the IRAP. Given that it was not possible to “undo” the effects of experience simply by providing general rules (because we suspected that experienced participants were self-generating specific rules), it was important to create a context for performance on the IRAP that served to undermine the dominance of a single trial type over the other three. One way that this might be achieved would be to require participants to engage with each trial type in a broadly similar manner during exposure to the task. We were unaware at the time of any obvious method that would work in this regard, but one method did suggest itself. Specifically, requiring participants to verbally report exactly what appears on screen on each IRAP trial, and the response they make, may reduce any tendency to treat one trial type differently from the others. We referred to this method as the “read-aloud procedure.” On the grounds of intellectual honesty, we must acknowledge that adopting the read-aloud procedure was, to some extent, an intuitive leap rather than a systematic solution to the problem we faced. We return to this issue in greater detail in the General Discussion section.

Experiment 2

The design of Experiment 2 mirrored that of Experiment 1, except for the addition of a read-aloud procedure. Experimental experience was operationalized in the same way as it was in Experiment 1. The IRAP in the second experiment was identical to the IRAP used previously.