The importance of the stimulus equivalence paradigm for the study of the symbolic processes associated with human behavior is widely recognized in the field of the experimental analysis of human behavior (e.g., D. Barnes-Holmes, Y. Barnes-Holmes, Smeets, Cullinan, & Leader, 2004; Billinger & Norlander, 2011; Sidman, 1994; Travis, Fields, & Arntzen, 2014). A typical stimulus equivalence study is characterized by the training of some conditional arbitrary relations between a set of stimuli and the testing for the emergence of other conditional relations that show the properties of reflexivity, symmetry, and transitivity, as Sidman and Tailby (1982) initially specified.

Most research in this area employs the matching to sample (MTS) procedure, in which the selection of a comparison stimulus from a set of stimuli is conditional to the presence of a particular sample stimulus. In this preparation it is typically said that a relation is established between each sample stimulus and each comparison stimulus whose selection is reinforced in its presence and thus is called a ‘positive,’ ‘sample-S+,’ or ‘select’ relation. Likewise, the relation between each sample stimulus and the comparison stimulus whose choice is extinguished or punished in its presence is called a ‘negative,’ ‘sample-S–,’ or ‘reject’ relation (Carrigan & Sidman, 1992; Dixon & Dixon, 1978; Johnson & Sidman, 1993; McIlvane, 2013; McIlvane, Withstandley, & Stoddard 1984b; Stromer & Osborne, 1982).

In studies where the MTS procedure is employed to teach the conditional relations serving as the baseline for the emergence of equivalence relations, positive relations are trained between those stimuli pre-experimentally defined as belonging to the same class (within-class relations), while negative relations are trained between those stimuli pre-experimentally defined as belonging to alternative classes (between-class relations). We call this the standard MTS procedure. For example, in a typical three-choice trial with the standard MTS procedure like A1-B1/B2, B3 (where each alphanumeric corresponds to sample-S+/S- and S stimuli in that respective order), a positive relation is established between stimuli A1 and B1, which eventually will belong to the same class. Simultaneously, in this trial type a negative relation can be established between the A1 and B2 stimulus. In other trial types, the same B2 stimulus will be a positive comparison for sample A2, so that the A1 and B2 stimuli will eventually belong to alternative classes. The same will happen between A1 and B3. This standard MTS procedure has a high probability to yield the formation of equivalence relations in children (e.g., Smeets, Barnes-Holmes, & Cullinan, 2000), human adults (e.g., Clayton & Hayes, 2004; Kinloch, Anderson, & Foster, 2013), and humans with intellectual disabilities (e.g., Carr, Wilkinson, Blackman, & McIlvane, 2000; O’Donnell & Saunders, 2003).

Carrigan and Sidman (1992) proposed that equivalence classes could be formed in the context of the standard MTS procedure by the exclusive training of positive relations without the training of any negative relations. They proposed a test that would involve the use of a modified MTS procedure that would establish exclusively positive within-class relations by the training of negative relations that involved many stimuli that did not belong to any class and between-class negative relations in only one trial for each sample in a block of training trials. Although this procedure has not been tested, some studies have shown that equivalence classes have formed after the establishment of high positive within-class and high negative between-class baseline relations (Arantes & de Rose, 2015; Carr, Wilkinson, Blackman, & McIlvane, 2000; de Rose, Hidalgo, & Vasconcellos, 2013; Grisante, de Rose, & McIlvane, 2014; Kato, de Rose, & Faleiros, 2008, Tomonaga, 1993). Furthermore, varying the S– comparison stimuli to establish high levels of control in a sample-S+ relation in the absence of reinforcement did not promote accurate performance in transitivity test trials (Harrison & Green, 1990).

More recently, a direct test of Carrigan and Sidman’s hypothesis was described by Plazas and Peña (2016). They implemented a modified three-choice MTS procedure, called the altered MTS procedure, in which trials like A1-B1/X1, X2 were trained so that the same positive relations as in the standard MTS procedure were established, but the negative relations were established to comparison stimuli (X1 and X2) that were not positive to any sample. Positive conditional control established by this procedure was assessed by employing the training trials of the standard procedure. They found that participants trained with this altered MTS procedure displayed high positive control, but this was followed by a low probability of establishing equivalence classes. Another group of participants was trained with the Standard MTS procedure, and a third group was trained with a procedure called the Semi-Standard MTS. In this last procedure each training trial established one between-class negative relation and one negative relation with an X stimuli (e.g., A1-B1/B2, X1). Participants trained with the Standard and Semi-Standard MTS procedures that displayed both high within-class positive control and high between-class negative control had a high probability to form equivalence relations. Contrary to Carrigan and Sidman’s theoretical predictions, these results seem to show that exclusively positive within-class relations are not sufficient for the emergence of equivalence classes and that between-class negative relations are necessary in verbally sophisticated human subjects, at least when an MTS format is used for training.

The three matching procedures used in the Plazas and Peña (2016) study established the same positive relations, but differed in the number of between-class negative relations that were established in each training trial: two for the standard procedure, one for the semi-standard procedure, and none for the altered procedure. Thus, it is possible to find a linear function between the number of negative between-class relations that the standard, semi-standard, and altered procedure establishes and the probability of formation of equivalence classes. The present study was designed to evaluate this hypothesis.

Three methodological limitations in the Plazas and Peña study must be noted. First, the standard, semi-standard, and altered procedures were compared in the context of the one-to-many (OTM) training structure. Some studies have reported differential probability of equivalence class formation according with the training structure employed (e.g., Arntzen, 2006; Arntzen, Grondahl, & Eilifsen, 2010; Arntzen & Holth, 1997, 2000; Arntzen & Vaidya, 2008; Fields, Hobbie-Reeve, Adams, & Reeve, 1999; Hove, 2003; Saunders, Chaney, & Marquis, 2005), which suggests the possibility that the differences in the probability of equivalence class formation found among the three matching procedures might be moderated if the many-to-one (MTO) and the linear-series (LS) structures are used.

Second, in the Plazas and Peña study the X stimuli presented in the training trials for the altered and semi-standard MTS procedures were always the same for each trial type. Some participants in these groups displayed high negative conditional control between the samples and the X stimuli, and this negative control was related to a low probability to establish equivalence relations. However, the original procedure proposed by Carrigan and Sidman (1992) to bias exclusive positive control involved varying the X stimuli for each presentation of each training trial type. In the present experiment, we varied the presence of the X stimuli across the training trial types of the altered and semi-standard procedure under the assumption that this change would decrease the negative control exerted by these stimuli, which should increase the probability of emergence of the equivalence classes if the Carrigan and Sidman hypothesis was correct.

Finally, in the Plazas and Peña study the positive conditional control established by the training with the standard procedure was tested by using the baseline trials of the altered procedure, and vice versa; for the semi-standard procedure, the test consisted of changing the negative B/C or X stimulus for each sample-S+ relation. The negative conditional control for the three procedures was probed by using novel stimuli as positive comparisons.

Yet, some doubts have been raised about the convenience of using novel stimuli to assess positive and negative control to the extent that there could be a bias towards selecting the novel stimuli (Carrigan & Sidman, 1992; Johnson & Sidman, 1993; McIlvane et al., 1987; Stromer & Osborne, 1982). As an alternative, McIlvane and colleagues (e.g., Costa, McIlvane, Wilkinson, & De Souza, 2001; McIlvane, Bass, O’Brien, Gerovac, & Stoddard, 1984; McIlvane et al., 1987; McIlvane, Kledaras, Lowry, & Stoddard, 1992; Wilkinson & McIlvane, 1997; Wilkinson, Rosenquist, & McIlvane, 2009) developed the blank comparison procedure to assess the positive and negative conditional control yielded by the training. In this procedure a blank comparison stimulus is introduced in the positive and negative control test trials in replacement of a correct or incorrect comparison stimulus. In this experiment we compared the use of novel stimuli and the blank comparison stimuli for the assessment of positive and negative conditional control regarding their sensitivity to the discrepancies among the different matching procedures as well as the differences in the performances of participants who formed equivalence relations and those who did not. In summary, this study is a systematic replication and extension of the Plazas and Peña study, addressing the following four questions: (1) Is the probability of equivalence class formation a function of the amount of negative between-class baseline relations established by the three different matching procedures (standard, semi-standard, and altered)? (2) Are there differential effects of the matching procedures on the equivalence class formation as a function of the training structure (many to one, one to many, and linear series) used? (3) Do all three matching procedures (standard, semi-standard, and altered) yield the same amount of positive and negative baseline control? (4) Which of two procedures to test positive and negative control of baseline relations (i.e., employing novel comparisons or a blank comparison stimulus) is more highly related to the matching procedures and is responded to in a different way by participants that establish and do not establish the equivalence classes.

Method

Participants

Ninety undergraduate psychology students at Fundación Universitaria Konrad Lorenz in Bogotá, Colombia, served as participants. Their ages ranged between 16 and 22 years old. Participants were randomly assigned to nine groups, ten for each group, and they received academic credit for their participation. Before the experimental session, each participant read and signed an informed consent form, or one was signed for them by their parents when they were younger than 18 years old.

Setting, Apparatus, and Stimuli

Participants sat in a module with panels that separated them and prevented visual contact with the performance of the other participants. They sat in front of a personal computer with a polychromatic monitor and were requested to wear headphones to listen to oral instructions and auditory feedback. A program designed in Visual Basic controlled the stimuli presentation and response recording. Participants responded to each trial by clicking on the left button of the mouse over one of the visual stimuli on the screen of the computer. The stimuli were presented as black-line drawing figures over a 3 × 3-cm white square background over a uniform gray hue on the computer’s screen. The sample stimulus appeared centered and 5.6 cm below the upper border of the screen, and the comparison stimuli appeared in a row 2.8 cm below the lower border of the sample stimulus and separated between them by 2.8 cm.

Figure 1 shows the stimuli employed in the experiment. The stimuli consisted of letters from different alphabets. The A, B, and C stimuli were used in the training trials to establish positive relations between the sample and correct comparisons and negative relations in the standard and semi-standard MTS procedures. The X stimuli were negative comparisons in the training trials of the semi-standard and altered MTS procedures. The N stimuli were used as novel stimuli in the positive and negative control test trials. The P stimuli were used in the pretraining phases, and the blank stimulus (herein called the K stimulus) was used as a positive or negative comparison in the pretraining trials and the positive and negative control test trials.

Fig. 1
figure 1

Stimuli set employed

Design

This experiment consisted of a 3 × 3 design, which compared three MTS procedures (standard, semi-standard, and altered) and three training structures (OTM, MTO, and LS) regarding the probability of formation of equivalence relations between dissimilar visual stimuli. The acronyms OTM, MTO, and LS were employed to refer to the groups according to their structure training and the acronyms STD, SEMI, and ALT to denote the standard, semi-standard, and altered MTS procedures, respectively.

Procedure

Each trial began with the presentation of the sample stimulus in the top-middle area of the screen. Participants had to click on it, and then three comparison stimuli appeared simultaneously in a row in the bottom area of the screen. Participants had to select one of the comparison stimuli by clicking on it. The sample stimulus remained present for the duration of the presentation of the comparison stimuli (simultaneous matching). If a participant selected the correct comparison stimulus, a ‘ta-dah’ tone was played through the headphones, but if an incorrect comparison was selected then a ‘chord’ tone was played. Then, all stimuli were removed and a 1-s intertrial interval started before the sample stimulus of the next trial was presented. In each trial of all the pretraining phases (phases 1-4) and in each of training phases 5 to 7 the participants’ choices were followed by auditory feedback.

Before the start of the first experimental phase, each participant was presented with written instructions, which indicated that they would be listening to two different sounds every time they made a correct or incorrect choice, and two buttons with the labels “CORRECT” and “INCORRECT,” which participants should to click to hear the sounds. Each session consisted of four pretraining phases, four training phases, and two testing phases. Experimental sessions typically lasted an average of 45 min.

Pretraining

The pretraining phases were designed to teach the participants to respond to the blank comparison stimulus, which was to be used later in the test trials that assessed positive and negative control. The procedure developed by McIlvane and colleagues (McIlvane, 2013; McIlvane et al., 1987) was followed, which employs a fading procedure to gradually introduce the blank comparison in the context of an identity matching trial. The pretraining consisted of four phases (phases 1 to 4) in which the P stimuli were employed (see Fig. 1). Throughout the phases, the blank comparison was progressively introduced from a little black square partially hiding a figurative stimulus to a large black square totally occupying the space in which a stimulus was presented. In a pretraining trial, for example, the trial type P1-P1/P2, P3 was presented, and the blank comparison hid one of the comparison stimuli, which could be the correct one (P1) or one of the incorrect ones (P2 or P3). Responses in each trial were followed by feedback, according with the criteria of identity matching. In this way, a participant learned to reject the blank comparison if the correct choice was available and to select the blank comparison if a correct choice was not available and the other choices were incorrect. Figure 2 presents some samples of trials employed in phases 1 to 4.

Fig. 2
figure 2

Examples of trial types in the different phases for the group with SEMI procedure and MTO structure

In pretraining phase 1, participants were presented with blocks of 12 trials of identity MTS training trials, and a 100% mastery criterion was needed to move to the next phase. In this phase the blank comparison was not introduced yet. Translated from the Spanish, the instructions in this phase were:

You’re going to start with the first phase of this experiment. There are four white squares on the screen, one above and three below. In the upper square a letter from a foreign alphabet will appear. If you click on that letter, three letters will appear in the squares below. You must choose one of these letters. If you choose correctly, the computer will tell you with the sound for a correct answer. If you choose a wrong one, the computer will play the sound for a wrong answer. If you do all the exercises correctly, you can pass to the next phase. If you make a mistake the whole phase will be repeated.

Phase 2 presented the same training trials, but one of the comparison stimuli was covered by a small black square, which partially covered it. The covered stimulus could be either the positive comparison or one of the negative ones. Phase 3 trained the same identity relations, but one of the comparison stimuli (positive or negative to the sample) was covered by a medium-sized black square. Phase 4 presented the same trials, but with a positive or negative comparison stimulus completely hidden by a big black square. Phases 2-4 consisted of 12-trial blocks with a mastery criterion of 100% to advance to the next phase.

Training

Each participant was trained in six arbitrary conditional discriminations for the formation of three three-member classes, with a three-choice matching procedure. Table 1 shows the six sample/comparison combinations used in the training trials for each of the nine groups. Participants in the nine groups were trained in the same positive relations, and according to the MTS procedure, the difference was in the negative relations that were trained. Participants in the STD conditions were presented with training trials establishing two between-class negative relations. Participants in the SEMI conditions were presented with training trials that established one between-class negative relation and one negative relation with an X stimulus, which did not belong to any class. Participants in the ALT conditions were presented with trials in which no between-class negative relation was trained, and both negative comparison stimuli were X stimuli. Throughout the training trials of the semi-standard and altered groups, the X stimuli varied semi-randomly from 12 available stimuli (X1-X12, see Fig. 1). Training occurred through phases 5 to 8. Table 2 describes the configuration of training phases for each training structure, the number of trials by block, and the mastery criteria for each. Figure 2 shows a sample of a baseline trial.

Table 1 Training trials of the nine experimental groups
Table 2 Structure of the training and testing phases

Testing

Testing of baseline control relations and emergent relations was conducted through phases 9 to 11. Table 2 describes the configuration of these phases. Phase 9 included test trials evaluating the positive and negative relations established in the baseline performances. There were four type of test trials, two for the positive relations and two for the negative relations. Twelve trials assessed positive relations by using novel stimuli as negative choices (sample-S+/N, N; N stands for a novel stimulus). Twelve further trials assessed positive relations by using the blank comparison stimulus as one of the negative choices and a novel stimulus as the other negative choice (sample-S+/K, N; K is the blank comparison stimulus). Twelve trials assessed negative relations by presenting a novel stimulus as the correct choice (sample-N/S-, S-). Finally, 12 trials assessed negative relations by employing the blank comparison stimulus as the correct choice (sample-K/S-, S-).

The negative control test trials for the semi-standard and altered groups contained X stimuli that were presented semi-randomly in each trial in the same way as in the training trials. Figure 2 shows some samples of trials for each of the four control test trials.

Phases 10 and 11 included test trials that assessed emergent symmetry and equivalence relations, respectively. Test trials in phases 9 to 11 were presented randomly intermixed along with baseline trials. Figure 2 depicts samples of symmetry and equivalence test trials.

Data Analysis

We treated the three matching procedures as a ratio variable, depending on the number of negative between-class relations established for each procedure: two in the STD procedure, one in the SEMI procedure, and none in the ALT procedure. Training structure was treated as a categorical variable. To provide an answer to the research questions presented above, we took the probability of equivalence class formation as the dependent variable for the first and second questions. The criterion to assume that a participant had established equivalence relations was a score above 83% (10/12 correct responses) in both the symmetry and equivalence test trials. Regarding the third and fourth questions, we took the percentage of correct responses in the four test trials for positive and negative control as the dependent variable.

Results

There were no differences between the groups in the pre-training phases (phases 1-4) regarding the number of blocks required to meet the criterion. As pre-training phases progressed, participants required fewer blocks to meet the criterion. In phase 1, 15 participants required two blocks, three participants required 3 blocks, and one participant needed six blocks to meet the criterion. The remaining participants needed a single block. In phase 2, five participants required two blocks to meet the criterion, and the remaining ones needed a single block. In phase 3, three participants required two blocks and the other ones a single block. In the last pretraining phase, four participants required two blocks, and one participant needed four blocks; the remaining passed the phase with a single block.

Figure 3 shows the differences in the mean number of blocks required to meet the criterion of the training phases (phases 5 to 8) for each of the matching procedures. Table 3 shows the differences that were statistically significant. In the acquisition of the AB relations a linear relation was found between the number of between-class negative relations trained by each matching procedure and the promptness of acquisition: Acquisition was slower for participants in the STD condition, faster for participants in the SEMI condition, and the fastest for participants in the ALT condition. The more between-class negative relations the matching procedure included, the slower acquisition was. Acquisition of the AC, CB, or BC baseline relations in each condition was faster as compared to AB relations. However, the function was different insofar as the STD and SEMI conditions presented the same acquisition rate, while it was faster for the ALT condition. The integration of both kinds of relations previously acquired in phase 7 was even faster in the three conditions, but the function was similar to the acquisition of the second set of relations. Almost all participants showed maintenance of the trained baseline relations by the first block of phase 8. It seems, therefore, that the removal of feedback did not have any effect on the maintenance of the baseline relations. Baseline maintenance in the testing phases was generally high (phase 9: M = 95.2%, SD = 9.21%; phase 10: M = 93.8%, SD = 11.03%; phase 11: M = 91.81, SD = 14.84%), although it decreased slightly as phases advanced.

Fig. 3
figure 3

Mean number of blocks to meet the criterion in phases 5 to 8 in each matching procedure

Table 3 Statistical analyses of blocks to criterion in training

The upper panel of Fig. 4 shows the percentage of participants in each of the matching procedures who formed equivalence relations. Twenty-three participants in the STD condition, 15 participants in the SEMI condition, and 4 participants in the ALT condition met the criterion for equivalence class formation. These differences were statistically significant [χ2 (2, N = 42) = 13.0, p = 0.002], indicating a positive linear relation between the number of between-class negative relations embedded in the baseline trials and the probability of equivalence class formation. The medium panel of Fig. 4 presents the percentage of participants for each of the training structure conditions who formed equivalence relations. Eleven participants in the OTM condition established equivalence relations as well as 18 participants in the MTO condition and 13 in the LS condition. These trends, however, were not significant (χ2 (2, N = 42) = 1.85, p = 0.39).

Fig. 4
figure 4

Percentage of participants who met the criteria for the formation of equivalence relations for the matching procedure (upper panel), training structure (medium panel), and training structure across the matching procedures (bottom panel)

The bottom panel of Fig. 4 shows the effects of the matching procedures on the probability of equivalence class formation for each training structure. Some differences can be appreciated in the function for each of the training structures. In the OTM structure, a clear direct relation was found between the number of between-class negative relations embedded in the trials and the formation of equivalence relations. In contrast, in the MTO structure, a very slight decrease was found in the probability of equivalence class formation between the STD and the SEMI conditions and then an abrupt decrease with the ALT procedure, close to that obtained for the ALT procedure in the OTM structure. As for the LS structure, a decrease in the probability of equivalence class formation was found from the STD to the SEMI conditions, a pattern very similar to that obtained in the MTO structure, but then the probability stayed at the same level for the ALT procedure and above that obtained with the other matching procedures. In general, a positive relation was found between the probability of equivalence class formation and the number of between-class negative relations involved in the training trials. No difference was found, however, in training two versus one between-class negative relations with the OTM procedure or in training one versus none between-class negative relations with the LS procedure.

Figure 5 presents the results of each of the positive and negative control tests of phase 9 for each of the matching procedures, comparing the performance of those participants who formed equivalence relations and those who did not. For the positive control test trials that employed novel stimuli as negative choices (sample-S+/N, N), results were high for all participants in the three matching procedures, irrespectively of whether they established equivalence relations or not. As for positive control test trials including the blank comparison stimuli as a negative choice (sample-S+/K, N), results were high for participants in the three matching procedures, but they were slightly lower for participants who did not form equivalence relations. In brief, the three matching procedures did not differ in the level of positive control that they yielded, and the positive control did not appear to depend on the number of between-class negative relations involved in the trials for each matching procedure.

Fig. 5
figure 5

Percentage of correct responses in positive and negative control test trials for participants who formed equivalence relations (Part-Eqv) and participants who did not (Part-No-Eqv) for each matching procedure

Regarding negative control, in the test trials that employed novel stimuli as the correct choice (sample-N/S-, S-), there were differences depending on the matching procedures and on whether participants established equivalence relations or not. For participants who formed equivalence relations, higher negative control was observed in the STD condition than in the SEMI and ALT conditions. For participants who did not form equivalence relations, higher negative control was observed in the STD condition than in the other conditions, but lower negative control was found for participants in the SEMI condition as compared to participants in the other conditions. Participants in the STD condition who formed equivalence relations had significantly higher results than participants who did not form equivalence relations: t(28) = 2.65, p = 0.013. Also significant differences were observed in participants in the SEMI condition who formed equivalence relations and those who did not: t(28) = 2.38, p = 0.024. As for the ALT condition, no differences were found between participants who formed equivalence relations or not. In consequence, negative control as assessed by the sample-N/S-, S- test trials was higher when the training trials included two between-class negative relations versus when they included only one or none of these relations. Further, negative control was related to equivalence class formation for the STD and SEMI conditions.

As for negative control test trials that presented the blank comparison stimulus as correct choice (sample-K/S-, S-), participants in the three matching procedures who formed equivalence relations showed very high performances. Participants who did not form equivalence relations showed significantly lower performances only in the SEMI condition: t(28) = 3.218, p = 0.003. Performances in this test did not vary in accordance with the number of between-class negative relations involved in the trials of each matching procedure, and differences between those who formed equivalence relations and those who did not were evident only when baseline trials involved one between-class negative relation.

Discussion

The probability of equivalence class formation was directly related to the number of between-class negative relations included in the baseline relations training trials. These results replicate those of Plazas and Peña (2016) as well as other studies that have highlighted the importance of negative relations for equivalence relations (e.g., Arantes, de Rose, 2015; Carr et al., 2000; de Rose et al., 2013; Grisante et al., 2014; Harrison and Green, 1990; Kato et al., 2008; Tomonaga, 1993; Urcuioli, 2008). The present results also dispute Carrigan and Sidman’s (1992) hypothesis that the exclusive training of positive conditional relations is sufficient for equivalence class formation. Specifically, although the matching procedures used in the present experiment did not differ in the positive baseline control, the various training procedures differed in their probability of inducing the emergence of the equivalence classes. This was particularly the case with the ALT procedure, in which the within-class positive baseline control was high, but the probability of equivalence class formation was found to be very low.

The relation of the number of between-class negative relations embedded in the trials and the probability of equivalence class formations was, however, slightly different for each training structure. It was clearly linear for the OTM structure, but not so for the other procedures. In the MTO structure, the probability of equivalence class formation was high regardless of the number of between-class negative relations present in the training trials. In this experiment the MTO structure presented the highest probability of equivalence class formation. Other studies also have reported a higher probability of the MTO structure to establish equivalence classes (Arntzen & Vaidya, 2008; Fields, Hobbie-Reeve, Adams, & Reeve, 1999; Hove, 2003; Saunders, Chaney, & Marquis, 2005; Saunders, Wachter, Spradlin, 1988; Spradlin & Saunders, 1986). In consequence, it is possible that this higher probability might have prevented a decreasing effect from two to one between-class negative relations in the baseline trials. It was not sufficient, however, to prevent such a decreasing effect when no between-class negative relations are included in the baseline.

The case for the LS structure is different. The probability of equivalence class formation decreased when the number of between-class negative relations in baseline went from two to one, but it remained at the same level with no between-class negative relations. Some studies have shown that this structure is the weakest for the establishment of equivalence relations (Arntzen & Holth, 2000; Arntzen et al., 2010: Eilifsen & Arntzen, 2009; Reilly, Whelan, & Barnes-Holmes, 2005; Saunders & McEntee, 2004). In this study, in contrast, the LS structure showed an overall higher probability of equivalence class formation than the OTM structure. Nevertheless, this advantage occurred in the ALT procedure, while the other studies only tested this with a STD procedure; thus, results of this experiment do not directly contradict them. It is not clear why this structure would present an advantage over the other structures in the ALT procedure; more research is necessary to elucidate this issue.

The present study also tried to solve the methodological issue of which of the procedures used to assess positive and negative baseline control—employing novel stimuli or a blank comparison stimulus—would be more sensitive to the differences in matching procedures and for the formation of equivalence relations. Our results show that the assessment of the positive baseline relations employing a blank comparison stimulus as an incorrect choice was a more sensitive measurement to differentiate between participants who established equivalence relations from those who did not. The very high scores in the sample-S+/N, N test trials for all matching conditions could be accounted for by the fact that these test trials allowed participants to select the correct stimulus on the basis of which comparison selection was reinforced in the establishment of the baseline, irrespectively of the sample stimulus. In this sense, this test cannot assess truly positive conditional control, and this fact accounts for the lack of discriminability between participants who formed equivalence relations and those who did not.

In the case of negative control tests, the use of a novel stimulus as the correct choice was more sensitive to the changes in the between-class negative baseline relations as compared to the use of the blank comparison stimulus as the correct choice. The use of novel stimuli in these tests discriminated best between participants who formed equivalence relations and those who did not. Hence, for future research attempting to assess positive and negative conditional control yielded by baseline relations, the use of the blank comparison stimulus as a negative choice for test trials for positive control and the use of novel stimuli as correct choices for test trials for negative control are recommended.

Two methodological issues to be considered about this experiment are the possible effect of pre-training and of the positive and negative control tests on the final probability to establish equivalence relations. Pre-training phases were included to teach participants how to respond to the blank comparison stimulus and were conducted in the context of the identity matching to sample. Their immediate effect could be reflected in the baseline training trials, probably making their acquisition more difficult because of the change from identity to arbitrary conditional matching and the withdrawal of the blank comparison stimulus. Although this might have been the case, this manipulation should have affected all experimental conditions in the same manner, and hence it does not account for the differences in acquisition rate of the baseline relations for each matching condition. Further, the differences across matching procedures on the probability of equivalence class formation were very similar to those found by Plazas and Peña (2016), although they did not include this manipulation in the pre-training phases. In consequence, it would be very difficult to attribute any effect of the pre-training phases on the baseline acquisition and the formation of equivalence relations. It might be of interest to explore any such effect in equivalence class formation if baseline and testing trials include the blank comparison stimulus as a correct or incorrect choice.

In regard to the inclusion of the positive and negative control baseline tests before symmetry and equivalence testing, we can distinguish two possible effects. First, the inclusion of a phase with 60 trials, with some baseline trials, but without feedback, might decrease the baseline maintenance at the time of the symmetry and equivalence testing. Although some decrease was observed in the baseline maintenance throughout the testing phases, it did not appear significant, and it equally affected all experimental conditions; thus, this could not account for the wide differences across matching procedures on the establishment of equivalence relations.

Second, the introduction of positive and negative control test trials, along with the introduction of the novel stimuli and the blank comparison stimulus, might have introduced a new responding context that in turn affected the responses during symmetry and equivalence test trials. Nevertheless, the content of the control test trials was not incompatible with the formation of equivalence relations. It is possible that the introduction of the positive and negative control test trials might have enhanced the effect of the baseline trials associated to each experimental condition, making the responding more idiosyncratic for each condition and amplifying the differences in symmetry and equivalence test trials given this condition. However, in Plazas and Peña’s experiment 1, wide differences in the probability of equivalence class formation were evident between the STD and the ALT procedure, despite the fact that no previous assessment of stimulus control was associated to the baseline performance. A systematic replication of the present experiment not including a phase of positive and negative control testing might be necessary to clarify this issue.

The present study shows that both positive within-class and negative between-class baseline relations are important factors for determining the probability of establishing equivalence relations. A percentage of participants, however, was trained with the STD procedure and did not establish equivalence relations as well as some other percentage of participants who were trained with the ALT procedure and formed equivalence relations. In consequence, more research seems necessary to establish what other determinant of the equivalence class formation might be responsible for these discrepancies. On the other hand, this research was conducted in the context of classes that were minimal in size, that is, they contained only three members. It might be of interest to establish whether the results found in this study would be similar with larger OTM, MTO, and LS classes, all of which would include one nodal stimulus, or in larger classes with some baseline relations established with the STD procedure and others with the ALT procedure.