Introduction

Simple arithmetic facts are stored in semantic memory as an associative network whose nodes are interrelated (Campbell and Graham 1985). When a simple problem is presented (i.e., the addition problem 2 + 4), the nodes that represent the operands (i.e., 2 and 4) and the solution (i.e., 6) of the problem are activated automatically. Furthermore, due to the principle of spreading activation, other related nodes become activated too (i.e., 8, the result of multiplying the operands 2 and 4) (Ashcraft 1992). This concurrent activation produces competition between arithmetic facts (Winkelman and Schmidt 1974). For instance, when individuals resolve an addition problem (i.e., 2 + 4), the arithmetic fact associated with the multiplication (i.e., 8) produces interference and slows down the time needed to select the correct answer (i.e., 6).

There is empirical evidence of this interference effect during an arithmetic problem verification task (Grabner et al. 2013; Lemaire et al. 1991; Winkelman and Schmidt 1974; Zbrodoff and Logan 1986). In this task, a simple addition problem is presented (i.e., a pair of one-digit operands and a result) and participants have to decide whether the proposed result is correct or not. The critical trials are those associated with negative responses (incorrect addition problems). In these trials, participants take more time to respond when the proposed result is incorrect, but it is the one of multiplying the operands (2 + 4 = 8) relative to an unrelated condition (2 + 4 = 10). This so-called associative confusion effect (Winkelman and Schmidt 1974) has been taken as an index of the simultaneous activation of addition and multiplication facts in semantic memory (Grabner et al. 2013; De Visscher et al. 2015).

The associative confusion effect described in mental arithmetic involves two processes: coactivation of several arithmetic facts (e.g., multiplications and additions) in semantic memory and competition between the one needed to solve the problem (e.g., 2 + 4 = 6) and others that are related but irrelevant (e.g., the multiplication counterpart, 2 × 4 = 8). Behavioral results capture the competition process (behavioral interference). However, latency measures remain silent about the coactivation process in semantic memory because they reflect the culmination of multiple stages of processing. Hence, behavioral indexes of the associative confusion effect would be complemented by the inclusion of temporally precise methods to index the time course of mental activity in simple arithmetic. Event-related potentials (ERPs) provide this temporal acuity.

In our study, we recorded electrophysiological activity when participants resolved addition problems in order to examine the time course of processes underlying the associative confusion effect. Specifically, we focused on the N400, a negative-going waveform peaking at approximately 350–450 ms after stimulus onset. Importantly for the current study, the amplitude of this component is sensitive to the processing of semantic information (Domahs et al. 2007; Jost et al. 2004; Macizo et al. 2012; Niedeggen and Rösler 1996, 1999; Niedeggen et al. 1999). For instance, in psycholinguistic studies, it has been corroborated that N400 amplitude is attenuated (less negative) when a target stimulus is preceded by a semantically related context relative to an unrelated context (Kutas and Hillyard 1980, 1984). This N400 attenuation has been interpreted as due to the spreading of activation in semantic memory which facilitates the processing of target stimuli preceded by related primes (Kutas and Federmeier 2011). The N400 is not specific to the processing of linguistic stimuli. In fact, N400-like potentials have been found when individuals process meaningful stimuli in the nonverbal domain (pictures, faces, etc.), suggesting that members of this family of N400-like potentials are found whenever stimuli tap into semantic memory (Kutas and Federmeier 2011).

Although N400-like attenuation is an index of coactivation of related information in semantic memory, this ERP modulation has been interpreted also as reflecting competition and inhibition of irrelevant contents (e.g., Debruille et al. 2008; Shang and Debruille 2013). For example, when participants have to indicate that a stimulus is not a word, N400 amplitude is more negative for pseudowords that look-alike real words (Holcomb et al. 2002; Debruille 1998); an effect that has been interpreted as reflecting the inhibition of irrelevant words activated by look-alike stimuli.

To our knowledge, ERPs have not been used to study the associative confusion effect in simple arithmetic. However, amplitude modulations of N400-like components have been reported in studies about the processing of multiplication problems (Domahs et al. 2007; Jost et al. 2004; Niedeggen and Rösler 1996, 1999; Niedeggen et al. 1999).Footnote 1 These studies seem to suggest that the N400 reflects coactivation in the network of arithmetic facts. To illustrate, Niedeggen and Rösler (1999) asked participants to decide whether simple multiplication problems were correct or not. The result of incorrect multiplication problems could be related (i.e., the results were multiples of either the first or the second operand, 5 × 8 = 32) or unrelated (i.e., 5 × 8 = 34). The authors found behavioral interference so related problems were solved slower than unrelated problems. In contrast, when the ERP pattern was considered, an attenuation of the N400-like component was obtained for related results relative to unrelated results. Hence, the authors observed dissociation between decision times and ERP measures where behavioral interference was accompanied by an attenuation of the N400-like amplitude. The authors concluded that N400-like effect indexed the spreading of activation in the network of arithmetic facts so related results facilitated the retrieval of the correct multiplication results. In contrast, behavioral interference was interpreted as a consequence of a late competition process which was not captured in ERP measures, but it was observed in response times.

When we revisit the associative confusion effect, we observe that even when people take more time to verify a problem whose result is the one of multiplying its operands (i.e., 2 + 4 = 8), they are able to resolve it correctly most of the time (i.e., to say that 2 + 4 = 8 is incorrect). It has been proposed that the conflict produced by the coactivation of arithmetic facts is solved by an inhibitory mechanism (Campbell and Dowd 2012; Campbell and Thompson 2012; Megías et al. 2014; Megías and Macizo 2015a, b). In a recent study, Megías et al. demonstrated that this inhibitory mechanism acts in a continuous manner (on a trial-by-trial basis) in order to reduce interference when competition between arithmetic facts takes place. To address this issue, Megías and Macizo (2015a) designed a new paradigm in which additions were presented in blocks of two trials and participants had to decide whether the proposed result of an addition problem was correct or not. In the first trial, participants took more time to respond to an incorrect addition problem whose result was the one of multiplying the operands (i.e., 2 + 4 = 8) relative to an unrelated condition (i.e., 2 + 4 = 10). This interference effect suggested that participants activated multiplication facts when they verified addition problems. In the second trial, participants took more time to respond to another addition problem whose result was the one of multiplying the operands of the previous trial (i.e., 2 + 6 = 8 preceded by 2 + 4) relative to an unrelated condition (i.e., 4 + 6 = 10 preceded by 2 + 4). This interference effect obtained in the second trial was interpreted as a consequence of inhibiting the irrelevant multiplication result when participants responded to the first trial. Hence, participants required additional time to reactivate the inhibited result (i.e., 8) when it was presented again in the second trial and it was the one needed to perform the task (i.e., 2 + 6 = 8).

The second goal of the current study was to gather electrophysiological evidence of the consequences associated with the selection of arithmetic facts. To this end, we focused on the P200 potential, a complex component peaking at about 200 ms after stimulus onset. This component is sensitive to several cognitive processes such as the analysis of facial expressions (Paulmann and Pell 2009), the early processing of lexical stimuli (Dehaene 1995; McCandliss et al. 1997), and the encoding and retrieval of the meaning of stimuli in semantic memory (Chapman et al. 1978; Dunn et al. 1998; Friedman et al. 1981). Therefore, the cognitive interpretation of the P200 is not straightforward and it depends on what is being studied. In the current research, we considered the sensitivity of the P200 potential to index the difficulty to encode and retrieve information from semantic memory (Raney 1993; Smith 1993). For instance, when participants with high and low recall of a list of words are compared (Dunn et al. 1998), low recall participants show larger P200 amplitude in anterior regions and smaller posterior amplitudes than high recall participants. The authors suggest that frontal P200 would be associated with the ease of encoding a stimulus whose meaning has to be retrieved, while the posterior P200 would be linked to the complete access to long-term memory.

In the field of arithmetic cognition, the P200 component has been related also to the difficulty of encoding and retrieval of semantic information with numerical stimuli (Kong et al. 1999; Muluh et al. 2011; Szücs and Csépe 2004). For example, when participants have to verify the correctness of addition problems, the P200 amplitude is larger in frontal regions when the addition problem is difficult (i.e., large addition problems with carrying in solution; e.g., 7 + 8 =) relative to easy addition problems (small addition problems without carrying in solution; e.g., 2 + 4 =) (Kong et al. 1999). This problem size effect seems to indicate that arithmetic facts associated with large problems are less accessible than those associated with small problems (Ashcraft 1992). Therefore, the results of these studies suggest that P200 component can be considered an index of the difficulty in resolving simple arithmetic problems. Concretely, previous studies have shown frontal P200 modulations related to the encoding of digits when participants solve additions (Iguchi and Hashimoto 2000; Szücs and Csépe 2004). Hence, in the context of the associative confusion effect, P200 could be sensitive to the difficulty underlying the encoding of an addition result which was presented previously in an associative confusion trial (i.e., it was the result of multiplying the operands of the addition presented before).

The current study

The goal of the current study was twofold. Firstly, to explore the time course of processes underlying the associative confusion effect reported in the past when participants perform arithmetic tasks (arithmetic facts stored in semantic memory; Winkelman and Schmidt 1974; Zbrodoff and Logan 1986). To this end, participants verified the correctness of addition problems. In a first trial, we expected to corroborate the behavioral interference effect reported in previous research (Megías et al. 2014; Megías and Macizo 2015a, b), so participants would take more time to verify an incorrect related addition problem presented with a proposed result that was the one of multiplying the operands (i.e., related 1 condition: 2 + 4 = 8) relative to an unrelated condition. This effect would capture the automatic coactivation of addition and multiplication facts in long-term memory. Moreover, if participants coactivate arithmetic facts, a N400-like attenuation would be observed in the related 1 condition due to the spreading of activation in the associative network of arithmetic facts which would facilitate the activation of related nodes. Conversely, if N400-like component is sensitive to competition produced by coactivation of irrelevant arithmetic facts, N400 amplitude would be more negative in the related 1 condition relative to unrelated 1 condition.

Our second goal was to examine the consequences associated with the selection of arithmetic facts in order to resolve the addition problems. We expected to observe longer reaction times in a second trial when participants verify a correct addition problem whose result is the one of multiplying the operands of the previous trial (related 2 condition, i.e., 2 + 6 = 8, preceded by 2 + 4 =) compared to an unrelated condition. This interference effect has been interpreted as due to the inhibition of the irrelevant result in the previous trial, so the difficulty to encode and retrieve the result increases when it is presented afterward (Megías et al. 2014; Megías and Macizo 2015a, b). If this argument is correct, the behavioral interference effect in the second trial would be accompanied by a modulation of the P200 component which reflects the difficulty of encoding stimuli for retrieving information in long-term memory.

Methods

Participants

Seventeen students from the University of Granada (ten women and seven men) took part in the study. Their mean age was 22 years (SD = 4.12). Sixteen participants were right-handed, and one participant was left-handed. All participants had normal or corrected-to-normal visual acuity. None had any reported history of neurological or psychiatric disorders. The experiment was undertaken in accordance with the Declaration of Helsinki. The Ethics Committees of the University of Granada approved the experimental procedures, and each subject provided written informed consent before performing the experiment. Their participation was remunerated with academic credits. Before the experimental task, they completed a questionnaire to determine their use of simple arithmetic (Colomé et al. 2011) (see Table 1). The percentage of calculation of addition problems on a daily basis was 43.8 % (SD = 11.80). Moreover, 81.18 % (SD = 27.36) of the participants learned the multiplication tables orally.

Table 1 Use of simple arithmetic

In order to control that participants had a good knowledge about multiplication tables, they performed a production multiplication task. In this task, tables from 1 to 4 were presented (i.e., 2 × 4 = ?) and participants had to say aloud the correct result (i.e., 8). This task was performed at the end of the experiment. The EEG was not recorded when participants performed this task. Participants showed a good knowledge of simple multiplication problems with 92.84 % of correct responses (SD = 5.74). Mean reaction time in correct multiplication problems was 1080 ms (SD = 339.34).

Design and materials

We used an arithmetic problem verification task (see Fig. 1) in which participants received addition problems and they decided whether they were correct or incorrect. The addition problems were presented in blocks of two trials. In the first trial, the variable Relation 1 was manipulated as a within-subject factor with two conditions: the related 1 condition included an incorrect addition problem whose result was that of multiplying the operands (i.e., 2 + 4 = 8), and the unrelated 1 condition contained an incorrect addition problem whose result was not the one of multiplying the operands (i.e., 2 + 4 = 10). In the second trial, the variable Relation 2 was manipulated as a within-subject factor with two conditions: the related 2 condition contained a correct addition problem whose result was the one of multiplying the operands of the previous trial (i.e., 2 + 6 = 8), and the unrelated 2 condition included a correct addition problem with a result which was not the one of multiplying the operands of the previous trial (4 + 6 = 10).

Fig. 1
figure 1

Arithmetic verification task was presented in blocks of two trials. The first trial started with a fixation point of 500 ms followed by an addition problem. Two addition problems could be presented: related 1 addition problems (i.e., 2 + 4 = 8) or unrelated 1 addition problems (i.e., 2 + 4 = 10). After the participant’s response, the second trial started with a fixation point of 500 ms followed by the second addition problem which could belong to the related 2 condition (i.e., 2 + 6 = 8) or the unrelated 2 condition (4 + 6 = 10)

The experimental material used in the current experiment was exactly the same employed in previous studies (Megías et al. 2014; Megías and Macizo 2015a, b). To make the experimental blocks of trials, 20 false addition problems were selected in the first trial (10 related 1 addition problems and 10 unrelated 1 addition problems), and 20 correct addition problems were used in the second trial (10 related 2 addition problems and 10 unrelated 2 addition problems; see Table 3 in “Appendix”). Across participants, each problem in each condition of the first trial (related 1 and unrelated 1 addition problems) was presented half of the times followed by related 2 addition problems and the other half they were followed by unrelated 2 addition problems. Therefore, the related 2 and unrelated 2 addition problems were preceded an equal number of times by related 1 and unrelated 1 trials. Each participant received the experimental block of trials (20 trials per condition in trial 1 and 20 trials per condition in trial 2) three times in order to have more trials per condition. Therefore, the total number of observations was 60 in each condition of the first trial (related 1 and unrelated 1) and in each condition of the second trial (related 2 and unrelated 2). The complete set of experimental trials used in the experiment is reported; see Table 3 in “Appendix”.

The addition problems used in the experimental task were carefully selected to equate them in several factors that might determine possible differences between conditions in the first and second trials of the experiment. All addition problems were composed of single-digit operands, and the two operands of each problem were presented in ascending order (i.e., 2 + 6). The parity (even and odd digits) of operands and results was equally distributed across the conditions of the first and second trials of the experimental blocks. In each trial, the solution corresponded to multiplication tables from 1 to 4 and it was never one of the two operands presented in the problem (i.e., 2 + 1 = 2 was not presented).

In the first trial, the related 1 condition and the unrelated 1 condition were equated in problem size (the sum of the two operands in both conditions was exactly the same: M = 7.40). The size of the incorrect results presented in the related 1 condition and the unrelated 1 condition was also similar (M = 11.80 and M = 11.60, respectively), t(18) = 0.12, p = .90. Furthermore, the distance between the incorrect result presented to the participants and the correct result of the addition problem in the two conditions of the first trial was exactly the same (M = 4.40). In the second trial, the problem size was equated in the related 2 condition (M = 11.80) and the unrelated 2 condition (M = 11.60), t(18) = 0.12, p = .90. In order to maintain the same problem size in the two conditions of the second trial, one addition problem in the related 2 condition (7 + 9 = 16) and one addition in the unrelated 2 condition (4 + 6 = 10) were repeated. The selection of these problems was random.

Moreover, we controlled for the degree of similarities between the addition problems presented in the first trial and those corresponding to the related 2 and the unrelated 2 condition of the second trial. The numerical distance between the incorrect result presented in the first trial and the second trial was exactly the same in the related 2 condition and the unrelated 2 condition (M = 1.40). The difference between the problem size in the first trial and the second trial was exactly the same in the related 2 condition and the unrelated 2 condition (M = 4.40). The number of repetitions between the digits presented in the first trial and the second trial (i.e., 2 was repeated in the block composed of the first trial 2 + 3 = 6 followed by 2 + 4 = 6) was exactly the same in the related 2 condition and the unrelated 2 condition (8 repetitions).

In order to check that there were no differences in response latency and accuracy when individuals answered to the addition problems used in the related 2 and unrelated 2 condition without any manipulation, we performed a pilot study (Megías and Macizo 2015b). Participants performed a production task that contained the addition problems presented in the related 2 and unrelated 2 conditions. There were no differences in the percentage of errors associated with related 2 addition problems (13.53 %) and unrelated 2 addition problems (11.59 %), F < 1. Furthermore, there were no differences in reaction times associated with the related 2 condition (990 ms) and the unrelated 2 condition (984 ms), F < 1. Therefore, the two conditions of the second trial were equated.

To prevent the participants from noticing the structure of the experimental blocks (a sequence of an incorrect operation in the first trial and a correct operation in the second trial), each list of experimental blocks was randomly intermixed with ten filler blocks of trials which were repeated four times. We selected here the same filler blocks employed the first time the experimental paradigm was used (Megías et al. 2014, Experiment 1). The correct responses in the first and second trials of these blocks were ‘yes’–‘yes,’ ‘no’–‘no,’ and ‘yes’–‘no,’ respectively. Therefore, the sequence of responses within each block of two trials was unpredictable through the experiment. The filler blocks included 6 addition problems and 4 multiplication problems. It could be argued that the inclusion of multiplication filler problems might foster the coactivation of multiplication facts. However, it has been corroborated (Megías et al. 2014, Experiment 2) that the inclusion of multiplication filler problems does not influence the processing of the experimental additions employed here, probably because small simple additions are used with rapid and automatic access to the arithmetic facts network.

Before starting the arithmetic problem verification task, the participants performed four blocks of practice trials (two pairs of additions and two pairs of multiplications) with problems that were not used in the main experiment.

Procedure

The experiment was designed and controlled by E-Prime experimental software (Schneider et al. 2002). The stimuli were always presented in the middle of the screen in black color (Arial font, 40 point size) on a white background. Participants were tested individually, and they were seated at approximately 60 cm from the computer screen.

The experimental task was an arithmetic problem verification task presented in blocks of two trials. Participants had to decide whether the result of each problem was correct or incorrect. We used the same procedure described by Megías et al. (2014), Megías and Macizo (2015a, b) in order to make comparable the current electrophysiological experiment with behavioral studies previously done with the same paradigm: The first trial began with a fixation point in the middle of screen for 500 ms, followed by the arithmetic problem until the participant’s response. After giving the answer, the second trial appeared with the same sequence of events as that of the first trial: a fixation point for 500 ms and the arithmetic problem until the participant’s response. After each block of two trials, the participants were instructed to press the space bar to continue with the following block. Participants were instructed to respond by pressing the keys ‘M’ and ‘Z,’ which were labeled as ‘correct’ and ‘incorrect.’ The ‘correct’ and ‘incorrect’ position assignment was counterbalanced across participants. The duration of the complete experimental session was approximately 90 min.

Electrophysiological recording and analysis

The EEG was recorded from 15 scalp electrodes (left frontal, F3, F1; medial frontal, FZ; right frontal, F2, F4; left central, C3, C1; medial central, CZ; right central, C2, C4; left parietal, P3, P1; medial parietal, PZ; and right parietal, P4, P2) mounted on an elastic cap according to the international 10–20 system (Jasper 1958). The continuous electrical activity was recorded with Neuroscan Synamps2 amplifiers (El Paso, TX). The EEG was initially recorded against an electrode placed in the midline of the cap (between Cz and CPz) and re-referenced off-line against a common average reference. To control for vertical and horizontal eye movements, two additional pairs of electrodes were used: a) Bipolar pairs of electrodes placed above and below the left eye and on the outer canthi allowed blink artifact to be corrected, and b) two electrodes placed in the external canthi, with one electrode on the left and another on the right side, allowed eye movements to be rejected. Each EEG channel was amplified with a band pass of 0.01–100 Hz and digitized at a sampling rate of 500 Hz. Impedances were kept below 5 kΩ.

Trials contaminated by eye movements, or amplifier saturation artifacts were rejected. Eye blinks were determined in the following way: Visual inspection of the activity in the electrodes placed above and below the left eye was done for each participant separately in order to determine the voltage range associated with blinks. Then, a voltage threshold was individualized for each participant to capture as blink artifacts those epochs exceeding the voltage criterion (the mean voltage threshold across participants was 100 μV). Afterward, blinks were averaged for each participant separately using a minimum of 73 blinks for each participant and later corrected with linear regression in the time domain (Neuroscan Scan 4.5 software, El Paso, TX). Individual epochs were performed for each experimental condition beginning with a 100-ms pre-stimulus baseline. Average ERP waveforms were time-locked to the presentation of the arithmetic problem. Trials with incorrect responses in the arithmetic verification task were excluded from average ERP and submitted to the behavioral analysis of accuracy (2.01 % of the data in the first trial and 3.14 % of the data in the second trial). Averages in each condition of the study were comprised of a mean of 58.46 trials out of 60 trials (with a minimum of 58 trials per condition).

Statistical analyses were performed on the mean amplitude in two time windows. These time windows were established after visual inspection and were intended to evaluate two ERP components: The 170- to 230-ms time window was used to assess the P200 component (Jiang and Zhou 2009; Paulmann et al. 2013), and the 350- to 450-ms time window was used to evaluate the N400 component (Carreiras et al. 2009; Galfano et al. 2009). For the repeated-measure analyses of variance (ANOVAs), the Greenhouse–Geisser correction (Greenhouse and Geisser 1959) for nonsphericity of variance was used for all F ratios with more than one degree of freedom in the denominator; reported are the original df, the corrected probability level and the ε correction factor.

Results

Behavioral

The reaction times (RTs) associated with correct responses were trimmed following the procedure described by Tabachnick and Fidell (2001) to eliminate univariate outliers (data points that after standardization were 3 SD outside the normal distribution of the data in each trial): 5.45 and 6.28 % of the data were excluded in the first and second trials, respectively. Since we were interested in possible differences between conditions within each trial, the two conditions of the first trial and the second trial were analyzed separately. Therefore, we report firstly the results obtained in the first trial (related 1 condition vs. unrelated 1 condition) and then the results found in the second trial (related 2 condition vs. unrelated 2 condition).

First trial

We performed ANOVAs on the RTs and percentage of errors with the variable Relation 1 (related 1 and unrelated 1) as a within-subject factor. The RT analysis showed a marginal main effect of Relation 1, F(1, 16) = 4.29, p = .06, η 2 = .21, so that responses to related 1 trials (1074 ms, SE = 46) were slower than responses to unrelated 1 trials (1051 ms, SE = 45; see Table 2). Moreover, the ANOVA on the percentage of errors showed a significant difference between the related 1 trials (3.14 %, SE = 1.05) and the unrelated 1 trials (0.88 %, SE = .48), F(1, 16) = 6.08, p = .03, η 2 = .28.

Table 2 Behavioral results

Second trial

We performed ANOVAs on the RTs and percentage of errors with the variable Relation 2 (related 2 and unrelated 2) as a within-subject factor. In the RT analysis, we found significant differences between these two conditions, F(1, 16) = 23.73, p < .001, η 2 = .60, such that responses to related 2 trials (1239 ms, SE = 67) were slower than responses to unrelated 2 trials (1140 ms, SE = 62; see Table 2). However, the ANOVA on the percentage of errors did not show significant differences between the related 2 (3.33 %, SE = 0.77) and unrelated 2 conditions (2.94 %, SE = 0.72), F < 1.

We evaluated the possible relationship between the interference effect found in trial 1 (related 1 minus unrelated 1 condition) and that obtained in trial 2 (related 2 minus unrelated 2 condition). The correlation was not significant, r = −.18, p = .48. Furthermore, we performed further analyses in order to explore two factors that might determine the effects found in the first and second trials of the study. In the first analysis, we addressed the possible influence of repeating the block of trials across the experiment. A reduction in the interference effect from the beginning to the end of the experiment would indicate that participants adapted to the conflict produced by the coactivation of arithmetic facts. In the first trial, the main effect of problem presentation (first presentation, second presentation and third presentation) was significant, F(2, 32) = 85.55, p < .001, η 2p  = .84. This was a practice effect, so participants were faster as the experiment advanced (first presentation, M = 1227 ms, SE = 49; second presentation, M = 1033 ms, SE = 45; third presentation, M = 942 ms, SE = 45). However, the Problem presentation × Relation 1 interaction was not significant, F < 1. The same pattern of results was found in the second trial. The main effect of problem presentation was significant, F(2, 32) = 47.01, p < .001, η 2p  = .74. Mean reaction times were 1344 ms (SE = 65) in the first presentation, 1163 ms (SE = 65) in the second presentation, and 1071 ms (SE = 68) in the third presentation. The observation of similar interference effects in trials 1 and 2 across the course of the experiment seems to indicate that coactivation is an automatic process which is not subject to adaptation due to practice.

The second factor we considered was problem size. Previous researchers have shown that the associative confusion effect is modulated by problem size with small problems having more automatic and rapid access to the network of arithmetic facts relative to large problems (Lemaire et al. 1994, 1996). In order to consider this variable, we classified the ten problems used in the first trial as small and large based on the size of the operands. The small condition was composed by five problems in which both operands were smaller than five. The larger condition included five problems in which one operand was smaller than five and the other operand was between 5 and 8. In the first trial, the main effect of problem size was significant, F(1, 16) = 9.09, p = .008, η 2p  = .36. Small problems were solved faster (1033 ms, SE = 41) than large problems (1091 ms, SE = 51). Importantly, the Problem size × Relation 1 interaction was significant, F(1, 16) = 11.64, p = .003, η 2p  = .42. When participants solved small problems, the Relation 1 effect was significant, F(1, 16) = 15.67, p < .001, η 2 = .49, with slower responses in the related 1 condition (1067 ms, SE = 44) relative to the unrelated 1 condition (999 ms, SE = 40). However, when participants solved large problems, no significant differences were found between the related 1 condition (1079 ms, SE = 49) and the unrelated 1 condition (1103 ms, SE = 54), F(1, 16) = 1.89, p = .19, η 2 = .10. The fact that the relation 1 effect was modulated by problem size replicates previous studies suggesting that small problems have an automatic access to the network of arithmetic facts and they produce a rapid coactivation of related multiplication problems in memory (Lemaire et al. 1994, 1996). In the second trial, the problem size effect was significant again, F(1, 16) = 32.44, p < .001, η 2p  = .67, with faster responses to small problems (1062 ms, SE = 50) than to large problems (1278 ms, SE = 79). The Problem size × Relation 2 interaction was significant, F(1, 16) = 18.55, p < .001, η 2p  = .54. When participants solved small problems, they were slower in the related 2 condition (1157 ms, SE = 58) relative to the unrelated 2 condition (967 ms, SE = 50), F(1, 16) = 23.45, p < .001, η 2 = .59. However, for large problems, the difference between the related 2 condition (1278 ms, SE = 80) and the unrelated 2 condition (1279 ms, SE = 80) was not significant, F < 1. The Problem size × Relation 2 effect found in trial 2 indicates that inhibition was applied when there was coactivation of arithmetic facts (for small problems in trial 1).

Event-related potentials

Analyses are reported in the same order in which each component is discussed in Introduction section, N400-like and P200. As with the behavioral data, for each component we report analysis of the first trial and then analysis of the second trial.

N400-like component

First trial

We performed an ANOVA on the mean amplitude in the 350- to 450-ms time window, with Relation 1 (related vs. unrelated conditions) and ROIs (left frontal, medial frontal, right frontal, left central, medial central, right central, left parietal, medial parietal and right parietal) as within-subject factors. The analysis showed a main effect of Relation 1, F(1, 16) = 4.31, p = .05, η 2p  = .21. Furthermore, there was a main effect of ROIs, F(8, 128) = 14.43, p < .001, ε = .20, η 2p  = .47. Importantly, the Relation 1 × ROIs interaction effect was significant, F(8, 128) = 10.82, p < .001, ε = .38, η 2p  = .40. A posteriori analysis with Bonferroni correction for multiple comparisons was performed to evaluate the Relation 1 effect in all ROIs. The N400-like amplitude was less negative when participants responded to related 1 trials relative to unrelated 1 trials in the left frontal region (p = .004), the medial frontal region (p = .001), the right frontal region (p = .008) and the medial central region (p = .05). The Relation 1 effect was not significant in other regions (all ps > .53; Fig. 2).

Fig. 2
figure 2

Grand average ERPs for related 1 condition (i.e., 2 + 4 = 8) and unrelated 1 condition (i.e., 2 + 4 = 10) of the first trial

Second trial

The ANOVA in the second trial with Relation 2 and ROIs as within-subject factors did not show a main effect of relation 2, F < 1. There was a main effect of ROIs, F(8, 128) = 8.63, p = .002, ε = .20, η 2p  = .35. The Relation 2 × ROIs interaction effect was not significant, F(8, 128) = 1.43, p = .24, ε = .44, η 2p  = .08.

P-200 component

First trial

The ANOVA on the mean amplitude in the 170- to 230-ms time window with Relation 1 and ROIs as within-subject factors did not show a main effect of Relation 1, F < 1. There was a main effect of ROIs, F(8, 128) = 16.46, p < .001, ε = .22, η 2p  = .51. The Relation 1 × ROIs interaction was not significant, F < 1.

Second trial

We performed an ANOVA on the mean amplitude with Relation 2 and the ROIs as within-subject factors. The analysis did not show a main effect of Relation 2, F(1, 16) = 2.39, p = .14. There was a main effect of ROIs, F(8, 128) = 13.57, p < .001, ε = .19, η 2p  = .46. Moreover, the Relation 2 × ROIs interaction showed a trend toward significance, F(8, 128) = 2.57, p = .07, ε = .34, η 2p  = .14. A posteriori analysis with Bonferroni-corrected probabilities showed a marginal Relation 2 effect in the medial frontal region (p = .07). The amplitude of the P200 component seemed to be more positive in the related 2 condition compared to the unrelated 2 condition. The Relation 2 effect was not significant in any other region (all ps > .90; see Fig. 3).

Fig. 3
figure 3

Grand average ERPs for related 2 condition (i.e., 2 + 6 = 8) and unrelated 2 condition (4 + 6 = 10) of the second trial

We explored the possible relationship between the N400-like attenuation associated with the Relation 1 effect and the increased P200 positivity associated with the Relation 2 effect. To this end, we computed the N400-like effect in the first trial (related 1 vs. unrelated 1) and the P200 effect found in the second trial (related 2 vs. unrelated 2). There was a positive correlation between these two electrophysiological indexes (r = .75, p = .02). Thus, when the N400-like component increased its attenuation in the first trial, the P200 potential increased its positivity in the second trial.

Nonparametric permutation testing

Nonparametric permutation tests were performed to evaluate possible differences that were not captured (or they were difficult to be indexed, e.g., the marginal P200 effect in trial 2) with the standard parametric tests reported above. To this end, we compared the averaged ERP waveforms of conditions in trial 1 (related 1 vs. unrelated 1) and trial 2 (related 2 vs. unrelated 2) at a particular electrode (Fz) that was sensitive to electrophysiological differences between conditions previously examined with parametric tests. We examined possible differences between conditions over time (10- to 550-ms time window, time-locked to stimulus onset) by performing a separate test every 10 ms. Type I error rate due to the large number of statistical comparisons (55 contrasts in trial 1 and 55 contrasts in trial 2) was controlled for with nonparametric cluster-based permutation tests (Maris and Oostenveld 2007). This permutation testing procedure began the same way as the usual parametric tests by computing statistic for the observed data in the related and unrelated conditions of trial 1 and trial 2 (the nonparametric Wilcoxon t test was used). All cluster-level statistics, defined as the sum of t values within each cluster, were evaluated under the permutation distribution of the maximum (minimum) cluster-level statistic. This permutation distribution was approximated by drawing 9999 random permutations of the observed data. The obtained p values represented the probability under the null hypothesis (no difference between the related and the unrelated conditions in trial 1 and trial 2) of observing a maximum or minimum cluster-level statistic that was larger or smaller (respectively) than the observed cluster-level statistics. Figure 4 shows the results found in these analyses (a detailed table of Wilcoxon t test values and permutation p values approximated by Monte Carlo estimate is provided in electronic supplementary material, Online Resource 1). The pattern of results was similar to that found with parametric tests, but it delimited the time windows in which the N400-like effect in trial 1 and the P200 effect in trial 2 were found. Concretely, differences between the related 1 and unrelated 1 conditions in the N400-like component were restricted to time sample-specific contrasts from 400 ms to 440 ms. In the second trial, differences between the related 2 and unrelated 2 conditions associated with the P200 component were limited to contrasts from 190 ms to 220 ms. Hence, the marginal P200 effect found in trial 2 with parametric analyses becomes significant in the permutation testing.

Fig. 4
figure 4

Nonparametric permutation testing of ERP waveforms at a particular electrode (Fz) obtained in trial 1 (upper graph) and trial 2 (lower graph). Time sample-specific contrasts were performed every 10 ms (10- to 550-ms time window time-locked to stimulus onset). Vertical lines represent the contrasts for which the statistic value exceed the critical value that corresponded to an alpha level of .05

Discussion

During the 1970s, it was observed an associative confusion effect in mental arithmetic: The verification of an addition problem presented with an incorrect result which was the result of multiplying the operands (2 + 4 = 8) was difficult to be performed (Winkelman and Schmidt 1974; Zbrodoff and Logan 1986). It was assumed that this effect reflected the existence of an interrelated network of arithmetic facts in semantic memory: multiplication facts are activated even when individuals resolve addition problems (Ashcraft 1992). Although this axiom has been largely assumed in cognitive arithmetic (Grabner et al. 2013; Lemaire et al. 1991; Winkelman and Schmidt 1974; Zbrodoff and Logan 1986), direct empirical evidence needed to be offered.

In the first trial of our study, we replicated the associative confusion effect at the behavioral level. The participants took more time to verify an incorrect addition problem whose result was the one of multiplying the operands (2 + 4 = 8, the related 1 condition) compared to an incorrect addition problem whose result was unrelated (2 + 4 = 10, the unrelated 1 condition). We also observed that the associative confusion effect was modulated by the problem size (Lemaire et al. 1994, 1996) suggesting that small problems had an automatic and rapid access to the network of arithmetic facts. Importantly, electrophysiological analyses helped us to determine the time course of the processes underlying the associative confusion effect. When the N400-like component was considered, two opposing predictions were established in Introduction section: coactivation in semantic memory would be associated with N400 attenuation, while competition in the network of arithmetic facts would produce large N400 effect. The results found in the 350–450 time window showed that N400-like amplitude was less negative in frontal–central regions in the related 1 condition relative to the unrelated 1 condition. This pattern of results corroborates that the associative confusion effect involves the coactivation of related addition and multiplication facts in semantic memory. Moreover, in order to strength the conclusion that the results found in trial 1 were associated with coactivation of addition and multiplication facts, we considered the frequency with which participants performed additions and multiplications in everyday life and the degree to which they learned the multiplication problems by rote (orally). These variables might foster the coactivation of arithmetic facts. We found that high frequency of addition resolution was associated with large N400 effect in the right anterior region (r = .49, p = .04; no other correlations were significant). Hence, N400-like effects in trial 1 seem to be related to activation in the network of arithmetic facts.

To our knowledge, this is the first study in which the associative confusion effect has been indexed with electrophysiological markers. However, other studies have reported N400-like modulations as evidence of coactivation of arithmetic facts (Domahs et al. 2007; Jost et al. 2004; Niedeggen and Rösler 1996, 1999; Niedeggen et al. 1999). In these studies, an attenuation of the N400 amplitude was found along with a behavioral interference when individuals resolved a multiplication whose result was incorrect but related (it was a multiple of one operand; 5 × 8 = 32) compared to an unrelated condition (5 × 8 = 34).Footnote 2 The critical difference between this previous work and the research presented here is that the former offered evidence of coactivation within operations (several related multiplication facts are activated together), while our study demonstrates coactivation of related arithmetic facts across operations (additions and multiplications). It is important to note that the N400-like attenuation found in our study and those exploring coactivation effects in mental arithmetic are accompanied by a behavioral interference (slower responses in related problems relative to unrelated problems). The behavioral interference is interpreted as a consequence of a late competition process after the coactivation of irrelevant multiplication facts, a process that is not captured with EEG measures. The same dissociation between N400 amplitudes and response times, and a similar interpretation of this dissociation (semantic coactivation and late competition) has been offered in other fields (language production, Blackford et al. 2012). The main point to highlight from the first trial of our study is that coactivation of related arithmetic facts across operations (additions and multiplications) underlies the associative confusion effect in simple arithmetic.

In our study, we also wanted to gather electrophysiological evidence of the consequences of selecting arithmetic facts. The results found in the second trial showed that participants were slower to verify an addition problem whose result was that of multiplying the operands of the first trial (the related 2 condition: 2 + 6 = 8, preceded by 2 + 4) compared to an unrelated condition (the unrelated 2 condition: 4 + 6 = 10, preceded by 2 + 4). This interference effect has been found in previous research (Megías et al. 2014; Megías and Macizo 2015a, b), and it has been interpreted as the result of inhibiting irrelevant arithmetic facts: To resolve the competition between addition and multiplication facts in the first trial, the incorrect multiplication result (8) was inhibited in order to select the correct addition result (6). Hence, when the inhibited result was presented again and it was relevant to perform the second trial (2 + 4 = 8), an additional time was required to retrieve it from semantic memory. Importantly, the interference effect in trial 2 was modulated by the size of the addition problems, indicating that inhibition was applied when there was coactivation of arithmetic facts in the first trial (e.g., when participants solved small additions).

When the electrophysiological pattern was considered in the second trial, we observed that the P200 amplitude was larger in the middle frontal region in the related 2 condition relative to the unrelated 2 condition. As stated in Introduction section, it is difficult to offer a unique interpretation of P200 modulations since this component is related to several cognitive processes. To illustrate, the P200 amplitude varies as a function of visual complexity of stimulus in language processing (Dehaene 1995; McCandliss et al. 1997). However, this factor cannot account for the P200 pattern found here since the addition problems were presented in the same visual format (Arabic digits) in all conditions. Moreover, it could be argued that differences we found in the P200 amplitude were related to magnitude processing. For example, P200 amplitudes are sensitive to distance effect in comparison tasks with numbers close to the numerical standard eliciting a larger P200 amplitude than numbers far from the standard (Turconi et al. 2004; see also Hyde and Spelke 2009; Hyde and Wood 2011; for P200 modulations in non-symbolic comparison tasks). Nevertheless, this explanation would not account for the results found in our study since the magnitude of the addition results presented in the second trial was equated in the related 2 and unrelated 2 conditions (problem size) as well as the distance between these results and those presented in the previous trial.

Although tentative, we suggest that P200 modulations found in our study were associated with the difficulty in the encoding and retrieval of arithmetic facts when they were irrelevant in the previous trial.Footnote 3 As we explained in Introduction section, P200 modulations have been related to the ease to which semantic information is retrieved form semantic memory (Dunn et al. 1998; Raney 1993; Smith 1993). Large P200 amplitude in anterior regions is associated with the difficulty in the encoding of stimuli to access semantic memory, while a posterior P200 seems to reflect the complete retrieval process in long-term memory. The medial frontal P200 effect found in the second trial of our study suggests hence that it is difficult to encode the result of addition problem presented previously in an irrelevant condition (i.e., it was the result of multiplying the operands of the addition presented before). Support for this interpretation comes from the correlation between the N400 modulations found in the first trial and the P200 effect observed in the second trial. A greater N400 modulation was connected to a greater P200 effect. As commented before, the N400 attenuation found in trial 1 did not reflect competition but coactivation of arithmetic facts. Thus, the N400–P200 correlation seems to indicate that the difficulty to encode the result of an arithmetic problem in trial 2 depends on the degree to which it was activated in an irrelevant context (it was an incorrect result) in the preceding trial.

In previous studies, other ERP components have been associated with inhibition in mental arithmetic. Galfano et al. (2011) conducted an ERP study to evaluate inhibition during the resolution of multiplication problems. The authors observed retrieval induced forgetting effect (RIF): Participants took longer to verify multiplication problems (e.g., 3 × 5 = 15) that included an operand previously practiced with a different multiplication (e.g., 2 × 5 = 10), relative to a condition with problems whose operands were not presented before (e.g., 3 × 4 = 12). This RIF effect was interpreted as a consequence of inhibiting related multiplication problems in the practice phase. The forgetting condition was associated with a reduced P350 component. A close visual examination of the results found in the second trial of our study seemed to suggest P350 modulations in left parietal regions. However, voltage amplitudes were in the opposite direction (more positive in the related 2 condition) and they were not significant. Differences between studies might be related to the fact that Galfano et al. evaluated a long-term inhibitory effect (inhibition was applied in a practice phase, and it was indexed in a posterior test phase). In contrast, we evaluated the consequences of applying inhibition trial by trial, and they were measured directly after inhibition proceeded to resolve competition. Future research is needed to disentangle similarities and differences between inhibition found with the retrieval practice paradigm (Galfano et al. 2011) and that evaluated in the current study with a negative prime-like paradigm.

To conclude, this study shows that the presence of an associative confusion effect in decision times is related to N400-like modulations which support the underlying coactivation of arithmetic facts in semantic memory. Moreover, once the addition problem is resolved, P200 modulations might suggest that it is difficult to encode a posterior addition problem with a result which was previously irrelevant.