1 Introduction

There is no universally accepted definition for mathematical giftedness (e.g., Davis & Rimm, 2004). Usually the concept of mathematical giftedness is either linked to high performance in mathematics or to the construct of general giftedness (Leikin, 2014). Analysis of previous studies on giftedness, in general, or on mathematical giftedness, in particular, demonstrate that the studies differ in sampling criteria (if there are any) and that in most cases high achievement in school mathematics is considered as an indicator of mathematical giftedness (Ziegler & Raul, 2000). Educational literature refers to the terms “mathematical giftedness”, “high mathematical abilities”, and “excellence in mathematics” as synonymous terms (Juter & Sriraman, 2011). These qualities of individuals with high mathematical abilities are usually linked to qualities in mathematical processing and problem solving, including advanced logical reasoning (Krutetskii, 1976). Following these observations, our study draws a distinction between general giftedness (G) and excellence in school mathematics (EM) as two interrelated but different in nature constructs (see also in Leikin, 2013; Leikin, Waisman & Leikin, 2013; Paz-Baruch Leikin, Aharon-Peretz & Leikin, 2014).

Studies that focus on cognitive characteristics of generally gifted (G) students demonstrate that they are significantly better than non-gifted (NG) individuals on tasks associated with analogical thinking, acquisition of new information, application of different problem-solving strategies (Steiner & Carr, 2003), short-term memory, and visual matching as well as exhibiting a higher speed of information processing (Paz-Baruch, Leikin, & Leikin, 2013; Paz-Baruch et al., 2014). Jensen (2006) demonstrate a high correlation between IQ and RT (reactions time) on chronometric tasks and maintains that these tasks can be used as measures of IQ.

Some studies demonstrate distinctions between thinking processes of moderately and highly intellectually gifted children. For example, Silverman (1989) defined super gifted (S-G) individuals as demonstrating significantly higher performance on different types of tasks. It is shown that the significance of these abovementioned differences between G and NG individuals are more augmented in the group of S-G individuals (Gross, 2009). Additionally, S-G individuals exhibit significantly higher performance than moderately gifted individuals on tasks requiring working memory and pattern recognition-specific characteristics (Leikin, Paz-Baruch & Leikin, 2014). Leikin et al. (2014) found three types of cognitive characteristics of S-MG students: unique, G-dependent and accumulative characteristics. Unique characteristics are those in which no significant differences were identified when comparing G-EM with NG-EM students, whereas S-MG participants differed significantly from students in the other two groups. G-related characterizing revealed significant differences between NG-EM students and the two other groups of students who excel in mathematics, with no significant differences between G-EM and S-MG students. In accumulative characteristics S-MG participants outperformed G-EM students, and G-EM students outperformed those from the NG-EM group non-significantly. Differences only become significant when comparing the S-MG students with their NG-EM counterparts.

Over the years, several studies have been carried out to investigate the neurophysiologic basis of giftedness in general and mathematical giftedness in particular (while the mathematically gifted population was mainly detected through their high achievements in mathematics). Using fMRI methodology, the brains of gifted individuals show enhanced development and activation of the right hemisphere (Prescott, Gavrilescu, Cunnington, O’Boyle & Egan, 2010) as well as enhanced brain connectivity (Desco et al., 2011) and an ability to activate task-appropriate regions in a well-orchestrated and coordinated manner (O’Boyle, 2005) when performing mental rotations or other non-mathematical tasks. Using EEG, Gevins and Smith (2000) showed that values of the amplitude of brain activation, as well as topographic patterns, are associated with levels of general intelligence.

There is strong evidence that individuals of higher intelligence mostly exhibit lower frontal brain activation as compared to individuals of lower intelligence (Neubauer & Fink, 2009); this is known as the neural efficiency hypothesis. The majority of these studies were conducted using EEG to measure brain activation (ERD signifiers). Nevertheless, Grabner, Neubauer, & Stern (2006) argue that whereas the generality of the neural efficiency effect has been qualified during performance of a variety of cognitive demands, “a moderating variable seems to be the degree to which the tasks draw on contents stored in long-term memory… (as) good memory performers displayed a higher ERD in the upper alpha band, while the opposite pattern was observed when presenting tasks which drew less on long-term memory (ibid., p. 423). Note, also that neural efficiency effect seems to be related to task difficulty. That is, neural efficiency might arise when individuals are faced with tasks of (subjectively) low to moderate difficulty; however, when the task-complexity rises to a high level of difficulty, highly intelligent individuals probably devote more cortical resources as compared to less intelligent individuals (Dunst et al., 2014). Neubauer and Fink (2009) argued that neuroscientific training studies suggest that neural efficiency seems to be a function of the amount and quality of learning. Additionally, studies using ERP methodology reported that more intelligent individuals had shorter ERP latencies (e.g., Jolij et al., 2007), lower ERP amplitudes (Polich & Kok, 1995), whereas Hill et al. (2011) and Zhang et al. (2007) found larger ERP amplitudes (especially P300) in individuals of higher intelligence. Based on this literature review we might expect each of the G and EM factors to lead to neural efficiency when solving mathematical problems. However, in our previous study (Waisman, Leikin, Shaul, & Leikin, 2014) we found the lowest electrical potentials (that we connect to the neural efficiency effect) in the group of generally gifted students who excel in mathematics when solving function-related tasks.

De Smedt and Verschaffel (2010) argued that since there are large individual differences in mathematics-related cognitive processes, cognitive neuroscience might be used to help gain a better understanding of these differences in mathematical performance. This may be even more critical for the investigation of such a previously undefined phenomenon as mathematical giftedness. Specific ERP components are considered to be related to different cognitive processes. Since ERP technique facilitates tracking the continuity of participants’ on-line cognitive activity (Neville, Coffey, Holcomb, & Tallal, 1993), we employed Event Related Potentials (ERP) with a focus on the behavioral and electrophysiological measures associated with logical inference processing in geometry by groups of participants that were distinct in levels of G and EM characteristics.

One of the characteristics of students with high mathematical ability is the logical reasoning they exhibit with regard to qualitative and spatial relationships (Krutetskii, 1976; Deal & Wismer, 2010; Gardner, 2011). At the same time, previous studies have demonstrated that the superior reasoning ability among mathematically gifted individuals is supported by a differently organized brain network (e.g., Desco et al., 2011). Following these observations, we decided to examine the brain activity of students when solving geometric tasks that require logical inference.

Development of logical reasoning plays an important role in the learning of mathematics and vice versa (Durand-Guerrier, Boero, Douek, Epp, & Tanguay, 2012). In this context, mathematics educators and researchers acknowledge the importance of geometry in the mathematics curriculum because of its power to foster logical and visual thinking (Ayalon & Even, 2010; Polya, 1973). There is a close relationship between logical and deductive reasoning—the process by which a person draws conclusions (logically necessary inferences) based on previously known facts (givens or premises) (Nunes et al., 2007) as, for example, in mathematical proofs and proving (Hanna & De Villiers, 2012).

Some studies in mathematics education have focused on students’ development of deductive reasoning (e.g., Hoyles & Kuchemann, 2002) or classification of geometrical objects (quadrilaterals) as an instance of deductive reasoning in geometry (Fujita, 2012). Different studies suggest that learners often have difficulty with the formal definitions of shapes including an understanding of hierarchical relationships between them (Erez & Yerushalmy, 2006; Fujita, 2012). It was also found that students’ geometrical reasoning is often influenced considerably by conceptual images of geometric figures (Vinner, 1991).

In the present study, we attempted to analyze the brain processing associated with logical reasoning in geometry (problems on quadrilaterals), which has been generally overlooked in current neuro-cognitive research. Previous neuro-cognitive research of logical inferences (mostly fMRI studies) has mainly used tasks requiring making deductions with non specific (to mathematics) premises, for example, “If a figure is a square then it is red. The figure is a square. Therefore it is red” (Qiu et al., 2007). The results pointed to the variability in the location and the strength of brain activity associated with logical reasoning (Mathieu, Booth, & Prado, 2015).

However, it appears that the noted brain activity depends on the experimental procedure (Prado, Chadha & Booth, 2011). Only a few EEG studies have addressed the neural correlates of deductive reasoning (Qiu et al., 2007; Bonnefond & Van Der Henst, 2013). The majority of them attempted to characterize the ERP components elicited by the processing associated with deductive arguments. Bonnefond and Van Der Henst (2013) found N2 to be most pronounced and P3b to be least pronounced in response to linguistic mismatches between the minor premise and the conclusion, while P3b was more pronounced when the minor premise and the conclusion matched. They interpreted their results by suggesting that participants had already drawn the conclusion, based on the major premise presented.

Note that most of the previous studies that examined neural correlates of logical reasoning focused on adult participants. They did not address logical reasoning in geometry, nor did they examine individual differences related to varying levels of mathematical abilities. Our study attempts to fill these lacunas.

1.1 The study hypotheses

Based on the literature review presented above and the results of our previous studies we hypothesized that G and EM are characteristics which are interrelated but different in nature, and we examined the following hypotheses:

Hypothesis 1::

G and EM factors have a significant effect on Acc, RTc and ERP measures: G students outperform NG students, while EM students outperform NEM students. This outperformance will be reflected in (a) higher Acc (b) shorter RTc (c) shorter latencies of early components; and (d) lower absolute ERP amplitude values of late potentials (as related to neuro-efficiency effect);

Hypothesis 2::

Varying combinations of G and EM factors lead to differences in performance in the four major groups of participants both in behavioral and electrophysiological measures. Students in the G-EM group outperform students from the NG-EM, G-NEM and NG-NEM groups. This outperformance will be reflected in (a) higher Acc (b) shorter RTc, (c) shorter latencies of early components and (d) lower absolute ERP amplitude values of late potentials;

Hypothesis 3::

Among EM students S-MG students will demonstrate superior performance reflected in (a) higher Acc, (b) shorter RTc, and (c) lower absolute ERP amplitude values of late potentials.

2 Methods

This study aimed to find similarities and differences in behavioral and electrophysiological measures associated with solving short geometric problems that require logical inferences in students who differed in the combination of general giftedness (G) and excellence in mathematics (EM). As presented in the Sect. 2.1, our study included four major groups of participants, who differed in combinations of G and EM factors, as well as a small fifth group of students who comprised what we called super-mathematically gifted (S-MG). In the behavioral dimension, we examined Accuracy (Acc) and Reaction time for correct responses (RTc), while in the electrophysiological dimension we examined amplitudes, latencies, and scalp topographies identified with ERP procedure.

2.1 Participants

This paper is reporting results related to 74 participants from 5 research groups as described below.

The sampling procedure was aimed at forming groups that differed in EM and G factors. High-school mathematics in Israel is taught at three levels: high, regular and low, while the level of instruction is determined by students’ mathematical achievements in earlier grades. A research population of 1200 students who studied mathematics in 10th and 11th grades (aged 16–18) at high level (HL) or regular level (RL) were examined using Raven’s Advanced Progressive Matrix Test (RPMT) (Raven, Raven & Court, 2000) for general intelligence and the SAT-M (Scholastic Assessment Test in Mathematics) for mathematical excellence. We used a shortened Raven test containing 30 items with a 15-min time limit and a short version of the SAT-M that contained 35 items with a time limit of 30 min (Zohar, 1990).

G students were chosen from high-school classes for gifted students (identified by a national examination as having an IQ > 130 in the third grade). Additionally, we used the Raven test for validation of the G factor in students from the classes for gifted students and for inclusion of students from regular classes in the G group. Students from classes for gifted students were chosen for the research sample with Raven scores of above 26 (of 30). To exclude gifted students from the group of excelling students who were not identified as gifted, we used a Raven score of 28 (of 30) as a lower border for inclusion in the G group (1 % of the general population). Students included in the EM group learned HL mathematics with scores above 90 (of 100, based on their teachers’ evaluation). An SAT-M score above 26 was chosen as a control measure for the EM sampling (2 % of general population). Students in the NEM group studied mathematics at RL with scores above 90. In the cases when school mathematics score at HL was above 94 the criteria of SAT-M score was omitted. Students who are grouped as NEM are those that study mathematics either at HL with scores below 80 or at RL with scores above 90 and received an SAT-M score below 22.

A sample (of 200 students including males and females) was comprised of four major groups categorized by varying combinations of EM and G factors.

In this paper we report findings for 67 right-handed male adolescents (Mean age is 16.01, with SD = 0.73 with no significant between-group differences) for whom the EEG data were collected without excessive noise and after exclusion of outliers: Sixteen G-EM students who were identified as generally gifted and excelling in school mathematics; 19 G-NEM students who were identified as generally gifted but did not excel in mathematics; 17 NG-EM students who excelled in mathematics but were not identified as generally gifted; 15 NG-NEM students who were identified as neither generally gifted nor excelling in mathematics.

The fifth research group included 7 “super mathematically-gifted” (S-MG) right-handed male students who were characterized by their mathematics professors as being students with extraordinary mathematical abilities. While attending senior high school (grades 10–12), these students studied additionally mathematics or computer science courses at a university, achieving a score above 95 % (of 100), or were members of the International Computer Science Olympiad team. All S-MG participants were identified as generally gifted. These students could be considered to be representatives of 1/20,000 of the general population.

2.2 Materials and procedure

A computerized test (originating from the Israeli curriculum and taught to all study participants in similar ways) was designed with 60 tasks using E-Prime software (Schneider, Eschman, & Zuccolotto, 2002). The task design was motivated by Polya’s (1973) theory of problem-solving strategies which includes four stages: understanding the problem, devising a plan, carry out the plan, and looking back. In our study each task was presented in three stages: S1–Problem condition, S2–Question presentation, and S3–Suggested answer. While S1 and S2 correspond to understanding the problem and devising a plan, S3 corresponds to the looking-back stage in Polya’s model. The stage of carrying out the plan/performing a solution was analyzed through ERP measures. All tasks were presented visually at the center of the computer screen and were displayed in black characters on a grey background within a white 5 cm × 5 cm square. The sequence of events and an example of the task are presented in Fig. 1.

Fig. 1
figure 1

The sequence of events and a task example

At S1 a set “X” of properties of a family of quadrilaterals was presented verbally. At S2, in order to design a logical chain, one of two types of questions appeared: (i)–“the quadrilateral is…” or (ii)–“the quadrilateral can be…”. At S3 the “Y” of a quadrilateral was displayed. For questions of type i the subjects were required to examine whether X is a sufficient condition for Y. For a question of type ii participants were asked to ascertain whether Y can belong to a family of quadrilaterals determined by X. That is, at S3 subjects examined the correctness of the logical inference designed from the three stages: S1, S2, and S3.

S1 was allotted 4000 ms followed by a 1000 ms inter-stimulus interval with a grey computer screen. S2 was allotted 1500 ms followed by a 1000 ms inter-stimulus interval. S3 was allotted 5000 ms during which each subject had to decide whether or not the suggested answer was correct by pressing an appropriate button on the keyboard. Reaction time was defined as the period between the onset of S3 and pressing the appropriate button.

Time periods and the sequence of events were determined by a pilot study (Leikin, Leikin, Waisman, & Shaul, 2013). Cronbach’s alpha was determined by accuracy criteria and found to be sufficiently high (α C  = .722).

Scalp voltages were continuously recorded using a 64-channel BioSemi ActiveTwo system (BioSemi, Amsterdam, The Netherlands) and ActiveView recording software. The system works in this way: Pin-type electrodes are mounted on a customized Biosemi head-cap, arranged according to the 10–20 system. Two flat electrodes are placed on the sides of the eyes to monitor horizontal eye movement. A third flat electrode is placed underneath the left eye to monitor vertical eye movement and blinks. During the session, electrode offset is kept below 50 µV. The EEG signals are amplified and digitized with a 24-bit AD converter. A sampling rate of 2048 Hz (0.5 ms time resolution) is employed. Figure 2 depicts the location of the electrodes and the selected electrode sites (see Sect. 2.3 for data analysis).

Fig. 2
figure 2

The selected electrode sites used for the statistical analysis

2.3 Data analysis

Both behavioral and electrophysiological analyses were performed for trials with correct responses in two steps: Step A and Step B.

At Step A the analysis focused on the similarities and differences of the four major groups of participants—G-EM, NG-EM, G-NEM and NG-NEM. We analyzed between-group differences using ANOVA for G and EM factors. The consequent pair-wise comparisons (G vs. NG in EM and NEM groups and EM vs. NEM in G and NG groups separately) were performed only when significant interactions were obtained.

For all analyses, p value was corrected for deviation from sphericity according to the Greenhouse Geisser method, when appropriate.

At Step B the analysis was directed at the identification of the specific characteristics of the S-MG participants. This analysis was performed for three groups of students all of whom excel in mathematics—S-MG, G-EM and NG-EM students. We performed a non-parametric (Kruskal–Wallis) test with consequent Mann–Whitney tests for pair-wise comparison between the groups.

For pair-wise comparisons at Step A and Step B, p values were adjusted for multiple comparisons according to the Bonferroni adjustment (i.e., the significant cutoff is .05 divided by the number of comparisons performed).

2.3.1 Behavioral analysis

Accuracy (Acc) and Reaction time for correct responses (RTc) were analyzed both at Step A and Step B. Acc was determined by the participant’s percentage of correct responses to the 60 tasks on the test. RTc was calculated as the mean time spent for verification of an answer (S3) in all trials on the test, per person.

2.3.2 Electrophysiological analysis

ERPs were analyzed offline using the Brain Vision Analyzer software (Brain-products). ERPs were Zero Phase Shift filtered offline (bandpass: 0.53–30 Hz) and referenced to the common average of all electrodes. Epochs with amplitude changes exceeding ±80 µV on any channel were rejected. Ocular artifacts were corrected using the Gratton, Coles & Donchin (1983) method. The ERP waveforms were time-locked to the onset of S1, to the onset of S2 and to the onset of S3. The averaged epoch for ERP, including a 200 ms pre-trigger baseline, was 1200 ms for S1, 2200 ms for S2 and 5000 ms for S3 (for which only the correct answers were averaged). The resulting data were baseline-corrected, and the grand wave was calculated for each stage. Trials in which the subjects failed to respond within 5000 ms (i.e., the duration of the display of the answer window−S3) were also excluded from analysis. About 40 trials were available for each participant, per stage.

The time frames of early components (P100) and late potentials were identified according to the preliminary examination of grand average waveforms and corresponding topographical maps. We focused on the time frames with obvious between-group difference. Since the peaks were not clear in the grand average waveform after 220 ms, the analysis of mean amplitude of late potentials was performed. Due to the distinction of the different stages in problem solving (Polya, 1973), we analyzed the ERPs for each stage separately.

The scalp surface was divided into nine electrode sites: Anterior left (AL), anterior middle (AM), anterior right (AR), central left (CL), central middle (CM), central right (CR), posterior left (PL), posterior middle (PM), posterior right (PR) (see Fig. 2). In each of the four time frames and at each of the nine electrode sites, the mean amplitude was found as an average of the mean amplitude at each single electrode within the site.

Early components(P100): We followed the suggestion of Luck (2014, p. 314–315) to ‘include only the sites where the ERP component only is actually present rather than including electrodes from the entire scalp’ and ‘analyze only the sites at which the component of interest is large’. Therefore we restricted the analysis of P100 to posterior electrode sites and it was performed only at Step A. At Step A, in addition to between-group differences Laterality was examined for amplitudes and latencies of P100 component as within subjects factor (Table 1).

Table 1 Summary of statistical analysis of electrophysiological data

Late potential components: At Step A, the mean amplitudes of late components were computed in 350–650 ms for S1 and 220-350 ms for S2 and S3. In addition to the analyses described at Step A (for G and EM factors), a Repeated Measures ANOVA was performed on the ERP mean amplitude considering the two orthogonal factors: Caudality and Laterality as within-subject factors. Analysis was performed for each of the three stages of a task (S1, S2 and S3).

At Step B, the late components were analyzed in 300–500 ms for S2 and 500–650 ms for S3 at the nine selected electrode sites.

Table 1 summarizes the statistical analyses performed.

3 Results

Due to the space limitations of the paper we report only on significant results.

3.1 Processing in the four major groups of participants

3.1.1 Behavioral data

Table 2 depicts means and standard deviations for Acc and RTc of participants’ performance. It also presents results of ANOVAs for G and EM factors effects. The G factor had a significant effect on the Acc (see Table 2). Acc of G students was significantly higher compared to that of NG students, and Acc of EM students was significantly higher compared to that of NEM students. No significant effects of G and EM factors on RTc were found. Thus, among Hypotheses 1a, 1b, 2a, 2b only Hypothesis 1a was confirmed for both G and EM factors.

Table 2 Acc and RTc in the four major groups of participants

3.1.2 Electrophysiological data

Figure 3 depicts the grand average waveforms in the major groups of students at the nine electrode sites. As mentioned above, we determined the time frames for analysis of the between-group differences based on the analysis of the graphs presented here.

Fig. 3
figure 3

Grand average waveforms in the major groups of students at the nine electrode sites

Early components: We detected the P100 on posterior electrodes at S1, S2, and S3 (Table 1). The significant main effects and interactions are presented in Table 3.

Table 3 Significant results associated with P100

G factor had a significant effect on both the latency and the amplitude of P100 at S3, on the amplitude alone of P100 at S2 and on the latency alone of P100 at S1. P100 appeared later and had higher amplitude in G students than in NG students. A significant interaction of G factor with EM factor was detected on the amplitude of P100 at S2 and S3. The amplitude of P100 in NG-NEM was significantly lower as compared to that of G-NEM students [F (1, 63) = 12.546, p = .001, η 2 p  = .166 and F (1, 63) = 11.929, p = .001, η 2 p  = .159 for S2 and S3, respectively] while the amplitude of P100 in NG-EM was similar as compared to that in G-EM participants. Additionally, we found a significant interaction of the EM factor with laterality on the amplitude of P100 at S1. The amplitude of P100 at PL and PM electrode sites was higher in EM students as compared to NEM students, while the amplitude at PR electrode site was higher in NEM as compared to that of EM participants.

Thus, the findings related to amplitudes of P100 partly contradict Hypotheses 1c and 2c since latencies of P100 evoked in G students were significantly longer than those in NG students. Even though we did not refer to the strength of early components in the research hypotheses, we would like to stress that P100 amplitudes appeared to be significantly higher for G students.

Late potentials: Table 4 depicts significant results associated with late potentials that were found at S1, S2 and S3 in the defined time frames.

Table 4 Significant main effects and interactions associated with analysis of mean amplitudes related to late potentials at the nine electrode sites

At Stage S1 in the 350–650 ms time frame, a significant effect of EM factor was found. The amplitude for EM was significantly higher than that for NEM students. Additionally, a significant interaction of Caudality with EM factor was found. The amplitude in EM was higher than for NEM students at the anterior and posterior parts of the scalp, while the amplitude for NEM students was similar to that for EM students at the central parts. The significant difference in amplitude between EM and NEM students was obtained at the anterior parts [F (1, 63) = 7.854, p < .01, η 2 p  = .111]. Moreover, a significant interaction of Laterality with G factor was found. The amplitude of G was higher at the left and right sites compared to that for NG students, while NG elicited higher amplitude in the middle sites of the scalp. These results at S1 contradict Hypotheses 1c and 2c: the ERP amplitudes of EM students were higher than those of NEM students. Additionally, no effect of G factor was found.

At Stages S2 and S3, a significant interaction of G with EM factors was found in the 220–350 ms interval. The mean amplitude of NG-NEM was lower as compared to that of G-NEM students, but the amplitude of G-EM was similar to that of NG-EM students. The significant difference in the mean amplitude between NG-NEM and G-NEM students was found at S3 [F (1, 63) = 7.222, p < .01, η 2 p  = .103]. Additionally, a significant interaction of Caudality with G and EM factors appeared at S3. Further investigation of this interaction revealed that the mean amplitude in G-NEM was significantly higher than that of NG-NEM in the posterior [F (1, 63) = 8.136, p < .01, η 2 p  = .114] electrode sites. Thus, the results of our study did not confirm Hypothesis 2d since the G-EM group did not exhibit significantly lower ERPs than the other research groups.

Figure 4 shows topographical maps of late potentials for the four major groups at S1, S2 and S3 in the examined time frames.

Fig. 4
figure 4

Topographical maps of late potentials for the four groups of participants at S1, S2 and S3 within examined time frames

Additionally, in order to explore the general relationship between behavioral performance and the ERP effects in NEM students, we performed correlation analyses between the mean amplitudes in the posterior electrode sites at S3 and the Acc and RTc. Positive correlation was observed between the mean amplitude in the PM electrode site in the 220-350 ms time frame and Acc (r = .343, p < .05). This correlation indicates that the high mean amplitude is associated with high Acc.

3.2 Processing in three groups of participants who excel in mathematics

3.2.1 Behavioral data

Table 5 depicts Acc and RTc in the three groups of participants who excel in mathematics. We found that accuracy of S-MG was non-significantly higher than that of G-EM, which was non significantly higher than the accuracy of NG-EM students. The differences become significant when comparing the accuracy of S-MG and NG-EM students. We consider this increase in accuracy as an accumulative characteristic of S-MG students (cf. Leikin, Paz-Baruch, & Leikin, 2014). These results support Hypothesis 3a: S-MG students demonstrated superior performance of an accumulative nature. Hypotheses 3b was not confirmed, since no significant differences were found for RTc among the three EM groups of participants.

Table 5 Acc and RTc in the three groups of participants who excel in mathematics

3.2.2 Electrophysiological data

Figure 5 depicts the scalp topographies and statistically significant results related to the between-group differences in the three groups of students excelling in mathematics (G-EM, NG-EM and S-MG). Significant differences in the means of absolute ERP amplitude values, exhibited by the three groups of participants who excel in school mathematics, were found at S2 during the 300–500 ms interval in the CR and AR electrode sites, and at S3 during the 500–650 ms interval in the CR electrode site. In all of these cases the mean of absolute ERP amplitude values of S-MG students was lower than that of G-EM students, which was, in turn, lower than that of NG-EM students, non-significantly. The differences only become significant when comparing the mean of absolute ERP amplitude values between S-MG and NG-EM students in the CR region. Thus, we argue that the lower mean of absolute ERP amplitude values found for the S-MG students can be considered an accumulative characteristic of this specific group, which is related to neural efficiency of brain activity. These findings confirm Hypothesis 3c.

Fig. 5
figure 5

Scalp topographies and statistical results related to the significant results in the three groups of students excelling in mathematics

4 Discussion

This study was directed towards developing a better understanding of the concept of high mathematical abilities. To this end we examined problem-solving performance associated with logical inferences in relation to the topic of quadrilaterals by students who differed in the combinations of general giftedness (G) and excellence in mathematics (EM). Problem-solving performance was analyzed for accuracy (Acc) and reaction time for correct responses (RTc) as well as for electrophysiological measures such as amplitudes and scalp distribution of electrical potentials evoked by the problem-solving activity. Following Polya’s (1973) model of problem-solving strategies, the tasks were divided into three stages and the electrophysiological measures were examined at each stage of problem solving. To achieve the study goal, we focused on five groups of participants that differed in the combination of their excellence in school mathematics and in their level of general giftedness.

The choice of logical inference tasks was influenced by the fundamental studies of Krutetskii (1976) and Polya (1973) who stressed connections between high mathematical abilities and logical reasoning as well as between logical thinking and expertise in problem solving. From our perspective, high achievements in school mathematics reflected problem-solving expertise. Previous findings emphasize the aptitude of mathematically able students to think logically about qualitative and spatial mathematical objects (Krutetskii, 1976).

We raised three main hypotheses that were only partially confirmed by our research.

4.1 Accuracy and reaction time

In this study, both G and EM factors significantly affected the Acc of problem solving, whereas a combination of the G and EM factors strengthened this effect. Thus, G-EM students exhibited the highest accuracy among all four major groups of participants. When comparing only among the three groups of excelling in mathematics students, S-MG students exhibited even higher Acc than G-EM students with a significant difference compared to NG-EM participants. We consider accuracy to be an accumulative characteristic of S-MG students since increases in accuracy from NG-EM to G-EM and from G-EM to S-MG students were non-significant, while S-MG participants were significantly more accurate that NG-EM students. These findings are consistent with previous findings of Leikin et al. (2014) who demonstrated that pattern recognition and working memory are accumulative characteristics of S-MG students.

No significant effects in RTc were found for either EM or G factors. This could be due to the fact that RT was measured at the stage of answer verification. This finding to some extent can be considered a limitation of the study. While all participants were allotted identical times at the stages of introducing the situation and presenting the question, we can assume that their processing time varied at those stages. However, without significant differences, the shortest RTc was shown in the G-EM group, whereas the longest RTc was found among the G-NEM students (Table 2). We can suggest that these findings show trends regarding the time required for solving the tasks by each of the different groups of participants. Our results related to Acc in solving tasks requiring logical inferences in geometry are consistent with findings of the seminal study performed by Krutetskii (1976) who demonstrated a strong connection between high mathematical abilities and the ability to think logically.

When we consider only behavioral measures, it seems that EM and G factors have similar effects on problem-solving performance. However, electrophysiological measures demonstrate that different neuro-cognitive processes are involved in solving logical problems by students who differ in their level of general giftedness or in their level of excellence in school mathematics. These differences are reflected in the strength of the brain electrical potentials and their topographical maps.

4.2 Strength of electrical potential when solving the task

In the analysis of behavioral data, EM and G factors appear to be relatively independent since they have different effects on the problem-solving processing. A combination of EM and G factors (in S-MG and G-EM groups) seems to cause an increase in Acc of an accumulative type. Moreover, the hypothesis that excellence in mathematics and general giftedness are interrelated but different in nature becomes even more pronounced when we analyze the electrophysiological data.

Analysis of the early and late potentials of the brain electrical activity during the solving of short geometry problems requiring logical inference revealed interesting phenomena associated with G and EM factors at all three stages of coping with the task (S1, S2, and S3). Both G and EM factors significantly affected amplitudes and latencies of early potentials, as well as mean amplitudes of late potentials (see Tables 3, 4).

The main effect of G factor was obtained in relation to P100. A significant interaction between G and EM factors in the amplitude and latency of P100 component was obtained, too. Analysis of the early potentials demonstrated that at all stages of problem solving, P100 had higher amplitude and longer latency in G students compared to NG students (the statistically significant differences appeared at S1 and S3 for latencies and at S2 and S3 for amplitudes). Note that P100 is typically associated with attention and primary visual processing of the stimulus (Doyle et al., 1996). Accordingly, we explain the effect of G factor by early analytical activation and more focused attention related to the stimuli that lead to the P100 delay and its higher amplitude.

At S2 and S3 we found a significant interaction between the G and EM factors with respect to the amplitude of P100. The G factor significantly strengthened amplitude in the NEM group but did not affect amplitude in EM students.

Thus, the main effect of G factor appeared only in early potentials, more precisely in the P100 component at S1, S2 and S3, while the main effect of EM factor was evident only in late potentials at S1. More specifically, the amplitude of P100 in G students was higher than that of NG students, and the mean amplitude of EM students at S1 was higher than that of NEM participants. At the same time significant interaction between these two factors was obtained at S2 and S3 in both P100 and in the 220–350 ms time frame (late potentials). In Hypotheses 2c the mean amplitude of participants from NG-NEM group was lower compared with that of G-NEM students, while the mean amplitude of NG-EM students was slightly higher than that of G-EM participants (Table 4). Specifically for S3, the G factor caused significantly increased ERP mean amplitude in the NEM group, while EM factor significantly increased the ERP mean amplitude in G students. Additional correlation analyses in NEM group revealed a positive correlation between the mean amplitude in PM electrode site at S3 and Acc (i.e., higher Acc resulted in higher amplitudes), which may indicate that the effect of giftedness on mean amplitude in NEM students might depend on Acc. These findings support our hypothesis with regard to the distinction and interrelation between G and EM factors.

As noted above, we analyzed mean amplitude of the late potentials (time frames of 350–650 ms at S1 and 220–350 ms at S2 and S3) due to the absence of pronounced peaks in the ERP grand average wave. Although the P300 ERP component was not identified, we suggest that it is embedded in the mean amplitude during the two time frames noted above. Generally speaking, P300 is associated with working memory, updating, encoding, and/or retrieval from the long-term memory (Donchin & Coles, 1988). Moreover, this component seems to be sensitive to the amount of attention resources allocated to a stimulus, with larger amplitudes reflecting greater amounts of attention (e.g., Polich, 2012). Accordingly, it may be suggested that the highest mean amplitude in the G-NEM student group may be linked to stimulus evaluation and classification (Leikin, 2002). In this case, gifted individuals elicited higher P300 amplitude as compared to non-gifted ones (Jaušovec & Jaušovec, 2000). In our study, G-NEM students exhibited significantly higher mean amplitudes than NG-NEM; this may be interpreted as an allocation of more mental resources for processing the question (S2) and for final conclusion and verification of a solution (S3). At the same time, at S1, EM students demonstrated higher mean amplitude as compared to NEM students. Note that at S1 the properties of quadrilaterals are presented to participants. Accordingly, we assume that the increase of amplitude in EM group at S1 could indicate that EM students started processing the properties on the basis of previously learned geometric material directed at identification of a corresponding family of quadrilaterals.

It should be noted that the electrical activity of S-MG in the anterior right (AR) electrode site was lower than in the other two groups of EM students, non-significantly, and significantly lower in the central right (CR) electrode site than in NG-EM students at the specific time intervals. This finding for S-MG seems to be in line with research which supports the claim that neural efficiency may be pronounced in frontal regions of the brain (Grabner, Neubauer, & Stern, 2006; Neubauer & Fink, 2009). Additionally, the literature points out the connection between neural efficiency effect and task difficulty (Dunst et al., 2014). In our previous study that employed function-related tasks, which were relatively easy for the participants in the study (Waisman et al., 2014), a significant decrease of the mean amplitude was observed in G-EM students. Accordingly, we assume that a lack of significance of such an effect in G-EM students in the present study can be explained by the relative difficulty of problems that require logical inference (e.g., Fujita, 2012). These tasks appear to be quite simple for S-MG participants and, therefore, the significant decrease in mean amplitude emerged in these students as compared to the other two groups of excelling in mathematics students. Note that this neural efficiency effect seems to be of an accumulative nature.

4.3 Distribution of electrical potentials

The study demonstrated also that G and EM factors differ in their effects on the distribution of electrical potentials evoked when solving geometrical tasks. These differences follow from the qualitative observation of topographical maps as well as from statistically significant interactions of Laterality or Caudality with G and EM factors. For example, we found interaction of Laterality with EM factor at S1 for P100 that is reflected in higher amplitude for EM students than for NEM students in the left and middle electrode sites, and higher amplitude for NEM students than for EM students in the right electrode sites. P100 seem to reflect low-level processing (Doyle et al., 1996). Thus, even at initial processing levels, such factor as excellence in mathematics may influence scalp distribution of brain potentials. Additionally, the significant interaction of Laterality with G factor was obtained in the 350-650 ms time frame at S1. At the same time, at S1, the interaction of Caudality with EM factor was found to be significant with an increase in amplitude for EM as compared to NEM students at anterior and posterior parts of the scalp. The frontal and prefrontal cortices are known to be involved in mathematical problem solving and deductive reasoning, whereas the posterior cortex was found to be essential in visuo-spatial processing (e.g., Prado et al., 2011). Thus, it seems that EM students activate electrode sites which are more relevant and better suited to the nature of the solved tasks.

In summary, not surprisingly, G and EM factors appear to be interrelated personal characteristics. EM and G factors affected similarly behavioral measures: EM and G students were more accurate than their non-EM and non-G counterparts correspondingly. At the same time G and EM factors affected differently electrophysiological measures with significant interaction between G and EM factors associated with absolute ERP amplitudes at some of the solution stages. Thus we maintain that G and EM factors are different in nature. Moreover, our findings demonstrate that general giftedness is neither a necessary nor a sufficient condition for excellence in school-mathematics, and vice versa.

This observation has very important and positive implications for mathematics education: mathematical expertise can be developed in a much broader population than that of generally gifted students. Generally gifted students (like all other students) should choose their major subjects according to their preferences, and they can and should be different from mathematics for many students. Our findings demonstrate that some of the characteristics of S-MG students are an extreme (accumulative) expression of the combination of G and EM factors. We argue that this population deserves special teaching and learning approaches when studying mathematics, more so than any other group in the general population. Consistent with findings of Leikin et al. (2014), these students can benefit from a distinctive learning environment that provides them with a quicker learning pace, more independent work and more challenging mathematical tasks.