Introduction

Research on object nouns, e.g., fruit, has been more extensive relative to event concepts typically denoted by verbs, e.g., to walk. Events differ from objects in that they unfold in time: The beginning and ending of an event cannot be observed simultaneously. This temporal attribute may explain the greater perceived abstractness of verbs than object nouns (Bird et al. 2000), and the conceptual difficulty of verbs relative to object nouns in early lexical development (Gentner 1982). Further, among all verbs, those that denote non-physical events, e.g., to infer and to consider, appear even more abstract and conceptually challenging due to scarcity of visual–perceptual attributes (Rips and Estin 1998).

Some studies on verbs have demonstrated activation of sensory–motor experiences in the processing of physical verbs in contrast to non-physical verbs. For example, Lo Gerfo et al. (2008) applied rTMS to the left prefrontal region and the primary motor cortex to examine its potential impact on retrieval of hand action verbs versus abstract verbs. Interference was found only for hand action verbs, but not for abstract verbs, implicating motor experience in the representation of an action event. In a PET study, Vigliocco et al. (2006) compared words referring to motor events (e.g., twirl, skate) versus sensory events (e.g., tickle, taste). Motor words led to preferential activation in the left motor cortical regions, while sensory words led to preferential activation in the higher-order visual association cortex adjacent to cortical regions processing multisensory input. In an fMRI study, Grossman et al. (2002) found distinct activation patterns for motion verbs (e.g., fall) compared with one particular type of abstract verb, i.e., cognition verbs (e.g., think). Specifically, cognition verbs were associated with activities in the left posterolateral temporal cortex, a supramodal region responsible for storage of propositional knowledge (Mesulam 1985), thus indicating greater involvement of propositional knowledge in the representation of cognition verbs relative to motion verbs (Grossman et al. 2002).

Differential recruitment of brain regions revealed in the aforementioned studies highlights the essential semantic features in representations of different types of event verbs. ERP studies on verb processing, though relatively limited, have offered further insight with regard to relative salience of different semantic features in event representation. Kellenbach et al. (2002) examined three types of verbs differentiated by visual and motor attributes: cognition verbs (e.g., consider), motion verbs (e.g., flow), and action verbs (e.g., sit). By design, action verbs referred to events with visual and motor attributes, motion verbs referred to events with visual, but not motor attributes, and cognition verbs referred to events without visual or motor attributes. In the N400 time window, both action and motion words generated greater negativity than cognition words. Since action and motion verbs both possessed visual attributes in contrast to cognition verbs, the greater N400 therefore seemed to indicate greater salience of visual imagery in the representation and processing of event verbs (Kellenbach et al. 2002).

In another ERP study, Barber et al. (2010) examined the processing of motor words versus sensory words. In the N400 time window, sensory words elicited greater negativity than motor words. Intuitively, sensory events (e.g., to taste) do not possess more salient visual attributes than motor events (e.g., to twirl). In addition, the study controlled imageability between sensory and motor words. Findings from Barber et al. (2010) thus suggest that other semantic features, possibly secondary to visual imagery, also come into play during the processing of event verbs. Lee and Federmeier (2008) argued that the N400 may reflect not only the degree to which a word elicits an imagery process, but also the extent to which its associated semantic network is activated (also see Holcomb et al. 1999; Kutas and Federmeier 2000, 2011). That is, activation of a stimulus word (e.g., smell) leads to activations of words (e.g., odor, aroma, scent, stink, fragrance, etc.) semantically related to the stimulus word. As words can vary in density of their respective semantic networks, the amplitude of the N400 may reflect such variation. Therefore, the greater N400 associated with sensory words revealed by Barber et al. (2010) might, to some extent, reflect the greater numbers of semantic associates of sensory words relative to motor words.

As noted earlier, due to greater perceived abstractness, event verbs appear more conceptually challenging than object nouns, and non-physical verbs pose even greater challenge than physical verbs. Yet, there have been reports on selective semantic impairment of not only verbs but also nouns (e.g., Luzzatti et al. 2002), not only cognitive verbs but also physical verbs (e.g., Bushell and Martin 1997), and further, not only abstract words but also concrete words (e.g., Cipolotti and Warrington 1995). Therefore, systematic assessment and analysis with regard to relative salience of different semantic features, such as imageability and density of semantic network, across different word categories may have implications not only for research but also for instruction and intervention in practical settings. For example, a more refined word categorization based on feature analysis may be instrumental to diagnosis of and treatment for selective semantic impairment.

The present study utilized the ERP technique to examine sensory, cognitive, and motor verbs and also included speech verbs. Speech communication seems to entail both mental processes, such as thinking and comprehending, and physical processes, such as facial muscle movement and gesture. Salience of mental attributes and physical attributes of speech verbs may differ from the mental attributes of cognitive verbs and the physical attributes of sensory and motor verbs. Consequently, visual imageability could demonstrate a graded variation across cognitive, speech, sensory, and motor verbs. In addition, the level of semantic association of speech verbs may also systematically differ from the other verb types. Therefore, ERP responses generated by speech verbs will provide an additional comparison and help to decipher the relative salience of imageability and semantic association in verb processing. Furthermore, speech verbs seem to have garnered less attention than other verb types in past research. Inclusion of speech verbs in the present study was also expected to improve our understanding about the representation and processing of this verb category.

In line with past ERP studies on sensory, motor, and cognitive verbs (e.g., Barber et al. 2010; Kellenbach et al. 2002), the primary interest of this study was in activities within the N400 time window. Specifically, the level of imageability and the level of semantic association of sensory, cognitive, speech, and motor verbs were assessed. ERP activities in the N400 time window generated by these four types of verbs were then evaluated in relation to their levels of imageability and semantic association. As researchers have pointed out, both factors are involved in word processing (Holcomb et al. 1999; Kutas and Federmeier 2000, 2011; Lee and Federmeier 2008). However, their respective contributions may differ in verb processing. More specifically, if processing a verb entails mainly activation of visual–perceptual attributes, the amplitude of N400 would largely correspond to the level of imageability, and the contribution from semantic associates might be secondary. In this case, motor verbs, having the most salient observable physical attributes, should elicit a higher level of negativity relative to cognitive verbs. Sensory and speech verbs might fall in between motor and cognitive verbs. Alternatively, if processing a verb involves primarily activations of semantic associates, the amplitude of N400 would mostly match the level of semantic association, and the activation of imagery information might play a lesser role. However, as discussed earlier, it should be noted that neither factor alone, imageability or semantic association, can fully account for the variation in N400 amplitude as both factors and, possibly, other unidentified factors contribute to word processing (e.g., Holcomb et al. 1999; Kutas and Federmeier 2011). That is, the amplitude of N400 is unlikely to perfectly agree with either the level of imageability or the level of semantic association. Instead, if one factor relative to the other makes a more substantial contribution to the processing of these verbs, the N400 amplitude would associate more evidently with one of the two factors. Alternatively, if the two factors make comparable contributions to the processing of these verbs, the N400 amplitude might be associated, to a similar degree, with both factors.

In summary, as the objective of this study is to evaluate the relative salience of imageability versus semantic association in verb processing, the following outcomes are possible with regard to the relations of the N400 amplitude to the levels of imageability and semantic association. First, across the four verb categories, the variation in N400 amplitude might show a greater association with the level of imageability relative to the level of semantic association. Second, the variation in N400 amplitude across verb categories might show a greater association with the level of semantic association relative to the level of imageability. Finally, the variation in N400 amplitude across verb categories might relate equally to the levels of both factors.

As a final note, current literature about word processing has reported, in addition to the N400, a late frontal negativity also modulated by word concreteness (e.g., Lee and Federmeier 2008; West and Holcomb 2000). West and Holcomb (2000) named this component N700 and interpreted it as imagery processing evoked by more concrete words. The aforementioned study by Kellenbach et al. (2002) revealed that words with salient visual attributes, e.g., action verbs, generated greater amplitude of this late component than did cognition words. However, in a more recent study, Barber et al. (2013), after controlling imageability, found that the N700 was still more pronounced for concrete words than for abstract words. The specific role of N700 therefore remains unclear and may signify multiple underlying processes. Conspicuously, the four verb categories examined in this study vary considerably in imageability level, providing an opportunity to further examine its effect on the N700. Thus, in addition to the primary focus, amplitude of the N400, analysis of the present study also explored potential amplitude differences in the N700 across verb categories.

Methods

Participants

Thirty-four university students (15 males, mean age = 22.26 years, ranging from 17 to 25) participated in this study. All were native speakers of Chinese, right-handed, with normal or corrected to normal vision. They were paid a small amount of money for participation. Three additional participants were excluded from statistical analyses due to insufficient number of trials in one or more than one verb condition after incorrect responses and movement artifacts were removed.

Materials

Four types of two-character Chinese verbs were sampled for the purpose of this study: 20 sensory verbs (e.g., listen, stare), 20 cognitive verbs (e.g., think, infer), 20 speech verbs (e.g., tell, speak), and 20 motor verbs (e.g., lean, jump). The four types of verbs were matched in number of strokes, F(5) = 0.130, p = 0.985, and in logarithmic frequency of Google hits, F(5) = 0.505, p = 0.772 (Table 1). In addition, there were 40 two-character filler verbs all referring to common physical activities (e.g., mail, borrow). Lastly, 120 pseudowords were created matching with the verbs for number of strokes. Each pseudoword consisted of two real Chinese characters, forming a two-character meaningless combination.

Table 1 Properties of sensory, cognitive, speech, and motor verbs

A group of 30 university students, who did not participate in the ERP experiment, evaluated all verbs with regard to level of imageability and level of semantic association. Specifically, for level of imageability, these raters were informed that some words could easily and quickly evoke mental imagery, whereas other words may do so with difficulty or not at all (Toglia and Battig 1978). They rated the ease with which each given verb could evoke a mental image, using a 7.0 scale where “1” was labeled as “not easy at all” and “7” was labeled as “extremely easy.” For level of semantic association, the raters were instructed to simply assess the number of words meaningfully related (Nelson et al. 1998) to each verb, using a 7.0 scale where “1” was labeled as “none” and “7” was labeled as “a lot.” They were encouraged to skim through all verbs on the list before starting to rate the first verb so that they could better utilize the scales. Inter-rater reliability was excellent for imageability ratings (α = 0.95) and good for semantic association ratings (α = 0.82). Figure 1 plots the mean ratings of both factors for all verb conditions.

Fig. 1
figure 1

Average semantic association and imageability ratings of four verb categories

Procedure

All participants signed a written informed consent prior to the experiment. To ensure that they understood the instructions, they completed two 18-trial practice blocks, with a set of words and pseudowords different from what was utilized in the following task.

The experiment included two sessions, with the second session being a repetition of the first in order to ensure adequate signal-to-noise ratio (Picton et al. 2000). A session was divided into two blocks, each consisting of 120 trials (10 sensory verbs, 10 cognitive verbs, 10 speech verbs, 10 motor verbs, 20 filler verbs, and 60 pseudowords) in a random order. Each trial began with a cross fixation (+) presented for 300 ms at the center of the computer screen, followed by a black screen for 200 ms. Then, a verb or pseudoword was displayed for 1000 ms. Participants needed to indicate whether it was a meaningful word by pressing either the YES key or the NO key as quickly and accurately as possible. If a response was made within 1000 ms, the stimulus disappeared immediately. Otherwise, it would disappear at 1000 ms after its onset. A black screen then was displayed for a randomly determined duration (1200–1800 ms) before the next trial began.

The experiment was conducted in a soft-lighted and soundproof recording room. Participants sat about 100 cm from the computer screen. All stimuli were presented white-on-black, 3.5 cm high and 6.6 cm wide, in the middle of the screen. The two response keys were counterbalanced across participants, and the order of blocks was counterbalanced across sessions. The order of trials within each block was randomized. There was a break between two blocks within each session and between two sessions. Participants determined the duration of each break.

EEG recording and ERP data analysis

Participants’ electroencephalograms (EEG) were recorded from a 32-channel Quik-Cap (NeuroScan, Inc.) with the right mastoid as reference. Vertical eye movements were recorded by two electrodes attached above and below the left eye. Horizontal eye movements were monitored by two electrodes placed on the left and right outer canthi. Impedances of all electrodes were kept below 5 kΩ. The sample rate was 500 Hz with a band pass of 0.05–100 Hz. The data were re-referenced off-line to linked mastoids.

The continuous data were segmented from 100 ms pre-stimulus to 800 ms post-stimulus for the four verb conditions. Data were filtered off-line with a low pass of 30 Hz (24 dB). The mean voltage of the 100 ms pre-stimulus interval acted as a baseline for ERP measurement. Trials contaminated by eye blinks, eye movements, or muscle potentials exceeding ±100 μV at any electrode and trials associated with wrong responses were excluded from data analysis, resulting in exclusion of 6.75 % of trials. There were at least 30 trials of each verb condition for each participant to compute average ERPs so that adequate signal-to-noise ratio could be achieved. As indicated earlier, three participants were excluded according to this criterion. The segmented data were then averaged for each word condition within each participant.

Figure 2 illustrates the grand average ERP waveforms of the four verb conditions from nine sites within the frontal and central areas, where the N400 appeared most pronounced. The fact that the N400 was more evident in these regions seemed to be consistent with the notion and reports that the neural substrate of verb processing is more anterior relative to that of noun processing (e.g., Caramazza and Hillis 1991; Tyler et al. 2004). Based on visual inspection, two time windows (250–350 and 350–450 ms) were delineated to examine the amplitude of the N400. There did not appear to be evident frontal negativities following the N400. In fact, one late positive going component was present. Additional analysis was also conducted to examine any potential verb effects on this positive going component. Three additional time windows were delineated for this purpose: 450–550, 550–650, and 650–800 ms. As the preliminary analysis did not reveal any repetition effect, data were collapsed over the two sessions to compute average ERPs of each time window.

Fig. 2
figure 2

Grand average ERP waveforms for four verb categories from nine representative electrode sites in the frontal and central areas

Analyses were then conducted separately for midline sites and lateral sites. For midline sites, 4 (verb type: sensory, cognitive, speech, and motor) * 6 (electrode sites: FZ, FCZ, CZ, CPZ, PZ, and OZ) repeated-measures ANOVA was performed on the mean amplitude of each time window. For lateral sites, 4 (verb type: sensory, cognitive, speech, and motor) * 2 (hemisphere: left and right) *11 (electrode sites: F7/8, F3/4, FT7/8, FC3/4, T7/8, C3/4, TP7/8, CP3/4, P7/8, P3/4, and O1/2) repeated-measures ANOVA was performed on mean amplitude of each time window. The Geisser–Greenhouse correction for non-sphericity was applied when appropriate, with uncorrected degrees of freedom and corrected probabilities presented. Out of primary interest, only main effects or interactions involving the factor of verb type are reported. To tease apart significant effects, pair-wise comparisons were conducted using Bonferroni’s t tests to control family-wise error rate. All reported differences from these comparisons had p values less than 0.05. Finally, Spearman’s rank order correlation (r s) analysis examined the relations of mean N400 amplitudes to ratings of imageability and semantic association.

Results

Behavioral data

One-way ANOVA showed that the four verb types differed in response time, F(3, 99) = 535.60, p < 0.001, partial η 2 = 0.94. Responses to sensory verbs (M = 494 ms, SD = 39 ms) were faster than to the other three types of verbs: cognitive verbs (M = 612 ms, SD = 44 ms), speech verbs (M = 597 ms, SD = 38 ms), and motor verbs (M = 610 ms, SD = 42 ms), all ps < 0.001. In addition, responses to cognitive verbs and motor verbs were slower than to speech verbs, both ps < 0.01.

Response accuracy rates were also different, F(3, 99) = 26.82, p < 0.001, partial η 2 = 0.45. Response accuracy rate was lower for sensory verbs (M = 90 %, SD = 6 %) than for cognitive verbs (M = 96 %, SD = 4 %), speech verbs (M = 97 %, SD = 3 %), and motor verbs (M = 97 %, SD = 4 %), all ps < 0.001.

ERP data

250–350 ms

Numerically, motor verbs elicited greatest negativities, followed by sensory verbs, speech verbs, and then cognitive verbs. Figure 3 illustrates the mean amplitudes of four verbs conditions within this time window. At the six midline sites, the 4 (verb) × 6 (site) repeated-measures ANOVA revealed a significant main effect of verb type, F(3, 99) = 4.70, p = 0.004, partial η 2 = 0.13. Specifically, motor verbs and sensory verbs led to more negative going activities than cognitive verbs. Speech verbs appeared as an intermediate case, not significantly different from motor, sensory, or cognitive verb conditions.

Fig. 3
figure 3

Mean amplitudes of four verb categories in the 250- to 350-ms time window

Similarly, at the lateral sites, the 4 (verb) × 2 (hemisphere) × 11 (site) repeated-measures ANOVA revealed a significant main effect of verb type, F(3, 99) = 4.96, p = 0.003, partial η 2 = 0.13. Motor verbs were shown significantly more negative going activity than cognitive verbs. Sensory verbs and speech verbs fell in-between and did not differ from each other or from the other two verb conditions. That is, the amplitude differences between sensory and cognitive verbs reached significance at only the six midline sites but not the lateral sites. This topographic difference might reflect distinctions in neural representations among semantic categories, e.g., sensory and cognitive verbs (Binder et al. 2009; Kutas and Federmeier 2011; Wang et al. 2010).

350–450 ms

The main effect of verb type persisted into the 350- to 450-ms time window, with smaller effect sizes, F(3, 99) = 3.14, p = 0.03, partial η 2 = 0.09 at midline sites and F(3, 99) = 3.17, p = 0.03, partial η 2 = 0.09 at lateral sites. There was also a significant interaction of verb and hemisphere, F(3, 99) = 6.17, p = 0.001, partial η 2 = 0.16. Simple effect analysis indicated that cognitive verbs showed significantly lower negativities relative to motor and sensory verbs only in the right hemisphere, but not in the left hemisphere (Fig. 4). Further, the three-way interaction, F(30, 990) = 1.89, p = 0.003, partial η 2 = 0.05, showed that these differences were mostly evident in the lateral anterior region of the right hemisphere (i.e., F8, FT8, T8, and TP8).

Fig. 4
figure 4

Scalp distributions of amplitude differences between verb categories in two time windows

Later time windows

No verb-related effect was detected. Because past research reported that the N700 was primarily evident in the frontal region, to increase the chance of detecting any verb effects, an ANOVA including rostrality as a factor was also performed on mean amplitudes within each of the three later time windows. However, no verb effect emerged in any time window, and posterior sites showed greater negativity than anterior sites.

Correlation analysis

The above analysis showed that verb type effects were most pronounced in the 250- to 350-ms time window and became abated and right lateralized in the 350- to 450-ms time window, as evidenced by effect sizes and simple effect analysis. Correlation analysis therefore focused on the 250- to 350-ms time window where variations in mean N400 amplitude across verb types were evidently present. (We also conducted correlation analysis in the 350- to 450-ms time window, which revealed essentially the same outcomes).

Based on mean amplitudes, the four verb types were ranked as follows from the lowest negativities to the greatest negativities: cognitive verbs (1), speech verbs (2), sensory verbs (3), and motor verbs (4). Then, Spearman’s rank order correlation (r s) analysis examined the relations of the rank of negativities with ratings of imageability and semantic association. Results showed the negativities were significantly correlated with ratings of imageability, r s = 0.81, p < 0.001, n = 80, but not ratings of semantic association, r s = −0.12, p > 0.30, n = 80. That is, the variation in mean N400 amplitudes across four verb categories largely corresponded to ratings of imageability.

Discussion

The present study examined ERP responses generated by four different types of event verbs: sensory, cognitive, speech, and motor verbs. Within the first half of N400 time window (250–350 ms), motor verbs elicited the greatest negativity, whereas cognitive verbs the smallest negativity. Sensory and speech verbs appeared intermediate and did not differ from each other. In addition, sensory verbs produced greater negativity than cognitive verbs at midline sites. In the second half of the N400 time window (350–450 ms), amplitude differences across verb types decreased and became significantly lateralized, only evident in the right hemisphere. As shown by correlation analysis, the variation in mean N400 amplitudes across verb types largely corresponded to the level of imageability (Figs. 1, 3), suggesting that imageability was the primary factor accounting for N400 amplitude variations across these verb categories.

The role of imageability in verb processing seems to be corroborated by the right lateralization of N400 amplitude differences in the later portion of the N400 time window. Kounios and Holcomb (1994) found right hemisphere differences in N400 between concrete words and abstract words, which mainly differed in level of imageability (also see Nittono et al. 2002). The present study therefore provides evidence that level of imageability appears to be a salient, characterizing feature that not only distinguishes abstract words versus concrete words in general, but also plays an important role in the semantic specification of different categories of event verbs, which are generally considered more abstract than object nouns (Bird et al. 2000). In Grossman et al. (2002), motion verbs, relative to cognition verbs, were linked to a right-lateralized predominance in caudate activation, which Grossman et al. argued reflecting spatial attributes of motion verbs. This seems to provide a potential explanation for the right hemisphere differences between motor/sensory verbs and cognitive verbs revealed in the present study, considering that both motor verbs and sensory verbs possess spatial attributes. However, further research is certainly needed to identify the specific source of the right hemisphere differences found in the present study.

Figures 1 and 3 indicate that motor verbs possessed the lowest level of semantic association, yet the greatest mean amplitude of N400. It is possible that at this particular stage, activation of semantic network is a secondary factor, relative to activation of imagery information, contributing to verb processing. For example, Barber et al. (2010) found that after controlling for imageability, sensory words elicited greater N400 than did motor words, which might be an effect of this secondary factor, i.e., different levels of semantic association between the two types of words. Further, the average imageability rating in the present study for motor verbs was 6.29 on a 7.0 scale, whereas sensory, speech, and cognitive verbs fell on the lower part of the scale, all below 4.30. Therefore, it is possible that, when imageability level is low, verb processing starts to draw heavily upon information stored in the association with semantic neighbors. Findings in the present study with regard to greater negativities of sensory words relative to cognitive verbs in the 350- to 450-ms time window thus may represent an integration of a greater amount of information drawn from semantic networks of sensory verbs. Taken together with the findings reported by Barber et al. (2010), this appears to be a topic worth continued investigation.

This study included speech verbs, e.g., tell, talk, and describe. Event concepts denoted by these verbs appear to possess both non-observable mental attributes, such as cognitive process, and observable physical attributes, such as facial muscle movement. Intuitively, visual-perceptual attributes are essential for concepts denoted by motor verbs, less essential for speech verbs, and even less so for cognitive verbs, which can be completely void of physical attributes. This was confirmed by the intermediate level of mean imageability rating for speech verbs between motor and cognitive verbs. Furthermore, the N400 generated by these three verb types corresponded to their respective imageability ratings. On the one hand, as discussed earlier, these findings indicate salience of visual-perceptual attributes in the representation and processing of speech verbs relative to cognitive verbs. On the other hand, they implicate the significance of non-physical attributes, e.g., mental attributes, in the representation and processing of speech verbs relative to motor verbs. In fact, evidence from imaging research indeed suggests that meaning representation of speech verbs differs from that of motor verbs. For example, Kemmerer et al. (2008) compared different types of action verbs: running, speaking, hitting, and cutting. Their fMRI data showed that, for the processing of running, hitting, and cutting verbs, activation patterns somatotopically mapped sectors in the primary motor and pre-motor areas, but no significant activation in response to speaking verbs was found in the mouth sector of these brain regions. Speech verbs have been understudied compared with the other types of verbs. Findings of the present study seem to indicate a unique semantic specification of speech verbs and thus the need of more research to develop a further understanding about this verb category.

To detect a potential late ERP component, the N700, this study also examined verb effect within a later time frame, 450–800 ms post-verb onset. The N700 was generally considered a representation of imagery processing (Lee and Federmeier 2008; West and Holcomb 2000). However, despite variation in imageability, analysis of the present data did not reveal verb-related effects. The lack of N700 is parallel to an earlier study on the concreteness effect also conducted in Chinese (Zhang et al. 2006), thus indicating that language might be one of the contributing factors to the discrepancy across studies. Alternatively, as Barber et al. (2013) argued, the late negativity related to concreteness might not reflect an imagery process, but instead a controlled process to maintain a mental representation with a greater amount of multimodal attributes possessed by concrete words. In that case, the absence of N700 from studies using Chinese words may indicate a difference in mental representations at the late stage of word processing between Chinese and morphologically rich languages such as English (Lee and Federmeier 2008) and German (Kellenbach et al. 2002). This is obviously a speculation in need of future evaluation.

The present study focused on two semantic features, imageability and semantic association, and revealed greater salience of imageability relative to semantic association in verb processing. However, the different types of verbs utilized in this study might also systematically differ in other dimensions. If their differences in one of these dimensions, e.g., valence, coincide with those in imageability, it would not be plausible to consider imageability as the prominent factor contributing to variations in N400 amplitude across verb categories. Further, Barber et al. (2013) found that imageability and semantic association, combined with a set of other lexical and semantic factors (e.g., familiarity), still could not fully account for N400 variation during word processing. Therefore, future research needs to continue to explore and take into consideration other potentially relevant factors in order to gain a more comprehensive understanding about verb processing. In an effort of doing so, a group of 15 university students were recruited to evaluate the verbs in this study with regard to two additional features: valence and age of acquisition. For valence, they rated on a 7.0 scale the degree to which each verb could evoke a positive or negative emotion. On the scale, “1” was labeled as “very negative” and “7” was labeled as “very positive.” For age of acquisition, they indicated, as accurately as they could, the age when they first learned the meaning of the verb. Inter-rater reliability was good for both types of ratings, α = 0.84 for valence and α = 0.88 for age of acquisition. Table 2 presents mean ratings of four verb conditions.

Table 2 Ratings of valence and age of acquisition for four verb categories

Spearman’s rank order correlation analysis showed that negativities in the N400 time window were not correlated with valence ratings, r s = −0.15, p = 0.20, but negatively correlated with age of acquisition, r s = −0.60, p < 0.001. That is, verbs learned earlier in life tended to elicit greater N400. As the words learned earlier in life tend to show greater imageability, imageability might be partially responsible for this correlation. After controlling for imageability, the correlation between the N400 and age of acquisition was relatively weakened, but still significant, r s = −0.44, p < 0.001. Conversely, after controlling for age of acquisition, the correlation between the N400 and imageability was also still significant, r s = 0.73, p < 0.001. These analyses suggest that imageability and age of acquisition, though overlapping with one another, make unique contributions to the processing of these verbs. This, on the one hand, supports the prominent role of imageability in verb processing and, on the other hand, points to the multifaceted differences across different verb categories, suggesting a need for future research to explore potential interplays among different factors in verb representation and processing.

One limitation of the present study is that it examined only simple event concepts such as to smell, to think, to talk, and to walk. There are complex event verbs in our lexicon representing more complex events, especially social events such as to marry, to invest, and to defend. Relative salience of semantic features may vary considerably across different conceptual domains. Further research with more refined conceptual categorization is clearly needed in order to develop a better understanding about the semantic features involved in concept representation. Another limitation of the study is that, although the addition of speech verbs, as a fourth verb category, provided the opportunity to examine graded variations in imageability and in the N400 across verb categories, an item analysis might be able to offer more insight on the strength of the relationship between N400 amplitude and imageability. Furthermore, an item analysis with a regression approach would be able to quantify the relative salience of imageability, semantic association, and age of acquisition. Unfortunately, the procedure employed by this study did not allow an item analysis, and the conclusion was based on the common practice of category-level comparison. The value of item-level analysis should be acknowledged. Future ERP studies with procedures incorporating item analysis should be able to further benefit research on concept representation and language processing.

In sum, this study evaluated the relative contribution of imagery processes versus semantic associates in the processing of sensory verbs, cognitive verbs, speech verbs, and motor verbs. For these verb categories, the level of imageability appeared a more salient factor relative to the level of semantic association within the N400 time window. Research and analysis on semantic features involved in verb processing and concept representation may provide the basis for a more refined categorization of event verbs in particular and words in general, which may have implications for learning, research, and even clinical practices in the future.