Introduction

In cognitive psychology, there is a certain viewpoint assuming that the body plays a crucial role in the way in which the mind processes and interacts with the world. This movement, labelled the embodied cognition theory, suggests that human cognition is deeply grounded in the sensorimotor system (Barsalou 2008). Over the last decade, a growing number of studies were devoted to supporting this theory, considering high cognitive functions such as memory, reasoning or language.

According to the embodied cognition theory, language is related to the sensorimotor system. Following this view, many studies aimed to understand the role of the sensorimotor system in the comprehension and production of language describing actions (Pulvermüller 2005). For example, this motor intervention has been demonstrated in imagery studies, where the activation of motor areas during action verbs processing was observed (Hauk et al. 2004). This clearly demonstrated that the motor system is recruited during verb perception, thus reflecting the embodied nature of language. Behavioural studies have also provided arguments in favour of this crosstalk. One example is the study by Glenberg and Kaschak (2002), who demonstrated an interaction between the movement described in a sentence and the subsequent execution of a movement. Comprehending a sentence that implies action in one direction (e.g., “close the drawer” is a statement that involves a movement away from the body) interfered with real action in the opposite direction (movement toward the body). These data support that language understanding is grounded in bodily action. Thus, action words processing can automatically affect motor processes, and this occurs even when they are processed in a subliminal way (Boulenger et al. 2008).

Significantly, some researchers showed that the relationship between action and language is not restricted to action production but also appears when action is simply observed (Bidet-Ildei et al. 2011, 2016; Springer and Prinz 2010), thus reflecting the use of mirror system during action language processing (Tettamanti et al. 2005). One example of the relationship between action observation and action words processing was studied by Bidet-Ildei and her colleagues (Bidet-Ildei et al. 2011). In this study, the authors investigated the influence of reading action verbs on the capacity to detect a point-light human action embedded in a scrambled mask. Participants saw a verb used as a prime (run, kick or throw) and then had to judge whether a human action was present in a visual sequence. The results showed a facilitation effect whenever the prime and the biological movement represented the same action. These results were interpreted as suggesting that the perception of action and action language processing are subtended by common sensorimotor representations. In the same vein, Springer and Prinz (2010) showed that action semantics modulate the judgment of observed actions. Namely, in their study, they demonstrated that the capacity to anticipate the end of an action sequence was influenced by the previous presentation of a word. This effect was modulated by the type of words (action verbs vs non-action verbs, dynamic vs static actions and fast vs slow actions). Taken together, these data support the idea that language affects action observation and action production.

If action observation is highly affected by action words, it might be hypothesised that the reverse pattern will also be true and that observing an action will impact the recognition of action words. Nevertheless, the majority of the studies in this field considered the effect of language (reading or listening to a word or a sentence) on action, and only few studies were interested in knowing whether this effect could be bi-directional. Liepelt and his colleagues (Liepelt et al. 2012) aimed to assess this hypothesis of bi-directionality. In a first experiment, participants had to perform an opening or closing gesture with the hand according to the colour of a word. This word could be either “open” or “close”. In congruent trials, i.e., when the word and the body response were semantically identical (open-open or close-close), the authors showed that participants produced the gesture faster. In the second experiment, participants had to say “open” or “close” according to the colour of a square that was in a picture of an open or closed hand. In the same manner, the authors observed that participants were faster to produce a congruent than an incongruent word. Those results demonstrated the bi-directional crosstalk between action and language. Significantly, in a third experiment, which included a neutral condition, the authors showed that the effects observed were related to an interference where action and language were incongruent instead of being related to a facilitation where action and language were congruent.

Nevertheless, in this study, the presented actions were static pictures of one part of the body. The impact of observing a point-light biological movement on action verb processing has never been specifically evaluated, which is the aim of the present study. The use of videos will enable us to directly have the motion that is only suggested when pictures are used. More specifically, this procedure presents the advantage of representing only the kinematic, which characterises a living organism, whereas other types of biological information (e.g., skin texture, limbs forms) are removed (Johansson 1973). In the present study, we examine whether the observation of a point-light human movement can affect the subsequent processing of an action verb.

Method

Participants

Eighteen French-speaking, 18- to 26-year-old (M = 19, SD = 1.98) university students (12 male, 17 right handed) participated in this experiment. Participants were recruited from the University of Poitiers in exchange for course credit. All participants had normal or corrected-to-normal vision and had no history of motor, perceptual or neurological disorders. Moreover, all participants provided their written informed consent prior to their inclusion in the experiment. They were also naive to the purpose of the study.

Apparatus

The participants sat on a chair in front of a table in a dimly lit room. A computer screen (spatial resolution of 1280 pixels * 800 pixels and temporal resolution of 60 Hz) was on the Table. A response box was placed on the table between the participants and the computer screen so the participants could easily provide their responses by pressing the button associated with the “yes” or “no” answers.

Material

The videos deployed as primes were point-light displays (Johansson 1973) and represented human actions (running, walking, jumping, throwing, and pushing). They included 13 points of light located on the main body joints (shoulders, elbows, wrists, hips, knees and ankles) and the head. We used the coordinates provided by a point-light actions corpus that is freely accessible on the following website: http://astro.temple.edu/~tshipley/mocap/dotMovie.html.

There were two types of stimuli: verbs and pseudo-verbs. Pseudo-verbs were orthographically and phonologically legal sequences that were created by changing one or two letters in each verb, resulting in a list of 15 pseudo-verbs and 15 verbs. Verbs could be action verbs that were congruent with the prime (CAV: run, walk, jump, throw, push), action verbs that were incongruent with the prime (IAV: eat, sing, drive, touch, sit), or neutral verbs, namely, verbs that did not describe an action (NAV: confess, know, refuse, choose, have to). All verbs were in French, in the infinitive form and matched with the lexique.org database (http://www.lexique.org/) for relevant lexical variables including word frequency, number of letters and number of syllables (see Table 1).

Table 1 Mean values of verb frequency, number of letters and number of syllables for congruent action, incongruent action and non-action verbs

Moreover, the imageability of words was assessed using a questionnaire. The task of the participants was to self-assess their ability to mentally represent a set of verbs by using a scale ranging from “very easily” to “very difficultly”. The questionnaire consisted of 35 verbs, of which some were verbs constituting the task and 20 were distracters. The aim of this questionnaire was to verify the homogeneity of the stimuli used and ensure that action verbs related to the prime and action verbs not related to the prime had an equivalent imageability. The statistical analysis revealed that imaging non-action verbs was more difficult than action verbs (p < 0.001 for each comparison), whereas no difference appeared between congruent and incongruent action verbs (p = 0.35). These results enabled us to ensure that any difference obtained during the experimental task between the congruent and incongruent action verbs would be related to a congruency effect instead of the result of differences in the verbal material.

Procedure

The experimental session included 150 trials (5 presentation of each of the 30 stimuli) with a randomized presentation across participants. Each trial underwent the following procedure (see Fig. 1): first, a fixation cross appeared (500 ms), then a prime video (1000 ms), and finally, following another fixation cross (500 ms), the stimulus (a word) appeared. The stimulus stayed on the screen until the response of the participant was entered. The task of the participant was to judge, as quickly and accurately as possible, whether the word that was presented existed in the French language. The “yes” answer was consistently entered by the dominant hand of the participant (right-handed participants pressed the button “5” of the response box, and left-handed participants pressed “1”), whereas the “no” answer was entered by the other hand (right-handed participants pressed “1”, and left-handed participants pressed “5”).

Fig. 1
figure 1

Procedure of the experimental task. The arrow represents the sequence of one trial. A central fixation cross was presented for 500 ms, then a point-light display (run, walk, jump, throw, or push) was displayed for 1000 ms, then a central fixation cross was presented again for 500 ms, and finally the stimulus appeared on the screen. It could have been a verb (congruent or incongruent with the prime) or a pseudo-verb. Participants responded “yes” if the stimulus was a French word and “no” if it was not

After the participants completed the experimental task, a short questionnaire was administered to them. The aim of this questionnaire was to verify whether each action of the primes had been recognised by the participants.

The experiment lasted half an hour with a break taken at the halfway point of the experiment.

Data Analysis

Response time and accuracy were recorded with the hypothesis that observing an action will facilitate the processing of subsequent action verbs. Thus, shorter response time and a less errors should be obtained for the CAV compared to the IAV and NAV.

Significant differences were examined with an ANOVA, with the verb types (congruent action x incongruent action x non-action) as a within-subject factor. Paired comparisons were performed using the Newman-Keuls procedure. The significance level was fixed at p < 0.05.

Results

Because of a high rate of accuracy for each type of stimulus (>96%), the data analyses concerned only the response times for correct answers.

Analysis revealed that response times varied according to the verb type (F(2,36) = 4.526; p < 0.02, part-ηp2 = 0.845). The response times for congruent action verbs (M = 573.76 ms, SD = 89.64) were shorter than those for incongruent action verbs (M = 601.09 ms, SD = 112.8) and non-action verbs (M = 611.82 ms, SD = 125.56). Newman-Keuls post-hoc analyses revealed that these differences were significant (p < 0.05 each). However, there were no differences in the response times between incongruent action verbs and non-action verbs (p > 0.4).

These results are only favourable for the postulation of a facilitation effect in the situation of congruence between the stimulus and the prime (see Fig. 2).

Fig. 2
figure 2

Mean response time according to verb type: CAV (congruent action verbs), IAV (incongruent action verbs) and NAV (non-action verbs). The error bars indicate the standard error of the mean. An asterisk* indicates a significant difference with p < 0.05

To determine whether the facilitation obtained for congruent verbs was due to the prime and not due to a language difference between the judgements of congruent and incongruent verbs, a post experiment was conducted. In this study, nineteen French-speaking, 18- to 21-year-old (M = 19, SD = 0.67) university students (15 male, 17 right-handed) performed the same task as that described above. However, no priming effect was induced in this condition. The video of point-light human actions were replaced by videos representing a scrambled point-light display (a set of points moving with biological kinematic and placed at random spatial positions). In this condition, no difference was observed in the judgements of congruent and incongruent verbs (F(2,36) = 1.3; p > 0.28). These results confirmed that the effect obtained in the main experiment was due to the point-light human movement used as the prime.

Discussion

The present study assessed whether perceiving a biological action can affect a language decision lexical task. To do so, we compared judgments of verbs after the presentation of a point-light display representing human movements. Our results showed that observing a point-light movement leads to the facilitation of processing the verb related to the action depicted by this point-light. As we did not counterbalance the verbs across conditions (i.e., congruent action verbs and incongruent action verbs were two distinct lists of verbs), our results could be due to a simple material effect, with the verbs of the first list being easier to process than the verbs of the second list. However, we conducted a post experiment manipulation check with scrambled point-lights that allowed us to exclude this possibility. Since no difference was observed between the two lists of verbs with a scrambled point-light, the facilitation effect was necessarily related to the observation of the biological point-light movement.

This finding confirms the existence of a relationship between action observation and action language processes and demonstrates for the first time that point-light human movement can affect subsequent judgments of action verbs. Although the effect of processing action verbs on judgments of point-light display has already been demonstrated (Bidet-Ildei et al. 2011, 2016), the present study shows the reciprocal link. Consequently, our findings confirm with a point-light procedure the bi-directionality of the action-language relationship (Liepelt et al. 2012).

Moreover, our data brings the evidence that action observation can affect language processing, which confirms the idea that both action observation and action execution can affect language processing and that action execution and action observation share common processes (Iacoboni and Dapretto 2006). Some authors have postulated that the mechanism sustaining the crosstalk between language and action is the activation of common sensorimotor activation (for a review, see Fischer and Zwaan 2008). These sensorimotor representations should facilitate the action execution or detection when activated by language (Bidet-Ildei et al. 2011), and reciprocal action execution or observation allows people to activate sensorimotor representations that will intervene during language comprehension. Our results could be interpreted in the following framework: sensorimotor representations of the several actions were activated when observing the point-light displays, thus shortening the time required to determine whether the verb that was presented existed in the French language. Moreover, our study revealed that the observation of point-light human movements is sufficient to affect the first stages of language processing, that is, lexical access and decision. This is in agreement with the idea that the lexical access of action words is facilitated by sensorimotor representations (Hauk and Pulvermüller 2004).

In conclusion, the present study shows for the first time that point-light human movements can affect the subsequent processing of action words. Significantly, the effect is obtained for a biological movement but not for a point-light scrambled movement, which suggests that the priming effect is induced by specific human patterns of motion rather than simply by the presence of dynamic stimuli. Finally, this study highlights the benefit of using point-light displays to assess the relationship between action and language. Because point-light displays have the advantage of being modifiable, this methodology offer the possibility of modulating the biological aspect of the human movement. Consequently, this study represents a necessary first step before assessing more precisely which parameters of human movement can affect the relationship between action observation and action word processing. Further research will have to investigate this question more specifically.