Keywords

Embodied cognition, also known as distributed or grounded cognition, posits that cognition does not occur solely in the brain, but that it also employs the rest of the body and the environment (see Barsalou 2008). In other words, the mind is extended beyond the boundaries of the head (Clark and Chalmers 1998). This implies that the capacity limits of working memory and its visuospatial processing components (see Castro-Alonso and Atit this volume, Chap. 2) can also be extended to the body and the environment.

This extension of the mind has been investigated by cognitive load theory, the instructional theory that considers the limitations of working memory and visuospatial processing for learning (see Castro-Alonso et al. this volume-b, Chap. 5). As proposed by Choi et al. (2014), a new model of cognitive load theory can include now the new limits set by the body and the environment, and must consider body and environmental variables that could affect learning.

When the three agents—brain, body, and environment—act together, usually a boost in learning is produced. For example, Kiefer and Trumpp (2012) reviewed diverse embodied activities that led to enhanced cognition for processes such as reading and writing, processing numbers, memorizing concepts and objects, and remembering events. Among the diversity of embodied activities, we focus here on object manipulations and gestures.

This chapter has three main aims: (a) to provide different research perspectives explaining the positive effects of embodied cognition on learning and visuospatial processing (Sects. 7.1, 7.2 and 7.3); (b) to describe various investigations relating object manipulation to education in sciences and visuospatial processing (Sect. 7.4); and (c) to give examples of studies that show a positive relationship between gesturing, science learning, and visuospatial processing (Sect. 7.5). Most of the visuospatial instruments and abilities described in this chapter are detailed in Castro-Alonso and Atit (this volume, Chap. 2) and in Castro-Alonso et al. (this volume-a, Chap. 8).

Regarding the first aim of the chapter, various non-mutually exclusive phenomena predict positive effects of embodiment and body actions on learning. We have grouped these phenomena into effects that are triggered: (a) solely by learners executing the actions, and (b) by learners executing the actions or learners observing others (e.g., instructors and peers) executing the actions. These perspectives and examples are summarized in Table 7.1 and described next.

Table 7.1 Phenomena that predict positive effects of embodiment on learning

7.1 Executing Body Actions

We consider three research perspectives that have investigated the positive effects on cognition triggered by executing body actions: (a) offloaded cognition, (b) generative learning, and (c) physical activity. These three areas tend to overlap sometimes. For example, when reviewing the learning effects of taking notes, R. S. Jansen et al. (2017) remarked that these body actions shared external storage and encoding benefits, which are, respectively, related to the offloaded cognition and generative learning that we describe next. We also include in this section the positive effects that physical activity has on cognitive processes.

7.1.1 Offloaded Cognition

As reviewed by Risko and Gilbert (2016), the two embodied mechanisms to offload cognitive activity in the brain involve placing the cognitive demands onto-the-body or into-the-world. An example of the former, in which the body is used to help processing the task, is typically observed in difficult mental rotations, in which tilting the head can reduce the degrees needed for the rotations in the mind. An example of the latter, in which the environment helps processing the task, can be observed in mental folding tasks, in which drawing sketches can help getting the correct solutions. Usually, both body and environment are involved together in offloading cognition, as the following examples with object manipulation and gestures show.

Regarding manipulations, the experiment by Vallée-Tourangeau et al. (2016) compared mental arithmetic performance of 52 psychology undergraduates and postgraduates (87% females) in two conditions of different embodiment. The embodied condition presented number tokens to the participants, which could be manipulated during the mental calculations. In the non-embodied condition, the participants had their hands palm down and still on the table. Results showed that, in the groups with reduced total working memory through articulatory suppression (mental repetition of a short word), the embodied condition was more efficient (more accuracy and fewer errors) than the non-embodied groups. In other words, the interference that reduced working memory was less problematic when the participants could manipulate the number tokens. Arguably, by offloading the arithmetic task to the body and the environment with manipulative tokens, the participants could manage with the few working more resources left due to articulatory suppression.

An example with gesturing for health science tasks is provided by Macken and Ginns (2014), who investigated 42 adults (74% females) studying illustrations and texts about the structure and function of the human heart. Half of the participants were instructed to gesture while studying (e.g., using the finger to make connections between illustrations and texts), and the other half, the control group, did not gesture. Results showed that the gesture condition outperformed the non-gesture group on a retention test of terminology and a test of comprehension. Ginns and Kydd (2019) replicated this study with 30 adults (67% females), finding the gesture condition again outperformed the non-gesture condition on both the retention and comprehension tests, while also reporting the lesson as less difficult.

Hegarty and Steinhoff (1997) provide an example of physics instruction where cognitive processing was offloaded to the environment by note-taking. In two experiments with a total of 186 undergraduates, the author investigated the instructional effects of making notes on diagrams showing the mechanics of pulley, gear, and lever systems. Mental folding of the participants was assessed with the Paper Folding Test, and the scores were used to perform a median split between low and high mental folding students. For low visuospatial processing participants, it was observed that making notes on the diagrams allowed better results. In other words, the limitations that low mental folders had to process the physics displays were compensated by them being allowed to make notes that acted as scaffolds to understand the visualizations. In contrast, high mental folders were not benefited by this note-making process. For these high visuospatial processing students, their cognitive capacity was enough to cope with the challenging learning visualizations, so they were not helped by offloading cognition into notes.

In contrast to these supporting evidence for offloaded cognition, there is also a negative side when we rely on external devices for our cognitive processes, as it has been shown for visual memory. For example, Henkel (2014, Experiment 1) investigated how taking photos of objects in an art museum affected the memory for them. In the study, the 27 undergraduates (78% females) either took photographs of 15 museum pieces (e.g., painting, pottery, sculptures) or just observed other 15 pieces. The next day, the memory test showed that the photographed objects were less remembered and with less detail than the observed pieces that were not recorded. Thus, when the participants relied on offloading cognition to the environment (camera), they used less effectively their own cognition to memorize the items (see also the review by Marsh and Rajaram 2019).

7.1.2 Generative Learning

As reviewed by Wittrock (1989), generative cognitive processes involve relating the learning contents to personal knowledge, beliefs, and experience. In other words, the students actively construct meaning from the contents and make them personal. Instructors can teach students several methods to construct this personal meaning. For example, to understand better a text passage, Wittrock (1989) recommended students’ actions such as writing personal questions or summaries, giving examples, and drawing own graphs or pictures.

It can be noted that these actions also involve offloading cognition. The critical addition is that generative actions are personal and original actions. For example, although writing a question can offload cognition to the environment, it only becomes a generative learning example when the question is personal rather than a copy from the teacher.

To the list of actions by Wittrock (1989), Fiorella and Mayer (2016b) added the activities of summarizing and taking notes (see also R. S. Jansen et al. 2017), self-explaining (see also Chi et al. 1994), imagining (see also Ginns et al. 2003), preparing to teach and teaching (see also Fiorella and Mayer 2013; Hoogerheide et al. 2019), and enacting. From this diversity of generative actions, the focus of this chapter is on enacting, chiefly manipulation and gestures. However, in this section we consider the action of drawing, as it is a highly visuospatial generative activity.

Study 1 in Fiorella and Mayer (2017) investigated 108 undergraduates (70% females) drawing maps and illustrations to understand better a biology text about the human respiratory system. Also, students’ visuospatial processing was calculated averaging the scores of: (a) the Cube Comparisons Test, a common mental rotation test with three-dimensional (3D) shapes; and (b) the Paper Folding Test, a typical instrument of mental folding (see about these tests in Castro-Alonso and Atit this volume, Chap. 2). Results showed that both the spatial generative strategies and visuospatial processing independently predicted effective learning from the scientific text. Similarly, in an experiment with 72 undergraduates learning chemistry from a multimedia module, S. P. W. Wu and Rau (2018) reported the effects of drawing chemical structures on paper while studying from the computers. Results showed that the conditions in which the program prompted the students to draw were more efficient (higher learning performance per time on task) than the condition without these illustration prompts. Similarly, when 120 undergraduates (64% females) were randomly assigned to study a geoscience text in different learning conditions, Wiley (2019) observed that the best performance was on the group instructed to sketch the contents.

Nevertheless, generative activities are not always productive for learning. As predicted by cognitive load theory (see Sweller et al. 2011; see also Castro-Alonso et al. this volume-b, Chap. 5), since working memory is limited, when generative activity involves too much working memory processing, this cognitive processing may interfere with learning. This is noticeable when the learning materials are complex for the students. For example, Ploetzner and Fillisch (2017) investigated 52 undergraduates (83% females) studying a complex animation about a four-stroke engine. The participants were randomly assigned to drawing or reflecting what they observed in the animation. Findings revealed that the overall structures were less frequently recognized in the drawing condition than in the reflection group. Also, in three experiments with a total of 370 university participants (66% females), Stull and Mayer (2007) compared students building concept maps about the reproductive barriers between species, versus conditions where the maps were already completed. A consistent finding of the three experiments was that the generative actions of building the maps were counterproductive to learning the biology topic. In short, instructors and teachers should pursue balanced learning activities, where generative actions are included in a quantity that is sufficient and not excessive.

7.1.3 Physical Activity

Pothier and Bherer (2016) defined physical activity as body movements by skeletal muscles using energy. This activity includes aerobic training, resistance training, dance, yoga, and tai chi, among others. These diverse embodied activities tend to show positive effects on cognition (Pothier and Bherer 2016). For example, Fenesi et al. (2018) investigated 77 undergraduates (78% females) studying a 50-min video lecture about the perception of forms. The students were randomly assigned to one of three groups. The exercise breaks condition performed three 5-minute breaks involving gross motor movements exercises (e.g., high knees, heeltaps, and jumping jacks). The non-exercise breaks condition performed three 5-minute breaks playing a puzzle videogame. The control condition studied the lecture continuously without breaks. A manipulation check revealed that the exercises increased the heart rate to approximately 70% of the maximum for young adults, indicating that the exercises were vigorous. The main results showed that the group doing breaks with exercises outperformed the non-exercise breaks groups and the control condition, on both attention and memory scores.

Exercising and sports can be productive also for visuospatial processing, in which the type of activity is important. In a study by Moreau et al. (2012), 62 undergraduate students (42% females) attempted the 3D Mental Rotations Tests. Subsequently, the participants completed sport training sessions of 2 h, weekly, for a total of 10 months. Half of the participants trained in wrestling, while the other half trained in running. Results showed that the improvements on the Mental Rotations Test after the sports training were significantly higher for wrestling compared to running. These findings indicate that not any type of physical activity is equally influential on cognition and visuospatial processing.

Note that the physical activity does not need to involve vigorous exercising or strenuous training sessions. Physical activity that positively influences cognitive processes can also be less energetically demanding, as the examples of manipulations and gestures show on science education and visuospatial processing (see Sects. 7.4 and 7.5). For another piece of evidence, Oppezzo and Schwartz (2014) reported that walking showed positive effects on the creative thinking of university students.

In addition, the effects of physical activities on visuospatial processing can be long-lasting. For example, the meta-analysis of 33 samples and 62 effect sizes by Voyer and Jansen (2017), revealed that athletes and musicians outperformed in spatial ability the subjects without these motoric experiences. The overall effect size was of d = 0.38. According to the behavioral sciences benchmarks by Cohen (1988), this number represents a small to medium effect size. Although this is correlational evidence, it supports that the training of motor skills that music and sport disciplines entail may positively influence visuospatial processing for long periods (but see P. Jansen et al. 2016).

7.2 Executing or Observing Body Actions

In addition to solely executing body actions, observing them can also trigger embodied mechanisms productive for learning and visuospatial processing. These observation and imitation mechanisms (e.g., Cracco et al. 2018) are partially triggered by mirror neurons. In arguably the first evidence of these neurons in humans, Fadiga et al. (1995) recorded the excitability of forearm and hand muscles of 12 adult participants. Results showed that the patterns of muscle activation were very similar during the execution of an action and observation of the same action done by another person. Later evidence has supported that these neurons constitute a system that matches action execution and observation in humans (see the mirror neuron system in Rizzolatti and Craighero 2004).

Although executing human body actions tends to be more effective than solely observing these actions and motions (e.g., Jang et al. 2017; Kontra et al. 2015; Stieff et al. 2016; Stull et al. 2018c), both executing and observing human motion trigger the mirror neuron system and are productive to cognitive processes. The following research perspectives describe the phenomena where execution or observation of human body actions can be effective for science learning and visuospatial processing.

7.2.1 Survival Cognition

Equipped with the mirror neuron system and similar imitation mechanisms, humans have evolved to learn human body actions and movements relatively easily. These actions are examples of primary biological knowledge, largely automatic and more efficient than secondary biological knowledge (see Castro-Alonso et al. this volume-b, Chap. 5; see also Castro-Alonso et al. 2019). This has links, respectively, to System 1 and System 2 of dual process theories of psychology (see Barrouillet 2011). Basically, primary biological knowledge has been evolved by our Homo sapiens species over thousands of generations. As a result, currently, modern humans can deal relatively easily with primary biological tasks, such as human movement tasks, because they are part of the System 1 that has helped us to survive in this world (Geary 2002).

In consequence, human body actions, including manipulations and gestures, have been evolved for survival and are relatively easy to learn (Paas and Sweller 2012; see also Sweller et al. 2019). Moreover, any other task aligned with a survival scenario will be more efficient cognitively and thus will tend to be easier. For example, Nairne et al. (2009) measured word recall in adults, comparing survival versus non-survival conditions. Survival conditions involved relating the words to hunting or gathering food for the subsistence of the tribe, whereas the non-survival groups related the words to hunting or gathering for a contest. The groups aiding survival of the species outperformed those just competing, even though all were involved in hunting and gathering. Looking to extend these findings to visualizations, Otgaar et al. (2010, Experiment 1) investigated 75 undergraduates (76% females) memorizing 30 static pictures shown on the computer. Participants were randomly allocated to three conditions. In the survival condition, students rated how relevant the different pictures were in helping to find food and protect from predators. In the moving condition, participants had to rate how important the pictures were if planning to move to a new home. In the pleasantness group, students rated the appeal of each picture. As predicted, analyses revealed that retention was higher in the survival condition, compared to the other two groups which were similar to each other.

An example of visuospatial tasks is provided by Nairne et al. (2012), who reported two experiments involving the visual working memory task known as Object Location Memory. In the experiments, the tasks showed line drawings and compared scenarios of survival versus no survival. In Experiment 1, 52 undergraduates (50% females) were shown 8 drawings of food items in different places on-screen. A group of students was given the instruction that the food collection was essential for survival, while the other group received the instruction that collecting was important to win a contest. In Experiment 2, 72 undergraduates (50% females) were shown 8 drawings of animals. A group was instructed that the animals had to be hunted for survival, while the other group was told that it was to win a contest. Both experiments measured accuracy in memorizing the positions of the elements from memory. Both studies revealed that location memory was higher in the survival contexts, compared to the non-survival conditions.

A key aspect of our species’ survival has been our capacity to reproduce, which entails competing and succeeding for sexual mates (e.g., Geary 2008). In modern societies, these mechanisms involve understanding the behavior of other human beings and communicating between humans, as described next.

7.2.2 Social Cognition

Social cognition belongs to the communicative aspects of survival cognition and is generally more related to observing than to executing body actions. From the four social principles to facilitate multimedia learning described by Mayer (2014), we apply in this section the embodiment principle and the voice principle. The embodiment principle predicts that on-screen instructors would be more effective by using non-verbal communication cues, such as gesturing, facial expressions, and looking directly to the camera. In multimedia science modules, this principle has shown positive effects with human instructors (e.g., Pi et al. 2019; Stull et al. 2018a; van Wermeskerken et al. 2018) and cartoon pedagogical agents (e.g., Mayer and DaPra 2012; see Wouters et al. 2008). The voice principle predicts that narrations would be more effective if recorded in human voice rather than machine voice. Extending the voice principle, there are usually more substantial instructional effects on students that learn from humans rather than from machines or artificial agents.

Concerning the embodiment principle, Stull et al. (2018a) reported two experiments totaling 107 undergraduates (70% females) who studied organic chemistry videos in one of two formats. In one condition, the male instructor wrote the chemistry contents on a conventional whiteboard. Thus, the social cues from the instructor (e.g., facial expressions, eye contact, and gaze) were not observable, as he was writing on the board while giving his back to the students. In the other condition, the instructor wrote on a transparent board, so he faced the students through a transparent window in which he wrote the contents. Results on immediate learning tests showed that the transparent condition performed better.

Similarly, Wang et al. (2019) investigated 58 educational technology undergraduates studying multimedia slides about using graphics editing software. The participants were randomly assigned to two conditions: (a) the gaze group watched the instructor sometimes looking to the relevant parts of the multimedia, whereas (b) the no-gaze condition observed that the instructor always looked to the camera. Results showed that participants in the gaze condition allocated more visual attention to the relevant parts of the multimedia and presented higher learning scores, compared to the participants in the no-gaze group.

Regarding the voice principle, it can be extended to predict that most learning scenarios where the instructor looks more human and less robotic would be more effective (e.g., Press et al. 2005). This is caused by our evolved human cognitive system, that has been shaped for generations to foster human–human communication and not human–machine relationships (cf. Geary 2002, 2008). Similarly, learning human hand tasks, including manipulations and gestures, tends to be more effective from videos and animations that show natural movements than from static images without these evolved motions (e.g., Castro-Alonso et al. 2015a; see also Castro-Alonso et al. 2019).

This extension of the voice principle also applies for visuospatial processing. In an experiment with 120 adults (50% females), mostly students, P. Jansen and Lehmann (2013) reported the common better performance of males over females on mental rotations with 3D figures (see Castro-Alonso and Jansen this volume, Chap. 4). Also, when comparing abstract cube figures to human figures, P. Jansen and Lehmann (2013) observed that the rotations with human depictions presented higher scores than with abstract shapes.

In a follow-up experiment with another 120 adult participants (50% females), Voyer and Jansen (2016) measured differences in mental rotation performance among three groups completing the rotations with different 3D figures. The non-embodied group completed a mental rotation test with abstract 3D cubic shapes. The partially embodied condition attempted the test with cubic shapes that included an attached human head. The fully embodied group performed the rotations of images of 3D human bodies. Results of accuracy and reaction time showed the predicted direction of effects: The group with 3D human bodies outperformed that with abstract shapes and heads, which in turn, outperformed the group with abstract shapes (see also Krüger et al. 2014).

Nevertheless, sometimes social cues can also produce adverse learning effects. As predicted by cognitive load theory, presenting many social cues visually could be detrimental to learning, as simultaneously watching the learning contents plus these visual cues could overload the visuospatial processing capacity of the students (see also Castro-Alonso et al. this volume-b, Chap. 5). For example, after the encouraging findings by Stull et al. (2018a) for transparent boards aiding chemistry learning, a follow-up experiment failed to replicate these positive effects. In this later study with 64 undergraduates (69% females), Stull et al. (2018b) did not find learning differences between transparent and conventional whiteboards. Moreover, an eye tracking analysis showed that the social cues of the instructor tended to be distracting in the transparent condition, where students focused less on the learning contents, compared to the conventional groups (see also van Wermeskerken et al. 2018).

7.2.3 Signaling

Teachers and instructors can use their body to signal important information. As shown in health sciences (e.g., A. J. Hale et al. 2017) and natural sciences (e.g., Pi et al. 2019), these signaling actions indicate the students when or where the most important learning pieces can be found. The effectiveness of signaling has been supported by evidence from diverse educational areas, including science disciplines (see Castro-Alonso et al. this volume-b, Chap. 5; see also van Gog 2014). Moreover, when the human body and its limbs (e.g., arms, hands, and fingers) are the signaling devices, social cognition effects can be triggered in addition to signaling.

Much of the evidence on gestures can be related to the two research perspectives of social cognition and signaling. For example, Ouwehand et al. (2016) investigated if the gesture of finger pointing was helpful to memorize the position of pictures shown on the four quadrants of the screen. In Experiment 1, the 79 adult participants (66% females) were assessed in both pointing versus naming (verbalizing) the quadrants (e.g., “top left”) when the pictures were presented for the first time (study time). In Experiment 2, the 60 adults (63% females) were assessed in both pointing versus solely observing the quadrants at study time. Results showed that, when the pictures were shown again (test time), pointing before was more effective than either naming before (Experiment 1) or observing before (Experiment 2).

Also, in a series of four experiments totaling 484 university students (71% females), Fiorella and Mayer (2016a) investigated the influence of hands drawing illustrations in videos about a physics topic (the Doppler effect). When groups of participants studying illustrations already drawn (hands not shown) were compared to groups studying the instructors’ hands drawing the illustrations (hands or body shown), supporting evidence for showing the hands was found.

In addition, signaling with human limbs tend to be more effective than signaling with non-human limbs, which is also related to the mechanisms of social cognition (voice principle) described above. For example, in an experiment with 84 undergraduates (23% females) studying a video of a photography task explained by a human instructor, Pi et al. (2017) randomly assigned students to either human signaling, non-human signaling, or non-signaling conditions. The human signaling was made by the instructor using her hands to point to the relevant parts in the video, and the non-human signaling involved adding arrows to the relevant parts. Results revealed that human signaling was more effective than both non-human signaling and non-signaling, which did not differ between them. In an experiment with 75 psychology undergraduates (79% females) studying the formation of lightning through an animation, de Koning and Tabbers (2013) compared a group watching a picture of a hand signaling the learning elements versus a group who observed an arrow signaling. Results showed that the hand signal was more effective than the arrow signal, in all learning measures, including written retention, oral retention, and transfer.

Nevertheless, as predicted by cognitive load theory, many signals can be redundant and thus counterproductive to learning. For example, A. J. Hale et al. (2017) advised that medical teachers should not convey too much body language in their lectures, as it could be distracting. Also, Castro-Alonso et al. (2018) conducted an experiment with 104 university students (50% females) memorizing the placement of colored symbols on the screen. Results showed that including static photos of human hands signaling the symbols was counterproductive. Moreover, as shown in the experiments by Castro-Alonso et al. (2014), the negative signaling effects of the static hands were larger when the task involved more visuospatial processing, so less capacity was left to deal with the signals and the visual elements.

7.3 Embodied Cognition in Manipulations and Gestures

A conclusion at this point is that diverse research perspectives support that executing or observing human actions can be productive for learning science topics and processing visuospatial information. We now focus on the two human hand actions most investigated in science education, namely manipulations and gestures.

The research perspectives from the previous sections can describe different examples of beneficial cognitive uses of executing or observing manipulations and gestures. For example, offloaded cognition can explain the beneficial effects of using manipulative tokens for calculations, or how the gesture of tracing with the finger can aid in understanding a machine system. Similarly, the generative learning perspective can be used to explain the positive effect of manipulating anatomical models to obtain a personal angle for study and observation. Likewise, the physical activity rationale would explain the positive effects of making gestures to process more rapidly mental rotations.

Also, survival cognition would explain why it is relatively easy to learn and imitate a human manipulating a chemistry model. Similarly, social cognition predicts that learning biology topics can be boosted if the learner executes or observes the instructor making gestures. Last, the signaling research perspective can explain why it is beneficial to watch the hands drawing a science illustration or pointing to it. In short, different research perspectives can be used to explain why executing or observing manipulations and gestures would influence science learning and visuospatial processing.

In addition to both hand actions influencing sciences education and visuospatial processing, manipulations and gestures share other similarities. Chu and Kita (2008) positioned these actions on a continuum, in which manipulations were more concrete and gestures tended to be less concrete. In four experiments with adults performing mental rotations, the authors provided evidence that training on these visuospatial tasks occurred in three incremental stages. In the initial stage, mental rotations are dependent on manipulations and also on gestures that connect the hand to the rotated shapes. This was regarded as a basic stage, restricted by both the physical constraints of the manipulative shapes, and by the anatomical limitations of the hand. In the intermediate stage, mental rotations only depend on gestures (different to those on the previous stage, such as gestures that simulate the movements of the shapes), so here only the anatomical limitations of the hand are present. In the advanced stage of mental rotation performance, there is independence from both manipulations and gestures, so there are no physical limitations of the shapes or the hands, and the visuospatial processing becomes internalized. In short, manipulations need an object and are concrete, gestures need the hands and are less concrete, and the least concrete action, which is independent of objects and hands, is internal mental processing.

Castro-Alonso et al. (2015b) described a similar relationship between manipulations and gestures. They argued that manipulations are dependent on manipulative objects, whereas gestures are dependent on hands (see Fig. 7.1). Conversely, manipulations can be independent of hands (e.g., Wong et al. 2009), whereas gestures can be independent of objects (e.g., Ping and Goldin-Meadow 2010). Thus, manipulations and gestures differ in their dependence to manipulatives and hands, respectively.

Fig. 7.1
figure 1

Manipulations and gestures are human hand actions that differ in their dependence to objects and hands, respectively

7.4 Manipulations

Manipulating objects and observing instructors or peers using these objects, has shown positive instructional effects on health and natural sciences. However, not all these objects, also known as manipulatives or models, are equally effective instructional assets. For example, Brown et al. (2009) suggested that simpler manipulatives would be more effective than more complex objects containing distracting features. This can also be predicted by cognitive load theory and the redundancy effect (see Castro-Alonso et al. this volume-b, Chap. 5), which discourages adding distracting and redundant information to the learning materials. Moreover, this extra information in the manipulatives can be particularly challenging for students with lower visuospatial capacity.

In this section, we describe the relationships between manipulations, science education, and visuospatial processing. In these research areas, some results have been consistently replicated, as shown in Table 7.2. For example, comparisons between physical and virtual manipulations tend to favor virtual formats. Similarly, when investigating executing versus observing manipulations, more supporting evidence is found for executing the hand tasks. Last, there are some indications where manual training seems to be more effective than only mental training, as research on rotational tasks has shown.

Table 7.2 Examples of research on manipulations

7.4.1 Manipulations and Science Education

Regarding health sciences, Yammine and Violato (2016) conducted a meta-analysis investigating the effectiveness of physical models versus other materials (e.g., 2D digital images, cadavers, and 3D textbooks) to learn anatomy. Although the meta-analysis was small (16 comparisons and a total of 498 students), it showed an overall medium to large mean effect size of d = 0.73, favoring using physical models over the other instructional materials. These positive effects of manipulations on anatomy learning do not need complicated or expensive manipulatives. For example, Chan (2015) described useful low-cost physical models made of simple materials (e.g., apron, T-shirt, hair bands, and pieces of colored paper).

In biology, there are also examples of positive outcomes for manipulations with physical objects. For biomolecular models, Roberts et al. (2005) reported that, in an undergraduate biochemistry course, physical manipulatives of proteins were effective instructional assets and were rated by the students as the most preferred tools. In a study with 32 biology or chemical engineering undergraduates (72% females), Höst et al. (2013) compared the instructional effectiveness of an image or a physical manipulative to learn about molecular self-assembly. Results of the open-ended questions showed that the manipulative was a more effective tool to understand this problematic biomolecular topic. Forbes-Lorman et al. (2016) investigated biology and biochemistry university students learning structure–function relationships in proteins. Using physical models of the proteins was beneficial for women but was not influential for men, arguably because men tend to have higher visuospatial processing (see Castro-Alonso and Jansen this volume, Chap. 4) and need to a lesser extent the offloading scaffolds provided by the manipulative models.

In addition to physical manipulations, current research has also employed computer or virtual formats (e.g., Cui et al. 2017; Skulmowski et al. 2016; Stull et al. 2009). To investigate which format was more effective in organic chemistry instruction, Stull et al. (2013, Experiment 1) recruited 29 university students (55% females). The participants were randomly assigned to either execute virtual and then physical manipulation of models, or physical and then virtual manipulations. Results showed that, independent of the format order, when students employed the virtual models, they needed less time to reach accuracy, compared to the physical manipulations. Similar findings were reported by Barrett et al. (2015) in a follow-up study with 41 psychology undergraduates (56% females). This larger efficiency of the virtual models can be explained by cognitive load theory. Virtual manipulations, having constrained interactivity, only permitted the motions relevant for the learning topic, whereas physical manipulations allowed more hand motions, including those not relevant for the task. A similar advantage of simulations over real-life laboratory activities is briefly discussed in Castro-Alonso and Fiorella (this volume, Chap. 6). In short, physical manipulations may include extraneous cognitive load that is not essential for learning.

As there is a distinction between physical and virtual manipulations, there can also be a difference between executing versus observing the manipulations. In two experiments, Stull et al. (2018c) investigated university students learning to interpret 2D representations of 3D organic chemistry molecules. Experiment 1 studied 61 students (66% females) in a controlled laboratory setting, whereas Experiment 2 involved 81 students (56% females) attending a lecture in an auditorium. In both experiments, participants in the groups that manipulated the chemistry models presented higher tests scores than students who only observed the instructor’s demonstrations with the models. Similarly, in four experiments with a total sample surpassing 170 adults, Kontra et al. (2015) studied executing versus observing manipulations to learn the physics concept of angular momentum. The manipulation involved holding a set of spinning bicycle wheels by the axle and tilting the axle. The four experiments showed that doing was more effective than observing the manipulations.

7.4.2 Manipulations and Visuospatial Processing

Arguably, the first notion of a connection between manipulations and visuospatial processing was the study of mental rotation by Shepard and Metzler (1971), in which there was a linear increase in response time as the angles between pairs of test figures were larger. In other words, to process the mental rotations between the pairs, it appeared that participants were mentally doing something equivalent to physical rotations. In a follow-up study with mental folding, Shepard and Feng (1972) observed a similar outcome, in which the more folds involved, the more time taken to answer. In other words, mental folding also seemed to be equivalent to physically folding and manipulating the pieces of paper.

The effects were replicated in later studies. For example, Wohlschläger and Wohlschläger (1998, Experiment 1) investigated 66 right-handed psychology students randomly assigned to either a mental task or a comparable manual rotation task. In both cases, the same 3D abstract shapes had to be rotated, but in the manual format this was performed twisting a knob with the right-hand. For both the mental and the manual tasks, results showed that the time taken to rotate the shapes was almost identically affected by the angular difference between the shapes. Thus, mental and manual rotations had analogous functions of response time.

In Wohlschläger and Wohlschläger (1998, Experiment 2), interference between the manual and the mental tasks were investigated on 48 right-handed psychology participants. As predicted due to common processing, results revealed that manually rotating the knob in the opposite direction of the mental rotations inhibited performance, whereas manually and mentally rotating in the same direction facilitated the response. Wexler et al. (1998) tested if this interference could also be obtained with 2D shapes. The study investigated 12 adults (50% females) executing on-screen mental rotations with simple 2D figures while performing unseen manual rotations with a joystick. When the direction of rotation for the mental and the manual tasks coincided, the mental rotations were faster and more accurate than when both tasks were incompatible. An example of these effects is shown in Fig. 7.2.

Fig. 7.2
figure 2

Effects when manual rotations (knob) and mental rotations (3D shape) are (a) in the same direction, or (b) in opposite directions

Later, Adams et al. (2014) replicated the interference effects and also investigated different rotational training regimes. In Experiment 1, regarding a mental rotation task, 68 university students (64% females) were randomly assigned to train in either manual rotation, mental rotation, or a verbal task (control condition). Manual rotation involved manually aligning the rotation of two abstract 3D blocks on-screen. Mental rotation, as in typical instruments, involved answering if the two abstract 3D blocks on-screen were the same but rotated shapes, or both mirrored and rotated depictions. Results on a mental rotation task showed that both manual and mental rotation training were more effective than the control condition. Experiment 2 investigated a manual rotation task performed by 65 university participants. Results revealed that manual training, but not mental training, was more effective than the control group for the manual task. In conclusion, both experiments showed that manual rotation training was effective for both manual and mental rotation tasks, but mental rotation training was only useful for the mental rotation task.

As in science education, visuospatial processing tasks have also investigated the effects of executing versus observing the manipulations. For example, Harman et al. (1999) studied 22 undergraduates (59% females) memorizing novel 3D virtual objects. In a yoked-control design, students rotating the objects on the screen were compared to students observing these manipulations by other participants. Results showed that the group doing the manipulative rotations recognized the objects faster than those observing the manipulations. Meijer and van den Broek (2010) conducted a replication experiment controlling the level of visuospatial processing of the participants. In the study, 36 university students (72% females) were assessed in their 3D mental rotation ability with the Mental Rotations Test. All participants studied novel 3D on-screen objects by: (a) rotating the objects with the computer mouse, and (b) observing the computer doing the rotations. Results revealed that the low mental rotation students presented higher performance when they could rotate the objects. In contrast, middle and high mental rotation students performed similarly when rotating or only observing the objects. In other words, their high visuospatial capacity allowed them to manage the task effectively, without the need of executing the manipulations.

The last example of doing versus observing is provided with a task resembling Object Location Memory. In the study, Trewartha et al. (2015, Experiment 1) investigated 12 adult participants assigned to the executing or watching condition. In the executing group, participants discovered the spatial locations of virtual objects by moving a robotic arm to uncover the objects. In the watching condition, the robotic arm moved by itself. Consistent with the literature for executing over observing hand actions, the results showed that the group doing manipulations to uncover the hidden objects was more accurate than the condition solely observing these actions.

7.4.3 Manipulations, Science Education, and Visuospatial Processing

Accumulating evidence is supporting that high visuospatial processing individuals profit more from the positive effects of manipulations on science learning than low visuospatial processing individuals. For example, in the field of anatomy, Stull et al. (2009) reported two experiments with a total of 133 university students (63% females) performing rotational manipulations of a 3D computer model of a bone (the human sixth cervical vertebra). In each experiment, a median split of the scores on the Mental Rotations Test defined low and high spatial ability students. Consistent results in both experiments showed that high mental rotation students outperformed their lower counterparts in being more accurate and direct to execute the manual rotations of the virtual model.

Regarding biology, Huk (2006) examined 106 undergraduate and high school students (67% females) learning the structure of plant and animal cells through interactive multimedia. To measure the mental rotation ability of the students, a 3D instrument was used, namely, the Tube Figures Test. Also, half of the sample could manipulate 3D virtual models of the cells, to investigate their effects on understanding the cellular structures. Results revealed that only high mental rotation students benefited from manipulating the 3D models. In other words, spatial processing was needed to cope with the mental demands of using 3D models. Similarly, for chemistry tasks, in two experiments with a total of 267 university students (51% females), Barrett and Hegarty (2016) showed that mental rotation and spatial ability were fundamental to manipulate virtual organic chemistry molecules.

Research about visuospatial processing and science education has also compared virtual and physical manipulations. For example, Stull and Hegarty (2016) conducted two experiments with undergraduate organic chemistry students using models to solve problems about translations of chemical representations. In both experiments, the effectiveness of different virtual and physical models of chemical molecules was compared. Also, in both studies mental rotation was measured with an online version of the Mental Rotations Test. In Experiment 1, which investigated 105 students (54% females), the virtual models presented low fidelity (low action-congruence), so their manipulations were performed using a computer mouse and keyboard.

In contrast, in Experiment 2, with 104 participants (65% females), the virtual model presented high fidelity (high action-congruence), so their manipulations were performed using a virtual reality system with a hand-held device and stereo glasses. The two experiments showed that the groups using models outperformed the control groups in translation accuracy between representations. The type of model did not affect these results, as both the virtual models (low and high fidelity) and the physical models were equally effective. It was also reported that mental rotation was a significant predictor of achievement in these molecular translations, but not as influential as the employment of manipulatives. In conclusion, these results are not as supportive of the computer manipulations over the physical formats as those described in Sect. 7.4.1.

Last, research combining the effects of manipulations, science instruction, and visuospatial processing has also investigated executing versus observing manipulations. For instance, in the realm of anatomy, Jang et al. (2017) examined 76 medical university participants (42% females) studying a 3D virtual model of the inner ear in a stereoscopic 3-D environment. Visuospatial processing was measured with the Mental Rotations Test. Results showed that participants that manipulated the model outperformed those that watched the model being manipulated. In addition, from the students that watched the manipulations, higher mental rotations predicted higher anatomy learning outcomes. This relationship between mental rotation and anatomy learning was absent in those that manipulated the model. Arguably, manipulating the model resulted in less investment of visuospatial processing (mental rotation), whereas only watching relied on this processing to learn the anatomical structures. Consequently, either manipulation or high mental rotation ability were key assets to understand the anatomy task.

7.5 Gestures

Gestures are hand motions that convey effective nonverbal communication when executed and observed (see Hall et al. 2019). Although they have been habitually connected to the social cognition and signaling research perspectives, we have shown that gestures are linked to all the embodied perspectives discussed in this chapter. As nonverbal assets, they convey additional information to that of speech, so they are useful tools for learners and instructors. For example, in a meta-analysis of 38 experiments (63 effect sizes; N = 2,396), Hostetter (2011) compared the effects of speech-only vs. speech plus gesture conditions on memory or learning. The effect of adding human gestures to speech showed an overall medium size of d = 0.61. The effect presented a comparable size if the performer of the gestures was following a script or was making the gestures spontaneously. The most useful gestures were those used to convey a spatial or motor idea, which indicated a relationship between gestures and visuospatial processing.

In addition to the findings on human gesturing, there are also positive effects of gestures produced by cartoon or animated agents. For example, Davis (2018) conducted a meta-analysis of 20 experiments (N = 3,841) and k = 41 pairwise comparisons that contrasted animated agents making gestures versus agents’ static images or voices. The results revealed that the agents that included gestures produced better retention (g = 0.28, k = 7) and near transfer (g = 0.39, k = 16) learning scores than agents not gesturing. These are small effect sizes supporting the inclusion of gestures in animated pedagogical agents.

In this section, we describe the relationships between gestures, science education, and visuospatial processing. As with manipulations, research on gestures has shown some consistent trends, presented in Table 7.3. For example, comparisons between executing and observing gestures have found more supporting evidence for executing these hand motions. Also, there are consistent results that show that gesturing outside the visual stimuli is counterproductive, whereas gesturing toward the stimuli is productive.

Table 7.3 Examples of research on gestures

7.5.1 Gestures and Science Education

In the meta-analysis just described, Davis (2018) investigated the moderating effects of topics on gesturing by animated agents. Results showed that the near transfer scores tended to be larger for science topics (g = 0.47), compared to maths (g = 0.32) and humanities (g = 0.08), although the difference was not significant. This result highlights the importance of gestures for science topics, in this case, made by cartoon agents (see also Li et al. 2019).

However, most of the research on gestures for science education deals with humans as executers and observers of gestures. For examples where the students executed the gestures, the action of tracing can be considered. Tracing is a gesture that comprises finger motion following a path or movement (Hegarty et al. 2005) typically against paper or other surfaces (Ginns et al. 2016).

In an experiment with 10 undergraduates studying static mechanical diagrams, Hegarty et al. (2005, Experiment 1) observed that producing tracing gestures facilitated mentally animating the diagrams and understanding their mechanisms. Tang et al. (2019) randomly assigned 46 school students to either study by reading lesson materials on the water cycle, or tracing out key water cycle processes (e.g., evaporation) while studying. Students who traced while studying subsequently outperformed the control group on both retention and transfer tests.

In addition to these science examples, Ginns et al. (2016) provided two experiments of maths topics. In Experiment 1, involving the spatial topic of triangle geometry, the participants were 52 school boys. In Experiment 2, regarding the non-spatial topic of order of operations, the participants were 54 school students (59% females). In both experiments, the students were randomly assigned to the experimental condition of executing tracing versus control conditions without tracing. The results on the transfer tests for both experiments showed that the tracing groups outperformed the non-tracing conditions.

An example besides executing tracing is the study by Pi et al. (2019), which concerns observation of gestures for biology education. In the experiment, 120 university students from diverse disciplines (78% females) studied a video lecture about reproduction and cloning. The video showed a teacher looking into the camera while explaining the content slides at her side. The participants were randomly assigned to one of four learning groups: (a) control (no gazing and no gesturing), (b) gazing only, (c) gesturing only, and (d) gazing and gesturing. In both gazing conditions, the teacher in the video looked to the relevant areas on the slide. In both gesturing conditions, the teacher used fingers and hands to point to the relevant areas. Results showed that the conditions with gesturing significantly outperformed the control group, for both retention and transfer tests.

Is it better to execute or to observe gestures for science learning? Aligned to the previous sections on manipulations, the evidence on gestures also show the tendency that executing is better than solely observing the hand actions. For example, Stieff et al. (2016) reported two experiments with organic chemistry undergraduates attempting translations between organic chemistry molecular representations. In Study 1 (N = 70), the participants were randomly allocated to one of three conditions: (a) control text-only group, (b) observed gestures, and (c) observed and executed gestures. Results showed that the most effective group for molecular equivalencies was that watching the experimenter making the gestures and then imitating the hand movements. Also, solely watching the gestures (observed gestures condition) was not more effective than not watching them (control condition). Study 2 (N = 104) replicated these positive results for observing and doing.

7.5.2 Gestures and Visuospatial Processing

The relationship between gestures and visuospatial processing has been supported by experiments showing deleterious effects of gesturing toward the outside of the visuospatial task and beneficial effects of gesturing toward the stimuli. Examples of the first line of evidence are provided in the interference experiments by S. Hale et al. (1996), who investigated undergraduates performing single and dual tasks of working memory. One of the tasks was the Location Span Task, which involved memorizing sequences of a mark randomly positioned on a 4 x 4 grid. In Experiment 1A (N = 30), results showed that pointing with the finger aside the stimuli impaired performance on the Location Span Task. A follow-up (Experiment 3, N = 20) revealed that moving the eyes aside the stimuli was also detrimental, and that moving the eyes and pointing aside was more deleterious. Similarly, Lawrence et al. (2001) investigated 18 undergraduates executing a spatial working memory task of memorizing randomly colored positions on a square grid. Results showed that the task was impaired by moving a finger toward a peripheral flash.

Concerning evidence of positive effects of gesturing to the visuospatial task, Chum et al. (2007) reported two experiments with a total of 37 psychology undergraduates performing spatial working memory tasks in which visual sequences had to be replicated from memory, as in the Corsi Block Tapping Test. As in this test, each sequence included shapes that were placed in different positions. Each experiment involved comparisons between executing pointing gestures versus not executing these gestures. The pointing was aimed at every position of the visual elements in the sequences. Results on the scores of this visuospatial working memory test revealed that pointing was more effective than not pointing. An example of these results is given in Fig. 7.3.

Fig. 7.3
figure 3

Effects when executing a pointing gesture either (a) away of the visuospatial task, or (b) toward the visuospatial task

Another effective gesturing example is provided by So et al. (2015), who investigated 138 undergraduates (54% females) learning difficult map routes. The visuospatial processing of the participants was calculated by combining the scores on a mental folding task (Paper Folding Test) and a spatial working memory task (Corsi Block Tapping Test). Groups of students allowed to execute gestures were compared to groups in which gesturing was not allowed. Results revealed that the most important predictor for recall about the routes was being allowed to gesture while memorizing. Visuospatial processing, although helpful, had a secondary influence.

Last, Göksun et al. (2013) investigated 28 adults executing gestures while doing mental rotations of physical shapes. Also, low versus high mental rotators were compared, according to the scores on the Mental Rotations Test. Results showed that low mental rotators produced more gesturing while solving the rotations, compared to the high mental rotation participants. Thus, gesturing was effective to offload cognition, and this was particularly helpful for those at the limits of their visuospatial processing capacity.

7.5.3 Gestures, Science Education, and Visuospatial Processing

The difference between executing and observing hand actions can also be made here. An example of the beneficial educational effects of executing gestures is provided in the physics disciplines by Hegarty et al. (2005, Experiment 2), who recruited 45 undergraduates to perform mental animations of static mechanical diagrams. To investigate the effects of doing gestures and visuospatial processing, a group of students executing a spatial tapping interference task was compared to a control without this load on the visuospatial processor. As predicted, results showed that spatial tapping prevented executing gestures and hindered mental animation of the mechanical systems.

In a follow-up with 60 undergraduates by Hegarty et al. (2005, Experiment 3), the comparison was made between a spatial tapping group, a gesture-restricted group, and a control group (without spatial tapping and allowed to gesture). Results revealed that the gesture-restricted and the control groups outperformed the spatial tapping condition. In other words, these mental animations tasks relied more on visuospatial processing (interfered by spatial tapping) than on gesturing (interfered by gesture-restrictions). In all, these two experiments support that visuospatial processing is the primary asset for the mental animation of static mechanical diagrams, and that executing gestures may be a secondary but effective resource. This order of effects contrasts with the findings by So et al. (2015), described in the previous section, where executing gestures was more important than visuospatial processing.

Another piece of evidence showing positive effects of executing gestures is the study by Pouw et al. (2016) with 20 adults (75% females) attempting the visual puzzle known as the Tower of Hanoi. In the study, visual working memory was assessed with the Visual Patterns Test. Results showed that, while the participants were solving the puzzle, executing pointing gestures reduced their eye movements. This efficient mechanism was larger for those with lower scores in the visual working memory test. Thus, these results support that executing gestures can alleviate part of the burden in the eye movement and visuospatial processing (see similar findings in Eielts et al. 2018).

An example of beneficial effects of observing gestures is provided in the biology fields by Brucker et al. (2015), who investigated 45 university students (69% females) learning about fish movements from dynamic visualizations supplemented with gestures of these motions. In addition, visuospatial processing ability was measured with the mental folding task known as the Paper Folding Test. It was observed that, when students watched gestures that corresponded with the fish movements, low visuospatial learners were benefited, but these motions did not affect high visuospatial students. It was argued that high visuospatial students could understand the fish locomotions without the scaffolds provided by observing gestural information.

The last example is illustrative of the importance of visuospatial processing for understanding gestures, although it involves the observation of gestures about everyday activities rather than science topics. It concerns four experiments, conducted by Y. C. Wu and Coulson (2014), sampling a total of 251 university students (65% females). In the study, photos of activities in which the speech was congruent to the gesture (e.g., describing screwing while moving the hand clockwise) were compared to photos in which speech and gesture were incongruent. Spatial working memory was measured with a computer version of the Corsi Block Tapping Test. Results showed that the fastest students to integrate the speech–gesture congruent information were those with higher scores in the spatial working memory test. Moreover, this effect was reduced when the participants performed a simultaneous visuospatial task, but was not affected when doing a simultaneous verbal task. In conclusion, the experiments supported that visuospatial processing was more necessary than verbal processing for understanding gesture plus speech information. It can be predicted that, for a science topic described by the instructor with gestures and speech, students with higher visuospatial processing will understand more from observing the gestures, compared to students with lower visuospatial processing.

7.6 Discussion

Embodied cognition phenomena can be triggered when executing or observing body actions. When solely executing body actions, three non-mutually exclusive experiences can happen, namely: (a) offloaded cognition, (b) generative learning, and (c) physical activity. The action of offloading cognition to the body and the environment can produce a cognitive boost, particularly helpful for students whose visuospatial processing is being challenged by the difficulty of the visuospatial information. Regarding generative learning phenomena, in addition to allowing offloading cognition, it can add a personal touch to the executer. For example, drawing puts information onto the environment (offloaded cognition), but these depictions use personal styles (generative learning). Last, physical activity, including vigorous and calmed activity, can boost immediate cognitive performance. Also, the positive effects can be sustained in time.

In addition, there are also experiences when either executing or observing body actions, that research has termed as: (a) survival cognition, (b) social cognition, and (c) signaling. Concerning survival cognition, our human cognition has always equipped us to survive, so cognitive tasks of today are more effective if they resemble the tasks our ancestors used for subsistence. One of these tasks was to communicate with other humans, so survival and social cognition have equipped us to understand the social cues of others, which is more effective if these others are humans and not machines or robots. Last, some of these social cues involve signaling relevant information. In these cases, signaling and social cueing co-occur.

The human motions mostly researched about these different embodied phenomena concern object manipulation and gestures, which have been useful assets in diverse fields of health and natural sciences, including anatomy, biology, chemistry, and physics. Also, manipulations and gestures are effective tools for visuospatial processing.

Regarding the type of manipulation, both physical and virtual manipulations have shown effectiveness, but in the studies where these formats have been compared, usually the virtual format is favored. Another common comparison in manipulation research is between executing and observing others executing the actions. In these cases, the typical trend is that executing is more effective than only observing.

Concerning gestures research, the findings also show that doing the hand actions tends to be better than solely observing them. However, observing the gestures of human teachers and instructors, as well as animated pedagogical agents, is also effective to learn health and natural science topics. In these disciplines, encouraging results are showing how gesturing can be helpful for students with lower visuospatial processing.

7.6.1 Instructional Implications for Health and Natural Sciences

Concerning executing body actions, many different physical activities, at different degrees of energy demands, can have positive effects on cognitive processes. An instructional implication is that teachers could promote low-intensity physical exercising (e.g., walking, manipulations, and gestures) as effective activities for science education.

A second instructional implication considers the survival cognition perspective. As such, learning activities could be framed in survival scenarios, such as hunting wild animals or collecting food to avoid starvation. In principle, any learning task with these added survival cues could be more effective than a version without this subsistence component.

Following the extension to the voice principle of social cognition, a third implication is that learning tasks should prioritize human–human interactions, and similar socially evolved mechanisms. For example, for tasks of manipulations and gestures, videos or live action may be preferable to static images, and humans doing the hand tasks may be preferable to robots or virtual agents.

An implication for manipulations is based on the aim of reducing visuospatial information. This fourth implication is to foster simple manipulatives, as they tend to produce meaningful learning. Similarly, virtual manipulations may be simpler and preferable to physical manipulations.

The fifth and last implication concerns gestures. Allowing students to execute gestures while learning science topics should be promoted, particularly in those individuals with lower visuospatial abilities.

7.6.2 Future Research Directions

Regarding the execution of body actions, future research could investigate which movement or action is best to train visuospatial processing. Similarly, further investigations could search for the most effective intensity and duration of training specific physical activity to boost cognitive functions.

Concerning the observation of hand actions, future research may reveal the best conditions to provide adequate social cognition and signaling, without also implying additional visuospatial information that could be difficult to handle, particularly for students with lower working memory capacity.

Future research needs to investigate further the relationship between science education and visuospatial processing (see Castro-Alonso and Uttal this volume, Chap. 3). For example, to establish better links between visuospatial processing assisting science learning, and science education helping visuospatial processing (see also Castro-Alonso and Uttal 2019), the addition of manipulative or gesturing actions can be considered. Similarly, interactive multimedia (see Castro-Alonso and Fiorella this volume, Chap. 6) and modern technological devices will provide new instructional possibilities for science education and human hand actions.

As sex and gender are influential to visuospatial processing and learning (see Castro-Alonso and Jansen this volume, Chap. 4; see also Castro-Alonso et al. 2019), their effects on embodied cognition are worth investigating. For example, research has shown that females tend to use more information than males from observing gestures and nonverbal communication (see Hall et al. 2019), so this effect could be investigated for science learning or visuospatial tasks.

7.6.3 Conclusion

Different research perspectives have investigated the phenomena of embodied cognition, which can be activated when executing or observing human body movements. Two of the most investigated embodied phenomena are manipulations and gestures, which can be executed and observed for effective science education and visuospatially processing. Regarding manipulations, it seems that virtual manipulatives are more effective than physical models. Regarding gestures, they are valuable assets, sometimes combined with visuospatial processing, to learn health and natural science topics. For both manipulations and gestures, a common finding is that executing these hand actions is more instructionally effective than solely observing them.