Introduction

Many children require instructions to develop good reading skills (Cain, Oakhill, & Lemmon, 2004). Substantial evidence indicates that reading strategies aid reading comprehension, helping children to read by training their adaptation to timing demands of novel knowledge (Cain, Oakhill, Barnes, & Bryant, 2001; Clark, Schoepf, & Hatch, 2018; Förster, Kawohl, & Souvignier, 2018; Reed & Lynn, 2016). Further, empirical studies use solely text-based materials, although most informative texts (e.g., scientific articles) include words and pictures. Numerous previous studies on text-based reading have indicated that performance improves when articles have text and pictures (Ainsworth, 2006; Butcher, 2014). This was dubbed the multimedia effect (Mayer, 2009). However, many children do not know how to integrate text and illustrations (Jian, 2019; Jian & Ko, 2017; Hannus & Hyönä, 1999; Mason, Tornatora, & Pluchino, 2013), and do not appear to benefit from the addition of diagrams to scientific articles without prior scientific literacy (Reid & Beveridge, 1986). Thus, children need tuition focused on decoding diagrams and linking relevant information in texts and diagrams.

Scheiter, Schubert, Gerjets, and Stalbovs (2015) found that teaching ninth-grade students to decode diagrams and link text–diagrams did not improve reading comprehension. However, some studies have found that text–diagram instruction improves the reading comprehension of fourth and sixth-grade students regardless of ability (Jian, 2018a, 2019). The reading instructions developed for all of these studies was based on the cognitive theory of multimedia learning (CTML) (Mayer, 2009), and emphasized the importance of reading diagrams and integrating relevant information between texts and diagrams. Notwithstanding, scientific evidence has met with mixed response and there is merit in further study on the influence of text–diagram instruction on reading comprehension. In sum, previous empirical research has investigated the influence of text–diagram instruction on reading comprehension through learning outcomes (Jian, 2018a; Scheiter et al., 2015) but has rarely focused on reading processes and real-time processing behavior. Eye-tracking technology is well-suited to examine reading processes (Kim, Vorstius, & Radach, 2018; Rayner, 1998) based on the eye-mind assumption (Just & Carpenter, 1980) that posits that the area where a reader’s eyes fixate is where information is likely to actively be processed.

Besides, understanding the short- and long-term effects of reading instruction on reading comprehension is also important (Förster et al., 2018), as we need to know how reading strategies are used post-acquisition to check if they were retained. Thus, this study uses immediate and delayed tests to investigate text–diagram instruction effects on reading comprehension, the learning processes underlying illustrated text reading, and changes in processes from immediate to delayed reading situations.

Theoretical Frames and Challenges in Learning Illustrated Texts

This study was grounded in the CTML (Mayer, 2009), a well-established theoretical framework used to develop empirical studies (Jian, 2018a; Mason, Pluchino, & Tornatora, 2013; Schüler, 2017) or design instructional interventions for text–diagram integration (Jian, 2019; Renkl & Scheiter, 2017; Scheiter et al., 2015). The CTML represents the information-processing system that underlies learning of multimedia materials (e.g., illustrated texts). It is based on cognitive demands—that is, selection, organization, and integration of the representations iteratively to the learner’s mind. Comprehension of an illustrated text requires readers to first identify the most salient pieces of information within each set of words and diagrams; then, group these pieces together into larger propositions; and finally integrate propositions and readers’ prior knowledge to establish a coherent mental model.

Adding multiple representations in the texts adds new comprehension challenges, especially for young readers. Readers must have good representational competence and apply multiple literacy skills to process these texts (cf. review paper, Guo, Zhang, Wright, & McTigue, 2020). Representational competence refers to a set of skills for the recognition, interpretation, transformation, and coordination of external representations used in texts (Kozma, 2003). It may not be easy for young readers whose reading and cognitive abilities are still developing to try to meet such cognitive demands (Coleman, McTigue, & Dantzler, 2018; Renkl & Scheiter, 2017). Eye-tracking studies have shown that children’s attention is focused on text sections as opposed to diagrams during illustrated text reading, and children seldom integrate the relevant information from illustrations and texts (Jian, 2018a; Hannus & Hyönä, 1999). One potential explanation for this phenomenon is that children’s cognitive capacity is underdeveloped and only pays attention to limited information over brief time periods.

Although many studies (Jian & Wu, 2015; Eitel, 2016; Johnson & Mayer, 2012; Schüler, 2017) using undergraduates as participants confirmed the theory of CTML (Mayer, 2009), the findings were less conclusive if CTML was extended to non-adult readers. McTigue (2009) explored the question of whether CTML applied to middle-school students. She asked sixth-grade students to read two science texts with either no illustrations (control), illustrations of the cycle with labels for each part (parts), illustrations of the cycle with labels for each major process (steps), or illustrations showing the labels for each part and each major process (parts and steps). The results showed that the students did not benefit from diagrams for both the texts, and did not replicate the findings of studies with adult readers as participants.

Promoting reading comprehension of illustrated texts

Strategic prompts and strategy training are major interventions that promote illustrated text comprehension (Renkl & Scheiter, 2017; Van Meter, Cameron, & Waters, 2017). Strategic prompts are process-oriented and involve deployment of prompts throughout experimentation. Self-explanation prompts are frequently studied regarding illustrated text reading (Van Meter et al., 2017), and usually require answering “why” questions (Berthold, Eysink, & Renkl, 2009) or explaining a specific relationship (van der Meij & de Jong, 2011; Van Meter et al., 2017).

Bartholomé and Bromme (2009) prompted their undergraduate students by presenting written instructions for processing text and pictures for five minutes during the learning phase. Unexpectedly, it did not improve reading comprehension. Bartholomé and Bromme argued that conflicts between the prompt approach and learners’ original strategies caused this, as their resolution consumes cognitive resources and impedes learning. This was regarded as speculative because no process data were collected. Moreover, the duration of this intervention was too short to demonstrate whether these effects may be based on an alternative explanation. Following up on this, the present study used an eye-tracker to record learners’ reading processes.

As for strategy training, it is preparatory-oriented, and involves teaching reading strategies prior to experimentation, denoting that if students acquire effective reading strategies anticipatedly they could read independently. Unlike text–only reading strategies, strategies for illustrated text have been developed only in few studies (Jian, 2018a; Scheiter et al., 2015). Based on CTML (Mayer, 2009) and many empirical studies (e.g., using self-explanations to learn illustrated content, Schlag & Ploetzner, 2011; how to read a diagram, Eitel, Scheiter, & Schüler, 2013; and shifting attention from text to diagrams, Hegarty & Just, 1993), Scheiter et al. (2015) developed a seven-step strategy for illustrated text reading comprehension. Initially, the experimental group received text–diagram reading strategies, and the control group received a placebo training involving the same amount of cognitive engagement. The ninth-grade students subsequently read an illustrated biology article and completed a test. Unexpectedly, no effect was observed regarding the taught strategies, and the reasons were hypothesized: first, maybe the provided practice time was inadequate to acquire a new strategy; second, students may have had prior reading habits that were difficult to alter, and the study was brief. Thus, they may have read the experimental material only based on prior strategies. Unfortunately, this could not be confirmed, as the study did not measure any online processing data. Thus, our study sought to address this question by using eye-tracking to record online reading behavior pre- and post-reading instruction.

Alternatively, the findings of Scheiter et al. (2015) could be interpreted as evidence that the nine-step reading strategy exceeds the upper limit of human working memory, which suggests that learning all these steps in such a short time may be too demanding. In a simplified model, a recent study (Jian, 2018a) developed a three-step text–diagram strategy where relevant information in the text and diagrams was identified and integrated into a coherent mental model. This helped to improve fourth-grade students’ reading comprehension regardless of reading ability.

Overall, strategic prompts and strategy training emphasize the need to guide readers to connect relevant text and pictures. However, the effects of strategic interventions using prompts and instructions on the comprehension of illustrated text remain unclear. This may be attributed to age, number of reading strategy steps, and intervention duration, all of which are considered in the present study.

Eye movement research on how text–diagram instruction changes reading processes

So far, only two studies (Jian, 2018b, 2019) seem to have applied eye-tracking technology to investigate the influence of text–diagram instruction on reading processes. Jian (2018b) investigated whether reading instruction influences learning outcomes and the cognitive processes involved in reading illustrated scientific texts. Investigations showed that the instruction group had better reading comprehension, spent twice the time reading the illustrations, made more saccades between text and illustration than the control group, and their eye movements were also different.

Jian (2019) investigated the effectiveness of reading processes using the signaling principle. This involves highlighting relevant elements of text and illustrations with signals, to improve learning outcomes (Van Gog, 2014), for sixth-grade students with or without text–diagram reading instruction. It showed that the signaling principle (Richter, Schieter, & Eitel, 2016; Van Gog, 2014) could not be extended to children without reading instruction. However, when they received reading instructions and learned the signaling principle, the teaching group had better test performance, different eye movement patterns, and made more saccades between text–diagram than the label group. Further, approximately 80% of the teaching group looked at the corresponding illustrations when they read the words “Diagram 1” and “Diagram 2,” whereas less than 50% did so in the label group.

These studies show that text–diagram reading instruction influences not only learning outcomes but also reading processes. However, three reasons to conduct an additional study persist. First, only the immediate effects of these reading instructions were measured, and delayed effects were left unattended. In immediate testing, learners rely on more superficial representations of the materials, whereas delayed testing relies on mental models (Arndt, Schüler, & Scheiter, 2015). Second, the control groups did not receive any instruction, so the results could not totally exclude a possible placebo effect. Therefore, this study added a placebo group that received reading instruction but not text–diagram integration. Third, a one-page article of 400 words and two illustrations were used as reading materials. This study investigated whether the effect of text–diagram instruction could be replicated on longer articles of 2–3 pages, over 700 words and 4–5 illustrations.

Present study and research hypotheses

The present study used an eye-tracker to investigate the immediate and delayed effects of text–diagram reading instruction on reading comprehension and learning processes for illustrated text reading, and whether these effects were moderated by reading ability. To determine reading instruction effects on the comprehension of illustrated texts, the text–diagram strategies developed for the experimental group in this study were based on CTML (Mayer, 2009) and prior empirical studies (Schüler, 2017; Scheiter & Eite, 2015). They also highlighted the instructions of selecting/organizing/integrating text–diagram elements and decoding diagram information. Moreover, some principles of general reading instruction were applied, such as including an adequate intervention time; introducing teacher modeling; providing students with a new article to apply the learned strategy; and providing reminders for participants to use the instructed strategies when reading the experimental material (Bartholomé & Bromme, 2009; Scheiter et al., 2015).

There were three research questions in this study: The first research question was whether text–diagram reading instruction had a positive effect on reading comprehension. Corroborating with earlier findings (Jian, 2018a), it was expected that reading comprehension performance would be better in the text–diagram instruction group compared to the other groups on the immediate test (Hypothesis 1a); and the effect on the immediate test was expected to be more pronounced than on the delayed test (Hypothesis 1b), due to previous research showing that the effect of reading instruction was more evident on the immediate test than on the delayed test (Manoli, Papadopoulou, & Metallidou, 2016; Torgesen et al., 2001).

The second research question was whether text–diagram reading instruction affected reading processes. Based on the design suggested by CTML (Mayer, 2009), the text–diagram group received reading instruction for decoding diagrams and integrating relevant text–diagram information. This group was expected to have longer eye-tracking fixation durations on scientific illustrations, more saccades between text–diagram, and spend a higher total-time proportion paying attention to scientific illustrations than the other two groups (Hypothesis 2).

The third research question related to the changes in reading processes over time. Participants were tested before, during, immediately after, and 1 month after the intervention. Previous research found that the scores of the immediate tests were higher than the delayed tests, whether for reading research (Manoli et al., 2016; Torgesen et al., 2001) or written proofs in mathematics (Roy, Inglis, & Alcock, 2017). This study predicted that the intervention effect on reading processes would be diminished between the immediate and delayed testing stages (Hypothesis 3). In other words, the reading patterns of the delayed stage would be nearer to those of the original, compared to those revealed before the intervention, in the immediate testing stage.

Methods

Participants and design

In this study, 132 fourth-grade students were recruited with parental consent from an elementary school in Taiwan (Mage = 10.27 years, SD = 0.28 years). Using the standard reading comprehension test (internal consistency reliability was .82, retest reliability was .94) (Ko, 2006), 66 participants were found to have high-reading ability (standard T score above 58, range 58–69, M = 61.5, SD = 3.26) and 66 had a low reading ability (standard T score below 55, range 38–54, M = 49.39, SD = 4.61). The T score was used to distinguish reading ability levels to ensure that there were an equal number of participants between the two groups. The study used a three (reading strategy instruction [text–illustration integration] vs. comprehension monitoring vs. no reading instruction]) × two (reading ability [high vs. low]) between subjects’ design. All high- and low-ability participants were randomly assigned to one of the three groups, totaling 22 participants in each group.

Materials

Four illustrated biology texts and four reading comprehension tests were used. The four biology texts introduced the structures and functions of a certain animals as is customary in biology texts (Brandstetter, Sandmann, & Florian, 2017; Cheng & Gilbert, 2014; Kragten et al., 2015). They were modified from scientific magazines (Young Newton, 2015, 2016; Young Scientist, 2015, 2016). The first article described the body structure and functions of a manta ray, the process of filter feeding and explained possible reasons for a manta ray’s behavior when jumping out of the water. This article had 588 words, two pages, and four illustrations (including: the body structure of a manta ray, the mouth of a manta ray and the processes of filter feeding). The first article measured participants’ baseline reading patterns. The second article was designed to provide an example of the reading strategies. It described the body structure and functions of a turtle, explaining why these body characteristics make it swim deep and far, and the reproduction process. This article had 331 words, two pages, and three illustrations (including: the body structure of a turtle, and the reproduction process of mating, laying eggs, and hatching young turtles). The third article measured immediate reading behavior after reading instruction. It described the body structure and functions of a dolphin; the process of suckling; skin structure and its adaptation for swimming and the echolocation system. This article had 772 words, three pages, and five illustrations (including: the body structure of a dolphin, skin structure, changes while swimming, and echolocation). The fourth article measured delayed reading behavior approximately 1 month after receiving instruction. It described the body structure and functions of a mantis, the process of hunting, and how mantises threaten their natural enemies. This article had 707 words, two pages, and four illustrations (including: the body structure of a mantis, and the hunting process).

To assess learning outcomes of the reading materials for three different instruction interventions, tests were created for dolphin and mantis articles. Each article had ten multiple-choice questions and two essay questions.

All materials and tests were assessed for difficulty and readability by three experts—a professor in reading psychology, a Ph.D. candidate in science education who had taught science courses in elementary schools for several years, and a current elementary school science teacher with a master’s degree in science education. Moreover, all multiple-choice questions underwent a pilot study (N = 120) and item analysis (including distractor analysis, item difficulty and discrimination) to ensure items quality in the formal reading tests. The internal consistencies (Cronbach’s alphas) of the articles’ reading tests were .63 (for the dolphin article) and .54 (for the mantis article). The test–retest reliability of the immediate and delayed tests of the dolphin article was .66, p < .001. As for the essay questions in the reading tests, two independent inter-rater agreements were measured using Cohen’s kappa coefficient (.63–.91 for the three reading tests) and inconsistent scores were discussed by raters to reach consensus.

Apparatus

Eye movements were recorded using the Eyelink 1000 (SR Research, Canada) eye-tracker system at a sampling rate of 1000 Hz. A chin-and-forehead rest was used to prevent apparent head movement. The movement of the right eye was tracked. The reading materials and tests were presented on a 24-in computer screen with a resolution of 1920 × 1200 pixels. The participants were asked to place themselves at 65 cm from the monitor and their eyes were leveled horizontally and vertically at 46° and 32°, respectively.

Procedure

Individually, students participated in this study at their elementary school. The experimentation consisted of three stages (Fig. 1). The first stage involved a standardized reading comprehension test (Ko, 2006) to select suitable participants. The second stage began after approximately 2 weeks. It was an eye movement experiment followed by an immediate test. The participants then read an illustrated scientific text and their eye movements were recorded to ensure that the baseline reading behaviors for the three groups were similar. The participants received one of the two reading instructions or no instruction. Then, the participants were told to learn the three reading steps and to apply them to two experimental texts. The participants in all groups had equal training time of 20 min.

Fig. 1
figure 1

The experimental procedure

The text–diagram integration instructions consisted of the following: first, “Take an overall look at the picture and speculate what the article intends to communicate.” Second, “Please read the sentences carefully and find information from the illustrations, then please imagine putting the sentence content into the respective illustrations and observe the illustrations carefully, for example, the shapes, sizes, relative positions, and relationships between components. Then, connect the information between the text and illustrations.” Finally, “Check how well you understand this article by yourself. Please look at the illustrations and recall the text content to test your reading comprehension and memory. If you have something you cannot say or clearly remember, please go back to the specific sentences to reread them.”

The comprehension monitoring instructions consisted of the following: first, “Please read the article and evaluate how much you comprehend. A score of 3 indicates you totally understand, 2 that you partially understand, and 1 that you do not understand the article content. Which score would you give yourself?” Second, “Please find where and what information you cannot understand in the article and try to use the reading strategies that you previously learned to resolve these misunderstandings.” Finally, “Check how well you understand this article by yourself. Please look at the illustrations and recall the text content to test your reading comprehension and memory. If you have something you cannot say or clearly remember, please go back to the specific sentences to reread them.”

The control group instructions consisted of: “I will give you an article to read. After you finish reading, please tell me what information you read in the article.” Then, the experimenter asked two questions: “What information was in the article?” and “What reading strategies did you adopt when your reading comprehension suffered?” In this group, the experimenter did not teach reading strategies to the participants but asked them which strategies they used. The participants were asked these two questions continuously until 20 min had elapsed, as this was equivalent to the time spent by the other two groups with different instructional interventions.

Using the practice scientific article, a research assistant taught participants individually how to use each strategy. Then, participants practiced these strategies using additional sentence(s) in the same article. If participants failed to use any reading strategy, the research assistant provided further instruction until they mastered all strategies.

Then, participants were instructed to use the strategies when reading another illustrated biology text (immediate test). All participants were informed that they would answer questions after reading. Excluding the reading instructions, the control group followed the same procedures, including being informed that they would answer questions after reading. Participants took approximately 60–70 min to complete this intervention.

A month later, there were two tests that were done in the delayed phase. Participants firstly completed the dolphin reading test again (delayed test) and were instructed to read another illustrated biology text without reading instructions to evaluate the retention of acquiring reading strategies and transfer these to the reading of a new learning material. Participants took approximately 30 min to complete the third experimental procedure.

Data selection and measures of eye movements

The effective sample included 113 participants (text–diagram group, n = 38; comprehension monitoring group, n = 39; control group, n = 36), as data from 19 participants were discarded owing to unsuccessful eye-tracker recordings (2), calibrations (7), apparent drift (3), and distractions during the experiment (7). As the second intervention time was longer, some participants (especially low reading ability ones) became distracted during the eye movement experiment. Further, participants who did not read the article completely, pressed the key to end the trial, or had more than one paragraph in the experimental article indicating no eye fixation were excluded.

The areas of interest (AOIs) defined in this study were the texts and pictures in the illustrated scientific text. Several eye movement indicators were used in this study: total fixation duration (i.e., the sum of all fixation durations located in AOIs), where higher total fixation duration reflected greater cognitive effort processing (Just & Carpenter, 1980); the proportion of total fixation durations (i.e., fixation duration in specific AOIs divided by total fixation duration), reflecting attention distribution on specific regions; and the number of saccades between text and illustration (i.e., the total saccades from text to illustration and from illustration to text), reflecting efforts to integrate text with illustration (Mason et al., 2013).

Results

Learning outcomes

To answer the first research question: whether text–diagram reading instruction had a positive effect on reading comprehension, ANOVAs were conducted with instruction groups (text–diagram, comprehension monitoring, and control) and reading ability (high and low) as independent variables, and the reading tests scores (immediate, delayed, and a new tests) as dependent variables. The descriptive values are shown in Table 1.

Table 1 Means and standard deviations of students’ learning outcomes

Immediate test

In the immediate multiple-choice questions, the results revealed main effects on the instruction groups, F(2, 107) = 4.65, p < .05, η2 = .08, and on reading ability, F(1, 107) = 27.69, p < .001, η2 = .21, but no interaction between instruction groups and reading ability, p > .05. Post hoc comparisons demonstrated that the text–diagram group had significantly higher scores on the immediate multiple-choice test than the other two groups, and high-ability students also significantly outperformed low-ability students, ps < .05. In the immediate essay questions, the results revealed a marginal main effect on the instruction groups, F(2, 107) = 2.51, p = .08, η2 = .05, and a significant effect on reading ability, indicating the high-ability students outperformed low-ability students, F(1, 107) = 23.76, p < .001, η2 = .18, and no interaction effect, p > .05.

Delayed test

In the delayed multiple-choice test, the results revealed a marginal impact on the instruction groups, F(2, 107) = 2.75, p = .06, η2 = .05, a significant effect on reading ability, F(1, 107) = 19.07, p < .001, η2 = .15, but no interaction effect. Post hoc analyses revealed that the text–diagram group significantly outperformed the comprehension monitoring group; the text–diagram group marginally outperformed the control group, and high-ability students outperformed low-ability students. In the delayed essay questions, the results revealed no significant effect on the instruction groups, a noticeable effect on reading ability, F(1, 107) = 22.08, p < .001, η2 = .17, and an interaction effect that was marginally significant, F(2, 107) = 2.62, p = .07, η2 = .05. The results of simple main effects revealed that low-ability students showed a main effect on the instruction groups, F(2, 43) = 3.28, p < .05, η2 = .14, indicating that the text–diagram outperformed the control group; however, for the high-ability students, who were in the three groups, there were no significant differences on the scores of the delayed essay questions, p > .05. In addition, the high-ability students significantly outperformed the low-ability students not only in the control groups, p < .001, but also in the comprehension monitoring instruction groups, p < .01.

A new test

One month later, the participants were asked to read a new science article without instruction interventions and complete a new test. In the multiple-choice questions, the results revealed a main effect on reading ability, F(1, 106) = 15.77, p < .001, η2 = .13, indicating that high-ability students outperformed the low-ability students; however, no main effect was evident for the instruction groups nor interaction of reading ability and instruction groups, ps > .05. The scores on the essay questions showed similar patterns: there was a main effect on reading ability, F(1, 106) = 22.93, p < .001, η2 = .18, indicating that high-ability students outperformed the low-ability students; however, there were neither interaction nor main effects on the instruction group, p s > .05.

Eye movement analysis

To answer the second research question: whether text–diagram reading instruction affected reading processes, ANOVAs were conducted with instruction groups (text–diagram, comprehension monitoring, and control) and reading ability (high and low) as independent variables, and the eye movement indicators (total fixation duration of the article, texts and diagrams; proportion of total duration of the texts and diagrams, and the numbers of saccades between texts and diagrams) as dependent variables. As eye movement indicators had different units (e.g., fixation durations, proportion of total fixation durations, and number of saccades between text and diagrams), the original values were converted into Z-scores to facilitate description (Fig. 2). Moreover, no interaction on reading ability or eye movement indicators were found, indicating that the high and low groups had similar processes of changing eye movement patterns over time. Therefore, the data of all participants were combined, as shown in Fig. 2.

Fig. 2
figure 2

The changing processes of eye movement patterns over time. Note The manta ray article was a baseline comparison for the three groups (text–diagram integration, comprehension-monitoring, and control) before instructions intervention. The turtle article was the sample article used by instructions intervention. The dolphin article readers immediately read it independently after the instructions. The mantis article was being read after 1 month later of instructions intervention

First, to confirm that all participants randomly assigned to one of the three groups (text–diagram, comprehension monitoring, and control) had similar reading patterns before reading instruction interventions, one-way ANOVAs were conducted to analyze the eye movement for the data when the participants were reading the first article (the manta ray). The results indicated that there was no main effect on instruction groups with regard to total fixation duration for the article, texts and diagrams; the proportion of total duration of attention paid to the texts and diagrams; and the numbers of saccades between texts and diagrams, ps > .05. The descriptive values are shown in Table 2. The results showed that the three groups did not differ significantly on the reading patterns before the intervention using instructions.

Table 2 Eye movement analyses from reading article one (manta ray), before reading instruction intervention, for all three groups (measured baseline reading patterns)

Second, to answer the question of whether text–diagram reading instruction affected reading processes at the moment of receiving the reading instruction, and whether reading ability mediated this effect, 3 × 2 ANOVAs were conducted to analyze eye movement data for the participants reading the second (turtle) article. The descriptive values are shown in Table 3. The results showed that there were main effects on the instruction group for the total fixation duration of the article, F(2, 107) = 14.53, p < .001, η2 = .21, total fixation duration of attention to the diagrams, F(2, 107) = 47.32, p < .001, η2 = .47, proportion of total fixations on the texts, F(2, 107) = 34.34, p < .001, η2 = .39, proportion of total fixations on the diagrams, F(2, 107) = 47.32, p < .001, η2 = .47, and the numbers of saccades between texts and diagrams, F(2, 107) = 54.28, p < .001, η2 = .51; however neither main effects on reading ability nor interaction effects were observed on the five eye movement indicators, ps > .05. Post hoc comparisons demonstrated that the text–diagram integration group had longer total fixation durations on reading the whole article and diagrams sections, higher proportion of total fixation durations on diagrams, lower proportion of total fixation durations on texts, and similar numbers of saccades between texts and diagrams than the other both groups, ps < .001; however, this was not the case for total fixation durations of texts sections, ps > .05.

Table 3 Eye movement analyses for reading the three articles, after receiving the instructions

Third, to answer whether text–diagram reading instruction affected reading processes after receiving the reading instruction and immediately reading an article, and whether reading ability mediated this effect, 3 × 2 ANOVAs were conducted to analyze eye movement data when reading the third (dolphin) article. The descriptive values are shown in Table 3. The results showed that there were main effects evident for the instruction groups on total fixation duration of the article, F(2, 107) = 7.38, p < .01, η2 = .12, total fixation duration of the diagrams, F(2, 107) = 23.03, p < .001, η2 = .30, proportion of total fixations on the texts, F(2, 107) = 17.27, p < .001, η2 = .24, proportion of total fixations on the diagrams, F(2, 107) = 23.03, p < .001, η2 = .30, and the numbers of saccades between texts and diagrams, F(2, 107) = 25.67, p < .001, η2 = .32; however, neither the main effects of reading ability nor interaction effects were observed on the five eye movement indicators, ps > .05. Post hoc comparisons demonstrated that the text–diagram integration group had longer total fixation durations when reading the whole article and diagrams sections, higher proportion of total fixation durations on diagrams, lower proportion of total fixation durations on texts, and similar numbers of saccades between texts and diagrams than the other both groups, ps < .05; however, this was not the case for total fixation durations of texts sections, ps > .05.

Fourth, to answer whether text–diagram reading instruction affected reading processes after receiving the reading instruction 1 month later, and whether reading ability mediated this effect, 3 × 2 ANOVAs were conducted to analyze eye movements data of the reading of article four (Mantis). The descriptive values are shown in Table 3. The results showed that there were main effects for the instruction groups on total fixation duration of the diagrams, F(2, 106) = 10.56, p < .001, η2 = .17, proportion of total fixations on the texts, F(2, 106) = 8.00, p < .01, η2 = .13, proportion of total fixations on the diagrams, F(2, 106) = 10.56, p < .001, η2 = .17, and the numbers of saccades between texts and diagrams, F(2, 106) = 12.03, p < .001, η2 = .19; however, neither main effects on reading ability nor interaction effects were observed for the five eye movement indicators, ps > .05. Post hoc comparisons demonstrated that the text–diagram integration group had longer total fixation durations when reading the diagrams sections, a higher proportion of total fixation durations on diagrams, a lower proportion of total fixation durations on texts, and similar numbers of saccades between texts and diagrams than the other both groups, ps < .01; however, this was not the case for the total fixation durations on the whole article and texts sections, ps > .05.

Discussion and conclusion

This study investigated the immediate and delayed effects of text–diagram reading instruction on the reading comprehension and learning processes for illustrated text reading. Prior studies focused on the effect of text–diagram instruction on learning outcomes (Jian, 2018a; Scheiter et al., 2015) as opposed to learning processes. This study used an eye-tracker to identify the reading processes that may underlie recent findings on the effect of text–diagram integration instruction. In addition, as reading ability is important and influences reading processes (Jian & Ko, 2017; Hannus & Hyönä, 1999), it was included to examine whether reading ability moderated instruction effects. Corroborating with prior research, and measuring the immediate effect of reading instruction (Jian, 2018a, b), the delayed effect of the intervention was measured, as readers may rely on different and more superficial representations when reading articles (Arndt et al., 2015). To control the possibility of a placebo effect, as raised by an earlier study (Jian, 2018b), this study included a control group that received different instructions.

Corroborating the CTML assumptions (Mayer, 2009), the results confirmed Hypothesis 1a and showed that there was better reading comprehension in the group that received text–diagram instruction. The effect of text–diagram instruction on learning outcomes replicates recent findings (Jian, 2018a, b), which showed that text–diagram instruction facilitated reading comprehension, evident in immediate testing for high- and low-ability readers. Additionally, the results of this study also confirmed Hypothesis 1b, which showed that the effect of text–diagram reading instruction was more significant on the immediate test than on the delayed test. It replicated the findings of previous reading instruction research (Manoli et al., 2016; Torgesen et al., 2001). An inspiring finding of this study was that even after 1 month, the intervention of text–diagram instruction was still effective for the low-ability young readers. The scores on multiple-choice questions were better for those readers who received text–diagram instruction than the control group readers. It was evident from the eye-tracking patterns that readers who received the instructions, which emphasized diagram decoding and text–diagram integration, inspected pictures to formulate an impression, which supported their understanding of the corresponding text (Eitel, Scheiter, & Schüler, 2012; Lindner, Eitel, Strobel, & Köller, 2017). This supports the view that connecting text and illustrations in multimedia learning is crucial to constructing a good mental model (Mayer, 2009).

Eye movement data identified the reading processes that may underlie the impact of text–diagram instruction (Scheiter et al., 2015). Figure 2 depicts the changing appearance of reading processes over time. Before instruction, all groups (text–diagram, comprehension monitoring, and control) had similar reading processes; their total fixation duration on the article, text, and diagram, total reading time proportion, and number of saccades between text–diagram were similar while reading the first (manta ray) article. Nevertheless, in the moment of and after of receiving the instructional interventions, there is evidence that the text–diagram group changed their strategies and reflected on their reading processes. They spent more time reading the second and third (turtle and dolphin) articles and diagrams sections; spent a higher total reading time proportion on the diagrams and made many saccades between text–diagram than the other two groups. These results confirmed Hypothesis 2a. Interestingly, text–diagram instruction did not change the reading processes on the texts section. The result of this study somewhat differed from the research of Scheiter et al. (2015) who used middle-school students as participants. Their study found that the students were likely to use their original strategies to read illustrated texts instead of a newly taught reading strategy. Scheiter et al. considered that middle-school students may have numerous existing reading strategies that could not be easily changed in a short period of reading training. Contrastingly, young readers may more easily accept the reading strategies taught to them and apply them when reading new articles. This speculation was supported by the evidence in the results of this study, which showed that the students who received text–diagram integration instruction used different reading patterns when compared to the other two groups, even when tested 1 month after the instructional intervention.

In addition, it was found that although readers seemed to have formulated a criterion for the time needed to comprehend an article and maintained that time across articles, their comprehension of the articles was different. Figure 2 indicated that readers had similar Z-scores for total fixation duration and numbers of fixations on the first article (before intervention) and fourth article (1 month after the intervention). However, although the text–diagram group performed well even after 1 month, they still devoted more cognitive effort and time to apply the learned strategies when reading the fourth article. The instruction effect somewhat diminished over time (confirming Hypothesis 3), however, the differences in eye movements between the text–diagram group and the other two groups were somewhat smaller on the delayed test and, crucially, they were statistically significant. The results of this study not only replicated the previous research concerned learning outcomes of immediate tests and delayed tests (Manoli et al., 2016; Roy et al., 2017; Torgesen et al., 2001), but also examined the reading processes over time by comparing the results of immediate and delayed tests.

Contributions and limitations

Overall, this study provided both theoretical and instructional contributions to the field. In terms of theoretical aspects, this study confirmed that the CTML (Mayer, 2009) could be generalize to explain young readers’ learning process when doing illustrated text reading, if they received well-designed reading strategy instructions. This study also identified how these young readers selected, organized, and integrated the textual and pictorial representations, using eye movement data. This study provided an effective text–diagram reading instruction intervention for young readers, which differed from the abundant text-only reading strategies which had previously been developed. Furthermore, the text–diagram integration instruction developed in this study corresponded to the purpose of reading to learn novel knowledge (Cain et al., 2001; Clark et al., 2018; Förster et al., 2018; Reed & Lynn, 2016), and provided evidence that the positive effect can last up to 1 month after the instruction.

However, some limitations remain a concern for future research: First, the internal consistency of the reading tests was not very high in this study. Although a pilot study before the formal experiment was executed, the item analysis (including distractor analysis, item difficulty and discrimination), to ensure the quality of the items, future research needs to verify the internal consistency of the reading tests by revising the test items several times until certainty can be ascertained. Second, the effect of the text–diagram instruction of this study can be tested to verify if it works in classroom-based interventions. Third, further research could explore other subject areas to examine the reading instruction effects on reading comprehension and reading processes in other disciplines.