Introduction

In scientific articles, the use of diagrams has an important function in supplementing textual explanations of concepts. This has been shown by several studies conducted in the fields of science education and psychology (Ainsworth 1999; Cook 2006; Ferk et al. 2003; Peeck 1993; Schnotz 2002; Slough and McTigue 2010; Slough et al. 2010). The diagram content, and the form in which they are presented, can directly affect readers’ interpretation and understanding of the diagrams (Pozzer and Roth 2003; Unsworth 2001). This, in turn, facilitates a text-diagram integration process (Mayer and Moreno 2003; Peeck 1993).

In recent years, diagrams have been used extensively within science textbooks and other scientific publications (Slough et al. 2010). There are many different types of diagrams, classified according to form and function. Among these, analytical diagrams are most commonly used in scientific articles (Roth et al. 2005; Slough et al. 2010). Slough et al. (2010) found that 64.6 % of diagrams used in fifth- and sixth-grade science textbooks were analytical diagrams, and there were 66.0 % of the same diagram type contained in eighth- and ninth-grade science textbooks. Also, 52.9 % diagrams of scientific articles were analytical. Analytical diagrams can be used to represent the part-whole relationship within the structure of an object (Kress and van Leeuwen 1996), for example, the internal structure of a cell comprises the nucleus, cell membrane, etc. From the view of form, this type of diagram shows the name of the individual components of an object and their respective spatial location. From the view of function, the names and locations correspond with the semantic representation and spatial representation, respectively. If a diagram contains both forms of representation, and its explanatory message corresponds with the text, it will effectively perform its supplementary role. Empirical studies have also shown that this method of explanation helps readers significantly when readers are learning scientific concepts (Carney and Levin 2002; Mayer and Gallini 1990).

There have been an increasing number of studies using the application of eye-tracking technologies to investigate the cognitive processes involved in the interpretation of diagrams (De Koning et al. 2007, 2010) and both text and diagrams (Jian and Wu 2012; Hyönä 2010; Rayner et al. 2001). Ratwani et al. (2008) used choropleth diagrams to explain cognitive processes of reading diagrams. The results showed that readers would focus their attention within specific parts of a diagram to answer an information-specific question. However, readers would experience two cognitive components when answering an integration question: visual integration and cognitive integration. Visual integration involves the formation of visual clusters. During this process, readers’ eye movement patterns indicate that attention is being directed to the boundaries between different color regions. Cognitive integration, on the other hand, involves repeated comparisons between the various color zones, with multiple inter-cluster eye movement patterns to derive a quantitative and qualitative explanation. Ratwani et al. (2008) also indicated that labels placed in specific parts of a diagram play an important role during the process of information-specific extraction. However, labels obstruct the formation of visual clusters; therefore, these labels are not useful for answering integration questions. This issue highlights that the use of labels on a diagram merits further investigation.

Previous eye movement research concerned readers’ cognitive processes when reading scientific articles containing diagrams by analyzing eye movement patterns (Hannus and Hyönä 1999; Hegarty 1992; Hegarty and Just 1993; Holsanova et al. 2009). This research was based on the Dual coding theory (Paivio 1990), which postulates that humans make use of two different systems when processing cognitive information: verbal and pictorial. The former represents textual information, while the latter represents visual images. When humans first come into contact with these two types of information, they will temporarily store the information in the sensory buffer. The information is then transferred to short-term memory, becoming semantic representations and spatial representations, respectively. Through cross-reference and linkage between the two types of representation, the cognitive concept of the information being constructed becomes a complete mental model.

Empirical studies have shown that for scientific topics, having the information presented in both text and diagram formats leads to readers’ better understanding and memory performance compared to information that is presented either only by text or by diagrams (Hegarty and Just 1993). In addition to studies using a visual format (information represented by text and/or diagrams), there have also been studies using an auditory format. Subjects were shown a picture and given a verbal description of its contents. The purpose was to compare single and dual forms of representation to see whether the latter causes different eye movement patterns and to measure its effect on learning performance (Ozcelik et al. 2010; She and Chen 2009).

Hegarty (1992) conducted a series of studies on the reading of scientific articles containing diagrams using eye tracking. In eye-tracking research, fixation duration and saccade length are two important eye movement indicators (Rayner 1998). Fixation duration refers to the amount of time that a reader spends fixating his or her eyes on a specific location before switching to another location. Longer fixation durations indicate that a reader needs more time to decode the information that he or she is looking at. Saccade length refers to the distance between two successive fixations. Longer saccade lengths indicate that the reading material is relatively easier for the reader. Undergraduates viewed a diagram illustrating a pulley system and were then asked to determine whether a description of its mode of operation was accurate. Eye movement sequences and fixation durations were the two indicators used in this study. The results showed that the subjects’ eyes moved back and forth repeatedly between several related and consecutive parts of the diagram and between the text and diagram. Rereading certain areas was also observed. The findings indicated that the subjects’ mental activities could not concurrently simulate or process all the details related to the pulley system diagram. A gradual, step-by-step process was needed, whereby the various components of the pulley system were deconstructed to form representations for memory storage. When the effort needed for making representations and memory storage exceeded the cognitive load of a subject, the conceptual representation of the sentences originally stored in working memory would be affected. When the memory trace was diminished, the subject would have to read the sentences again to form a new representation. This resulted in the back and forth eye movement patterns in areas where the concepts of textual and diagram semantics corresponded to each other.

Hegarty and Just (1993) adopted the premise of the above-stated studies and carried out variations of the experiment. The same diagram of a pulley system was used but with a separate text describing the components of the pulley system and the principles of kinematics. Three different reading conditions were manipulated. One condition provided a pulley system diagram only, another condition provided a text describing how the pulley system operates, and the other provided both text and a diagram for the subjects. The subjects’ level of background knowledge was also considered as an experimental variable. The findings showed that regardless of the level of knowledge, subjects who read both the text and diagram scored significantly better on the comprehension test than the other two groups. The study also found that those subjects with less knowledge of mechanics suffered difficulty in building a mental model of the pulley system. These individuals scored relatively lower on the comprehension test, displayed more eye movements transferred between the text and diagram, and spent a majority of their total fixation duration on examining the different parts of the diagram.

Taken together, these studies suggest a conclusion offered by Hegarty (1992): When looking at both text and diagrams, readers are mainly guided by the text first. Readers’ eyes move back and forth within the textual area to integrate information. At the end of several sentences, readers’ eyes move to the corresponding part of the diagram to make a cross-reference. When looking at a diagram, readers examine local information contained within each component and decode the relationships. Next, readers inspect the relationships between several components, integrate the information, and then form a complete representation of knowledge. This series of studies is based on eye movements regarding how readers allocate their visual attention to different parts of text-diagram articles and utilize sequential reading paths to infer the cognitive processes of constructing a mechanical kinematic representation.

Recently, Mason et al. (2013a) used an eye tracker to investigate the effects of reading a scientific text illustrated by either a labeled or an unlabeled picture. They asked sixth graders to read a scientific article about air pressure. The results revealed that those who read the text accompanied by the labeled illustration performed better at reading comprehension than those in the other two conditions (only text or text with the unlabeled illustration). It suggested that the label on the illustration provided corresponding clues to relevant information for the text and helped readers to use the strategy of referencing the text and illustration. This was supported by eye-tracking data showing that the labeled illustration promoted more integrative processing of the learning material, as revealed by the time spent fixating on text segments while respecting the illustration. This is an important study on the reading of illustrated text. However, an unresolved question that arose from this study was in what reading stages did this label effect on illustration that led to semantic activation occurred? In other words, did it occur during the initial processing stage of decoding or the late processing stage of comprehension? This is therefore the main motivation behind the current study.

From the aforementioned literature review, it can be seen that the cognitive process involved in reading an article containing diagrams is quite complex. The respective researchers’ attempts at explaining the strategies adopted by readers can also be subtly discerned (Hegarty 1992; Hegarty et al. 1991; Hegarty and Just 1993; Mason et al. 2013a, b).

Reading strategies are an important topic in the field of reading research, with several studies having been conducted on the various strategies adopted by readers when reading text only (Afflerbach et al. 2008; Yang 2006). However, few studies have focused on the strategies related to reading both text and diagrams. One example is the aforementioned studies conducted by Hegarty and Just (1993) using mechanics as reading materials. They found that most adult readers adopted the text-diagram referencing strategy, reading two or three sentences before focusing their attention on a few relevant parts of the diagram. For advertisements that contain both text and diagrams, research by Rayner et al. (2001) indicates that adult readers basically look at the main headlines before reading the detailed text, and finally, look at the diagram displaying the product.

This series of studies describe typical text-diagram referencing strategies that most readers adopt when looking at both text and diagrams. However, this research did not include distinguishing between the different reading strategies adopted by readers. Jian and Wu (2011, 2012) attempted to categorize the strategies that readers adopted when looking at a scientific article containing diagrams and found other reading strategies in addition to the text-diagram referencing strategy. Jian and Wu (2011) asked undergraduates to read a biological article with diagrams. They found that 60 % of the subjects adopted the text-diagram referencing strategy directly, while 35 % looked at the diagram first before reading the text. Jian and Wu (2012) used the same material in a follow-up study. The majority of subjects (67 %) adopted the text-diagram referencing strategy. However, 21 % looked at the diagram first before reading the text, while the remaining 12 % finished reading the text before examining the diagram. Since there are different types of reading strategies, one of the aims of the current study was to investigate whether the various strategies produce different levels of reading efficiency and effectiveness. The text-diagram referencing reading strategy represents readers’ cognitive processes when integrating text and diagrams (Hannus and Hyönä 1999; Hegarty and Just 1993; Holsanova et al. 2009; Johnson and Mayer 2012). However, the exact information from the diagram that is being integrated with the text has not been studied, providing another aim for the current study.

A well-designed diagram should provide two types of information: semantic information from the text and spatial information provided by the configuration of the components (Mayer 2001, 2005; Mayer and Gallini 1990). On the one hand, text labels on a diagram may obstruct the integration of visual clusters (Ratwani et al. 2008). On the other hand, labels can help readers find information on a diagram that is related to content in the text (Hegarty 1992) or facilitate comprehension when extracting semantic information from the diagram (Mayer 1989, Mayer and Gallini 1990). Research on repetition priming has shown that through the repetition of a stimulus word, the amount of time needed by readers to process the same target word later is reduced (Ledoux et al. 2007; Liversedge et al. 2003). Since the stimulus word has triggered information related to the target word, less cognitive resources are needed to process the same word later. In the current study, we wanted to find out whether such a facilitative effect happens when reading an article containing diagrams. Specifically, in situations where the semantic information has already been processed by looking at the diagram first, we wanted to know whether that would facilitate the processing of similar words when reading the text afterward.

Another aspect to be studied is the spatial information that is usually found in a diagram, which provides direction for readers to form mental representations of the concepts being learned. The literature has already identified that the arrows and color of a well-designed diagram can do precisely that. Previous studies have also shown that arrows used in a good design should provide readers with guiding messages along the path direction, such that readers would perform better during the cognitive course and conclusion of the learning process compared to other readers who were not provided with those arrows (Heiser and Tversky 2006; Mayer 2001, 2005; Mayer and Gallini 1990). Furthermore, the effective use of coloration can guide readers to pay attention to the related parts of a diagram, thereby facilitating cognitive processing (Boucheix and Lowe 2010). Through the current study, we wanted to further investigate the effectiveness of reading between readers using different strategies to look at both text and diagrams at different points in time during the reading process, as well as when different spatial messages from the diagram were integrated into the text.

The current study had two research questions. First, we investigate the strategies which were used by adult readers when reading a scientific article containing an analytical diagram, and compared information extracted by readers using different strategies when first examining a diagram. Second, we wanted to find out whether readers were able to read and obtain information from both types of semantic and spatial representations, from either representation only or from neither while reading an analytical diagram.

Several assumptions and predictions were made for the current study. For the first research question, based on the observations of Jian and Wu (2011, 2012) who found adult readers took multiple reading strategies instead limited in text-diagram referencing strategy while reading a scientific illustrated article, we predicted that the participants in this study performed different eye movement scan path, such as text-diagram referencing, text first, diagram first as aforementioned literature observed or else reading paths.

For the second research question, we firstly assumed if readers would be able to extract the semantic information from the diagram, then we predicted repetition priming occurred (Ledoux et al. 2007; Liversedge et al. 2003), readers had already preliminarily processed the semantics of the scientific term when looking at the diagram, which should facilitate the identification of that same term when subsequently reading the text. Hence, the readers who read diagram before text reading should spend shorter reading time when looking continuously at a scientific term of the brain in the text for the first time than that spent by those readers who directly read text without reading diagram. In addition, based on the statement that diagram had an advantage of conveying configuration of the components (Mayer 2001, 2005; Mayer and Gallini 1990), and readers organize and integrate verbal and pictorial representations helps them constructing a well mental model (Mayer 2001, 2005), we predicted that readers would be able to extract the spatial message from the diagram; hence, the time spent by a diagram-first reader looking back and forth at the spatial paths of the related sentences in the text should be significantly shorter than that spent by a text-first reader. A preliminary spatial representation should have already been made by the diagram-first reader when looking at the diagram, which would reduce the degree of dependence on the spatial message formed through reading. Hence, the amount of time spent looking back and forth at sentences in the text should be correspondingly reduced. Since a text-first reader should have to rely on the messages conveyed by the text to form spatial representations, the reading behavior should involve more looking back and forth among the sentences in the text before a spatial path is formed through the integration of several sentences.

Method

Participants

Seventy-one students recruited from the National Taiwan Normal University in Taiwan participated in this experiment. The participants majored in education, management, arts, and social science. We excluded students who majored in science or medicine due to having a neuroscience background and having acquired knowledge relevant to the reading materials in this study. All participants had normal or corrected-to-normal visions.

Materials

One neuroscience article with a diagram was taken from Scientific American (LeDoux 1994, p. 56) that had been translated into Chinese. This article describes the pathways of the fear response and awareness in the human brain. The text section was 398 Chinese characters, and its content consisted of four emotional-pathway sentences, including an inferior-cortex pathway, a superior-cortex pathway, response pathway, and awareness-evaluate pathway. In addition, there were two sentences ahead of the four emotional-pathway sentences describing a simple introduction of the article and a concluding sentence. The diagram section of the article had seven technical terms referring to the brain, which were the same as those contained within the text section. Arrangements in the text order were thalamus, amygdala, visual cortex, brainstem, anterior cingulate gyrus, hippocampus, and prefrontal cortex. The typesetting of the article was text wrapped around a diagram, referred to as Fig. 1. One screen displayed this article; only one page was displayed on the screen.

Fig. 1
figure 1

The reading material in this study (Chinese edition)

The reading comprehension test consisted of ten yes/no questions. Of these questions, three were text-based questions (e.g., Is the hippocampus involved in attention execution?), four were pathway questions (e.g., Does information transformation in the subcortical pathway of one’s brain start from the eyes, continue to the thalamus, amygdala, and finally to the brainstem?), and three were integrating questions (e.g., Does this article discuss the fear response system and neural mechanism of danger awareness?). The article was not on display while participants answered these questions.

Apparatus

Participants’ eye movements were recorded with an Eyelink 1,000 eye tracker. The sampling rate was 1,000 Hz. A chin bar was used to minimize head movement. The article composed of text and a diagram displayed on a 19-inch monitor covering 1,024 × 768 pixels. The diagram was about 16 × 10 cm. The text had two sections, one on the left side of the diagram and the other under the diagram, 11 cm × 6 cm and 16 cm × 10 cm, respectively. Participants sat approximately 65 cm from the monitor. The whole article covered 40° horizontal and 32° vertical visual angles.

Procedure

Participants were instructed to read an article and then complete the yes/no comprehension questions. There was no time limit for the reading procedure, following previous studies on reading (Hegarty and Just 1993; Jian and Wu 2012), in order to provide a natural reading condition. Therefore, participants set their own reading speed. After eye movement calibration and verification, participants were asked to read a practice article and respond to a couple comprehension questions. Subsequently, the formal experiment was executed. The ten comprehension questions were displayed one at a time, and the time taken for participants to respond to each question was recorded. Eye movements for reading the article and response time for each comprehension question were recorded at the same time. The experiment lasted for approximately 20–30 min.

Data Selection and Scoring Criterion

Six participants’ eye movement data were excluded due to apparent drift. These participants’ fixations were almost entirely located in the upper (or lower) part of the screen (blank space); however, the experimental material in this study was displayed in the center of the monitor. In eye-tracking experiments, a crucial first step is to confirm that participants’ eye fixations were not apparent drift to ensure that the locations a reader looked at were recorded exactly. Sixty-five participants provided adequate samples. As in previous eye movement research (Andrews et al. 2004; Jian et al. 2013), any fixations shorter than 100 ms were excluded, which included approximately 3 % of all fixations (Fig. 2).

Fig. 2
figure 2

Four reading types: the left-upper was text-diagram referencing type, the right-upper was diagram-fist type, the left-lower was text-first type, and the right-lower was text-only type. The circles with numbers indicates readers’ reading sequences

According to Jian and Wu (2012), we created three groups based on reading behavior while reading the article. One was the text-diagram referencing group. Here, readers go back and forth between the text and diagram sections. Another is referred to as the diagram-first group. Here, readers read the diagram section of the article for a while and then read the text section of the same article. The final group is referred to as the text-first group. Here, readers read the text section of the article for a while and then referred to the diagram. We used two graduate students as raters to judge participants’ eye movements belonging to which group, independently, according to the above three judgment criteria. If the judgment results were inconsistent for the two raters, a discussion was conducted until a consensus was reached. If a controversy still existed, then the raters conferred with the experimenter.

Results

Eye Movements and Comprehension for the Four Reading Types

Two raters who were blind to the purpose of this study were invited to rate each participant’s reading pattern. They independently examined each participant’s eye movements. Before making judgments, the raters trained by practicing on eye movement patterns obtained in the pilot study. Disagreements were resolved by discussion until a consensus was reached.

In addition to the three reading patterns classified above, there were two other reading patterns. One was merely reading the text section but not the diagram section, and the other could not be classified as a specific reading type. These five reading types are (1) text-diagram referencing: participants initially read a few sentences and then referred to the diagram. There were many reference behaviors between the text and diagram. (2) Diagram-first: participants read the diagram section or one (or two) sentences prior to the first proper noun, “thalamus,” and then transferred their fixations to the diagram section. Finally, these participants went back to the text section to refer to the relevant information between the text and diagram. Results showed that these participants’ total fixation durations ranged from 3.78 to 18.84 s during initial processing of the diagram. (3) Text-first: participants almost wholly read the text section and then began to read the diagram section. (4) Text-only: participants only read the text section and did not refer to the diagram section. (5) Others: the reading patterns could not be classified into any of the above four reading groups. Results showed that 23 participants belonged to the text-diagram referencing group, nine participants belonged to the diagram-first group, 24 participants belonged to the text-first group, seven participants belonged to the text-only group, and two participants belonged to the others group. The inter-rater reliability was 1.

Accuracy and reaction times on the comprehension test for the four reading groups are shown in Table 1. Results showed that there were significant differences among comprehension question types for the four reading groups, F (3, 59) = 3.64, p = .018, η 2 = .16. Post hoc comparisons revealed that the diagram-first group (85 %) was more accurate on text-based questions than the text-only group (43 %). Participants in the text-diagram referencing (75 %) and text-first (74 %) groups were also more accurate on the text-based questions than were text-only participants. However, participants did not differ on accuracy for the pathway and integrating questions, ps > .10. As for reaction time, participants in the four reading groups differed significantly, F (3, 59) = 2.97, p = .039, η 2 = .13. Post hoc comparisons revealed that participants in the text-diagram referencing (10.06 s), diagram-first (10.37 s), and text-first groups (8.57 s) had significantly longer reaction times than the text-only group (6.00 s).

Table 1 Accuracy and reaction time of the comprehension test

Participants’ eye movements for the four reading groups are shown in Table 2. First, we considered the whole article as an area of interest (AOI). There were significant differences in total fixation durations for participants within the four reading groups, F (3, 59) = 2.83, p = .046, η 2 = .13. Post hoc comparisons revealed that participants in the text-diagram referencing (126.03 s) group had longer total fixation durations than participants in the text-only (67.56 s) group. However, on measures of mean single fixation duration and mean saccade length, there were no significant differences between groups, ps > .10.

Table 2 Means for eye movement measures for participants of four reading types

We next used the diagram section as an AOI. According to the classified criteria for the text-only group, the text-only group indeed did not read the diagram. In addition, there were no significant differences in total fixation durations on the diagram and reading time ratio of the diagram for participants in the text-diagram referencing, diagram-first, and text-first groups, ps > .10. When using the text section as an AOI, participants in the text-only group spent all of their time reading the text section. Thus, the total fixation duration ratio of the text section was significantly higher for this group than the other three reading groups, F (3, 59) = 21.08, p < .001, η 2 = .52. In addition, participants in the text-first group had a higher total fixation duration ratio for the text section than did participants in the diagram-first group (80 vs. 71 %).

As for total fixation durations on the text AOI, there were no significant differences between the four reading groups, p > .10. As for the number of saccades between the text and diagram AOIs, there were significant differences between the four reading groups F (3, 59) = 8.24, p < .001, η 2 = .30. Post hoc comparisons revealed that participants in the text-diagram referencing group (15.52 numbers of times) had more saccades between the text and diagram AOIs than did participants in the text-first (6.75 numbers of times) and text-only (0.00 number of times) groups. Participants in the diagram-first group (14.78 numbers of times) also had more saccades between the text and diagram AOIs than did participants in the text-only group.

The experimenter wondered whether the text-only participants took the task seriously. However, the raters observed that these participants read the whole article without skipping any sentences; a few participants even read the article several times. Moreover, text-only participants did not differ significantly from the other three reading groups in terms total fixation durations, saccade length, and total fixation durations on the text section. Therefore, the different reading patterns might be due to different reading styles or strategies instead of the text-only participants not taking the task seriously.

Detailed Eye Movements for the Participants in the Diagram-First and Text-First Groups

We analyzed eye movements for participants in the diagram-first and text-first groups toward the technical brain words in the text (named “technical words AOIs”) to assess whether readers were able to construct the semantic representation from the diagram section. Then, we took the brain-pathway sentences in the text (named “pathway sentences AOIs”) to examine whether readers were able to construct the pathway-representation from the diagram section.

If there was no fixation within a specific analysis area, then this area was not taken into account during calculation but regarded as missing data. Therefore, the degrees of freedom for seven technical brain words may not be the same.

Results showed that while reading the seven technical words AOIs, the two reading-type participants did not differ significantly in terms of mean total fixation duration, first fixation duration, gaze duration, and rereading time, ps > .10. For the individual technical words AOIs, the results showed that the participants in the diagram-first group tended to spend a shorter total fixation duration on the prefrontal cortex than participants in the text-first group, t(25) = −1.69, p = .10, d = −.76. These two reading groups did not differ significantly on total fixation duration for the other six technical words AOIs, ps > .10. The diagram-first participants spent shorter first fixation and gaze durations on the anterior cingulate gyrus AOI than the text-first participants, t(26) = −2.12, p = .044, d = −.68; t(26) = −2.27, p = .034, d = −.71. These two reading groups did not differ significantly on first fixation and gaze durations for the other six technical words AOIs, ps > .10. In addition, the diagram-first participants had shorter rereading times on the thalamus AOI than the text-first participants, t(25) = −2.26, p = .033, d = −1.17. These two reading groups did not differ significantly on rereading times for the other six technical words AOIs, ps > .10.

Results for the four emotional-pathway sentences AOIs for the diagram-first and text-first participants are shown in Table 3. These results showed that the diagram-first participants had significantly shorter rereading times on the superior-cortical pathway sentence AOIs than the text-first participants, t(31) = −2.45, p = .021, d = −.81. However, these two reading groups did not differ significantly on the other three pathway sentence AOIs, ps > .10.

Table 3 Rereading time (s) on the emotional-pathway sentences in the text section for the participants of diagram-first and text-first groups

The above results showed no differences for most of the pathway sentence AOIs between the diagram-first and text-first participants. This result might be due to the fact that there were too many common words in the text, which reduced the effect of the technical brain words on eye movement patterns. Therefore, we conducted another analysis, merely combining the sentences with the technical brain words as the unit of analysis. However, the results of the eye movement patterns were similar to the emotional brain sentences with common words. Diagram-first participants had significantly shorter rereading times on the superior-cortex pathway sentence AOI than the text-first participants, t(31) = −2.63, p = .013, d = −.85, but not on the other four pathways pathway sentence AOIs, ps > .10.

Discussion

There were two main findings in this study. The first was that we found other reading strategies while reading illustrated texts, in addition to the text-diagram referencing strategy commonly cited in the literature (Hegarty 1992; Hegarty and Just 1993). The second was that we resolved the question that arose from Mason et al. (2013a, b) study. That is, in what reading stages does the label effect on illustration leading to semantic activation occur? Does it occur during the initial processing stage of decoding or the late processing stage of comprehension? We found that it occurs during initial processing. A discussion of these two findings and comparison with those of previous studies can be found below.

Previous studies (Hegarty 1992; Hegarty and Just 1993) made use of experimental materials related to mechanics (such as the pulley system and the tank of a water closet) to arrive at the conclusion that the text-diagram referencing strategy is the main reading style used. However, our findings, and those of Jian and Wu (2012), show that when materials related to biology are used, a significant proportion of readers adopted other reading strategies, including text-first and diagram-first.

The findings of these two studies differed even though both used adult readers as participants, as well as the same type of experimental materials (containing both text and diagrams). One possible explanation for the discrepancy could be the nature of the topics being read. The reading materials used by Hegarty (1992; Hegarty and Just 1993) were on mechanical systems, the concept of which can only be mastered through reliance on the strength of spatial messages. Hence, it was natural for readers to adopt a text-diagram referencing reading strategy, through which the spatial messages from the diagram can be transformed into internal spatial representations. When those messages were integrated with semantic representations of the textual message in working memory, the interaction of the two resulted in a dynamic representation, leading to comprehension of the overall mechanical system. This study and that of Jian and Wu (2012) made use of reading materials related to biology. Readers could extract semantic information from the text and build a preliminary spatial relationship to a certain extent; thus, readers were relatively less dependent on the messages provided by the labeled diagrams. This in turn led to the text-first reading strategy that focused entirely on the text before making references to the labeled diagrams. A small portion of the readers did not even refer to the labeled diagrams at all.

Besides these studies that examined adult readers, a recent pioneering study (Mason et al. 2013b) indicated that young readers also had different reading strategies while reading an illustrated scientific text, and that participants’ reading behavior were highly correlated with reading comprehension. Eye movement data showed that the more the participants referred to the text and diagram information, the better their learning performance was. On the contrary, the readers who seldom referred to the text and diagram information had worse reading comprehension scores. The result of our study was similar to that of Mason et al. (2013a, b), which showed that if readers looked only at the text and ignored the diagram, their learning performances would be worse than those who referred to both the text and diagram frequently.

Despite the adoption of different strategies, the reading behavior of the three groups (text-diagram referencing, diagram-first, and text-first) all involved making references to the labeled diagrams, even though the referencing was performed at different time points during the reading process. There was no significant difference between the groups in terms of the ratio of the total time spent on focused gazing and time spent examining the labeled diagrams. This indicates that diagrams play an important role in helping readers to grasp concepts (Ainsworth 1999; Cook 2006; Ferk et al. 2003; Peeck 1993; Schnotz 2002).

To summarize, the reading behavior of the three groups of subjects involved examining the labeled diagrams, and the amount of time they spent reading was generally similar. The test performance of all three groups was equally good, indicating that there was no difference between the efficiency and effectiveness of the three reading strategies. On the other hand, the group of subjects who did not look at the labeled diagrams at all did not perform as well, especially for questions that tested their understanding of basic factual questions. Their performance was especially weaker for questions that required memorization of the related text. This was probably due to poorer memory retention (Jian and Wu 2012).

This result provided empirical evidence that supported dual coding theory (Paivio 1990) might be a possible theory to explain this bad performance phenomenon. This theory confirmed decoding dual representation of verbal and pictorial outperforms single representation. Another possible reason to explain this bad performance phenomenon for the text-only group might be that they did not take the task as serious as the other groups. However, we did not make sure whether the participants in the text-only group did not take the reading task seriously, or just some readers whose reading strategies are poor and did not refer to diagram information while reading an illustrated text. That is to say, the text-only group in this study might contained above two kinds of readers. Table 2 showed that the text-only group spent averaged 53.16 s (total fixation durations) on text reading (398 Chinese characters). Although this averaged reading time was not as long as the other three groups, the text-only group still spent some time (not too short) to read the text section and their eye-fixation pattern showed they also swept the whole text (not omitted a few sentences) as the other groups. Besides, the text-only group had similar saccade length as the other three groups while reading the article. In general, the longer saccade length indicates the reading material is easier for the reader (Rayner 1998; Jian et al. 2013). Apparently, the text-only group did not feel the article was easier or difficult than other three groups.

Semantic Representations of Scientific Diagrams

In terms of the processing of the semantic messages related to the seven scientific terms, there was a tendency (although not a statistically significant one) for the diagram-first group to spend, on average, shorter durations of time on total gazing, first gazes, first continuous gazes, and back and forth viewing as compared to the text-first group. Next, a comparison was made of the two groups’ performance for the seven scientific terms on the text section, individually. We found that whenever there was a significant (or marginally significant) difference in eye movements between the two groups, it was always consistently shorter in gaze duration for the diagram-first group. This indicated a tendency for the diagram-first group to process the semantic messages of the diagram first. This also indicates that labeled diagrams facilitate semantic information in the initial processing stage of decoding words rather than the later processing stage of comprehension, as demonstrated by the difference between the diagram-first group and text-first group in the first gaze duration while reading the text. We have resolved the question stemming from Mason et al. (2013a, b) study.

In theory, this finding preliminary extended the repetition priming effect (Ledoux et al. 2007; Liversedge et al. 2003) to other expressing media, and specifically, repeated semantic stimuli label on diagram also could facilitate its meaning processing on word form. In practice, this finding also replicated previous researches (Mayer 1989, Mayer and Gallini 1990) confirmed labels on a diagram relating content in text facilitate comprehension learning concept.

The lack of significant differences in the two groups’ average scores on the seven scientific terms could be due to the subjects in the diagram-first group having used reading strategies that were not completely identical. For example, some of these subjects might have read all the scientific terms on the labeled diagrams, individually, while others could have processed only the semantic message of selected scientific terms. Yet, others might not have processed the semantic messages within the labeled diagrams at all. As a result, the group’s average performance was reduced when the processing of semantic messages was being measured. Given that this group only had nine subjects, the statistical analysis could be easily distorted even if a minority had variations in reading strategies, which would lead to inconsistencies within the group in terms of its average performance. However, in the natural reading condition, it is reasonable to find unequal groups that readers performed various reading types (e.g., diagram-first readers was nine, text-diagram referring readers was 23). Previous research (Mason et al. 2013a, b) also found similar results. They found that fourth-grade students varied in performance while reading a scientific article accompanied by a diagram, with 23 students highly integrating the text and diagram and seven students integrating them less.

Spatial Representations of Scientific Diagrams

For the processing of spatial information, we found that readers were able to extract this information from the labeled diagrams, which were then used to form representations of the various neural pathways used to convey messages in the brain. This conclusion confirmed diagram has a capacity of depicting the configuration of components (Mayer 1989; Heiser and Tversky 2006). The supporting evidence is the significantly shorter time spent on back and forth viewing by the diagram-first group on sentences regarding neural pathways within the text compared to the text-first group. However, the significant difference between the two groups applied only to the superior-cortex pathway but not the other three neural pathways. One possible reason for this result is that the superior-cortex pathway was shown the most clearly within the diagram. The other three neural pathways had been revised during the preparation of the experimental material, causing their visual presentation to be relatively less distinctive.

To verify this possibility, we subsequently interviewed five social science undergraduates who shared similar backgrounds with the subjects of this study but did not take part in the study. When asked about what information they had seen from the labeled diagrams, most of the interviewees mentioned the superior-cortex pathway. Specifically, they were able to make the following statement: “After a person sees a snake, the image of the snake goes from the eyes to the hypothalamus, followed by the visual cortex. The message is then relayed to the amygdala before being sent to the brain stem…” Most of the interviewees did not pick out from the labeled diagrams the way messages could be relayed by the other three neural pathways. This indicates that the effectiveness of the spatial information conveyed by the labeled diagrams could have been affected by varying degrees of perceptual clarity. For future studies, researchers may wish to consider designing and using clearly printed diagrams in order to test whether readers are indeed able to form adequate pathway representations.

There was an interesting finding needed to be noted. Although text-first and diagram-first readers spent similar total reading time on the text and diagram parts of the article, respectively, text-first readers had significantly less the number of saccades of text and diagram than diagram-first readers. This result might be explained by the fact that the participants in this study were adult readers. Even though they had less prior knowledge of brain science, they had mature reading ability and word recognition were automatic, so they were capable of learning new knowledge by reading words to some extent (Brusnighan and Folk 2012). Once some semantic and spatial information of the brain-pathway article were learned, they may degrade relying on diagram information and result in less integrators of text and diagram. This text-directed reading also has enriched empirical research (Hannus and Hyönä 1999; Hegarty and Just 1993; Jian and Wu 2012; Rayner et al. 2001).

Research Limitations

Although there were some interesting and important findings in this study, it still had limitations. First, this study was an exploratory study and did not manipulate experimental variable; therefore, the findings could not explain cause and effect. Second, we adopted only an article on biology as the reading material, so the findings about different reading strategies and reading types might only be generalizable to reading materials on biology. The reading strategies found in this study may not totally account to other topic article or typesetting. Third, our participants were undergraduate with limited brain knowledge; therefore, the reading strategies and eye movements performed in this study may generalize to adult readers with less background knowledge but not to adolescent or children, or adult readers with more knowledge.

There is a relative lack of research on the strategies used for reading both text and diagrams in the literature as compared to research that focuses purely on reading text or looking at diagrams. This study attempted to address the deficiency in this area. We observed that in addition to the text-diagram referencing reading strategy, there were other strategies used by readers when reading articles containing diagrams. Furthermore, the role played by diagrams in the overall understanding of an article will affect the type of reading strategies adopted by readers. Finally, it has been verified that there are indeed readers who concurrently process both semantic and spatial information from diagrams. Thus, we can conclude that the use of clearly presented diagrams in scientific articles is important because these help readers process both the semantic and spatial information contained within the diagrams, thereby facilitating an overall understanding of the article.